
518 F.3d 800 (2008)
UNITED STATES of America, Plaintiff-Appellant,
v.
Christopher Wayne SMART, Defendant-Appellee.
No. 06-6120.
United States Court of Appeals, Tenth Circuit.
March 4, 2008.
*801 Randal A. Sengel, Office of the United States Attorney, Oklahoma City, Oklahoma (John C. Richter, United States Attorney, with him on the briefs), for the Plaintiff-Appellant.
Fred L. Staggs, Oklahoma City, Oklahoma, for the Defendant-Appellee.
Before HENRY, LUCERO, and HARTZ, Circuit Judges.
LUCERO, Circuit Judge.
Christopher Wayne Smart was convicted of inducing a minor to engage in sexually explicit conduct for the purpose of producing videotapes depicting such conduct in violation of 18 U.S.C. § 2251(a). Exercising its discretion under United States v. Booker, 543 U.S. 220, 125 S.Ct. 738, 160 L.Ed.2d 621 (2005), Smart's sentencing court concluded that his United States Sentencing Guidelines ("Guidelines") range of 168 to 210 months' imprisonment overstated the seriousness of his offense, and varied downward, imposing a sentence of 120 months' imprisonment. The government appeals.
*802 We review this exercise of district court sentencing discretion under the recent Supreme Court holdings in Gall v. United States, ___ U.S. ___, 128 S.Ct. 586, 169 L.Ed.2d 445 (2007), and Kimbrough v. United States, ___ U.S. ___, 128 S.Ct. 558, 169 L.Ed.2d 481 (2007), which substantially invalidate the rigorous form of review our circuit announced in United States v. Garcia-Lara, 499 F.3d 1133 (10th Cir.2007). Applying a deferential abuse of discretion standard, we AFFIRM.

I
Smart was indicted, along with his codefendants Kevin "Tiny" Fields and Robert Rousey, on August 17, 2005, by a grand jury in the Western District of Oklahoma. He was charged with a single count of producing videotapes depicting the sexual abuse of a minor in violation of 18 U.S.C. § 2251(a). All charges in the case stemmed from a March 2001 investigation by officers in the El Reno, Oklahoma, police department of suspected sexual abuse by Rousey of a 13-year-old girl who lived with him intermittently. In November 2000, the defendants had videotaped the girl having sex with the three of them and a woman, Johnita Lynn Wheeler.
Rousey pleaded guilty to two counts under § 2251(a) and was sentenced to the statutory minimum of 120 months' imprisonment. His Guidelines range was 110 to 137 months, based on a total offense level of 26 and a criminal history category of V. Unlike Rousey, Smart chose to exercise his right to trial. At trial, the victim testified to a jury that Smart, Rousey, and Fields all knew that she was less than 18 years old at the time of the offense. Wheeler testified that she was concerned about engaging in group sex with someone who looked so young, and that she asked Rousey about the victim's age. She further testified that Rousey told her the girl was 16 and this had been confirmed by Smart. Smart took the stand in his own defense and testified that he had never spoken with either Fields or Rousey about the victim's age, and that he had not told Wheeler that the victim was 16 years old. Smart was convicted as charged.
At sentencing, the district court accepted the Guidelines calculation in Smart's presentence report ("PSR"), which determined that Smart's total adjusted offense level was 31. That offense level reflected a base level of 27, a two-point enhancement due to the age of the victim, see U.S.S.G. § 2G2.1(b)(1), and a further two-point increase for obstruction of justice, see § 3C1.1. Smart's criminal history category was V. He requested, but did not receive, a two-point reduction for aberrant behavior under § 2K2.20. Accordingly, Smart's Guidelines range was 168 to 210 months' imprisonment.
The district court sentenced Smart to 120 months' imprisonment, a downward variance of 48 months below the bottom of his Guidelines range. After referring to the factors set forth in 18 U.S.C. § 3553(a), the court offered its reasons for granting Smart a downward variance. Initially, the court addressed the difference in culpability between Smart and Rousey, and the nature and seriousness of Smart's conduct, stating:
I think [Smart's] conduct . . . does fall somewhere between Mr. Fields and Mr. Rousey. It was obvious to me that Mr. Rousey was I guess the lead instigator of this in trying to make some sex films and so forth.
The court communicated a discomfort with imposition of a higher sentence than that of his codefendant based on Smart's decision to proceed to trial:
I also agree that while you do get the benefit of accepting responsibility and avoiding trial and that in the Guidelines *803  . . . that's a consideration the court takes, I also agree that you do have a right to go to trial. I guess I would put it this way, that while you get the benefit of your plea agreement if you plead, I don't necessarily think that you should be punished because you exercised your right to a trial by jury.
Finally, the court elaborated further on its initial point:
I feel it would violate [18 U.S.C. § 3553] . . . if you received a far greater sentence than Mr. Rousey. I believe that the disparity would be a violation of that section. And I find, in reviewing the overall case, your involvement as opposed to Mr. Fields' and Mr. Rousey's and the others, I do not feel that you should receive a greater sentence than Mr. Rousey. As I stated, he was obviously the instigator and the promoter of this whole event and got others involved, including the under-age girl, and it was his contact with her which created the whole situation. And to avoid any disparity, great unwarranted disparity in the sentences among the defendants based upon their involvement in this episode, and in meeting the other standards, the court finds a reasonable sentence should be that Christopher Wayne Smart is hereby committed to the custody of the Bureau of Prisons to be imprisoned for a term of 120 months.
The district court explained, "I feel that that sentence is reasonable and that it reflects the seriousness of the offense, promotes respect for the law, and provides just punishment. Certainly, that 10-year sentence would afford adequate deterrence to criminal conduct and the public would be protected." The government argues that the sentence is unreasonable.

II
Since the Supreme Court's decision in Booker, which relegated the Sentencing Guidelines to an advisory status, district courts have been free to apply any sentence that is "reasonable" under the sentencing factors listed at 18 U.S.C. § 3553(a). See 543 U.S. at 261, 125 S.Ct. 738. Our appellate review for reasonableness includes both a procedural component, encompassing the method by which a sentence was calculated, as well as a substantive component, which relates to the length of the resulting sentence. United States v. Kristl, 437 F.3d 1050, 1055 (10th Cir.2006).
The government does not specify whether it challenges the procedural or substantive component of Smart's sentence. Rather, it contends generally that the district court's reliance on two "legally erroneous" sentencing factors rendered Smart's sentence unreasonable. We conclude that this assertion raises a challenge to both aspects of the reasonableness of Smart's sentence.
In Gall, the Supreme Court identified "failing to consider the § 3553(a) factors" and "failing to adequately explain the chosen sentence" as forms of procedural error. 128 S.Ct. at 597. Section 3553(a) lists in broad and general terms the factors which district courts must account for during sentencing, and encompasses the vast majority of considerations courts have traditionally treated as relevant in setting sentences. The error asserted here is not a failure to evaluate these factors, but rather, a related error: consideration by the district court of legally erroneous factors.
We agree that if a district court bases a sentence on a factor not within the categories set forth in § 3553(a), this would indeed be one form of procedural error. Section 3553(a) mandates consideration of its enumerated factors, and implicitly *804 forbids consideration of factors outside its scope. § 3553(a) ("The court, in determining the particular sentence to be imposed, shall consider [the listed factors]."); see also United States v. Roberson, 517 F.3d 990, 994, 2008 WL 323223, at *1 (8th Cir. Feb.7, 2008) (procedural sentencing error includes "giving significant weight to an irrelevant or improper factor").[1]
Because the government also questions whether Smart's sentence can be supported in the absence of the allegedly "improper" factors it identifies, it has also raised a substantive reasonableness challenge. A challenge to the sufficiency of the § 3553(a) justifications relied on by the district court implicates the substantive reasonableness of the resulting sentence. United States v. Conlan, 500 F.3d 1167, 1169 (10th Cir.2007); see also Gall, 128 S.Ct. at 597 (in conducting substantive reasonableness review, appellate courts must deferentially examine the district court's determination that "the § 3553(a) factors, on a whole, justify the extent of the variance").

III
As directed by the Supreme Court, we begin by considering the procedural reasonableness of the sentence imposed. Gall, 128 S.Ct. at 597 (appellate courts "must first ensure that the district court committed no significant procedural error"). The government raises two factors that, it contends, were improper under § 3553(a). First, the government argues that under § 3553(a)(6), the district court could not take into account the disparity between Smart's sentence and that of his codefendant. Section 3553(a)(6) mandates consideration of "the need to avoid unwarranted sentence disparities among defendants with similar records who have been found guilty of similar conduct."
After Gall, it is clear that codefendant disparity is not a per se "improper" factor, such that its consideration would constitute procedural error. The Court approvingly noted that the district court below had inquired about the sentences Gall's codefendants received, and stated that district courts may "consider[ ] the need to avoid unwarranted similarities among [codefendants] who [are] not similarly situated," despite falling under the same or similar Guidelines sentencing ranges. 128 S.Ct. at 600. It follows that a district court may also properly account for unwarranted disparities between codefendants who are similarly situated, and that the district court may compare defendants when deciding a sentence.[2] We emphasize *805 that whether any such disparity justifies a sentencing variance in a given case raises a separate question, one of substantive reasonableness, which we address in Part IV.
Second, the government argues that the district court improperly relied on its view that Smart should not be "punished" for exercising his right to a trial by jury. See United States v. Portillo-Valenzuela, 20 F.3d 393, 395 (10th Cir.1994) ("[D]enying the reduction for acceptance of responsibility is not a penalty for exercising any rights. The reduction is simply a reward for those who take full responsibility."). When read in context, however, it is clear that the court's statement was not offered as a justification for its ultimate sentencing decision. The court unremarkably noted that while a defendant gets the benefit of acceptance of responsibility if he pleads guilty, the converse is not necessarily true: "I don't necessarily think that you should be punished because you exercised your right to a trial by jury." Moreover, the court denied a reduction for acceptance of responsibility, and approved a two-point Guidelines enhancement for obstruction of justice based on Smart's testimony during trial. These decisions are not indicative of an irrational sympathy for Smart's decision to proceed to trial. Because the district court plainly did not rely on Smart's decision to go to trial as a justification for its downward variance, we need not decide whether such a consideration would constitute procedural error after Gall.
We conclude that the district court relied on no improper factors in sentencing Smart,[3] and proceed to consider the substantive reasonableness of the sentence it ultimately imposed.

IV

A
Following the Supreme Court's decision in Rita v. United States, ___ U.S. ___, 127 S.Ct. 2456, 168 L.Ed.2d 203 (2007), it has been well settled that we review a district court's sentencing decisions solely for abuse of discretion. See id. at 2465 ("[A]ppellate `reasonableness' review merely asks whether the trial court abused its discretion."). Significantly less settled has been the question of how exactly we should apply abuse of discretion review to individual sentences. The Supreme Court has recently provided considerable guidance in Gall and Kimbrough.[4]*806 These cases modify the application of our existing substantive reasonableness review and clarify the amount of deference we must afford to a district court's weighing of § 3553(a) sentencing factors. Briefly stated, we now review "all sentences  whether inside, just outside, or significantly outside the Guidelines range  under a deferential abuse-of-discretion standard." Gall, 128 S.Ct. at 591.
Following Rita, but before the Supreme Court's decisions in Gall and Kimbrough, our substantive reasonableness review of outside-Guidelines sentences was governed by United States v. Garcia-Lara. In Garcia-Lara we stated that, since Booker, "we have implicitly acknowledged that we employ an abuse-of-discretion standard," and held that "Rita says nothing new about the standard of review." 499 F.3d at 1136; but see id. at 1141-42 (Lucero, J., dissenting).
Under Garcia-Lara, review of a sentencing variance began with mathematical calculation of both the absolute amount and the relative percentage of the variance from a Guidelines baseline. Based on this determination, we required "more compelling reasons" "the farther the trial court diverge[d] from the advisory guideline range." Id. at 1138-39 (quotation omitted). Second, we conducted our own detailed review of the record to assess the district court's factual bases for its § 3553(a) conclusions. See id. at 1139-40 ("recount[ing] Mr. Garcia-Lara's criminal history in . . . detail" and factually concluding that this history "illustrate[d] his demonstrated propensity to break the law"). Third, we considered de novo the weight assigned by the district court to various § 3553(a) sentencing factors, because we considered this weighing process to be a question of law. Id. at 1137, 1141 n. 4 (refusing to "credit" the district court's "legal conclusion" that "a particular sentence fulfills the sentencing purposes of § 3553(a)").
Fourth, we permitted a variance only if we agreed that it was "justified by `particular characteristics of the defendant' that [were] `sufficiently uncommon'" to distinguish him from the ordinary defendant contemplated by the Sentencing Commission during drafting of the Guidelines. Id. at 1141 (quoting United States v. Mateo, 471 F.3d 1162, 1169 (10th Cir.2006)); see also United States v. Hildreth, 485 F.3d 1120, 1129 (10th Cir.2007). Finally, we did not allow a district court to assign different weight to certain § 3553(a) factors than the weight already given to those factors in the Guidelines. If a district court did so, we sometimes characterized such judgment as "ignor[ing] Congress's policy" choices or "ignor[ing] the Guidelines . . . and instead adopt[ing] its own sentencing philosophy." Garcia-Lara, 499 F.3d at 1140, 1141. Each of these five features of our review of non-Guidelines sentences has now been invalidated by the Supreme Court, and accordingly, Garcia-Lara is no longer binding on this panel. See United States v. Torres-Duenas, 461 F.3d 1178, 1183 (10th Cir.2006) ("intervening Supreme Court precedent" allows a panel of our court to overturn another panel's decision).
The Court has now more clearly defined the term "abuse of discretion." The details of our review, in practice, must now afford substantial deference to district courts. In assessing the standard of review applied by the Eighth Circuit in Gall, the Court looked past that circuit's terminology to the practical application of its "abuse of discretion" review, and concluded that the latter belied the former:
The Court of Appeals gave virtually no deference to the District Court's decision. . . . Although the Court of Appeals correctly stated that the appropriate *807 standard of review was abuse of discretion, it engaged in an analysis that more closely resembled de novo review of the facts presented and determined that, in its view, the degree of variance was not warranted.
128 S.Ct. at 600.
It is clear that Gall and Kimbrough cannot be reconciled with the five previously identified features of our circuit's standard of review applicable to non-Guidelines sentences. Following Gall, sentencing review may not be based on "a rigid mathematical formula that uses the percentage of a departure as the standard for determining the strength of the justifications required for a specific sentence." Id. at 595. Although the degree of variance from the Guidelines range remains a consideration on appeal, id., it may not define our threshold standard of review. We may no longer seek a certain mathematical precision by requiring that § 3553(a) factors reach some specific level of evidentiary weight. Id. at 594 ("[T]he Court of Appeals' rule requiring `proportional' justifications for departures from the Guidelines is not consistent with . . . Booker."); contra Garcia-Lara, 499 F.3d. at 1138-39. To do so would be to "apply[ ] a heightened standard of review to sentences outside the Guidelines range" in practice, no matter how we might describe that standard in theory. Gall, 128 S.Ct. at 596. Moreover, although a district court must provide reasoning sufficient to support the chosen variance, it need not necessarily provide "extraordinary" facts to justify any statutorily permissible sentencing variance, even one as large as the 100% variance in Gall, 128 S.Ct. at 595; contra Mateo, 471 F.3d at 1169-70 (requiring "extraordinary circumstances" to support some variances).
None of this means that the extent of the variance is unimportant. Of course, as the Court noted, in making an "individual assessment," if the district court
decides that an outside-Guidelines sentence is warranted, [it] must consider the extent of the deviation and ensure that the justification is sufficiently compelling to support the degree of the variance. We find it uncontroversial that a major departure should be supported by a more significant justification than a minor one. After settling on the appropriate sentence, [it] must adequately explain the chosen sentence to allow for meaningful appellate review and to promote the perception of fair sentencing.
Gall, 128 S.Ct. at 597. We further recognize, of course, that "closer review may be in order when the sentencing judge varies from the Guidelines based solely on the judge's view that the Guidelines range fails properly to reflect § 3553(a) considerations even in a mine-run case." Kimbrough, 128 S.Ct. at 575 (emphasis added).
Our Garcia-Lara practice of relying upon our own reading of the factual record, in place of the district court's reading, ignores the trial judge's "superior position to find facts and judge their import under § 3553(a) in the individual case." Gall, 128 S.Ct. at 597 (quotation omitted). As Gall observes, a sentencing "judge sees and hears the evidence, makes credibility determinations, has full knowledge of the facts and gains insights not conveyed by the record." Id. (quotation omitted). We lack the district courts' institutional experience of imposing large numbers of sentences, the vast majority of which are never appealed. See id. at 598 n. 7. Thus, our standard of review must comport with this reality by giving both formal and practical deference to the sentencing court's assessment of the facts under § 3553(a). It is not enough to solely credit raw findings of individual fact.
*808 We may not examine the weight a district court assigns to various § 3553(a) factors, and its ultimate assessment of the balance between them, as a legal conclusion to be reviewed de novo. Instead, we must "give due deference to the district court's decision that the § 3553(a) factors, on a whole, justify the extent of the variance." Compare id. at 597, with Garcia-Lara, 499 F.3d at 1137, 1141 n. 4 (refusing to "credit" the district court's § 3553(a) balancing). An appellate court's "disagree[ment] with the District Judge's conclusion that consideration of the § 3553(a) factors justified . . . a marked deviation from the Guidelines range" is simply not enough to support a holding that the district court abused its discretion. Gall, 128 S.Ct. at 602. "[I]t is not for the Court of Appeals to decide de novo whether the justification for a variance is sufficient or the sentence reasonable," id., and we must therefore defer not only to a district court's factual findings but also to its determinations of the weight to be afforded to such findings.
Gall and Kimbrough end our practice of permitting a variance only if the district court "first distinguish[es] [the defendant's] characteristics and history from those of the ordinary . . . offender" contemplated by the Guidelines.[5]Garcia-Lara, 499 F.3d at 1140 n. 5. As the Court explained in Kimbrough, the Sentencing Commission and sentencing courts play complementary roles in fine-tuning an individual sentence: Although "the Commission's recommendation of a sentencing range will reflect a rough approximation of sentences that might achieve § 3553(a)'s objectives" and the Guidelines give a district court a measure of national practice to use as a starting point, the district court will have "greater familiarity with the individual case and the individual defendant before him than the Commission." 128 S.Ct. at 574 (quotation omitted) (citing Rita, 127 S.Ct. at 2465, 2469).[6] Allowing sentencing variances only on the existence of extraordinary defendant "characteristics and history" assumes not that the Guidelines give a "rough approximation" of appropriate sentences, but that they dictate the only appropriate sentence in the absence of extraordinary facts.
Our requirement that district courts distinguish an offender from the "ordinary" offender has given the Guidelines more weight than other § 3553(a) factors, and has effectively required every sentencing variance to be justified by "extraordinary facts." To perform their individualizing role, district courts are now allowed to contextually evaluate each § 3553(a) factor, including those factors the relevant guideline(s) already purport to take into account, even if the facts of the case are less than extraordinary. Compare Gall, *809 128 S.Ct. at 596-97, 602, and Kimbrough, 128 S.Ct. at 570 ("In sum, while [§ 3553(a)] still requires a court to give respectful consideration to the Guidelines, Booker permits the court to tailor the sentence in light of other statutory concerns as well." (citations omitted)), with Garcia-Lara, 499 F.3d at 1137-38 ("A court's conclusion that the Guidelines are simply `wrong' or an inadequate reflection of the statutory sentencing purposes is an unreasonable application of [§ 3553(a) factors]."). In many cases, the Guidelines recommendation and the district court's individualized determination will continue to overlap, but we may not assume that the Guidelines perfectly express the § 3553(a) factors present in an individual case in the face of a district court's conclusion to the contrary.
We may not conclude that simply by diverging from the Guidelines, a district court has disregarded the policy considerations which led the Commission to create a particular Guideline. Compare Kimbrough, 128 S.Ct. at 564 (stating that "the Court of Appeals erred in holding the crack/powder disparity effectively mandatory"), with Garcia-Lara, 499 F.3d at 1140 ("[T]o the extent the District Court believed the career offender enhancement over-represented Mr. Garcia-Lara's prior criminal history, it ignored Congress's policy of targeting recidivist drug offenders for more severe punishment."). Such a review could lead to excessive deference to the macro-level § 3553(a) determinations reached by the Sentencing Commission, and too little deference to the micro-level determinations reserved for the district courts. If district courts are required to balance the Guidelines against the other § 3553(a) considerations, then we cannot say they disregard the Guidelines simply by striking a different balance and imposing a variance in a particular case.[7]See Gall, 128 S.Ct. at 602 ("[T]he Guidelines are only one of the factors to consider when imposing a sentence.").
In sentencing defendants, district courts exercise a guided discretion within a range specified by Congress. As Justice Cardozo wrote, a "judge . . . is to exercise a discretion informed by tradition, methodized by analogy, disciplined by system, and subordinated to the primordial necessity of order in the social life." Benjamin N. Cardozo, The Nature of the Judicial Process 141 (1921) (quotation omitted). Mindful of the statutory limitations of § 3553(a), a sentencing judge's determination within that framework is an exercise of discretion and must be reviewed as such.

B
In sentencing Smart, the district court began by stating that it "gave great weight to the guidelines" recommendation, which had been properly calculated in Smart's PSR. In justifying its downward variance from the Guidelines range, the district court relied primarily on Smart's relatively minor role in the offense. On point:
I feel it would violate [18 U.S.C. § 3553] . . . if you received a far greater sentence than Mr. Rousey. . . . As I stated, he was obviously the instigator and the promoter of this whole event and got others involved, including the under-age *810 girl, and it was his contact with her which created the whole situation.
Although noting that Rousey got the "benefit of accepting responsibility and avoiding trial" and Smart rightfully did not, the court also recognized that Rousey had played a much larger role than Smart in the underlying incident, and sentenced Smart to the same 120-month term imposed on Rousey.[8] This sentence is equivalent to the statutory minimum under § 2251(a). In setting the sentence, the district court cited the need for the sentence imposed to reflect the seriousness of the offense, to promote respect for the law, and to provide just punishment for the offense, § 3553(a)(2)(A), and the need to avoid unwarranted sentencing disparities, § 3553(a)(6).
As discussed above, Gall allows district courts to consider disparities between codefendants. 128 S.Ct. at 600. The district court considered this factor in conjunction with the seriousness of Smart's offense, using codefendant Rousey's conduct largely to illustrate why Smart was less culpable than the Guidelines range would suggest. See United States v. Shaw, 471 F.3d 1136, 1140-41 (10th Cir.2006) (affirming an above-Guidelines sentence where the district court considered a codefendant's sentence in the course of discussing several § 3553(a) factors, including the defendant's criminal history and the seriousness of his offense). In effect, the court concluded that Smart's lesser culpability, offset by his failure to accept responsibility, supported the same term of imprisonment as Rousey's greater culpability and acceptance of responsibility. Given the district court's firsthand observation of Smart's trial, its credibility determinations, and insights not necessarily conveyed by the record, its conclusion was not an abuse of discretion. See Gall, 128 S.Ct. at 597 (recognizing that these "practical considerations" underlie the legal principle that sentencing courts must be reviewed deferentially).
Moreover, the district court's determination that Smart's sentence overstated the seriousness of his offense is supported by evidence in the record tending to show that both Smart and Fields were lesser players in the charged offense. For example, the district court heard testimony from the victim that only codefendant Rousey coerced her-through payment and promises of a place to stay  into participating in various sexual acts. These included those captured on the videotape which led to Smart's conviction. Rousey also initiated the plan to videotape this episode, while Smart was only a follower in the scheme.
Additionally, the district court found that Smart's sentence was justified by several other § 3553(a) factors. It found that the 10-year sentence would be sufficient to deter future criminal conduct, see § 3553(a)(2)(B), and that it would provide sufficient time for Smart to obtain correctional treatment, see § 3553(a)(2)(D). These § 3553(a)(2) factors are of particular importance, as district courts are bound to "impose a sentence sufficient, but not greater than necessary" to comply with them. § 3553(a); see also Kimbrough, 128 S.Ct. at 570.
Applying the appropriate legal standard of review provided by Gall and Kimbrough, and crediting the district court's reasoned consideration of these multiple sentencing factors, we fail to perceive any abuse of discretion in sentencing Smart to a below-Guidelines sentence.


*811 V
AFFIRMED.
HARTZ, Circuit Judge, dissenting:
I respectfully dissent.
Before explaining why I would reverse and remand for resentencing, I should note my concern about the scope of the majority opinion. That opinion discusses at length the meaning of abuse-of-discretion review of the substantive reasonableness of a sentence in light of Gall v. United States, ___ U.S. ___, 128 S.Ct. 586, 169 L.Ed.2d 445 (2007), and Kimbrough v. United States, ___ U.S. ___, 128 S.Ct. 558, 169 L.Ed.2d 481 (2007). The opinion justifies this discussion on the ground that the government's argument on appeal includes a claim of substantive unreasonableness. In my view, however, the majority opinion has mischaracterized the government's arguments in this case and has misconceived the meanings of substantive and procedural error. As I read the government's briefs, they argue only that the district court took into account two improper considerations in arriving at Mr. Smart's sentence. And that, I believe, is a matter of procedural, not substantive, error, as those terms have been used in this context.
I will begin by distinguishing between procedural and substantive error. In arriving at a sentence the district court must (1) correctly find contested facts, (2) properly calculate the Guidelines sentencing range, (3) listen to the arguments of the parties, (4) determine what matters should be considered in imposing sentence (a process guided largely by 18 U.S.C. § 3553(a)), (5) weigh those considerations to arrive at a sentence, and (6) explain the sentencing decision to the parties. As I understand what the Supreme Court has said, substantive error is an error at step 5  weighing the proper considerations and fixing the length of the sentence. Error in the performance of the other five steps is termed procedural error. When a procedural error has been committed, we would ordinarily need to reverse for resentencing, regardless of whether the length of the sentence was substantively reasonable.
Although the Supreme Court has not had occasion to draw a precise line between substantive and procedural reasonableness, its opinions provide significant guidance. In Gall the Court writes:
[The appellate court] must first ensure that the district court committed no significant procedural error, such as [2] failing to calculate (or improperly calculating) the Guidelines range, treating the Guidelines as mandatory, [4] failing to consider the § 3553(a) factors, [1] selecting a sentence based on clearly erroneous facts, or [6] failing to adequately explain the chosen sentence  including an explanation for any deviation from the Guidelines range. Assuming that the district court's sentencing decision is procedurally sound, the appellate court should then consider the substantive reasonableness of the sentence imposed under an abuse-of-discretion standard.
128 S.Ct. at 597. As shown by the bracketed numbers that I have inserted, what I call steps 1, 2, 4, and 6 are included as procedural matters. With respect to step 4, the Court specifically mentions "failing to consider the § 3553(a) factors," id. Although it does not include "considering an improper factor" as a procedural error, I do not see how such an error could be characterized differently from failing to consider a proper factor. Perhaps "treating the Guidelines as mandatory," id., might be viewed as consideration of an improper factor, but I would view it as simply a special case of not considering all the § 3553(a) factors  of which only one, *812 paragraph (4), relates to the Guidelines range.
The structure and content of Part IV of Gall reaffirms my view of the meanings of procedural and substantive. The first paragraph of Part IV states that the discussion to follow will relate to procedural error, and it lists several procedural requirements (including my step 3, listening to argument by the parties):
As an initial matter, we note that the District Judge committed no significant procedural error. He correctly calculated the applicable Guidelines range, allowed both parties to present arguments as to what they believed the appropriate sentence should be, considered all of the § 3553(a) factors, and thoroughly documented his reasoning. The Court of Appeals found that the District Judge erred in failing to give proper weight to the seriousness of the offense, as required by § 3553(a)(2)(A), and failing to consider whether a sentence of probation would create unwarranted disparities, as required by § 3553(a)(6).
Id. at 598. Part IV continues with the Court's refutation of the arguments that the sentencing judge had ignored the health risks of the drug ecstasy and the need to avoid unwarranted disparities in sentences. The first sentence of the concluding paragraph of Part IV then states:
Since the District Court committed no procedural error, the only question for the Court of Appeals was whether the sentence was reasonable  i.e., whether the District Judge abused his discretion in determining that the § 3553(a) factors supported a sentence of probation and justified a substantial deviation from the Guidelines range.
Id. at 600. Again, the Court appears to view substantive error as error in the weighing of proper factors and fixing the sentence.[1]
The concluding portion of Kimbrough also supports my view that consideration of an improper factor is a procedural error. Part V begins: "Taking account of the foregoing discussion [regarding the crack-cocaine Guidelines] in appraising the District Court's disposition in this case, we conclude that the 180-month sentence imposed on Kimbrough should survive appellate inspection." 128 S.Ct. at 575. The Court then notes that the district court "properly calculat[ed] and consider[ed] the advisory Guidelines range," and "addressed the relevant § 3553(a) factors." Id. "[T]he District Court," it says, "thus rested its sentence on the appropriate considerations and committed no procedural *813 error." Id. at 575-76 (emphasis added; internal quotation marks omitted). The next paragraph then begins: "The ultimate question in Kimbrough's case is whether the sentence was reasonable i.e., whether the District Judge abused his discretion in determining that the § 3553(a) factors supported a sentence of 15 years and justified a substantial deviation from the Guidelines range." Id. at 576 (brackets and internal quotation marks omitted). I would infer from the quoted language that the first paragraph of Part V addressed procedural error and the next (and final) paragraph addressed substantive error. Thus, "resting [the] sentence on the appropriate considerations," id. at 575-76, is a matter of procedural reasonableness and whether "the § 3553(a) factors supported [the] sentence" is a matter of substantive reasonableness, id. at 576. Accordingly, an appellate court's determination of whether it was proper for the lower court to consider a factor is part of the review of procedural reasonableness.
My understanding of the Court's terminology is also supported by Justice Scalia's use of the term substantive in this context. Justice Scalia has endorsed appellate review of procedural error in sentencing, but he has condemned substantive review because of his concern that it can create the error barred by the holding of United States v. Booker, 543 U.S. 220, 125 S.Ct. 738, 160 L.Ed.2d 621 (2005), regarding the Sixth Amendment. See Rita v. United States, ___ U.S. ___, 127 S.Ct. 2456, 2474-84, 168 L.Ed.2d 203 (2007) (Scalia, J., concurring). The Booker issue that concerns him arises only from review of the lengths of sentences. Booker reaffirmed the proposition that "[a]ny fact (other than a prior conviction) which is necessary to support a sentence exceeding the maximum authorized by the facts established by a plea of guilty or a jury verdict must be admitted by the defendant or proved to a jury beyond a reasonable doubt." 543 U.S. at 244, 125 S.Ct. 738. In Justice Scalia's view, that rule would be violated if an appellate court said that the length of a sentence of a defendant who went to trial was reasonable only because the sentencing judge found, say, that the quantity of drugs involved in the crime was ten times the amount alleged in the indictment. See Rita, 127 S.Ct. at 2476-72 (Scalia, J., concurring). Because Justice Scalia views such determinations by appellate courts as inevitable if the lengths of sentences must be reviewed for reasonableness, see id. at 2478, 2480-81, he concludes that "substantive reasonableness review [would] cause judge-found facts to justify greater punishment than the jury's verdict or the defendant's guilty plea would sustain," id. at 2482. The substantive review on which his concern is focused must be only review of the ultimate sentence, not review of matters such as whether the sentencing judge properly considered or failed to consider certain factors, because Booker places no Sixth Amendment constraint on what a sentencing judge may consider, so long as the lawfulness of the length of a sentence is not dependent on judicial finding of a particular fact. Indeed, Justice Scalia's Rita concurrence specifically labels as "procedural review" an appellate court's reversal of a sentence on the ground that the district court "consider[ed] impermissible factors." Id. at 2483.
Accordingly, I conclude that a district court commits procedural, not substantive, error when it (1) takes into account an improper consideration or (2) fails to take into account a mandated consideration. Yet that is precisely the type of error alleged by the government in this case. Its brief is devoted to arguing that the district court committed legal error by deciding that a defendant's sentence should not be increased because he went to trial *814 and that it should be decreased to be consistent with a codefendant's sentence. Its "Statement of the Issue Presented for Review" is:
Whether the district court's departure [read, `variance'] below the Guideline range for Smart who testified falsely at trial and received an enhancement for obstruction of justice was reasonable when the court's justification was that Smart should not be punished for going to trial and should not receive a greater sentence than codefendant Rousey who pled guilty prior to trial.
Aplt.'s Br. at 1. The heading for the opening brief's Argument section is: "The District Court's Departure below the Guideline Range for Smart was not reasonable when it was based upon Legal Error." Id. at I. And its Summary of the Argument is:
When a district court imposes a sentence below the Guideline range, the resulting sentence is not presumptively reasonable. A district court must provide an appropriate justification for the Court of Appeals to determine whether the departure is reasonable. The district court's justification in this case rested on two erroneous grounds. First, the district court concluded that a higher sentencing range for Smart than co-defendant Rousey, who pled guilty prior to trial, would punish Smart for going to trial. Second, the district court concluded that a greater term of imprisonment for Smart than co-defendant Rousey would contravene the unwarranted sentencing disparities provision in 18 U.S.C. § 3553(a)(6). The justification that a higher sentence for Smart would punish him for going to trial contravenes established authority. Furthermore, the unwarranted sentencing disparity provision in § 3553 is intended to be applied on a national level  not among co-defendants in the same case. Because the district court's justification was legally erroneous, the resulting sentence is unreasonable. The case should be remanded to the district court for resentencing.
Id. at 8.
As I read these passages, the government is contending that the district court committed legal error by including two improper considerations in its calculus. It is not contending simply that the two considerations were given improper weight. It makes no argument regarding what would be a reasonable length for Mr. Smart's sentence. Its argument is thus a claim of procedural, not substantive, error. Although the government did not use the term procedural or substantive error, that is not surprising in a brief filed on August 22, 2006.
The majority opinion acknowledges that a sentencing judge's consideration of an improper factor may be characterized as a procedural error, but, for reasons I do not understand, limits that characterization to consideration of a factor not set forth in § 3553(a). Substantive error, in its view, includes (1) unreasonable weighing of proper considerations in arriving at the length of the sentence and (2) consideration of a factor not set forth in § 3553(a). What a strange beast substantive error has become. To a familiar animal (the first type of substantive error) has been added an awkward appendage. I see nothing in the Supreme Court's opinions that creates this creature. As previously noted, Kimbrough appears to state that "rest[ing] [the] sentence on the appropriate considerations," 128 S.Ct. at 575-76, is a matter of procedural reasonableness. And certainly Justice Scalia's use of the terms substantive and procedural in this context is inconsistent with the majority opinion's view: reversing a sentence on the *815 ground that the sentencing judge (under the post-Booker regime) took into account an improper consideration (whatever the ground on which it was found improper) could in itself have no Booker Sixth Amendment implications.
Moreover, when an appeal challenges the sentencing judge's taking into account an allegedly improper consideration, I would think that the nature of our review would necessarily be the same regardless of whether the consideration could be said to be encompassed by the § 3553(a) factors. If, for example, the sentencing judge considered the defendant's race (a "characteristic[s] of the defendant" encompassed by § 3553(a)(1)) when imposing sentence, we would treat the error just as we would if the judge had considered the latest quotation for the Dow Jones Industrial Average (not a factor encompassed by § 3553(a)). Although there may be limited circumstances in which the error would not require reversal (such as, when the court imposed the statutory mandatory minimum sentence), we would ordinarily reverse; and that reversal would be regardless of whether our substantive-reasonableness review showed that the sentence imposed was a reasonable one in light of the proper considerations taken into account by the judge. Substantive-reasonableness review is so deferential to the sentencing judge only because that review is performed after the appellate court has assured itself that the judge committed no procedural error before undertaking the weighing process that led to the sentence.
Accordingly, I am not persuaded by the majority opinion that the government has raised a claim of substantive reasonableness on this appeal. The government's argument that the district court took into account two improper considerations raises a question of procedural reasonableness. And the government's briefs do not support the majority opinion's statement that "the government also questions whether Smart's sentence can be supported in the absence of the allegedly `improper' factors it identifies." Op. at 804.
I now turn to the issues actually before us. The government contends, in essence, that the district court did not "rest[ ] [the] sentence on the appropriate considerations," Kimbrough, 128 S.Ct. at 575-76. In the government's view the court committed legal error by taking into account two improper considerations when determining Mr. Smart's sentence: (1) that the sentence of a defendant who exercises the right to go to trial should not be greater than it would be if the defendant had pleaded guilty, and (2) that 18 U.S.C. § 3553(a)(6) supports the imposition of identical sentences on codefendants even when they are not similarly situated. I am not certain that the district court committed either of these errors. But a court's language in imposing sentence is important, and the words employed at Mr. Smart's sentencing sufficiently suggest these two errors that I believe remand for resentencing to be the proper disposition.
Beginning with the right-to-trial issue, I share the government's concern with the district court's statement, "I don't necessarily think that you should be punished because you exercised your right to a trial by jury." Aplt. App. at 105. The court's statement does not express an absurd point of view, but it is contrary to the policy of the United States Sentencing Guidelines. Under USSG § 3E1.1 a defendant who accepts responsibility for the charged crime, which almost always requires pleading guilty to the charge, see id. cmt. n. 2, receives a lower offense level, and hence a lighter sentence. I recognize that district courts are not bound by Guidelines policy. In particular, Kimbrough *816 held that the sentencing judge need not follow a guideline that did "not exemplify the [Sentencing] Commission's exercise of its characteristic institutional role." 128 S.Ct. at 563. But here, unlike in Kimbrough, performance of that characteristic role, which entails analysis of empirical sentencing data to establish national norms, almost certainly led to § 3E1.1. It is therefore appropriate to recognize Kimbrough's statement that "closer review may be in order when the sentencing judge varies from the Guidelines based solely on the judge's view that the Guidelines range fails properly to reflect § 3553(a) considerations even in a mine-run case." Id. (internal quotation marks omitted). Unfortunately for us, the Court in Kimbrough had "no occasion for elaborative discussion of this matter." Id. I would think, however, that a district court's disagreement with one of the Commission's core policies at least requires an explanation beyond a mere assertion of a contrary point of view. In particular, the district court did not point to anything special about this case that would suggest the inapplicability of the Guidelines policy.
The government's second contention is not as compelling but should be addressed upon remand. Under 18 U.S.C. § 3553(a), "[t]he court, in determining the particular sentence to be imposed shall consider  . . . (6) the need to avoid unwarranted sentence disparities among defendants with similar records who have been found guilty of similar conduct[.]" In sentencing Mr. Smart, the district court said: "I feel it would violate 3553 from the standpoint if you received a far greater sentence than Mr. Rousey. I believe that the disparity would be a violation of that section." Aplt. App. at 106. My concern is that the use of the term disparity may suggest that the court was invoking § 3553(a)(6). Doing so would have been error. "[T]he kind of `disparity' with which § 3553(a)(6) is concerned is an unjustified difference across judges (or districts) rather than among defendants to a single case." United States v. Boscarino, 437 F.3d 634, 638 (7th Cir.2006). The purpose of § 3553(a)(6) would be defeated if district courts used it as the rationale for imposing the same sentences on two codefendants despite finding that one, but not the other, had obstructed justice and failed to accept responsibility (as was true in this case). If such a practice prevailed, the sentences of two absolutely identical defendants (I'll call them A and B) in two different cases could differ solely because of the conduct of their codefendants. If A's codefendant obstructed justice and refused to accept responsibility, whereas B's codefendant accepted responsibility and did not obstruct justice, then A's codefendant would receive a harsher sentence than B's codefendant, and a practice of equalizing the sentences of codefendants would result in A's sentence being harsher than B's, contrary to the like-sentence mandate of § 3553(a)(6).
Gall does not suggest otherwise. The majority opinion quotes Gall's statement that sentencing judges may "`consider[ ] the need to avoid unwarranted similarities among [codefendants] who [are] not similarly situated,' despite falling under the same or similar Guidelines sentencing ranges." Op. at 804 (quoting Gall, 128 S.Ct. at 600) (alterations in majority opinion). But all that this statement in Gall does is affirm that differently situated defendants can be sentenced differently. I am sure that the Supreme Court would also say that it is proper for a judge (as may have happened in this case) to impose identical sentences on two defendants who are not similarly situated but whose dissimilarities (some favoring one defendant and some favoring the other) cancel out. The mere fact that two persons being sentenced are codefendants is, however, not a *817 proper ground for imposing identical sentences, and Gall does not say that it is.[2] The district court should make clear on remand that it is not invoking that ground in support of Mr. Smart's sentence.
Accordingly, I would reverse Mr. Smart's sentence and remand for resentencing.
NOTES
[1]  Invoking what is perhaps the most extreme hypothetical case, the dissent contends that a sentence based on a defendant's race would amount to consideration of an "improper" or "legally erroneous" sentencing factor reversible for procedural error. Dissent at 814-15. We have no doubt that such a sentence would represent grave legal error, but not due to procedural infirmity. Section 3553(a)(1) allows district courts to consider the "characteristics of the defendant," and from a purely procedural perspective, race is a characteristic of the defendant. Our review of the substantive reasonableness of such a sentence would surely reveal, however, that the district court abused its discretion, as this characteristic could not provide a logical justification for imposing a particular sentence. In addition, the resulting sentence would violate the Equal Protection Clause of the Fourteenth Amendment, which makes the specific characteristic of race an impermissible government consideration in the absence of compelling reasons to the contrary. See, e.g., Johnson v. California, 543 U.S. 499, 505, 125 S.Ct. 1141, 160 L.Ed.2d 949 (2005).
[2]  Even before Gall, we never squarely held that a district court could not consider disparities between codefendants. We held only that a defendant was not entitled to a variance from a presumptively reasonable Guidelines sentence merely because the defendant received a larger sentence than a codefendant. See United States v. Verdin-Garcia, 516 F.3d 884 (10th Cir.2008); United States v. Davis, 437 F.3d 989, 997 (10th Cir.2006) ("[A] criminal defendant alleging a disparity between his sentence and that of a co-defendant is not entitled to relief from a sentence that is properly within the sentencing guidelines and statutory requirements."); see also United States v. Parker, 462 F.3d 273, 278 (3d Cir.2006) ("Where appropriate to the circumstances of a given case, a sentencing court may reasonably consider sentencing disparity of co-defendants in its application of [§ 3553(a)] factors.").
[3]  Accordingly, we express no view as to the consequences of such a procedural error. Cf. Roberson, 517 F.3d at 993, 2008 WL 323223, at *1 (holding that reversal is required only where an improper factor was given "significant weight"); Fed.R.Crim.P. 52(a) (providing for review for harmless error).
[4]  In Gall, the Court overturned the Eighth Circuit and reinstated a variance imposed by the district court, holding that the appellate court failed to use an abuse of discretion standard in concluding that the district court incorrectly applied § 3553(a) sentencing factors. 128 S.Ct. at 602. In Kimbrough, the Court overturned a per se rule forbidding a district court from "disagreeing" with the Guidelines' 100:1 crack-to-cocaine sentencing ratio, and held that district courts must be allowed to consider whether other § 3553(a) policies outweigh the Guidelines in a given case. 128 S.Ct. at 564.
[5]  The defendant in Kimbrough was, by all accounts, an ordinary offender with characteristics fully contemplated by the Guidelines; the Court nonetheless reinstated a below-Guidelines sentence which the district court had concluded was "clearly long enough" to accomplish the objectives of § 3553(a). See 128 S.Ct. at 564-65. As noted above, "closer review may be in order when the sentencing judge varies from the Guidelines based solely on the judge's view that the Guidelines range fails properly to reflect § 3553(a) considerations even in a mine-run case." Id. at 575 (emphasis added).
[6]  District courts are also better positioned to evaluate "ordinariness" than are we. A case that seems "ordinary" to us compared only to the cases we see on review, as opposed to the entire range of sentences imposed at the trial court level but not appealed, may, from the district court's vantage, be anything but ordinary. See Gall, 128 S.Ct. at 598 & n. 7 (stating that district courts see "many more Guidelines sentences than appellate courts do" (quotation omitted)).
[7]  Moreover, once a district court correctly calculates and considers a defendant's Guidelines range, we must assume that it has "necessarily" given consideration to the need for uniformity in sentencing. Gall, 128 S.Ct. at 599; contra Hildreth, 485 F.3d at 1130. Based on this holding, we might conclude that correct calculation of the Guidelines range indicates consideration of all of the sentencing policies animating a particular guideline.
[8]  Smart and Rousey had the same criminal history category of V and the same base offense level, before enhancements and reductions, of 27.
[1]  Part V of Gall, which addresses the substantive reasonableness of the sentence, may appear to treat consideration of an improper factor as a matter of substantive error. The discussion includes a response to the view of the court of appeals in that case that the sentencing judge had given "significant weight to an improper factor"  namely, "compar[ing] Gall's sale of ecstasy when he was a 21-year-old adult to the impetuous and ill-considered actions of persons under the age of 18," id. at 601 (internal quotation marks omitted). The argument, however, was not that immaturity is an improper factor to consider in sentencing but that the record did not provide factual support for the factor  that is, the record failed to show that Gall's behavior was "impetuous or ill-considered," id. (internal quotation marks omitted). The Court rejected the appellate court's view, citing authority that someone of Gall's age would not have a fully mature brain. Perhaps the Court's treatment of this argument would have been better placed in the Part IV discussion of procedural error, but whatever the rhetorical reasons for placing it in Part V, that placement hardly suggests that factual errors are not procedural errors, see id. at 597 (including "selecting a sentence based on clearly erroneous facts" in a list of possible procedural errors). And the discussion certainly does not imply that consideration of an improper factor-such as race  is a substantive error.
[2]  Of course, when codefendants are similarly situated, they should receive similar sentences.
