
617 F.3d 1029 (2010)
UNITED STATES of America, Appellee,
v.
Earl FOY, Jr., Appellant.
No. 09-3027.
United States Court of Appeals, Eighth Circuit.
Submitted: April 15, 2010.
Filed: August 20, 2010.
*1032 Angela L. Campbell, argued, Des Moines, IA, Alexander M. Esteves, on the brief, Sioux City, IA, for appellant.
Rebecca Goodgame Ebinger, argued, Cedar Rapids, IA, Kevin Craig Fletcher, I, AUSA, on the brief, Sioux, City, IA, for appellee.
Before LOKEN, BRIGHT, and MELLOY, Circuit Judges.
MELLOY, Circuit Judge.
Earl Foy, Jr. pled guilty to three counts of mailing threatening communications, in violation of 18 U.S.C. § 876(c), and two counts of mailing threatening communications to extort money, in violation of 18 U.S.C. § 876(b). Foy subsequently moved to withdraw his plea. The district court[1] denied Foy's motion and ultimately sentenced him to 480 months' imprisonment. Foy appeals his sentence and the denial of the motion to withdraw his plea. We affirm.

I. Background
In May 2007, Foy was charged with sending threatening letters to "S.K.," his ex-girlfriend and the mother of his son, between November and December 2005. Foy was incarcerated in state prison at the time the letters were sent. All three letters contained death threats against S.K. and others. Two of the letters also contained demands for sums equaling thousands of dollars, including one demand for a total of approximately $90,000 from S.K., her daughter, and her friend.
A jury trial commenced on April 20, 2009. At the close of the government's case, Foy pled guilty to all charges without the benefit of a plea agreement. The court[2] accepted the plea, finding that Foy was competent and capable of entering an informed plea, that he was aware of the nature of the charges and the consequences of the plea, and that he made the plea knowingly and voluntarily. Three days later, through trial counsel, Foy filed a motion to withdraw the plea and to appoint new counsel. Foy subsequently submitted pro se memoranda in which he maintained that he wished to withdraw the *1033 plea primarily because the government presented evidence at trial that had been tampered with or forged. New counsel was eventually appointed in June. In July, the district court denied Foy's motion to withdraw his plea.
The case proceeded to sentencing. The presentence report ("PSR") calculated Foy's combined adjusted base offense level as twenty-five. Because Foy qualified as a career offender, his total offense level increased to thirty-two. See United States Sentencing Guideline § 4B1.1. The PSR recommended against a two-level reduction pursuant to U.S.S.G. § 3E1.1(a) for acceptance of responsibility. Foy objected to the recommendation against an acceptance-of-responsibility reduction. The PSR computed his criminal history category as VI on two grounds: Foy's accumulation of twenty-one criminal history points and his status as a career offender. Consequently, the PSR scored Foy's advisory Guidelines sentencing range as 210 to 262 months' imprisonment.
At sentencing, the district court overruled Foy's objection to not receiving acceptance-of-responsibility credit, a decision that he does not appeal. The district court also gave notice at the beginning of the hearing that it intended to vary upwardly from the Guidelines. Foy's counsel objected to the lack of advance notice. The court listed several reasons it believed warranted a variance. Defense counsel argued in response that Foy had a history of mental health and substance abuse issues from a young age and a difficult childhood that mitigated against a substantial sentence. The government offered as evidence Foy's pro se requests to withdraw his plea, letters Foy sent to S.K. at work post-plea, and reports of disruptive behavior while awaiting sentencing. The court also admitted to the sentencing record a forensic competency evaluation prepared prior to trial as an indicator of Foy's mental health. Upon consideration of the record, the court agreed with the PSR's determination of an advisory Guidelines range, denied Foy's motion for a downward variance, and imposed an upward variance, sentencing him to 480 months' imprisonment. The court achieved the federal sentence by running the twenty-year statutory maximum sentences on the two § 876(b) counts consecutively, with the sixty-month sentences for the § 876(c) counts running concurrently to one another and to the § 876(b) counts. It also ordered the federal sentence to run consecutively to his incomplete state sentence. See U.S.S.G. § 5G1.3(a). The district court subsequently filed a sentencing memorandum addressing its reasoning for varying upwardly.

II. Discussion

A. Withdrawal of the Guilty Plea[3]
Foy argues that the district court should have allowed him to withdraw his guilty plea. In support of his position, he presents arguments that he failed to raise before the district court and brings for the first time on appeal. First, he asserts that some of his responses during the plea colloquy demonstrate that his mental state was impaired at the time. To the extent he presents this argument to establish his plea was unknowing or involuntary, "such a claim would not be cognizable *1034 on direct appeal where he failed to present it to the district court in the first instance by a motion to withdraw his guilty plea." United States v. Washington, 515 F.3d 861, 864 (8th Cir.2008) (citing United States v. Murphy, 899 F.2d 714, 716 (8th Cir.1990)); see also United States v. Young, 927 F.2d 1060, 1061 (8th Cir.1991). Second, he contends that the district court failed to inform him that the twenty-year statutory maximum sentences for the § 876(b) charges could be run consecutively. He alleges this omission was a violation of the requirement in Federal Rule of Criminal Procedure 11 to advise him of the maximum possible penalty he faced. See Fed.R.Crim.P. 11(b)(1)(H). Instances of noncompliance with Rule 11 may be raised for the first time on appeal, but our review is for plain error. United States v. Vonn, 535 U.S. 55, 59, 122 S.Ct. 1043, 152 L.Ed.2d 90 (2002).
To succeed on plain error review in this context, a defendant must show "not only an error in the failure to follow Rule 11 but also a `reasonable probability that but for the error, he would not have entered a guilty plea.'" United States v. Garcia, 604 F.3d 575, 578 (8th Cir.2010) (quoting United States v. Luken, 560 F.3d 741, 745 (8th Cir. 2009)). "Even if he establishes such a probability, relief is discretionary and `the court should not exercise that discretion unless the error seriously affect[ed] the fairness, integrity or public reputation of judicial proceedings.'" Id. (quoting United States v. Olano, 507 U.S. 725, 732, 113 S.Ct. 1770, 123 L.Ed.2d 508 (1993)). In determining whether a Rule 11 error affected a defendant's substantial rights, the reviewing court considers the entire record, not merely the plea proceedings. Vonn, 535 U.S. at 74-75, 122 S.Ct. 1043.
Regarding the maximum penalties, the district court stated in relevant part: "Under Counts 1, 2, and 4 [the § 847(c) charges], there is a maximum possible fine of $250,000, a maximum possible imprisonment of 5 years. . . . On Counts 3 and 5 [the § 876(b) charges] there's a maximum possible fine of $250,000. There's a maximum possible imprisonment of 20 years in prison[.]" There is no dispute that the district court's statements did not explicitly alert Foy to the possibility of consecutive sentencing. Foy cites our decision in United States v. Burney, 75 F.3d 442 (8th Cir.1996), in support of his claim that the district court's disclosure was insufficient. In Burney, we stated that "[t]o the extent that the sentencing court is obligated [in accepting a plea agreement] to disclose the possibility of consecutive sentencing . . . we believe that the district court implicitly did so by telling [the defendant] that ten years was the maximum term of imprisonment for each of the three counts." Id. at 445. We agree that the district court's statements were less clear than those in Burney. However, even assuming, without deciding, that the district court should have done more in this case, we do not believe a reasonable probability exists that he would have continued with trial but for the court's misstep.
First, the record indicates that Foy actually knew early on that consecutive sentences could result in an even more onerous total sentence than the one eventually imposed. According to the competency report prepared prior to trial, Foy told the evaluator that he would be incarcerated possibly for fifty-five years if convicteda sentence achieved by running the sentences consecutively on all five counts. Moreover, Foy does not appear to argue here that the district court's statements during the plea colloquy confused his understanding or actually misled him about his potential sentence, as did the defendant in Burney. See id. at 444. Nor has Foy asserted that he thought a twenty-year sentence was the absolute maximum he *1035 could receive. In sum, because the record shows Foy was actually aware of the possibility of consecutive sentencing, any Rule 11 violation was unlikely to impact his decision to plead guilty. See Young, 927 F.2d at 1062.
Additionally, Foy had notice prior to sentencing of the possibility of consecutive sentencing and did not object. The PSR referred to consecutive sentencing pursuant to U.S.S.G. § 5G1.2(d), which directs consecutive sentencing beyond the highest statutory maximum "to the extent necessary to produce a combined sentence equal to the total punishment," as well as § 5G1.3(a). Foy did not object to these portions of the PSR. The district court also addressed the possibility of consecutive sentences specifically for the § 847(b) charges during the sentencing hearing. Although Foy's attorney objected generally to the lack of advance notice of an upward variance, Foy said nothing about the alleged Rule 11 violation. Rather, Foy went on to indicate during his allocution that he had attempted to withdraw his guilty plea because he regretted pleading without a plea agreement. Were his decision to plead guilty closely tied to court error during the plea colloquy, we think it likely he would have mentioned any misunderstanding resulting from it in the district court. See Garcia, 604 F.3d at 578 ("Had [the defendant's] decision to plead guilty been closely dependent on the mistaken view that he faced a maximum of three years' of supervised release, we believe it likely that he would have spoken up in the district court" where he had notice of the period of supervised release and opportunities to object). At heart, Foy's arguments appear to state a belief that the district court should have warned him as to how consecutive sentences could be used to achieve the exact sentence eventually imposed. Rule 11 does not, however, guarantee a defendant the right to know his actual sentence before pleading guilty. Burney, 75 F.3d at 445. We reject this challenge to his plea.

B. Sentencing Issues
Foy first argues that the district court gave insufficient notice of its intent to apply an upward variance. As the district court correctly noted, however, it was not required to provide advance notice of its intent to vary upwardly. Irizarry v. United States, 553 U.S. 708, 128 S.Ct. 2198, 2202-03, 171 L.Ed.2d 28 (2008); see also United States v. Levine, 477 F.3d 596, 606 (8th Cir.2007); United States v. Sitting Bear, 436 F.3d 929, 932-33 (8th Cir. 2006); United States v. Long Soldier, 431 F.3d 1120, 1122 (8th Cir.2005); United States v. Egenberger, 424 F.3d 803, 805 (8th Cir.2005). As we have explained, Federal Rule of Criminal Procedure 32(h) "provides that under certain circumstances the district court must give notice to the parties that it is contemplating a departure from the guidelines range. However, notice pursuant to Rule 32(h) is not required when the adjustment to the sentence is effected by a variance, rather than by a departure." Long Soldier, 431 F.3d at 1122; see also Irizarry, 128 S.Ct. at 2203-04.
The Supreme Court has recognized that "there will be some cases in which the factual basis for a particular sentence will come as a surprise to a defendant or the Government." Irizarry, 128 S.Ct. at 2203. In those cases, "[t]he more appropriate response" is for the district court "to consider granting a continuance when a party has a legitimate basis for claiming that the surprise was prejudicial." Id. Here, Foy's attorney objected to the lack of advance notice, but did not request a continuance to respond. Furthermore, counsel indicated he was not surprised that the court was considering an upward variance and did not identify any specific prejudice resulting *1036 from the timing of the notice. On these facts, we conclude resentencing is not required based on a lack of notice.
Foy also argues that the district court erred in varying upwardly to 480 months. Absent a procedural error by the district court, such as "failing to calculate (or improperly calculating) the Guidelines range, treating the Guidelines as mandatory, failing to consider the § 3553(a) factors, selecting a sentence based on clearly erroneous facts, or failing to adequately explain the chosen sentence," we review substantive reasonableness under an abuse-of-discretion-standard. United States v. Feemster, 572 F.3d 455, 461 (8th Cir.2009) (en banc). To determine whether a district court abuses its discretion, we consider broadly whether it "(1) fails to consider a relevant factor that should have received significant weight; (2) gives significant weight to an improper or irrelevant factor; or (3) considers only the appropriate factors but in weighing those factors commits a clear error of judgment." Id. (quotation omitted). "[W]e are to take into account the totality of the circumstances, including the extent of any variance from the Guidelines range." Id. (quotation omitted). We may not, however, consider a sentence outside the range presumptively unreasonable. Id. In considering the extent of a variance, we give "due deference to the district court's decision that the § 3553(a) factors, on a whole, justify the extent of the variance." Gall v. United States, 552 U.S. 38, 50-51, 128 S.Ct. 586, 169 L.Ed.2d 445 (2007) (noting a major deviation from a Guidelines sentence should be supported by "more significant justification than a minor one."). Finally, we note that when a district court achieves an upward variance through consecutive sentencing, it must indicate why the above-Guidelines sentence is sufficient, but not greater than necessary under § 3553(a), and why consecutive terms of imprisonment are reasonable under § 3584, in light of the § 3553(a) factors. United States v. Jarvis, 606 F.3d 552, 554 (8th Cir.2010).
Here, the district court recognized the variance from the top of the Guidelines range was substantial and stated that the factors outlined in § 3553(a)(1) and (2) primarily drove its decision. The court was concerned by the nature of Foy's criminal conduct, which includes unscored assaults as a juvenile and several other violent crimes as young adult, a number of them against women; interference with official acts, such as assaulting officers and fleeing from arrest; and assaultive conduct and violations of prison rules, some which occurred while he was awaiting sentencing on the instant charges. The court also noted Foy had accumulated twenty-one criminal history points by the age of twenty-seven, substantially more than the minimum number of points associated with career offender status, see U.S.S.G. § 4A1.1(a), and eight points over the number needed to reach criminal history category VI. As a result, the court concluded that by this objective measurement, the Guidelines did not adequately account for his criminal history. Additionally, the court found a Guidelines sentence gave substantially too little weight to the nature and circumstances of the crime charged, in particular because Foy had threatened to kill multiple people, including S.K., who had been the victim of some of his prior assaultive conduct. The court characterized Foy as a "menace to society" and stated its belief that it was only a "matter of time" before Foy would kill someone. Our careful review of the record leads us to conclude that the district court's concerns were justified.
As for the need for the sentence imposed, the court reiterated the serious nature of the offenses and Foy's disrespect for the law as evidenced by his criminal history. Although the court believed a lengthy sentence was unlikely to deter *1037 Foy's recidivism, it determined a long sentence would provide general deterrence to similarly situated persons. See 18 U.S.C. § 3553(a)(2)(B). Most importantly, the court found a sentence that would imprison Foy for most of his adult life was necessary to protect the public. See id. § 3553(a)(2)(C). The court considered the remaining § 3553(a) factors and determined them to be "at best, neutral." Based on its weighing of the relevant factors, the court believed that the sentence imposed was sufficient but not greater than necessary.
Foy argues that the district court erred by failing to fully consider and give sufficient weight to the "mitigating" factors in his history. We disagree. The district court expressly considered his mental health, but disagreed with Foy as to its significance and impact on his case. Instead of helping his case, the court found the record indicated a high level of anti-social behavior that placed Foy in the "top 5 or 10" of the approximately 2,500 defendants the court had sentenced. The court did find "substantially mitigating" the lack of parental guidance and supervision he received as a child and his introduction at a young age to drugs and alcohol. It determined, however, that these factors did not sufficiently dispel a belief that a 480-month sentence was necessary. We cannot say the court erred in its consideration of mitigating factors in Foy's history.
Foy next claims that his lengthy sentence was the product of the district court's "predisposition" or "bias" against violent offenders, particularly those with a history of domestic abuse. In probing whether Foy was prejudiced by a lack of advance notice of an upward variance, the court frankly acknowledged its own history of "being harder on violent defenders [sic] than what the government often recommends" and for giving defendants with a "demonstrated lengthy recidivist history of violence and violence against women" a "very bad day." The court also used strong language during the hearing and in its written order to characterize Foy's criminal history and threat to the public. It was not improper, however, for the court to consider the type and nature of Foy's prior criminal activities, and the court clearly did so within its analysis of the § 3553(a)(1) and (a)(2) factors. Foy further acknowledges that the post-Booker sentencing scheme has bestowed wide latitude to individual district court judges in weighing relevant factors. Accordingly, some variation in sentencing will result depending on the identity of the sentencing judge. See United States v. Booker, 543 U.S. 220, 263, 125 S.Ct. 738, 160 L.Ed.2d 621 (2005) ("We cannot and do not claim that the use of a `reasonableness' standard will provide the uniformity that Congress originally sought to secure.").
In any case, we do not think "personal distaste" for the crime, as Foy asserts, ultimately improperly influenced the district court's analysis in this case. Notwithstanding its statements, the district court permitted Foy a full opportunity to argue for a different sentence, engaged with defense counsel to clarify Foy's position, acknowledged factors in Foy's background were significant mitigators, and ultimately addressed his main arguments in its written order. The record reflects the court's decision was the result of serious, reasoned consideration of factors for and against a lengthy sentence based on Foy's individual circumstances.
That being said, the extent of the variance coupled with the length of the resultant sentence gives us pause in this case.[4]
*1038 We are not entitled, however, under our deferential review to overturn a sentencing decision because we might have reasonably concluded a different sentence was appropriate. Feemster, 572 F.3d at 462. Our determination that the district court did not abuse its discretion in this case is bolstered by two factors. First, the district court provided "as our precedent requires, substantial insight into the reasons for its determination." Id. at 464 (quotation omitted). Second, we believe the district court's explicit justifications rest largely on "the kind of defendant-specific determinations that are within the special competence of sentencing courts." Id. (quotation omitted). We conclude the sentence was procedurally sound and substantively reasonable.

III. Conclusion
For the foregoing reasons, we affirm.
BRIGHT, Circuit Judge, dissenting.
I respectfully dissent.
The majority acknowledges that the extent of the variance, coupled with the length of the sentence, gives it pause. Maj. op. at 1037-38. But it reasons that it cannot overturn the sentence because the court did not abuse its discretion.
I disagree. While I support the district court's sentencing discretion, and hold this particular judge in the highest regard, I cannot agree with the imposition of such an excessive sentence on the record before this court. The sentence is substantively unreasonable because the variance from the sentencing guidelines lacks sufficient justification. See United States v. Feemster, 572 F.3d 455, 461 (8th Cir.2009) (en banc) (holding that a district court must "consider the extent of the deviation and ensure that the justification is sufficiently compelling to support the degree of the variance." (internal quotations omitted)).
Under the sentencing guidelines, Foy faced a 210-262 month (17.5 years-21 years, 10 month) sentence. The district court imposed a sentence of 480 months (40 years), approximately double that of the recommended guideline range. The district court justified its forty-year sentence mostly on the nature and circumstances of Foy's present crime and his criminal history.
The court considered the 18 U.S.C. § 3553(a) factors, but the record does not support this variance of two times the guideline range. The district court expressed the most concern over Foy's criminal history. That record reflects that over the past decade, Foy engaged in several altercations, leading to various assault charges. He also has been cited for several minor offenses. His most serious crimes include second- and third-degree burglary. Without question, Foy's criminal history is lengthy. But his offenses are not excessively severe or extraordinary to justify what amounts to a life sentence.
Foy's difficult childhood should be weighed against his criminal background. The presentence report documents a turbulent and unstable childhood. Foy's parents were never married and suffered from a history of substance abuse. Foy's mother, absent from much of his childhood, played little role in his upbringing. At age eight, Foy moved into a boys' home, beginning a series of juvenile placements. By age nine, Foy started receiving treatment for psychiatric problems, including adjustment disorder and mild retardation.
Mental health professionals later diagnosed him with conduct disorder of adolescence and impulse control disorder. Professionals *1039 more recently diagnosed him with antisocial personality disorder and borderline personality traits. Foy's mental health concerns were magnified by his substance abuse problems. Foy began smoking marijuana at age twelve, and within one year, he used the substance on a daily basis. Foy began consuming alcohol at age fourteen and admits to snorting cocaine.
The district court did not find the lack of parental guidance and substance abuse "substantially mitigating." At the time of sentencing, Foy was nearly twenty-eight years of age. He had never finished high school. What we have here is an African American man who did not receive much of an education, nor has he received many opportunities to make a useful life for himself. Adding his forty-year sentence to his existing incarceration for the crime of second-degree burglary likely amounts to a life sentence.
The district judge decided in essence that Mr. Foy had led a worthless criminal life and would continue to do so. Thus, the court adopted the principlejail him now and throw away the key.
I disagree with the district court that committing the crime of mailing threatening communications warrants a forty-year sentence, even with Foy's background. While this is a serious offense, it did not result in physical harm to another individual. Yet the district court imposed a longer sentence than that which many murderers might receive.
Although the district court was not required to provide notice that it intended to vary upward from the guideline range, see Irizarry v. United States, 553 U.S. 708, 128 S.Ct. 2198, 2202, 171 L.Ed.2d 28 (2008), I comment that in this case, where the sentence is almost double that of the guideline range, notice would have improved the sentencing process. The record reflects that the district court did not inform Foy that he could be subject to forty years' imprisonment. At Foy's change-of-plea hearing, the district court stated that "[u]nder Counts 1, 2, and 4 . . . [Foy faced] a maximum possible imprisonment of 5 years" and that "[o]n Counts 3 and 5 there's a maximum possible . . . imprisonment of 20 years in prison." The district court did not inform Foy that the twenty-year sentence could be run consecutively to another twenty-year sentence. Had Foy received notice of the district court's intent to vary so significantly from the sentencing guidelines, he could have introduced evidence that the record did not support an extreme variance. Cf. United States v. Rutherford, 599 F.3d 817, 822 (8th Cir.2010) (affirming a sentence based on consecutive sentences because the district court at the change-of-plea hearing alerted the defendant to the possibility of consecutive sentencing and sufficiently justified the sentence).
Moreover, the heavy sentence rests on consecutive sentences of twenty years for counts 3 and 5. Those counts recite extortion crimes, in which Foy, while in prison, wrote letters to S.K., threatening to kill her and others unless she paid him money. These were serious crimes. Yet these letters were part of related transactions, written and delivered within two weeks of each other, and the prison terms were imposed at the same sentencing hearing. It would seem such a relationship of the crimes would usually call for concurrent sentences.
In this case, we are dealing with a difficult subject: recidivism. Certainly the district court possesses more experience on this subject than appellate judges. But in this case experts with specialized knowledge about recidivism may have aided the district court in sentencing Foy. This record does not provide enough justification to sentence Foy to forty years in prison. I *1040 would vacate the sentence. The case should be remanded so that Foy and the government may have an opportunity to introduce expert testimony addressing the risks of recidivism.
NOTES
[1]  The Honorable Mark W. Bennett, United States District Judge for the Northern District of Iowa.
[2]  The Honorable James E. Gritzner, United States District Judge for the Southern District of Iowa, presided at the trial and at Foy's plea hearing.
[3]  Foy also raises the possibility of an ineffective assistance claim due to prior counsel's failure to present argument to the district court on the plea withdrawal. He acknowledges, however, that a direct appeal is typically an inappropriate forum for such a claim and requests that we preserve this issue for post-conviction proceedings. See United States v. McAdory, 501 F.3d 868, 872 (8th Cir.2007) ("We ordinarily defer ineffective assistance of counsel claims to 28 U.S.C. § 2255 proceedings."). We do not entertain it further here.
[4]  We note that the degree of upward variance in and of itself is not unheard of in our case law. See, e.g., United States v. Gnavi, 474 F.3d 532 (8th Cir.2007) (affirming upward variance to 120 months from advisory Guidelines range of 63 to 78 months).
