                     United States Court of Appeals
                             FOR THE EIGHTH CIRCUIT
                                 ________________

                                    No. 05-3106
                                 ________________

Roger L. Baker,                            *
                                           *
             Appellee,                     *
                                           *       Appeal from the United States
      v.                                   *       District Court for the Northern
                                           *       District of Iowa.
Jo Anne B. Barnhart,                       *
Commissioner of Social Security,           *
                                           *
             Appellant.                    *

                                 ________________

                             Submitted: March 14, 2006
                                 Filed: June 13, 2006
                                ________________

Before COLLOTON, HEANEY, and GRUENDER, Circuit Judges.
                       ________________

GRUENDER, Circuit Judge.

        The Commissioner of Social Security (“Commissioner”) appeals the district
court’s order reversing the Commissioner’s denial of Supplemental Security Income
(“SSI”) disability benefits to Roger L. Baker. The Commissioner argues that the
district court erred in relying on materials not contained in the administrative record,
that substantial evidence supports the Commissioner’s decision to deny benefits and
that even if substantial evidence does not support the Commissioner’s decision, the
proper remedy is a remand to the Social Security Administration (“SSA”) rather than
an order directing the award of benefits. We reverse the district court and remand
with instructions to reinstate the Commissioner’s denial of benefits.

I.    BACKGROUND

       In February 2000, Baker slipped and fell on ice and injured his back. Baker’s
family physician, Dr. Kissel, ordered an MRI after the muscle relaxants and pain
killers he prescribed failed to improve Baker’s condition. The MRI showed
degenerative disc disease, bulging disc material and “neuroforamen stenosis.” Dr.
Kissel referred Baker to a neurologist, Dr. Case. Dr. Case confirmed Baker’s back
problems and also diagnosed Baker with moderate carpal tunnel syndrome and tennis
elbow in his left arm. Dr. Case referred Baker to a surgeon, Dr. Durward, for further
treatment of his lumbar disc disease. Dr. Durward scheduled Baker for back surgery.

       In September 2000, Baker informed his employer, Enterprise Rent-a-Car, that
he could no longer work because of the pain in his back and legs. Dr. Durward and
his associate, Dr. Noel, performed surgery on Baker in November 2000 to fuse two
lumbar vertebrae. Regarding a follow-up examination on January 8, 2001, Dr.
Durward stated in a memorandum, “As far as his back goes he is doing extremely
well. Lost pretty well all of his preoperative pain syndrome. . . . Moves his back
freely. His straight leg raising is unrestricted. . . . The films that were done last month
demonstrate perfect fusion occurring [in the fused lumbar vertebrae].” After
conducting an examination on January 29, 2001, Dr. Noel noted that Baker was
“[d]oing well” and “[d]enies any complaints.” On April 23, 2001, Dr. Noel noted that
he “encouraged [Baker] to get back to work and he is excited by that.” Dr. Noel
recommended a functional capacity evaluation (“FCE”) “to determine [Baker’s] final
restrictions.” Meanwhile, Baker applied for SSI disability benefits on February 28,
2001, but this application was denied on May 10, 2001 because his condition was “not
expected to prevent [him] from working for a continuous period of 12 months from
the date of [his] surgery.”

                                            -2-
       On June 15, 2001, Baker was examined by Dr. Durward. This time, Dr.
Durward noted that Baker was “still complaining of significant pain” that was
“[w]orse with physical activity, even light housework in a bent forward position
exacerbates it.” Dr. Durward characterized Baker’s case as a “[p]uzzling situation.
Seems to have an inordinate amount of pain despite what appears to be a satisfactory
x-ray result.” Dr. Durward referred Baker to Dr. Keppen, a pain specialist, and
recommended delaying the FCE until Baker’s pain situation was addressed. Baker
returned to Dr. Durward on August 6, 2001, and Dr. Durward noted that Baker’s
reported pain had not improved and that physical activity such as raking exacerbated
the pain. However, Dr. Durward also noted that Baker could walk and perform
straight leg raising without restriction. Dr. Durward released Baker for light-to-
medium work, restricting him to a maximum occasional lifting limit of 25 pounds, a
frequent lifting limit of 15 pounds and to limited bending at the waist. Dr. Durward
also stated that an FCE would enable him to implement more accurate work
restrictions.

      Baker next visited Dr. Durward on November 11, 2001, complaining of
continuing pain in his back and left forearm. Baker stated that he had raked leaves the
previous day and that doing so exacerbated his back pain. Dr. Durward noted:

      I believe that the fusion is solid and [Baker] is not any longer getting any
      pain from it. I have released him previously for work with lifting limits
      but he has not gone back to work. He has applied for Social Security.
      I think I would give the man the benefit of the doubt here and try one
      more method of treating his residual symptoms. What I would
      recommend is referral to Dr. Mike Donohue for consideration of an
      isokinetic rehabilitation program.

       Baker returned to Dr. Durward on January 4, 2002 with the same complaints
of pain. This time Dr. Durward noted:



                                          -3-
      Very difficult situation. We had referred him for isokinetic rehab but
      that was not undertaken. I think there are 2 things that we need to pursue
      here. Firstly, it would be worthwhile getting a second opinion to see
      whether there was something we have missed or something else that may
      be done to help his residual pain. . . . The second thing is that I think it
      would be of value to have a [FCE] done after he has had the second
      opinion to really try to identify exactly what his limitations are. This
      man is only 49. He tells that Social Security was turned down because
      his wife does work. I think he is motivated to do some kind of work if
      we can find something that does not exacerbate his residual pain
      symptoms.

      Baker received the recommended second opinion from Dr. Lynn, but Dr. Lynn
could not identify any cause for Baker’s pain and did not recommend any new
treatment. Baker returned to Dr. Durward on May 29, 2002, after Dr. Durward had
a chance to review the findings of Dr. Lynn as well as those of Dr. Keppen, the pain
specialist. Baker reported that his pain had not decreased and that prolonged sitting,
such as during an extended drive in an automobile, exacerbated his pain. Baker also
reported that he was taking no pain medication other than “intermittent Tylenol.” Dr.
Durward noted:

      This man does describe a significant disabling pain syndrome. At this
      point in time I do not feel confident that there is any structural cause for
      it that I could help. . . . As far as the pain goes, I am going to give him
      samples of a non steroidal Bextra or Mobic to see if that gives him some
      relief. My recommendations are that he can only work on a light duty
      occupation. I would recommend a maximum lifting limit of 25 lbs.,
      frequent lifting limit 10 lbs. Avoid bending at the waist. However
      because this is a complicated situation and he has significant pain I
      would like to get [an FCE] to try and determine more accurately what his
      limitations are.

      Baker participated in an FCE with licensed physical therapist Terry F. Nelson
on July 23, 2002. The FCE results showed evidence that Baker was exaggerating his

                                          -4-
symptoms and giving less than full effort during testing. Because the district court
found serious fault with these FCE results, we discuss the FCE methodology in detail.

       The FCE includes multiple evaluations designed to produce an objective
measure of the patient’s effort and cooperation with the goal of the test. The Pain
Replication Test (“PRT”) is designed to “determine whether or not an organic pain
syndrome is present and how much it is limiting normal function.” The detailed FCE
report describes the nature of the PRT as follows:

      The rationale behind the PRT is that a static effort up to a barely
      perceptible increase in pain is reproducible within a small margin of
      error, a 15% Coefficient of Variation (CV). . . . [If] a barely perceptible
      increase in pain is experienced, then that force is not enough to produce
      an injury or an aggravation to the injury, but that level of force is
      reproducible. However, if the patient is exaggerating their [sic]
      symptoms and are not exerting their best effort up to a barely perceptible
      increase in their pain, then it will be difficult for them to produce static
      exertions with good consistency.
      ....
      [The patients then lift] as hard as they are able to do safely. They must
      stay within their perceived safety zone, while attempting to determine
      their maximum tolerable pain level of force. . . .
      ....
      The person with an organic pain syndrome in their Low Back and Upper
      Body will be able to produce two distinctly different levels of force
      exertion, but the symptom exaggerator will not be able to produce two
      distinct levels of force and they will occasionally produce results that are
      impossible.

Nelson concluded in the detailed FCE report that “Mr. Baker demonstrated a non
organic pain response during which he was unable to consistently replicate two levels
of force based on his pain perception. The result is usually consistent with voluntary
submaximal effort on this test.”


                                          -5-
       In addition to the PRT, the FCE included a Blankenship Reliability Profile. The
detailed FCE report states, “The Blankenship Reliability Profile includes two
objective components, Non Organic Signs and Validity. Patients scoring invalid on
both components are felt to be attempting to control the test results to demonstrate a
greater level of disability than what is actually present, the motivation of which is not
known.” The Non Organic Signs measure is based on comparisons of the patient’s
movement patterns with the patient’s own description of his pain, and also on
“distraction” observations of those movements:

      The Distraction category is defined as any improvement in the
      Movement Dysfunction when the patient is not aware that they [sic] are
      being observed, compared back to observations . . . when the patient was
      fully aware that they were being observed. . . . A positive Distraction
      Category is evidence that the patient is attempting to demonstrate a
      greater level of pain disability than are [sic] actually present.

       In his summary letter to Dr. Durward, Nelson stated, “Mr. Baker exhibited
distracted movement patterns throughout the [FCE]. This patient demonstrated trunk
range of motion that improved significantly with distraction.” Nelson also noted,
“When comparing this patient’s movement patterns throughout the [FCE] to the
patient’s complaints of pain throughout the [FCE], it is felt by this evaluator, that the
patient demonstrated a poor correlation between movement patterns and his
complaints of pain.”

     Finally, the detailed FCE report describes the Validity component of the
Blankenship Reliability Profile:

      The Validity Profile is comprised of a cohort of individual tests that
      collectively help determine whether or not the patient is exerting their
      [sic] best effort during all of the FCE tests. . . . If the patient exerts effort
      up to the point of a barely perceptible pain increase, or slightly below
      that level so there is no pain increase at all, then they will pass the

                                             -6-
      overall Validity Profile. If the patient does not pass the overall Validity
      Profile, then they [sic] have not exerted their best effort. . . . Current
      research, submitted . . . [for publication], shows that a strong indicator
      of whether or not an evaluee is cooperating with and exerting their [sic]
      best effort on a functional test is the Evaluator’s professional judgment;
      however, most of the Validity Criteria of The Blankenship System FCE
      are based on published research.

      ....

      There is also an empirical relationship between the number of validity
      criteria passed and the degree of effort exerted during testing and the
      reliability of the test results for predicting work performance. This
      algorithm was developed by K. Blankenship published in the revised
      edition of, The Blankenship System Functional Capacity Evaluation: The
      Protocol Manual, revision began in 1996, Macon, GA. That revised
      algorithm is shown below.

      Validity Criteria   Degree of Effort Word Descriptor
      Passed
      90-100%             Excellent Effort        Valid Results
      80-89%              Good Effort             Valid Results
      70-79%              Fair Effort             Valid Results
      70-75%                                      Borderline Valid,
                                                  Results are Conservative
      60-69%              Poor Effort             Borderline
                                                  Invalid Results
      <60%                Very Poor Effort        Invalid Results
      < 20 Criteria       May Be Unreliable, Professional Judgment
                          Required

      Nelson’s summary letter to Dr. Durward read, “This patient exhibited a sub
maximal effort throughout the [FCE]. Mr. Baker passed 62% of the validity criteria[,]
36/58 validity criteria scored. . . . The results of the [FCE] will not provide an
accurate aid in the medical management and vocational planning for this patient.”
Nelson concluded that the FCE results, taken at face value, showed an ability to work

                                          -7-
at a “sedentary” level. However, taking into account Baker’s less-than-maximal effort
and symptom exaggeration, Nelson estimated that Baker could handle “light-medium”
work.1

      Baker returned to Dr. Durward on August 5, 2002. Dr. Durward, based on his
own judgment, agreed with the FCE report that Baker could work at the light-medium
physical demand category. However, Dr. Durward stated, “He has an eighth grade
education. I am very doubtful that he is going to be able to find meaningful work
within the light job classification. It may be reasonable for him to apply for Social
Security.”

       Baker returned to Dr. Kissel for a check-up on October 24, 2002. Dr. Kissel
noted that Baker “says he is unable to do anything. He can’t bend over. He can’t tie
his sho[e]s. He can’t even brush his teeth without having severe pain in his back.”
Dr. Kissel reviewed Dr. Durward’s notes and recognized that Dr. Durward felt
Baker’s functional capacity “was better than [Baker] thinks it is.” Dr. Kissel agreed
that applying for SSI disability benefits was “a good idea.” Dr. Kissel also noted
Baker’s borderline high blood pressure.

       In December 2002, Baker filed a new application for SSI disability benefits,
stating that he suffered from lower back pain, carpal tunnel syndrome, and high blood
pressure. In the application, Baker stated that he operated his own tree-trimming
business but did none of the physical labor. Baker added that because of the pain, he
no longer mowed or raked his yard and that he had trouble using an eight-pound chain
saw to cut tree branches. In a pain questionnaire, Baker stated that he vacuumed the
floor and helped his wife with the dishes “once in a while,” but avoided any bending
and leaning while doing so. He also stated that he drove his son to and from work and


      1
      Nelson referenced the terms “sedentary” and “light-medium” to the Dictionary
of Occupational Titles, U. S. Department of Labor, 1991.
                                         -8-
occasionally drove to the grocery store, but that the pressure of the car seat aggravated
his pain.

       Baker received a hearing before an administrative law judge (ALJ) on October
30, 2003 and testified that he was born in 1952, completed his education through the
eighth grade and had not taken a GED test, was married and had one adult son who
currently lived with Baker and his wife. Baker also stated that he was self-employed,
devoting ten to twelve hours per week to a tree-trimming service for which he
solicited customers through local ads, drove himself to customers’ sites and gave
estimates for the jobs, and then hired part-time labor to do the physical work. He was
no longer able to wield his small chain saw for more than a few minutes. He took no
pain medications because none of them helped. In general, he was able to sit
comfortably for only about half an hour at a time and then needed to stand up and
walk around. He was able to sleep only two to three hours at a time without
interruption from the pain. Driving 100 miles caused him serious pain.

      In an opinion issued January 28, 2004, the ALJ ruled that Baker was not
disabled. Applying the five-step evaluation set forth in 20 C.F.R. § 404.1520,2 the

      2
       The five-step evaluation is as follows:

      (i) At the first step, we consider your work activity, if any. If you are
      doing substantial gainful activity, we will find that you are not disabled.
      ...

      (ii) At the second step, we consider the medical severity of your
      impairment(s). If you do not have a severe medically determinable
      physical or mental impairment that meets the duration requirement in §
      404.1509, or a combination of impairments that is severe and meets the
      duration requirement, we will find that you are not disabled. . . .

      (iii) At the third step, we also consider the medical severity of your
      impairment(s). If you have an impairment(s) that meets or equals one of
                                           -9-
ALJ determined that Baker’s tree-trimming business did not qualify as “substantial
gainful activity” for the purposes of step (i). The ALJ also found that Baker’s lower
back impairment, while severe for purposes of step (ii), did not meet or equal one of
the listings in 20 C.F.R. pt. 404, subpt. P, app. 1, as required in step (iii). Therefore,
the ALJ proceeded to analyze Baker’s residual functional capacity3 (“RFC”) for step
(iv):

      [T]he claimant’s subjective complaints are inconsistent with the medical
      evidence of record. A functional capacity evaluation found evidence of
      non-valid criteria and evidence of deconditioning. Moreover, he takes
      no medication for his pain in spite of rather significant complaints. He
      is capable of full self-care, does a wide variety of household chores and
      outdoor tasks, including car washing, mowing the lawn, and raking
      leaves. He drives a car every day, shops and runs a number of errands.


      our listings in appendix 1 of this subpart and meets the duration
      requirement, we will find that you are disabled. . . .

      (iv) At the fourth step, we consider our assessment of your residual
      functional capacity and your past relevant work. If you can still do your
      past relevant work, we will find that you are not disabled. . . .

      (v) At the fifth and last step, we consider our assessment of your residual
      functional capacity and your age, education, and work experience to see
      if you can make an adjustment to other work. If you can make an
      adjustment to other work, we will find that you are not disabled. If you
      cannot make an adjustment to other work, we will find that you are
      disabled. . . .

20 C.F.R. § 404.1520(a)(4).
      3
       “Residual functional capacity” is defined as the most an individual can still do
despite the “physical and mental limitations that affect what [the individual] can do
in a work setting” and is assessed based on all medically determinable impairments,
including those not found to be “severe.” 20 C.F.R. § 404.1545.
                                           -10-
      The undersigned finds that the claimant performs a significant amount
      of activities of daily living for an individual with such longstanding
      intractable pain.

      All the physicians in the file said the claimant could perform light, (or
      medium), work or that there was no etiology for the alleged pain
      symptoms.

       The ALJ recited the results of Baker’s various medical tests and evaluations and
found that Baker retained the RFC to perform light work, specifically finding that “the
claimant’s allegations regarding his limitations are not totally credible.” Finally, for
step (v), the ALJ applied the Medical-Vocational Guidelines of 20 C.F.R. pt. 404,
subpt. P, app. 2 to Baker’s RFC, age, education, and work experience. Rule 202.10
of the Medical-Vocational Guidelines directed a finding of “not disabled.” The
Appeals Council denied administrative review of the ALJ’s decision, making the
ALJ’s decision the final decision of the Commissioner. Baker sought review in
federal district court.

       The district court reversed the ALJ’s decision and ordered that SSI disability
benefits be awarded to Baker. The district court ruled that the ALJ’s finding that
Baker “does a wide variety of household chores and outdoor tasks, including car
washing, mowing the lawn, and raking leaves” and “performs a significant amount of
activities of daily living” was not supported by substantial evidence on the record as
a whole because Baker indicated that merely attempting such activities exacerbated
his pain. The district court also faulted the ALJ’s reliance on the fact that Baker took
no pain medication. However, the district court devoted most of its analysis to what
it deemed to be a flawed FCE report. The district court did not accept that the FCE
could show Baker to be giving valid effort 62 percent of the time and “cheating” 38
percent of the time. The district court stepped through the results from the individual
FCE tests and repeatedly made comments such as “[f]or some reason, this cheater did
not cheat on this category” and:


                                          -11-
       They have been testing him for a number of things. According to the test
       administrator, he has cheated. They are now testing him for static
       strength and he made a good effort and passed twenty-two out of twenty-
       five. This demonstrates, from the test administrator’s point of view, that
       the plaintiff is now well into the tests and has been cheating all along but
       he then decides, “hey, maybe I am cheating too much” so he did not
       cheat at all in this category and the test administrator concludes, “it was
       a good effort on his part.”

       In summary, the district court discredited as a “hard-to-believe scenario” any
conclusion that “a person who admittedly has not been a great student, whose
education is limited” could “put on [such] a real show of deceit.” The district court
also found, without citing to any evidence, that Nelson did not give proper
consideration to Baker’s demonstrated poor cardiovascular fitness or “deconditioning”
in evaluating Baker’s level of effort on the FCE tests. Acknowledging that neither
Baker’s doctors nor the ALJ saw any problems with the FCE, the district court stated
that “[n]ot only did the ALJ not know what was in the test, the doctors seem to have
bought some of it without knowing how Nelson arrived at his scores.”

       Finally, the district court took judicial notice of an article4 (“Soderberg article”)
not contained in the administrative record to discredit Baker’s FCE results. The
district court characterized the article as a treatise. Relying on a brief general criticism
in the Soderberg article of the use of functional capacity evaluations, the district court
found that the FCE methods employed by Nelson were not reliable and that “[t]he ALJ
relied too heavily on this seemingly unreliable test.”



       4
        Gary L. Soderberg, “A Note on Methods of Testing for Human Performance
Capacity,” available at http://www.costreductiontech.com/pdf/validitypaper.pdf (last
visited May 2, 2006).
                                            -12-
       With the FCE results discredited, the district court found that the record
overwhelmingly supported a finding of disability and directed the Commissioner to
compute and award benefits. The Commissioner appeals, arguing that the district
court erred in relying on the extra-record Soderberg article and that substantial
evidence supports the ALJ’s decision. In the alternative, the Commissioner argues
that even if substantial evidence does not support the ALJ’s decision, the proper
remedy is a remand to the SSA rather than an order directing the award of benefits.

II.   DISCUSSION

      A.     Consideration of Evidence Outside the Administrative Record

        The parties contest the role played in the district court’s decision by the extra-
record Soderberg article, noticed sua sponte by the district court. Baker contends that
the district court did not ascribe evidentiary value to the extra-record article, but
instead characterized it as a treatise. In doing so, the district court relied on our
decision in United States v. Eagleboy, 200 F.3d 1137, 1140 (8th Cir. 1999), in which
we held that judicial notice of a document was permissible where the document “[did]
not present new evidence on a disputed question of fact.” We stated further that
“judicial opinions, treatises, law review articles, public records, and the like . . . may
be cited for the first time on appeal in support of a legal theory that was raised in the
trial court.” Id.

       The Commissioner, on the other hand, contends that the Soderberg article
constitutes extra-record expert opinion evidence. In the context of judicial review of
a decision of the Commissioner regarding SSI disability benefits, evidence outside the
administrative record generally is precluded from consideration by the court. Delrosa
v. Sullivan, 922 F.2d 480, 483 (8th Cir. 1991); see also Johnson v. Chater, 108 F.3d
942, 946 (8th Cir. 1997) (holding, where the claimant urged the court to take judicial

                                           -13-
notice of the weight of a gallon of gasoline to controvert the ALJ’s finding regarding
his lifting ability, that “to take judicial notice of a fact such as the one [the claimant]
suggests would undermine the ALJ’s role as the factfinder under the Social Security
Act”); 42 U.S.C. § 405(g) (“The court shall have power to enter, upon the pleadings
and transcript of the record, a judgment affirming, modifying, or reversing the
decision of the Commissioner of Social Security, with or without remanding the cause
for a rehearing.” (emphasis added)).5

       We find that the district court abused its discretion by taking judicial notice of
the Soderberg article. See Johnson, 108 F.3d at 946 (noting that the court “has
discretion as to whether to take judicial notice”). The district court took judicial
notice of the article as a treatise. A treatise is defined by Merriam-Webster’s
Collegiate Dictionary (11th ed.) as “a systematic exposition or argument in writing
including a methodical discussion of the facts and principles involved and conclusions
reached.” In addressing the treatise exception to the hearsay rule, the Second Circuit
stated:

      Learned treatises are considered trustworthy because “they are written
      primarily for professionals and are subject to scrutiny and exposure for
      inaccuracy, with the reputation of the writer at stake.” Fed. R. Evid.
      803(18) advisory committee note. Failure, therefore, to lay a foundation
      as to the authoritative nature of a treatise requires its exclusion from
      evidence because the court has no basis on which to view it as
      trustworthy.

Schneider v. Revici, 817 F.2d 987, 991 (2d Cir. 1987).



      5
       If a party shows good cause that new material evidence should be considered,
the district court may remand to the agency for consideration of that evidence.
Delrosa, 922 F.2d at 483-84; see 42 U.S.C. § 405(g).
                                           -14-
       This logic applies with equal force where a court takes judicial notice of a
treatise. The district court did not identify, and we cannot discern, any indication that
the Soderberg article qualifies as a treatise of an authoritative nature. The article itself
states, “This note provides a brief overview of various systems or techniques used in
assessing a patient’s capacity to perform.” (Emphasis added.) The article, which is
only five pages long, makes conclusory statements about several methods of testing
human muscle performance and cites published sources for those statements, but the
article does not include any “methodical discussion of the facts and principles
involved.”

       Furthermore, there is no indication that the article itself has been published in
a peer-reviewed journal. The article apparently is available only on the internet at the
web site of Cost Reduction Technologies, LLC and is written by that company’s Chief
Scientist and Technical Director. That company promotes, and sells testing equipment
for, “isokinetic” testing, an alternative to the FCE methods employed by Nelson.
Given the limited depth of the article and the author’s and publisher’s interests in
selling alternative testing equipment, there is no basis upon which we may conclude
that the article’s brief criticism of the FCE is trustworthy in the nature of an
authoritative treatise. Therefore, the district court abused its discretion by taking
judicial notice of the Soderberg article and relying upon it to discredit the FCE results.

       B.     The Decision of the ALJ

       The ALJ found that Baker retained the RFC to do light work and then applied
the Medical-Vocational Guidelines to determine that Baker was not disabled. “The
findings of the Commissioner of Social Security as to any fact, if supported by
substantial evidence, shall be conclusive.” 42 U.S.C. § 405(g). Our standard of review
of the ALJ’s decision is as follows:



                                            -15-
      Our role on review is to determine whether the Commissioner’s findings
      are supported by substantial evidence on the record as a whole.
      Substantial evidence is less than a preponderance, but is enough that a
      reasonable mind would find it adequate to support the Commissioner’s
      conclusion. In determining whether existing evidence is substantial, we
      consider evidence that detracts from the Commissioner’s decision as well
      as evidence that supports it. As long as substantial evidence in the
      record supports the Commissioner’s decision, we may not reverse it
      because substantial evidence exists in the record that would have
      supported a contrary outcome or because we would have decided the
      case differently.

McKinney v. Apfel, 228 F.3d 860, 863 (8th Cir. 2000) (internal citations omitted).

       Baker challenges the ALJ’s rejection of Baker’s assertions regarding his
physical limitations, such as his claim that he must alternate between sitting and
standing every half-hour. However, substantial evidence in the record as a whole
supports the ALJ’s finding that Baker’s “allegations regarding his limitations are not
totally credible.” The ALJ was entitled to draw conclusions about Baker’s credibility
based on the FCE pain-replication and distraction analyses indicating that Baker was
exaggerating symptoms and giving less than his full effort.6 See Clay v. Barnhart, 417
F.3d 922, 930 n.2 (8th Cir. 2005) (noting that two psychologists’ findings that the


      6
        With regard to the FCE’s conclusions about effort and symptom exaggeration,
we find that the district court’s characterization of the physical therapist as merely
watching each FCE test and deciding on an ad hoc basis if the subject is “cheating”
on that test is not supported by the record. Instead, the record indicates that the FCE’s
conclusions about overall effort and symptom exaggeration are drawn in an empirical
fashion by comparing the results of a large number of tests and observations. Because
Baker did not submit any evidence to the ALJ challenging the reliability of the FCE
methods employed by Nelson, and because those methods were accepted by Dr.
Durward and Dr. Kissel, Baker’s treating physicians, we see no reason not to accept
the FCE results.
                                          -16-
claimant was “malingering” on her IQ tests cast suspicion on the claimant’s
motivations and credibility); Jones v. Callahan, 122 F.3d 1148, 1152 (8th Cir. 1997)
(holding that a physician’s observation “of the discrepancies in [the claimant’s]
appearance in the examining room and those outside when he did not know that he
was observed” supported an ALJ’s finding that the claimant’s complaints were not
fully credible).

       Baker also alleges that the ALJ improperly discredited Baker’s subjective
complaints of pain due to the lack of objective evidence. While “an ALJ may not
disregard subjective pain allegations solely because they are not fully supported by
objective medical evidence, an ALJ is entitled to make a factual determination that a
Claimant’s subjective pain complaints are not credible in light of objective medical
evidence to the contrary.” Ramirez v. Barnhart, 292 F.3d 576, 581 (8th Cir. 2002)
(internal citation omitted). In this case, the ALJ discredited Baker’s subjective
complaints of pain after considering the indications of symptom exaggeration during
the FCE, Baker’s choice not to take pain medication, his ability to do a wide variety
of chores and otherwise perform “a significant amount of activities of daily living,”
and the failure of repeated examinations by Dr. Durward, Dr. Noel and Dr. Lynn to
uncover any physical explanation for Baker’s reported lower back and hip pain.

       Baker specifically challenges the ALJ’s finding that he “does a wide variety of
household chores and outdoor tasks, including car washing, mowing the lawn, and
raking leaves.” Baker contends that the record merely reflects that when he attempted
to rake leaves or do other chores, his pain prevented him from doing so. An
examination of the record reveals that Baker told Dr. Durward during his various
visits that “even light housework in a bent forward position exacerbates” the pain
(June 15, 2001); that raking exacerbated the pain (August 6, 2001); and, again, that
raking exacerbated the pain (November 11, 2001). In his December 2002 application
for SSI disability benefits, Baker stated that he no longer mowed or raked his yard,

                                         -17-
that he had trouble using an eight-pound chain saw to cut tree branches, and that he
vacuumed the floor and helped his wife with the dishes “once in a while.” Finally,
Baker reiterated before the ALJ that he was no longer able to wield his small chain
saw for more than a few minutes and that he was able to help with the dishes in a
minimal fashion for 15 or 20 minutes at a time.

       We find that this evidence does not constitute substantial evidence on the record
to support the ALJ’s finding that Baker “does a wide variety of household chores and
outdoor tasks, including car washing, mowing the lawn, and raking leaves,” at least
to the extent that the finding suggests Baker performed those activities for more than
a few minutes at a time. However, that finding was just one component of the ALJ’s
finding that Baker’s subjective complaints were not consistent with the record
evidence. The record does contain substantial evidence to support the ALJ’s findings
that Baker “is capable of full self-care” and “drives a car every day, shops and runs
a number of errands.” These findings, especially those regarding the activities
associated with Baker’s tree-trimming business, support the conclusion that Baker
performed “a significant amount of activities of daily living.”

       In addition, the ALJ also discredited Baker’s subjective complaints of pain
based on the indications of symptom exaggeration during the FCE, Baker’s choice not
to take pain medication and the absence of an etiology for the alleged pain symptoms.
All these grounds are supported by substantial evidence on the record as a whole.
First, as discussed above, the report of symptom exaggeration on the FCE provides
good cause for the ALJ to discredit Baker’s subjective complaints of pain. Second,
Baker’s decision not to take pain medication was a valid factor for the ALJ to
consider.7 See Curran-Kicksey v. Barnhart, 315 F.3d 964, 969 (8th Cir. 2003)


      7
       Although not cited by the ALJ, Dr. Durward’s notes of January 4, 2002 reflect
that Baker also did not undertake the recommended course of physical therapy.
                                          -18-
(“[E]vidence that [the claimant] did not regularly require prescription medication or
physical therapy could create doubt in a reasonable adjudicator’s mind with regard to
her testimony about the extent of her pain.”). Third, the numerous medical opinions
from Baker’s treating physicians agree that Baker’s reports of pain were inconsistent
with his physical condition. “A treating physician’s opinion is due controlling weight
if that opinion is well-supported by medically acceptable clinical and laboratory
diagnostic techniques and is not inconsistent with the other substantial evidence in the
record.” Ellis v. Barnhart, 392 F.3d 988, 995 (8th Cir. 2005) (quoting Hogan v. Apfel,
239 F.3d 958, 961 (8th Cir. 2001)) (internal quotation omitted).

       “We ‘will not disturb the decision of an ALJ who considers, but for good cause
expressly discredits, a claimant’s complaints of disabling pain.’” Goff v. Barnhart,
421 F.3d 785, 792 (8th Cir. 2005) (quoting Gowell v. Apfel, 242 F.3d 793, 796 (8th
Cir. 2001)). In this case, the ALJ considered Baker’s subjective complaints of pain
but found independent reasons, each supported by substantial evidence, to discredit
those complaints. Therefore, we find that the ALJ did not improperly discredit
Baker’s subjective complaints of pain.

       After finding that substantial evidence supports the ALJ’s findings that Baker’s
assertions of his limitations and his subjective complaints of pain are not wholly
credible, we also find that substantial evidence supports the ALJ’s finding that Baker
retained the RFC to do light work. Nelson concluded in the FCE report that Baker
was capable of working in the light-medium category. In addition, after receiving the
FCE report, Dr. Durward examined Baker again and agreed, independent of the FCE,
that Baker was capable of work in the light-medium category. The FCE results and
Dr. Durward’s opinion provide strong support for the ALJ’s finding that Baker
retained the RFC to perform light work.



                                          -19-
       Baker contends that the ALJ should have given greater weight to the fact that
both of his treating physicians, Dr. Kissel and Dr. Durward, encouraged him to apply
for Social Security disability benefits. This argument fails. Dr. Durward agreed that
Baker was capable of light/medium work but stated, “[Baker] has an eighth grade
education. I am very doubtful that he is going to be able to find meaningful work
within the light job classification.” Dr. Kissel reviewed Dr. Durward’s notes and
agreed with his conclusion. However, a physician’s opinion regarding a claimant’s
ability to find work within a particular classification is not a “medical opinion.” See
Stormo v. Barnhart, 377 F.3d 801, 806 (8th Cir. 2004) (“[T]reating physicians’
opinions are not medical opinions that should be credited when they simply state that
a claimant can not be gainfully employed, because they are merely opinions on the
application of the statute, a task assigned solely to the discretion of the
[Commissioner].” (internal quotation marks omitted)).

       Finally, the ALJ properly applied the Medical-Vocational Guidelines to
determine if Baker could make an adjustment to other work. Generally, where the
claimant suffers from a nonexertional impairment such as pain, the ALJ must obtain
the opinion of a vocational expert instead of relying on the Medical-Vocational
Guidelines. Ellis, 392 F.3d at 996. However, the Guidelines still may be used where
the nonexertional impairments “do[] not diminish or significantly limit the claimant’s
residual functional capacity to perform the full range of Guideline-listed activities.”
Id. (quotation omitted). In particular, “[w]hen a claimant’s subjective complaints of
pain ‘are explicitly discredited for legally sufficient reasons articulated by the ALJ,’
the Secretary’s burden [at the fifth step] may be met by use of the
[Medical-Vocational Guidelines].” Naber v. Shalala, 22 F.3d 186, 189-90 (8th Cir.
1994) (quoting Hutsell v. Sullivan, 892 F.2d 747, 750 (8th Cir. 1989)). In this case,




                                          -20-
as discussed above, the ALJ expressly discredited Baker’s subjective complaints of
pain for legally sufficient reasons; therefore, use of the Guidelines was proper.8

       We conclude that the ALJ’s determination that Baker was not disabled is
supported by substantial evidence in the record as a whole. Therefore, we do not
reach the issue of whether the district court erred in directing the award of benefits
rather than remanding to the SSA for further proceedings.

III.   CONCLUSION

       We conclude that the district court erred in relying on materials not contained
in the administrative record and in finding that the Commissioner’s denial of benefits
was not supported by substantial evidence. Therefore, we reverse the district court
and remand with instructions to reinstate the Commissioner’s denial of benefits.

HEANEY, Circuit Judge, dissenting.

      I would remand this case for rehearing because the ALJ failed to fully and fairly
develop the record prior to determining Baker’s eligibility for social security disability
payments. Thus, I respectfully dissent from the majority opinion affirming the
Commissioner’s denial of benefits.


       8
        The dissent notes that the ALJ did not expressly find that Baker could not
return to his past relevant work. However, this finding was made implicitly when the
ALJ proceeded past step (iv) and reached step (v) of the 20 C.F.R. § 404.1520
analysis. See footnote 2, ante. The absence of an express finding did not prejudice
Baker with respect to burden-shifting at step (v) because use of the
Medical-Vocational Guidelines is sufficient to meet the Commissioner’s burden. See
Naber, 22 F.3d at 189-90 (“[T]he Secretary’s burden [at the fifth step] may be met by
use of the [Medical-Vocational Guidelines].” (quoting Hutsell, 892 F.2d at 750)
(alterations in Naber)).
                                           -21-
      It is settled law in this circuit that social security hearings are nonadversarial,
and the ALJ is responsible, independent of the claimant’s burden, for fully and fairly
developing the record. Snead v. Barnhart, 360 F.3d 834, 838 (8th Cir. 2004). The
duty to develop the record extends to cases like this one where the claimant is
represented by counsel. Id. “The ALJ possesses no interest in denying benefits and
must act neutrally in developing the record.” Id. Here, the ALJ failed in several
instances to fully and fairly develop the record.

       The ALJ found that Baker had not engaged in substantial gainful employment
since the alleged onset of his disability in February 2002. The ALJ further found that
Baker had a combination of impairments considered severe, but not medically equal
to a listed impairment, and finally that, pursuant to the Medical-Vocational rule
200.10, he was not disabled.

       The ALJ found that Baker could “perform the full range of light work,” (App.
at 63), but failed to make a finding regarding Baker’s ability to return to his own past
relevant work. This omission was serious and prejudicial, because if Baker could not
return to his past relevant work, the burden would shift to the Commissioner to prove
that there was other work in the national economy that Baker could perform in light
of his severe impairments.9 Baumgarten v. Chater, 75 F.3d 366, 368 (8th Cir. 1996).

        Baker’s principal occupation from May 1983 to September 2000 was that of a
tree trimmer. (Admin. R. at 106.) Until Baker fell and was injured in September
2000, he did all of the tree work himself. (Id. at 384.) Thereafter, he was limited to
supervising others in the tree trimming operation. (Id. at 377-382.) His work as a tree
trimmer is classified as heavy (Dictionary of Occupational Titles, § 408-.664-010),
and clearly beyond his current functional capacity. Had the ALJ found that tree


      9
       The initial examiner for the Social Security Administration determined that
Baker “is not able to return to his previous vocation.” (Admin. R. at 140.)
                                          -22-
trimming was Baker’s past relevant work, the burden would have shifted to the
Commissioner, as noted above.

       Prior to working as a tree trimmer, Baker worked as an assembly line worker
in a hog processing plant from October 1970 to May 1983. (Admin. R. at 106, 110.)
This job is also classified as heavy, and therefore beyond Baker’s present capabilities.
(Dictionary of Occupational Titles, § 525.381-014.) Thus, from 1970 to 2000,
Baker’s past relevant work was in positions classified as “heavy,” and beyond his
current functional capacity.



       In addition to the full-time occupations discussed above, Baker supplemented
his income with part-time jobs. From December 1994 to December 1999, Baker
worked approximately fifteen hours a week as a laundry attendant. (Admin. R. at 106,
109.) From December 1999 to September 2002, Baker worked approximately thirty-
three hours per week as an automobile detailer for Enterprise Rent A Car. (Id. at 106,
108.) The ALJ failed to determine whether Baker was capable of performing the
duties of either of these past jobs.

       The ALJ also erred in holding that Baker was not disabled because he took no
pain medication. The ALJ’s statement that Baker “takes no medications for his pain
in spite of rather significant complaints” is only half true. (App. at 61.) The record
is replete with references to pain medications prescribed by Baker’s treating
physicians and nothing supports an inference that he did not take the prescribed
drugs.10 Baker did report on a Social Security Administration form that he was not
“presently” taking pain medication. (Admin. R. at 128.) But Baker also testified that
he quit taking the pain medication because it did not relieve his pain. (Id. at 384.)


      10
       See, e.g., Admin. Rec. at 143-145, 148-150, 157-158, 160, 175, 180, 184, 189,
192, 202, 216.
                                          -23-
Quitting pain medication because is does not work is far different from quitting it
because he was not in pain. In Baumgarten, 75 F.3d at 369, the claimant testified that
she quit taking prescription pain medication because it was no more effective than
Tylenol. This court held that the ALJ’s “erroneous assertion” that the claimant
provided no reasonable explanation for discontinuing the pain medication called the
ALJ’s ultimate decision into doubt. Id.

      As the majority notes, the ALJ also incorrectly determined that Baker’s
complaints of pain were inconsistent with his performance of activities of daily living.
The ALJ improperly embellished Baker’s activities, noting that he did a wide variety
of household chores and outdoor tasks, including car washing, mowing the lawn, and
raking leaves. As to household chores, Baker testified:

      I don’t do much around the house. I help my wife barely with the dishes.
      I take them out of the sink and I stand at the counter. I hand them – the
      dishes to my wife, she bends over and leans over to put the dishes in the
      dishwasher. And once in a while I’ll wash the pots and pans at the
      kitchen sink. And about 15, 20 minutes of doing that it’s just – it’s really
      hard on me. I got to quit doing it.

(Admin. R. at 381.) The ALJ asked no further questions regarding household chores.
 The ALJ made its determination regarding car washing, mowing the lawn, and raking
leaves based on the daily activities questionnaire. In answering the questionnaire,
Baker stated that he “rarely” washed the car, “rarely” mowed the lawn, and “rarely”
raked the leaves. (Id. at 124 (emphasis added).) Certainly it was improper for the
ALJ to omit the fact that Baker only rarely did these chores.

      Furthermore, this court has consistently held that “the ability to do activities
such as light housework and visiting with friends provides little or no support for the
finding that a claimant can perform full-time competitive work.” Hogg v. Shalala, 45
F.3d 276, 278 (8th Cir. 1995); see also Baumgarten, 75 F.3d at 369 (finding the

                                          -24-
claimant’s ability to make the bed, prepare food, perform light housekeeping, grocery
shop, and visit friends an unpersuasive reason to deny benefits). Accordingly, Baker’s
failure to take prescription pain medication and his rare performance of daily activities
do not support the ALJ’s conclusion that Baker is not entitled to disability benefits.

      The ALJ gave significant weight to the residual functional capacity evaluation,
which stated in part:



      1.     This patient exhibited a sub maximal effort throughout the
             functional capacity evaluation. Mr. Baker passed 62% of the
             validity criteria 36/58 validity criteria scored. . . . The results of
             the functional capacity evaluation will not provide an accurate aid
             in the medical management and vocational planning for this
             patient.

      2.     Mr. Baker demonstrated during the functional capacity evaluation,
             an ability to work at a Sedentary physical demand level. This
             would allow an individual to perform occasional material handling
             activities with 10 lbs and less weight, 1-33% of the day. This is
             according to the Dictionary of Occupational Titles, US
             Department of Labor, 1991. The results are considered this
             patient’s minimal level of function and his maximal level of
             function must be left to conjecture. . . .

      3.     Estimated Physical Demand Level – The results of the functional
             capacity evaluation demonstrated an invalid result. Mr. Baker
             during the Occasional Material handling tests demonstrated the
             ability to perform material handling activities at a Sedentary
             Level, 10 lbs and less, however, the results of the Occasional
             material handling tests were invalid. . . . This patient will be
             placed at a Light-Medium Physical Demand Level. This would
             allow an individual to perform occasional material handling
             activities with 35 lbs and less. This is according to the Dictionary
             of Occupational Titles, U.S. Department of Labor, 1991.


                                          -25-
(Commissioner’s Factual Addendum for Appellant’s Br. at 69B.)

       The district court rejected the ALJ’s finding and questioned the validity of the
test. The district court concluded that it was highly improbable that a 50-year-old man
with an eighth grade education could manipulate the test, that he could have a valid
effort on 62% of the tests and invalid on the remainder. In plain language, the court
opined that it could not accept the tester’s conclusion that Baker had selectively
cheated on the test, and refused to give the test weight in its decision. To buttress its
opinion, the district court improperly considered and quoted from a “treatise” by Gary
L. Soderberg. Irrespective of the Soderberg article, the record sufficiently raises
concerns regarding the FCE as a basis for awarding or denying disability benefits.
This question deserves careful study and consideration by the ALJ and the
Commissioner in the first instance.

       Thus, I would remand to the ALJ for a hearing consistent with this dissent, that
includes an opportunity to fully explore the propriety of the use of the FCE in this
case. The ALJ has the responsibility to fully and fairly develop the record,
recognizing that because Baker cannot return to his past relevant work, the burden
shifts to the Commissioner to show that there are jobs in the national or regional
economy in meaningful numbers that Baker can perform despite his disabilities.
                       ______________________________




                                          -26-
