
407 F.Supp.2d 351 (2006)
UNITED STATES of America
v.
Amando MONTEIRO, Valdir Fernandes, Angelo Brandao, Brima Wurie, Luis Rodrigues, Manuel Lopes, Defendants.
No. CRIM. 03-10329-PBS.
United States District Court, D. Massachusetts.
January 6, 2006.
*352 *353 *354 Theodore V. Heinrich, Donald A. Cabell, Glenn A. MacKinlay, United States Attorney's Office, Boston, MA, for U.S.
Kevin S. Nixon, Boston, MA, E. Peter Parker, Law Office of E. Peter Parker, Boston, MA, for Amando B. Monteiro.

MEMORANDUM AND ORDER
SARIS, District Judge.

I. INTRODUCTION
Pursuant to Fed.R.Evid. 702, the defendants seek to exclude expert testimony that cartridge cases found at several crime scenes match firearms linked to defendants based on "unique" toolmarks transferred from the firearms to the ammunition. Specifically, defendants seek to exclude the testimony of a firearms examiner from the Massachusetts State Police who examined cartridge casings found at the scenes of the shooting of Dinho Fernandes and the attempted shootings of Alcides DePina and Antonio Diaz.
Defendants argue: (1) that the standard methodology of firearms identification is unreliable under Daubert v. Merrell Dow, *355 509 U.S. 579, 113 S.Ct. 2786, 125 L.Ed.2d 469 (1993) and Kumho Tire Co. v. Carmichael, 526 U.S. 137, 119 S.Ct. 1167, 143 L.Ed.2d 238 (1999); (2) that even if the methodology of firearms identification is reliable, the examiner is not qualified as an expert in the field; (3) that the examiner did not apply that established methodology adequately; and (4) that, in any event, with respect to one gun, the FEG FP 9, the use of replacement parts to test fire the weapon rendered the match to recovered ammunition unreliable.
At a six-day evidentiary hearing, Special Agent Timothy Curtis, operations officer for the forensic laboratories at the Bureau of Alcohol, Tobacco, Firearms and Explosives in Maryland and former chief of the Firearms Section, and Mary Kate McGilvray, the quality manager with the Massachusetts State Police Crime Laboratory, testified for the government. Witnesses called by defendants included David J. LaMagna, an engineer with a Masters in Materials Science; Mary-Jacque Mann, a former firearms examiner with the National Fish and Wildlife Forensics Laboratory, who holds a Masters of Forensic Science, and is a scanning electron microscopist; Catherine Doherty, commander of the ballistics unit of the Boston Police Department; and Sgt. Douglas Weddleton, the Massachusetts State Police firearms examiner whose testimony is being challenged. The defense also submitted the affidavit of Adina Schwartz, an Associate Professor of Law at the John Jay College of Criminal Justice, and the government submitted the affidavit of FBI Special Agent Philip Ball.
Based on the extensive documentary record replete with photographs, demonstratives, and journal articles, this Court holds that the underlying scientific principle behind firearm identification  that firearms transfer unique toolmarks to spent cartridge cases  is valid under Daubert. However, the process of deciding that a cartridge case was fired by a particular gun is based primarily on a visual inspection of patterns of toolmarks, and is largely a subjective determination based on experience and expertise. Because of the subjective nature of the matching analysis, a firearms examiner must be qualified through training, experience, and/or proficiency testing to provide expert testimony. Moreover, an examiner must follow the established standards for intellectual rigor in the toolmark identification field with respect to documentation of the reasons for concluding there is a match (including, where appropriate, diagrams, photographs or written descriptions), and peer review of the results by another trained examiner in the laboratory. These standards ensure the reliability of the expert's results and the testability of the opinion.
If the government meets these standards at trial, the expert may give an opinion of a match to a reasonable degree of certainty in the ballistics field. However, the expert may not testify that there is a match to an exact statistical certainty.
The Court concludes (1) the methodology is reliable; (2) the examiner is qualified by reason of training, experience and proficiency testing; (3) the expert opinion is inadmissible because it fails to comport with the standards for documentation and peer review in the ballistics field, and (4) the dispute over the effect of replacement parts does not render the testimony inadmissible but goes to the weight of the evidence. The motion in limine is ALLOWED without prejudice to the government's re-submission of test results that comply with the standards in the ballistics field.

II. BACKGROUND
Defendants have been indicted for violations of the Racketeer Influenced and Corrupt *356 Organizations Act (RICO), 18 U.S.C. § 1961 et. seq., related to their alleged membership in a violent street gang known as Stonehurst. The government intends to prove these allegations, in part, through the use of expert testimony by firearms examiner Sgt. Weddleton of the Massachusetts State Police. Sgt. Weddleton seeks to offer his opinion that cartridge cases recovered from the scenes of these various shootings match cartridge cases test-fired from guns linked to the defendants. In particular, Sgt. Weddleton will opine that these cartridge cases were fired by a 9 mm. Ruger and a 9 mm. FEG FP 9 Browning High-Power.
With respect to the FEG FP 9, police recovered part of the gun in a sewer in a state of disrepair. Sgt. Weddleton reconstructed the gun using numerous replacement parts and test-fired it. He then examined the test-fired cartridge case and the cartridge case recovered from the crime scene, and declared them a match.

III. DISCUSSION
A. The Court's Gatekeeper Role Under Daubert and Kumho Tire
The admission of expert evidence is governed by Fed.R.Evid. 702, which codified the Supreme Court's holding in Daubert v. Merrell Dow and its progeny. See United States v. Diaz, 300 F.3d 66, 73 (1st Cir. 2002); see also Fed.R.Evid. 702 advisory committee's note. Rule 702 states:
If scientific, technical, or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training, or education, may testify thereto in the form of an opinion or otherwise, if (1) the testimony is based upon sufficient facts or data, (2) the testimony is the product of reliable principles and methods, and (3) the witness has applied the principles and methods reliably to the facts of the case.
Fed.R.Evid. 702.
The trial court must determine whether the expert's testimony "both rests on a reliable foundation and is relevant to the task at hand" and whether the expert is qualified. Daubert, 509 U.S. at 597, 113 S.Ct. 2786; Diaz, 300 F.3d at 73 ("[A] proposed expert witness must be sufficiently qualified to assist the trier of fact, and [] his or her expert testimony must be relevant to the task at hand and rest on a reliable basis."). Because "the admissibility of all expert testimony is governed by the principles of Rule 104(a)," the proponents of the expert testimony must establish these matters by a preponderance of the evidence. Fed.R.Evid. 702 advisory committee's note (citing Bourjaily v. United States, 483 U.S. 171, 107 S.Ct. 2775, 97 L.Ed.2d 144 (1987)); see also Moore v. Ashland Chem., Inc., 151 F.3d 269, 276 (5th Cir.1998).
As a threshold matter, the government suggests that because toolmark identification evidence has been deemed admissible by many other courts, the burden of proving such evidence to be unreliable should shift to the defendants. I disagree. Because reliability under Daubert is among the preliminary inquiries a court must address under Fed.R.Evid. 104(a), the burden of proof with respect to reliability remains on the proponent of the evidence. See Daubert, 509 U.S. at 593 n. 10, 113 S.Ct. 2786; Moore, 151 F.3d at 276 ("The proponent need not prove to the judge that the expert's testimony is correct, but she must prove by a preponderance of the evidence that the testimony is reliable.").
First, the Court must determine whether all proffered expert testimony is sufficiently reliable to be admitted, whether *357 "scientific" or not. See Kumho Tire, 526 U.S. at 147, 119 S.Ct. 1167 (noting that "[t]he trial judge's effort to assure that the specialized testimony is reliable and relevant can help the jury evaluate that [testimony], whether the testimony reflects scientific, technical, or other specialized knowledge"). The Court must make a determination as to whether "the reasoning or methodology underlying the testimony is scientifically valid and of whether that reasoning or methodology properly can be applied to the facts in issue." Daubert, 509 U.S. at 592-93, 113 S.Ct. 2786; Gen. Elec. Co. v. Joiner, 522 U.S. 136, 142, 118 S.Ct. 512, 139 L.Ed.2d 508 (1997). Daubert itself listed five factors which should guide judges in this determination: (1) whether the theory or technique can be and has been tested; (2) whether the technique has been subject to peer review and publication; (3) the technique's known or potential rate of error; (4) the existence of standards controlling the technique's operation; and (5) the level of the theory's or technique's acceptance within the relevant discipline. Daubert, 509 U.S. at 593-94, 113 S.Ct. 2786. "These factors, however, are not definitive or exhaustive, and the trial judge enjoys broad latitude to use other factors to evaluate reliability." United States v. Mooney, 315 F.3d 54, 62 (1st Cir.2002) (citing Kumho Tire, 526 U.S. at 153, 119 S.Ct. 1167).
In Kumho Tire, the Supreme Court was careful to emphasize that the trial judge must exercise her gatekeeping role with respect to all expert evidence, but that how she might exercise that role would necessarily vary depending on the type of testimony at issue. See Kumho Tire, 526 U.S. at 150, 119 S.Ct. 1167; United States v. Frazier, 387 F.3d 1244, 1262 (11th Cir.2004) ("Exactly how reliability is evaluated may vary from case to case, but what remains constant is the requirement that the trial judge evaluate the reliability of the testimony before allowing its admission at trial."); Amorgianos v. Amtrak, 303 F.3d 256, 266 (2d Cir. 2002) (recognizing that "the Daubert inquiry is fluid and will necessarily vary from case to case"). Numerous courts have recognized that the particular factors the Court outlined in Daubert may not perfectly fit every type of expert testimony, particularly technical testimony based primarily on the training and experience of the expert. See, e.g., Thomas v. City of Chattanooga, 398 F.3d 426, 431 (6th Cir.2005) (noting that "lower courts have flexibility in the application of the factors, because it may not make sense to apply some of the Daubert factors, such as the rate of error analysis, to non-scientific testimony"); United States v. Hankey, 203 F.3d 1160, 1169 (9th Cir.2000) (noting that the "Daubert factors . . . simply are not applicable to . . . testimony, whose reliability depends heavily on the knowledge and experience of the expert, rather than the methodology or theory behind it"). Under Kumho Tire, it is clear that testimony based on experience must rest on a reliable foundation. The critical inquiry is whether the expert "employs in the courtroom the same level of intellectual rigor that characterizes the practice of an expert in the relevant field." 526 U.S. at 156, 119 S.Ct. 1167; Rider v. Sandoz Pharm. Corp., 295 F.3d 1194, 1197 (11th Cir.2002).
Should the Court find the general methodology underlying the expert's proposed testimony sufficiently reliable, it then must turn to the proffered expert testimony. The Court must, of course, ensure that the witness is qualified to offer an expert opinion. Poulis-Minott v. Smith, 388 F.3d 354, 359 (1st Cir.2004). After determining whether the witness is a qualified proponent of a scientific or technical methodology, the Court must then *358 query "whether those principles and methods have been properly applied to the facts of the case." Fed.R.Evid. 702 advisory committee's note. "In other words, Rule 702, as visualized through the Daubert prism, `requires a valid scientific connection to the pertinent inquiry as a precondition to admissibility.'" Ruiz-Troche v. Pepsi Cola, 161 F.3d 77, 81 (1st Cir.1998) (quoting Daubert, 509 U.S. at 592, 113 S.Ct. 2786). As such, this Court must evaluate the reliability of not only the general field of toolmark identification but also the application by Sgt. Weddleton. See Amorgianos, 303 F.3d at 267 ("In deciding whether a step in an expert's analysis is unreliable, the district court should undertake a rigorous examination of the facts on which the expert relies, the method by which the expert draws an opinion from those facts, and how the expert applies the facts and methods to the case at hand.").
The Court's vigilant exercise of this gatekeeper role is critical because of the latitude given to expert witnesses to express their opinions on matters about which they have no firsthand knowledge, and because an expert's testimony may be given greater weight by the jury due to the expert's background and approach. See Daubert, 509 U.S. at 595, 113 S.Ct. 2786; Kumho Tire, 526 U.S. at 152, 119 S.Ct. 1167 (noting that experts enjoy "testimonial latitude unavailable to other witnesses"); Frazier, 387 F.3d at 1263 ("Simply put, expert testimony may be assigned talismanic significance in the eyes of lay jurors, and, therefore, the district courts must take care to weigh the value of such evidence against its potential to mislead or confuse."); United States v. Hines, 55 F.Supp.2d 62, 64 (D.Mass.1999) (noting that "a certain patina attaches to an expert's testimony unlike any other witness: this is `science,' a professional's judgment, the jury may think, and give more credence to the testimony than it may deserve").
The Court must, however, keep in mind the Supreme Court's admonition that, "[v]igorous cross-examination, presentation of contrary evidence, and careful instruction on the burden of proof are the traditional and appropriate means of attacking shaky but admissible evidence." Daubert, 509 U.S. at 596, 113 S.Ct. 2786; see also 4 Joseph M. McLaughlin, Jack B. Weinstein & Margaret A. Berger, Weinstein's Federal Evidence § 702.02[5], at 702-20 (2d ed. 2005) ("Trial courts should be aware of the curative powers of the adversary system when faced with an objection that is solely on the basis of confusion."). Furthermore, the Court must bear in mind that:

Daubert does not require that a party who proffers expert testimony carry the burden of proving to the judge that the expert's assessment of the situation is correct. As long as an expert's scientific testimony rests upon "good grounds, based on what is known," Daubert, 509 U.S. at 590, 113 S.Ct. 2786 (internal quotation marks omitted), it should be tested by the adversary process  competing expert testimony and active cross-examination  rather than excluded from jurors' scrutiny for fear that they will not grasp its complexities or satisfactorily weigh its inadequacies. In short, Daubert neither requires nor empowers trial courts to determine which of several competing scientific theories has the best provenance. It demands only that the proponent of the evidence show that the expert's conclusion has been arrived at in a scientifically sound and methodologically reliable fashion.
Ruiz-Troche, 161 F.3d at 85 (quoting Daubert, 509 U.S. at 590, 113 S.Ct. 2786) (internal citations omitted). It is with these *359 principles in mind that the Court assesses the multi-level objection by the defendants.
B. Reliability of the Underlying Methodology
1. A "Primer" on Firearm Identification
In order to perform a Daubert/Kumho analysis properly, the Court must define the challenged methodology, in this case matching a cartridge case to a particular gun based on a comparison of grooves and markings, collectively referred to as toolmarks, left by the firearm on spent cartridge cases. The Court will first briefly sketch the process a firearms examiner uses when comparing cartridge cases and then examine whether that process passes muster under Daubert/Kumho.
The underlying principle of firearm identification is that each firearm will transfer a unique set of marks, known as "toolmarks," to ammunition fired from that gun. By using a "comparison microscope" to compare ammunition test-fired from a recovered gun with spent ammunition from a crime scene, a trained firearms examiner determines whether the recovered ammunition was fired from that particular gun. The following scientific principles support this approach.
When a firearm is manufactured, the "process of cutting, drilling, grinding, hand-filing, and, very occasionally, hand-polishing . . . will leave individual characteristics" on the components of the firearm. See Brian J. Heard, Handbook of Firearms and Ballistics 127 (1997). Although modern manufacturing methods have reduced the amount of handiwork performed on an individual gun, the final step in production of most firearm parts requires some degree of hand-filing which imparts individual characteristics to the firearm part. See id. at 128. This process results in "randomly produced patterns of individual stria," or thin grooves or markings, being left on firearm parts. Id. These parts are assembled to compose the final firearm.

When a round (a single "shot") of ammunition is fired from a particular firearm, the various components of the ammunition *360 come into contact with the firearm at very high pressures. As a result, the individual markings on the firearm parts are transferred to the ammunition. Id. The ammunition is composed primarily of the bullet and the cartridge case. The bullet is the missile-like component of the ammunition that is actually projected from the firearm, through the barrel, toward the target. (Bullets are not at issue in this challenge.) The cartridge case is the part of the ammunition situated behind the bullet containing the primer and propellant, the explosive mixture of chemicals that causes the bullet to be projected through the barrel. Id. at 42. In the case of a semi-automatic handgun, once a round of ammunition is loaded into the chamber, and the gun is cocked, the shooter pulls the trigger, and the firing pin is released. The firing pin strikes the back of the cartridge case, igniting the primer in the ammunition, thus starting a chemical reaction, leading to
the bullet being pushed down the barrel by the expanding gases. These gases also exert an equal and opposite force on the cartridge case which forces the slide and breechblock to the rear. This ejects the spent cartridge case through a port in the side, or occasionally top, of the slide.
Id. at 19.
During this process, which occurs in a fraction of a second, the cartridge case comes into contact with several parts of the firearm, most notably the firing pin, as explained above, and the breech face, a flat surface behind the cartridge case against which the cartridge case is pushed by the expanding gases. When the cartridge case is "slammed into the standing breech face," some of the individual toolmarks left on the breech face in the manufacturing process are replicated on the surface of the cartridge case. Id. at 131. These toolmarks are referred to as "impressed" toolmarks. Other marks might be left on the ammunition when parts of the firearm, like the firing pin, the extractor, or the ejector, are moved across the cartridge case, and these are referred to as "striated" toolmarks. See Theory of Identification (hereinafter "AFTE Theory"), Association of Firearm and Toolmark Examiners ("AFTE"), 30 AFTE J. 86, 88 (1998) (Ex. 24).
Each of the marks left on a particular cartridge case, whether "impressed" or "striated," may or may not be unique to that particular firearm. Firearms examiners classify these types of marks in three categories: class, subclass, and individual characteristics. See id. at 87-88. Class characteristics are defined as "family resemblances which will be present in all weapons of the same make and model." Heard, supra, at 132. A class characteristic will potentially reproduce similar marks on all ammunition fired from a particular make and model of a firearm. (Daubert Hr'g Tr. 13, Oct. 27, 2005.)
Subclass characteristics appear on a smaller subset of a particular make and model of a firearm. They are "produced incidental to manufacture" and "can arise from a source which changes over time." AFTE Theory, supra, at 88. Subclass characteristics, then, may be present on a group of guns within a certain make or model, such as those manufactured at a particular time and place.
Individual characteristics are defined as "[r]andom imperfections produced during manufacture or caused by accidental damage . . . which are unique to that object and distinguish it from all others." Heard, supra, at 132. A variety of each type of mark is left on a spent round of ammunition. Some of the individual characteristics of toolmarks are comprised of non-unique marks. Adina Schwartz, A Systemic *361 Challenge to the Reliability and Admissibility of Firearms and Toolmark Identification, 6 Colum: Sci. & Tech. L.Rev. 2, 6 (2005) (citing Jack D. Gunther and Charles Gunther, The Identification of Firearms, 90-91 (1935) ("It is probably true that no two firearms with the same class characteristics will produce the same signature, but it is likewise true that each element of a firearm's signature may be found in the signatures of other firearms.")). Also, individual characteristics of toolmarks change somewhat over time due to wear and tear.
Now enters the firearms examiner. The basic underpinnings of firearm identification by comparing patterns produced on cartridge cases have been ably described by a firearm specialist for the FBI:
The theory underlying firearms identification is that no two firearms should produce the same microscopic features on bullets and cartridge cases such that they could be falsely identified as having been fired from the same firearm. . . . Patterns produced on bullets and cartridge cases from contact with [barrels and breech faces] can be microscopically compared to determine if they have originated from a common source.
Erich D. Smith, Cartridge Case and Bullet Comparison Validation Study with Firearms Submitted in Casework, 36 AFTE J. 130 (2004) (Ex. 20).
With this principle in mind, a firearms examiner presented with a handgun and spent cartridge cases will test fire the weapon using the same type of ammunition as that recovered in the case. The examiner will look at the test-fired cartridge cases and the recovered cartridge cases simultaneously using an instrument called a comparison microscope, which is necessary to overlay the images of the two shell casings. First put into use in 1925, the comparison microscope allows the examiner to compare the tiny markings left on the two cartridge cases. In theory, if the test cartridges and recovered cartridges were fired from the same gun, the examiner would see sufficient patterns of matching marks, supposedly leading to "a result as conclusive as fingerprints." Julian S. Hatcher, Frank J. Jury & Jac Weller, Firearms Investigation, Identification, and Evidence 15 (2d ed.1957).
*362 
Heard, supra, at 129.
2. The Challenge to the Methodology
Defendants levy their strongest attack on the process by which an expert comes to the conclusion that a cartridge case was fired from a particular gun. A perfect correspondence between the lines on a test-fired cartridge and the evidence recovered from the scene is impossible; in the real world, there is no such thing as a "perfect match." Alfred A. Biasotti, A Statistical Study of the Individual Characteristics of Fired Bullets, 4 J. Forensic Sci. 34, 44 (1959) (Ex. 27) (noting the "erroneous conception of a `perfect match' which is actually only a theoretical possibility and a practical impossibility"). The government's expert at the Daubert hearing, whom I found qualified and credible, Special Agent Curtis of the ATF, confirmed that all the marks will not match up even when two cartridges have been fired from the same gun. (Daubert Hr'g Tr. 88, Sept. 16, 2005.) Rather, he echoed the findings of a 1957 study by Alfred Biasotti which found that only 21-38 percent of the marks will match up on bullets fired from the same gun. (Id. at 90.) These differences may be due to possible changes in toolmarks over time. Schwartz, supra, at 8. Moreover, when bullets fired by two different .38 special Smith & Wesson revolvers of the same make and model were compared, 15-20 percent of the lines matched up. (Daubert Hr'g Tr. 90-91, September 10, 2005.) Accord Biasotti, supra, at 37-40; Ronald G. Nichols, Firearm and Toolmark Identification Criteria: A Review of the Literature, 42 J. Forensic Sci. 466, 467 (1997) (Ex. 17). Therefore, there can be a pattern of matching marks on cartridge cases fired from different guns.
The conclusion that a recovered cartridge case matches a test-fired cartridge case is based on a subjective "threshold currently held in the minds eye of the examiner and . . . based largely on training and experience in observing the difference between known matching and *363 known non-matching impression toolmarks." Richard Grzybowski et al., Firearm/Toolmark Identification: Passing the Reliability Test Under Federal and State Evidentiary Standards, 35 AFTE J. 209, 213 (2003) (Ex. 18). A recent article has highlighted the complexity of comparing patterns because of the difficulty in distinguishing between class, subclass, and individual characteristics, noting that a firearm "may be wrongly identified as the source of a toolmark it did not produce if an examiner confuses subclass characteristics shared by more than one tool with individual characteristics unique to one and only one tool." Schwartz, supra, at 8. Both experts seem to agree that most examiners do not accept quantitative standards for determining whether two cartridge cases were fired from the same gun. Moreover, Special Agent Curtis testified that there was "no application" of "probability studies and statistics" to the field of firearm identification. (Daubert Hr'g Tr. 48-49, Sept. 16, 2005.)
To respond to this challenge, Special Agent Curtis advanced the methodology embodied in the AFTE Theory. AFTE is the leading professional organization in the field and publishes standards for ensuring reliability and proficiency. (Daubert Hr'g Tr. 45, Sept. 16, 2005.) The AFTE Theory "as it pertains to the comparison of toolmarks enables opinions of common origin to be made when the unique surface contours of two toolmarks are in `sufficient agreement'." AFTE Theory, supra, at 86. The document defines "sufficient agreement" as follows:
This "sufficient agreement" is related to the significant duplication of random toolmarks as evidenced by the correspondence of a pattern or combination of patterns of surface contours. Significance is determined by the comparative examination of two or more sets of surface contour patterns comprised of individual peaks, ridges and furrows. Specifically, the relative height or depth, width, curvature and spatial relationship of the individual peaks, ridges and furrows within one set of surface contours are defined and compared to the corresponding features in the second set of surface contours. Agreement is significant when it exceeds the best agreement demonstrated between toolmarks known to have been produced by different tools and is consistent with agreement demonstrated by toolmarks known to have been produced by the same tool. The statement that "sufficient agreement" exists between two toolmarks means that the likelihood that another tool could have made the mark is so remote as to be considered a practical impossibility.
Id. The theory follows this qualitative guideline, candidly acknowledging: "Currently the interpretation of individualization/identification is subjective in nature, founded on scientific principles and based on the examiner's training and experience." Id.
The examiner then, after evaluation of the samples under a microscope, declares whether or not the cartridge case was fired from the firearm in question. Under the AFTE Theory, the examiner may declare an "identification" after making the following finding:
Agreement of a combination of individual characteristics and all discernable class characteristics where the extent of agreement exceeds that which can occur in the comparison of toolmarks made by different tools and is consistent with the agreement demonstrated by toolmarks known to have been produced by the same tool.
AFTE Theory, supra, at 86. The examiner may also conclude that the comparison *364 is "inconclusive," that the firearm did not fire the ammunition in question, or that the sample is unsuitable for microscopic examination. Id. at 87.
This conclusion is not based on any quantitative standard for how many striations or marks need to match or line up. Instead, it is based on a holistic assessment of what the examiner sees. See Grzybowski et al., supra, at 214 ("The AFTE Theory of Identification is based on an assessment of both quality and quantity of agreement observed between toolmarks being compared. This is how toolmark identifications have always been made."). Special Agent Curtis characterized this process: "It's just from examining all the marks and then interpreting." (Daubert Hr'g Tr. 88, Sept. 16, 2005.)
Then, as Special Agent Curtis testified, additional AFTE standards require the examiner to document his findings through identifications by notes, sketches, or photographs. (Daubert Hr'g Tr. 43-4, Sept. 16, 2005; see infra part III.D.) Subsequently, the examiner should have a second examiner review his work and his findings. (Daubert Hr'g Tr. 46, Sept. 16, 2005; see infra part III.D.E.)
3. Prior Judicial Analyses of Admissibility of the Methodology
For decades, both before and after the Supreme Court's seminal decisions in Daubert and Kumho Tire, admission of the type of firearm identification testimony challenged by the defendants has been semi-automatic; indeed, no federal court has yet deemed it inadmissible. See, e.g., United States v. Hicks, 389 F.3d 514, 526 (5th Cir.2004) (dealing with a similar challenge, noting that "[w]e have not been pointed to a single case in this or any other circuit suggesting that the methodology . . . is unreliable"); United States v. Santiago, 199 F.Supp.2d 101, 111 (S.D.N.Y.2002) ("The Court has not found a single case in this Circuit that would suggest that the entire field of ballistics identification is unreliable.").
Courts have understandably been gun shy about questioning the reliability of firearm identification evidence. See Santiago, 199 F.Supp.2d at 111-12 ("The Court . . . can only imagine the number of convictions that have been based, in part, on expert testimony regarding the match of a particular bullet to a gun seized from a defendant or his apartment."). Accord United States v. Foster, 300 F.Supp.2d 375, 377 n. 1 (D.Md.2004) (noting that "[b]allistics evidence has been accepted in criminal cases for many years"); United States v. O'Driscoll, 2003 WL 1402040 at *1, 2003 U.S. Dist. LEXIS 3370 at *4 (M.D.Pa. Feb. 10, 2003). Storm clouds, however, are gathering.[1]See Sexton v. State, 93 S.W.3d 96 (Tex.Cr.App.2002) (rejecting matching of cartridge cases based on magazine marks alone without recovery of underlying magazine); Ramirez v. State, 810 So.2d 836 (Fla.2001) (rejecting toolmark analysis matching knife to fatal stab wounds). One commentator from within the firearm identification profession has cited Daubert objections as "perhaps the biggest challenge facing the firearms discipline since it was firmly established in the 1920's." Sgt. Gerard Dutton, Ethics in Forensic Firearms Investigation, 37 AFTE J. 79, 82 (2005) (Ex. 6).
*365 4. Firearm Identification Through the Prism of Rule 702
Defendants allege that this pattern-based methodology, as embodied in the AFTE standard, even when followed correctly by a qualified examiner, is not sufficiently reliable to be admissible under Rule 702 and Daubert/Kumho Tire.
One issue to note at the outset is whether the expert testimony proffered in this case is "scientific," like the disputed epidemiological studies in Daubert, or "technical," like the testimony of the tire failure expert in Kumho Tire. In either case, the Court must still ensure the testimony is sufficiently reliable. See Kumho Tire, 526 U.S. at 152, 119 S.Ct. 1167; Berry v. City of Detroit, 25 F.3d 1342, 1350 (6th Cir. 1994) (noting that expert testimony based on experience, such as that of a beekeeper testifying that bees always take off into the wind, may be admissible under Rule 702, but a court still must ensure its reliability).
Firearm identification evidence straddles the line between testimony based on science and experience. As the AFTE Theory describes it, the methodology is "subjective in nature, founded on scientific principles and based on the examiner's training and experience." Supra, at 86. Science is in the background, at the core of the theory, but its application is based on experience and training. The Court must, therefore, determine the reliability of both the underlying science and its application.
Initially, defendants mounted an interesting attack, arguing that modern manufacturing methods have reduced the number of individual toolmarks that might be transferred from the firearm to the ammunition. However, this attack fizzled at the hearing. Defendants did not offer any evidence that these modern manufacturing methods had this effect on the guns at issue in this case. Moreover, recent scientific studies have demonstrated that the underlying principle that firearms leave unique marks on ammunition has continuing viability. See, e.g., Smith, supra, at 130-31 (noting that studies have confirmed that tests have showed that consecutively manufactured gun barrels left unique marks on bullets fired through them); Amy C. Coody, Consecutively Manufactured Ruger P-89 Slides, 35 AFTE J. 157 (2003) (Ex. 19) (finding that with respect to consecutively manufactured Ruger pistols "variations, combined with other imperfections and irregularities that occurred during the manufacturing process, result in unique, individual breechface marks that can be positively identified.").
Additionally, with respect to the case at hand, the government presented an affidavit from an FBI Special Agent Philip Ball who had visited the Hungarian plant where the FEG FP 9 handgun in this case had been constructed. (Aff. of Philip Ball at 4-5) (Ex. 32). The agent stated that the "plant was archaic, and the manufacturing process used was not modern but rather more than 50 years old." (Id.) Moreover, Special Agent Ball stated that the breech face of this brand of firearm was milled and finished by hand. (Id. at 5.)
In his testimony, defense expert LaMagna, distancing himself from the original position that recently manufactured firearms did not leave unique marks on cartridge cases, testified that those marks could be better identified and analyzed using more modern equipment such as a scanning electron microscope ("SEM"). (Daubert Hr'g Tr. 109, Nov. 17, 2005.) Another impressive witness, Mary-Jacque Mann, formerly of the Bureau of Fish and Wildlife, testified that she had used such a microscope in firearms identification with excellent results, but conceded this was a non-traditional approach. *366 (Daubert Hr'g Tr. 47, July 29, 2005.) See Mary-Jacque Mann & Edgard O. Espinoza, Firearms Examinations By Scanning Electron Microscopy: Observations and an Update on Current and Future Approaches, 24 AFTE J. 294 (1992) (Ex. 2) (pointing out that SEM provides better depth of field, magnification, and imaging than conventional optical microscopy). It may well be that other methods not generally used in the field may prove to be the best method of analysis. However, Daubert and Kumho Tire do not make the perfect the enemy of the reliable; an expert need not use the best method of evaluation, only a reliable one.
There has also been no credible challenge to the underlying physical theory of how marks are transferred from the firearm to the cartridge case. The government has met its burden with regard to demonstrating that the underlying scientific principle that firearms leave unique marks on ammunition is reliable under Rule 702.
The question of whether the methodology of identifying a match between a particular cartridge case and gun is reliable requires far more analysis. This process is admittedly "subjective" and based on experience and training of the individual examiner. The advisory committee's note to Rule 702 counsels:
If the witness is relying solely or primarily on experience, then the witness must explain how that experience leads to the conclusion reached, why that experience is a sufficient basis for the opinion, and how that experience is reliably applied to the facts. The trial court's gatekeeping function requires more than simply "taking the expert's word for it."
Fed.R.Evid. 702 advisory committee's note (citing Daubert v. Merrell Dow, 43 F.3d 1311, 1319 (9th Cir.1995)). Daubert demands that the expert's "`knowledge' connote[] more than subjective belief or unsupported speculation." 509 U.S. at 590, 113 S.Ct. 2786; see also Ambrosini v. Labarraque, 101 F.3d 129, 134 (D.C.Cir.1996).
The Court will now assess this methodology for matching patterns of toolmarks on cartridge casings using Daubert and Kumho Tire as a guide, taking into account defendants' numerous objections.
a. Peer Review and Publication
Daubert counsels that "publication in a peer reviewed journal [is] a relevant, though not dispositive, consideration in assessing the validity of a particular technique or methodology on which an opinion is premised." Daubert, 509 U.S. at 594, 113 S.Ct. 2786. The First Circuit has elaborated, noting that publication and peer review "serve as independent indicia of the reliability of the . . . technique" and "demonstrate a measure of acceptance of the methodology within the scientific community." Ruiz-Troche, 161 F.3d at 84.
The AFTE publishes a peer reviewed journal, aptly named the AFTE Journal, which contains numerous articles validating the current technique of firearm identification. See, e.g., Smith, supra, at 132 (concluding that "the absence of false positives or false negatives indicates that the theory of firearms identification, using pattern recognition, is an accurate and precise method for determining a common source for bullets and cartridge cases for firearms collected from casework"); Nichols, supra (for a bibliography of various studies). The Journal maintains a formal pre-publication peer review process, including "assignment of manuscripts to other experts within the scientific community for a technical review, returning of manuscripts to authors for clarification and re-write, and a final review by the Editorial Committee." *367 Grzybowski et al., supra, at 220; see also Dominic J. Denio, The History of the AFTE Journal, the Peer Review Process, and Daubert Issues, 34 AFTE J. 210 (2002) (Ex. 23).
Other peer reviewed articles have not universally been laudatory of the current technique of identification. The above-referenced Biasotti study in the Journal of Forensic Science opens by noting an "almost complete lack of factual and statistical data pertaining to the problem of establishing identity in the field of firearm identification." Biasotti, supra, at 34. Professor Schwartz, who, it bears mentioning, is not a firearms examiner, also attacks the methodology in a peer reviewed publication. Schwartz, supra. Although there appears to be a disagreement in the peer reviewed literature as to the reliability of the AFTE method of identification, consensus is not necessary.
b. Known Error Rate
Daubert directs that "in the case of a particular scientific technique, the court ordinarily should consider the known or potential rate of error." 509 U.S. at 594, 113 S.Ct. 2786. In the case of firearm toolmark identification, because the process is so subjective and qualitative, it "is not possible to calculate an absolute error rate for routine casework." Grzybowski et al., supra, at 219.
One article cited by the government actually posits that the error rate is 1.4 percent based on a 1978-1991 study of results of proficiency tests given by a private firm, Collaborative Testing Services (CTS). Grzybowski et al., supra, at 216. The authors note, however, that this figure may not stand up to scrutiny due to variations in the difficulty of the tests and the conditions under which they were taken. Id. at 218. The government also relies on a study by Bunch and Murphy in which FBI examiners, all of whom knew they were being tested, conducted 360 cartridge case examinations, and no false positives and no false negatives were reported. Stephen G. Bunch & Douglas P. Murphy, A Comprehensive Validity Study for the Forensic Examination of Cartridge Cases, 35 AFTE J. 201, 203 (1993) (Ex. 22). The lack of false positives is an especially important figure because it indicates a somewhat reduced risk of wrongfully accusing a defendant. See United States v. Mitchell, 365 F.3d 215, 259 (3d Cir.2004) (noting that the rate of false positives is the critical figure in Daubert analysis).
The testimony at the hearing raised a note of caution about the proficiency testing on which this figure is based. First, proficiency testing is not required of all firearms examiners, only those working in labs voluntarily seeking to be certified by the American Society of Crime Laboratory Directors (ASCLD), meaning that the sample is self-selecting and may not be representative of the complete universe of firearms examiners. (Daubert Hr'g Tr. 71, Oct. 27, 2005). Second, examiners know when they are being tested. (Id.) There is also some variation depending on whether the incorrect conclusion that a result is "inconclusive" is counted. (Id. at 71.)
Mary Kate McGilvray, quality assurance manager at the Massachusetts State Crime Lab, testified that in the 2005 CTS cartridge case examination, none of the 255 test-takers nationwide answered incorrectly. (Daubert Hr'g Tr. 10, Oct. 28, 2005.) One could read these results to mean that the technique is foolproof, but the results might instead indicate that the test was somewhat elementary. See United States v. Lewis, 220 F.Supp.2d 548, 554 (S.D.W.Va.2002) (finding proficiency testing for handwriting analysis "not meaningful" when "all of [the expert's] peers always *368 passed"), United States v. Plaza, 188 F.Supp.2d 549, 564-65 (E.D.Pa.2002) (noting that the FBI's fingerprint examiner proficiency tests, on which examiners "scored spectacularly well," were "less demanding than they should be" and therefore "can be of little assistance in providing the test makers with a discriminating measure of the relative competence of the test takers."). Nonetheless, there is no evidence that the tests are inaccurate or otherwise deficient.
Based on the record before me, the government has established that known error rate is not unacceptably high.
c. Testability
Because of the subjectivity of the analysis, Special Agent Curtis underscored the standards in the ballistics field ensuring the intellectual rigor of the examiner in performing the identifications and the testability of his results. Firearms examiners are required to document their results and have their work reviewed by another examiner. These requirements ensure the reliability and the reproducibility of the examiner's results. See United States v. Crisp, 324 F.3d 261, 269 (4th Cir.2003) (upholding admission of fingerprint evidence despite lack of a universal standard for identification due to requirements of "professional training, peer review, presentation of conflicting evidence, and double checking"); Plaza, 188 F.Supp.2d at 571 (noting that such standards in fingerprint analysis create a "substantially more restricted compass" on subjective opinion).
Special Agent Curtis indicated AFTE standards require an examiner to document his or her findings by notes, sketches, or photographs. (Daubert Hr'g Tr. 43-44, Sept. 16 2005.) In lieu of a picture the examiner should take notes as to which markings on the cartridge case caused the examiner to declare a match. (Daubert Hr'g Tr. 46-47, Sept. 16 2005.) As one firearms examiner indicated in the AFTE Journal in 2003:
In other words, for our work to be valid, it must be verifiable to other examiners. This means that other examiners must be able to repeat the work and come to the same conclusions. Therefore, the data that we gather should provide a well-defined "roadmap" as to what experiments we performed to answer the question(s) posed, what data was gathered, and a clear demonstration of the evidence from which we supported our conclusion(s). This mechanism of communication among scientists is a substantial part of the process of verification.
Bruce Moran, Photo Documentation of Toolmark Identifications  An Argument in Support, 35 AFTE J. 174, 181 (2003) (Ex. 14). Although that article does not speak directly to AFTE guidelines, that language corresponds to the purpose of documentation as described by Special Agent Curtis.
Furthermore, ASCLD states in its Laboratory Accreditation Board Manual, excerpts of which were introduced as Exhibit 49 in the evidentiary hearing, that "documentation to support conclusions must be such that in the absence of the examiner, another competent examiner or supervisor could evaluate what was done and interpret the data." American Society of Crime Laboratory Directors, Laboratory Accreditation Board Manual, 29 (1997) (Ex. 49). In addition, the leading treatise in the field states that the "firearms expert must not only do his work meticulously, accurately, and efficiently; he must also report his findings in the same manner." Hatcher et al., supra, at 445.
Special Agent Curtis also indicated that it was the standard in the field to have a second examiner independently review the *369 findings of the first examiner. (Daubert Hr'g Tr. 46, Sept. 16, 2005 ("Once an examiner does his examinations, if they have a second person do technical review of it, that helps cut down on . . . any errors being performed . . . during that examination.").) On cross-examination, Special Agent Curtis reiterated that the standard in the field would be to have a second examiner verify a match. (Daubert Hr'g Tr. 33, Oct. 27, 2005.) Moreover, the definitive treatise in the field indicates that a second examiner must review the first examiner's work and conclusions. See Hatcher et al., supra, at 383 ("A positive match should be confirmed by a second examination. The usual laboratory personnel should check the comparison."); see also Grzybowski et al., supra, at 219 (noting that ASCLD requires peer and administrative review of an examiner's work). The Massachusetts State Police expert who testified at the Daubert hearing, Mary Kate McGilvray, agreed. (Daubert Hr'g Tr. 25, Oct. 28, 2005.)
Documentation and peer review ensure that defense counsel will be able to challenge the results through their own testing and effective cross-examination. Defense experts may testify about both the limitations of the methodology and the evidence in a particular case in a way accessible to the jury. See Currier v. United Techs. Corp., 393 F.3d 246, 252 (1st Cir.2004) (noting that "weakness of [an expert's] analysis [was] a matter of weight rather than admissibility and thus properly a subject of argument and jury judgment"); Mooney, 315 F.3d at 63 ("flaws in [an expert's] opinion may be exposed through cross-examination or competing expert testimony").
Accordingly, although the process of rendering an opinion is primarily subjective and based on the expertise of the examiner, the existence of the requirements of peer review and documentation ensure sufficient testability and reproducibility to ensure that the results of the technique are reliable.
d. Standards For the Technique's Operation
Daubert directs courts to consider "the existence and maintenance of standards controlling the technique's operation." 509 U.S. at 594, 113 S.Ct. 2786. As noted above, the requirements of documentation and peer review of examiners' results are industry standards which help to ensure reliability and testability of the expert opinion. Maintenance of these standards is a strong factor in favor of admissibility. The government argues, however, that even these standards need not be religiously followed because they only reflect emerging trends and include protocols not used by many laboratories. The government erroneously believes it should be held to no standard at all, an argument which even the ballistics field, per the AFTE, rejects. Even if many examiners do not currently follow these guidelines, they have been put forward as standards by the leading professional organization in the field. I find that the AFTE standards of documentation and peer review were adopted by the ballistics industry to ensure the reliability of test results and examiners at a minimum must comply with them.
The field, however, continues to search for a universal standard for when an examiner may declare a "match." Special Agent Curtis put forward the AFTE Theory, but this theory leaves much to be desired. It bears re-emphasizing that the AFTE Theory is not a numeric or statistical standard, but is based on the individual examiner's expertise. As one author in the field has stated, "Due to the subjective nature of the processes involved in the *370 elimination of insignificant detail, the criteria used by the expert to ascertain the degree of concordance of stria cannot be mathematically quantified." Heard, supra, at 140.
Instead, the AFTE Theory, upon which the government relies, is tautological: it requires each examiner to decide when there is "sufficient agreement" of toolmarks to constitute an "identification." AFTE Theory, supra, at 86. This threshold is surpassed when the examiner finds that the agreement of toolmarks "exceeds the best agreement demonstrated between toolmarks known to have been produced by different tools and is consistent with agreement demonstrated by toolmarks known to have been produced by the same tool." Id. Toolmark analysis does not follow an objective standard requiring, say, a certain percentage of marks to match. Rather, as noted, this "threshold is currently held in the minds eye of the examiner and is based largely on training and experience." Grzybowski et al., supra, at 213.
Since Daubert, many examiners understandably have been concerned about the lack of an objective standard. The Grzybowski article, on which the government relies, also states that an "increasing number of toolmark examiners have applied `conservative criteria'" which involves counting consecutively matching striae (CMS) to a proposed numeric threshold between identity and nonidentification as proposed by Biasotti and Murdoch, The Scientific Basis of Firearms and Toolmark Identification, 516-517 (1997) (in 3 David L. Feigman et al Modern Scientific Evidence 490 (2002)). The Grzybowski et al. article also states that CMS is generally accepted as a concept in toolmark identity and is an extension of the currently prevailing "pattern match" technique. Supra, at 213. Professor Schwartz describes the CMS protocol:
Under CMS, the threshold for identifying a particular tool as the source of a three-dimensional toolmark is a match between evidence and test toolmarks of one group of six consecutive matching stria or two different groups of at least three consecutive matching striae in the same relative position. The threshold for two-dimensional toolmarks is one group of eight consecutive matching striae or two groups of at least five consecutive matching striae in the same relative position.
Schwartz, supra, at 15. In some experts' views, the Biasotti CMS method provides a more objective standard for determining when a sufficient number of striae are consistent to declare a match. Id. See also Dutton, supra, at 83 (opining that CMS is a "positive step forward in developing a more objective approach"); Schwartz Aff. ¶ 16.
Although CMS is a widely accepted protocol which has been scientifically validated, it is not the predominant standard in the field according to AFTE. See Grzybowski et al., supra, at 215. Special Agent Curtis confirmed this (Daubert Hr'g Tr. 30, Oct. 27, 2005), and Sgt. Weddleton stated that he does not follow that method either, opting instead for a more holistic approach (Daubert Hr'g Tr. 30, Oct. 28, 2005). So, we are left with the AFTE Theory.
At least in Massachusetts, however, the AFTE Theory does not appear to be a broadly recognized document. Apparently, while the AFTE Theory appears to be widely accepted by trained firearms examiners, it is not universally followed. Sgt. Weddleton testified that he had never before even seen or heard of it. (Daubert Hr'g Tr. 41, Oct. 28, 2005.) Not only that, Mary Kate McGilvray, of the Massachusetts *371 State Police Crime Lab, also testified that she had never before read the AFTE Theory and that it was not the policy in her lab. (Daubert Hr'g Tr. 18, Oct. 28, 2005.) That said, the AFTE Theory appears to be more of a description of the process of firearm identification rather than a strictly followed charter for the field.
As pointed out above, one critical problem with the AFTE Theory is the lack of objective standards for deciding whether a particular mark is a subclass or individual characteristic. Again, a subclass characteristic is one which is present on a subset of a make and model of a gun, such a batch manufactured at a particular time and place. The AFTE Theory states only that "[c]aution should be exercised in distinguishing subclass characteristics from class characteristics," but it offers no additional guidance. Supra, at 88. Special Agent Curtis added that the AFTE Theory offers no guidance on telling the difference between subclass and individual characteristics. (See Daubert Hr'g Tr. 48, Oct. 27, 2005.)[2] Thus, there is no generally accepted standard for distinguishing between class, subclass, and individual characteristics.
The question, then, is whether a method that relies on the individual examiner's training and experience to distinguish between characteristics on a cartridge casing is fatal to the reliability of the technique on the whole. Based on the peer reviewed articles and the testimony of the witnesses, particularly Special Agent Curtis, I conclude that the trained eye will be able to distinguish among the class, subclass, and individual characteristics produced by the firearms. As Judge Pollak noted in Plaza, "there are many situations in which an expert's manifestly subjective opinion (an opinion based . . . on `one's personal knowledge, ability, and experience') is regarded as admissible evidence in an American courtroom." 188 F.Supp.2d at 570 (deeming admissible under Daubert the "ACE-V" method of fingerprint analysis, which relies on a holistic assessment rather than a required number of "points" of identification). He continues: "In each instance the expert is operating within a vocational framework that may have numerous objective components, but the expert's ultimate opining is likely to depend in some measure on experiential factors that transcend precise measurement and quantification." Id. at 571, accord Mitchell, 365 F.3d at 246 (affirming admission of fingerprint evidence even though some testimony indicated that the "examination process is irreducibly subjective"); Bitler v. A.O. Smith Corp., 391 F.3d 1114, 1122 (10th Cir.2004) (admitting fire investigator's testimony despite finding it "not susceptible to testing or peer review," and stating that the expert's "personal experience, training, method of observation, and deductive reasoning sufficiently reliable to constitute `scientifically valid' methodology").
As such, I find that the maintenance of standards with respect to documentation and peer review weigh in favor of admissibility. The lack of a universal standard for declaring a match is troubling but not fatal under Daubert/Kumho because a court may admit well-founded testimony based on specialized training and experience.
e. General Acceptance Within the Relevant Community
In Daubert and Kumho Tire, the Supreme Court affirmed the continued relevance *372 of whether the technique at issue is generally accepted within the scientific community. 509 U.S. at 594, 113 S.Ct. 2786. "Widespread acceptance can be an important factor in ruling particular evidence admissible, and `a known technique which has been able to attract only minimal support within the community' may properly be viewed with skepticism." Id. It is clear that the community of firearm and toolmark examiners accepts the current identification methodology as reliable. Grzybowski at al., supra, at 220-21. Certainly, some authors have argued that the technique might be better performed through the use of improved technology or the application of statistical methods. See, e.g., Mann & Espinoza, supra, at 294 (noting possible improvements to the filed through use of more advanced instruments); Biasotti, supra, at 34 (noting desire for a more quantitative approach). Cf. United States v. Lowe, 954 F.Supp. 401, 406-08 (D.Mass.1996) (discussing rigorous scientific and statistical basis for DNA identification). Although these authors have suggested possible improvements, the community of toolmark examiners seems virtually united in their acceptance of the current technique. See Grzybowski, supra at 220-21.
5. The Bottom Line
Based on the factors outlined in Daubert and Kumho Tire, the Court concludes that the methodology of firearms identification is sufficiently reliable. Therefore, a qualified examiner who has documented and had a second qualified examiner verify her results may testify based on those results that a cartridge case matches a particular firearm to a reasonable degree of ballistic certainty. See Ruiz-Troche, 161 F.3d at 82 (allowing accident reconstruction expert to testify to reasonable degree of scientific certainty); Baker v. Dalkon Shield Claimants Trust, 156 F.3d 248, 253 (1st Cir. 1998) (allowing doctor to testify about the cause of a disease to a "reasonable degree of medical certainty," acknowledging that "little in diagnosis is certain"); McGuire v. Davidson Mfg. Corp., 238 F.Supp.2d 1096, 1101 (N.D.Iowa 2003) (allowing experts to testify to a "reasonable degree of wood science certainty" in product liability case involving wooden ladders).
One important caveat: during the testimony at the hearing, the examiners testified to the effect that they could be 100 percent sure of a match. Because an examiner's bottom line opinion as to an identification is largely a subjective one, there is no reliable statistical or scientific methodology which will currently permit the expert to testify that it is a "match" to an absolute certainty, or to an arbitrary degree of statistical certainty. Allowing the firearms examiner to testify to a reasonable degree of ballistic certainty permits the expert to offer her findings, but does not allow her to say more than is currently justified by the prevailing methodology. See Burke v. Town of Walpole, 405 F.3d 66, 91 (1st Cir.2005) (defining, in context of bite mark identification, reasonable degree of scientific certainty as "a showing that the injury was more likely than not caused by a particular stimulus, based on the general consensus of recognized [scientific] thought"). The lack of absolute certainty on the part of the expert does not render her opinion unreliable under Daubert. See Baker, 156 F.3d at 253 (allowing medical causation testimony even if not certain); United States v. McGlory, 968 F.2d 309, 346 (3d Cir.1992) (allowing handwriting expert to testify even if not "absolutely certain"). The opinion of a qualified firearms examiner who has followed industry guidelines goes far beyond the type of "unsupported speculation" barred by Daubert, 509 U.S. at 590, 113 S.Ct. 2786.
*373 Another court in this district articulated similar concerns in United States v. Green, No. 02-10301-NG, slip op. at 6 (D.Mass. Dec. 20, 2001) (Gertner, J.). If the firearms and toolmark profession meets the challenge of Daubert and Kumho Tire by developing a statistical methodology using better technology or following a more objective methodology like CMS, testimony quantifying the likelihood of a match may become reliable and admissible. See generally Schwartz, supra, at 4 (explaining the need for adequate statistical empirical foundation to determine "the likelihood that the toolmarks made by a randomly selected tool of the same type would do as good a job as the toolmarks made by the suspect tool at matching the characteristics of the evidence toolmark[.]"). As of the writing of this opinion, however, such a standard is not prevailing in the field, and an expert may not assert any degree of statistical certainty, 100 percent or otherwise, as to a match.
C. Qualifications
Fed.R.Evid. 702 requires the judge to ensure that the proposed expert witness is qualified by "knowledge, skill, experience, training, or education." See also Poulis-Minott, 388 F.3d at 359 ("It is the responsibility of the trial judge to act as gatekeeper and ensure that the expert is qualified before admitting expert testimony."). "It is well-settled that `trial judges have broad discretionary powers in determining the qualification, and thus, admissibility, of expert witnesses.'" Diefenbach v. Sheridan Transp., 229 F.3d 27, 30 (1st Cir.2000) (quoting Richmond Steel, Inc. v. Puerto Rican Am. Ins. Co., 954 F.2d 19, 20 (1st Cir.1992)).
Although Sgt. Weddleton has not yet attained a college degree (he is still taking courses), education is not the sine qua non of qualification as an expert witness. See Fed.R.Evid. 702 advisory committee's note (noting that the "text of Rule 702 expressly contemplates that an expert may be qualified on the basis of experience"); McLaughlin, et al., supra, at § 702.04[1][a]. See also Poulis-Minott, 388 F.3d at 360 (affirming trial court's allowing testimony from a fishing boat captain as to the ability of a captain to respond to certain emergencies under the circumstances). Sgt. Weddleton does have significant training and experience as a firearm examiner. A former highway patrolman, Sgt. Weddleton was transferred to the firearms identification unit in 1993. He then underwent on-the-job training by an experienced examiner in firearms identification and attended various armorer schools. (Daubert Hr'g Tr. 33, Oct. 28, 2005.) He has conducted hundreds of examinations of firearms using a comparison microscope. (Id. at 140.)
To be sure, Sgt. Weddleton's scientific and academic credentials are underwhelming. He apparently has no formal scientific training, is neither certified by, nor is he a member of any professional organizations, reads no literature in the field, and had not undertaken any proficiency testing at the time he performed the tests at issue in this case. (Daubert Hr'g Tr. 33-35, Oct. 28, 2005.) However, he has performed hundreds of such examinations, and is, by the standard in the field, qualified. Furthermore, the government introduced evidence that Sgt. Weddleton took a nationally administered proficiency test in July 2005 and passed. (Daubert Hr'g Tr. 95, Oct. 27, 2005.) Although the American Society of Crime Laboratory Directors (ASCLD) lists a bachelor's degree with science courses as a "desirable" qualification for firearm examiners, it does not list it as "essential." American Society of Crime Laboratory Directors, Laboratory *374 Accreditation Board Manual, 29 (1997) (Ex. 49).
D. Documentation
That Sgt. Weddleton is qualified, however, does not relieve the government of its burden of proof that his methodology in this particular case was reliable even if the general methodology of toolmark identification passes muster under Daubert.
Special Agent Curtis described the prevailing and established standard of reliability in the field of toolmark identification, with which Sgt. Weddleton's examination must comport. See In re Paoli R.R. Yard PCB Litig., 35 F.3d 717, 745 (3d Cir.1994) (explaining that "any step that renders the analysis unreliable . . . renders the expert's testimony inadmissible. This is true whether the step completely changes a reliable methodology or merely misapplies that methodology."); Fed.R.Evid. 702 advisory committee's note. Sgt. Weddleton's examination falls short of the mark in two major areas: documentation and peer review.
With respect to documentation, Special Agent Curtis indicated that the guidelines of the Association of Firearm and Tool Mark Examiners require examiners to document identifications by notes, sketches, or photographs. (Daubert Hr'g Tr. 43-4, Sept. 16, 2005.)[3]
Sgt. Weddleton did not make any sketches or take any photographs, so the question is whether his notes provide adequate documentation of the identification. (Daubert Hr'g Tr. 65, 72, Oct. 28, 2005.) The three reports of identifications in this case, entered as Exhibits 38, 40, and 41, contain no description of what led Sgt. Weddleton to his conclusions. Indeed, all the reports indicate is that there was a "positive ID." Even Sgt. Weddleton seems to acknowledge that current standards would require more description of his examination than he provided in this case. (Daubert Hr'g Tr. 78, Oct. 28, 2005.)
Until the basis for the identification is described in such a way that the procedure performed by Sgt. Weddleton is reproducible and verifiable, it is inadmissible under Rule 702.
E. Review By a Second Qualified Examiner
There is no evidence that Sgt. Weddleton had an independent second examiner from his lab review his work or conclusions in accordance with the generally accepted standard in the field. (Daubert Hr'g Tr. 73, Oct. 28, 2005.) This is particularly important since Weddleton used replacement parts when test-firing the FEG FP 9 firearm. Until Sgt. Weddleton's work has been peer reviewed and his conclusions verified, his testimony is inadmissible under Rule 702.
The government has indicated its intention to have Sgt. Weddleton's work reviewed by additional expert witnesses prior to trial and to have those experts testify at trial. (Docket No. 1094.) Review and verification of Sgt. Weddleton's results by a second qualified examiner, and proper documentation of the results of both that review and Sgt. Weddleton's original review, will render Sgt. Weddleton's testimony admissible under Rule 702.
F. Replacement Parts
Defendants have attacked specifically Sgt. Weddleton's identification of cartridge *375 cases found near the scene of the fatal shooting of Dinho Fernandes as coming from a 9 mm. FEG FP 9 Browning Hi-Power handgun. Defendants allege that Sgt. Weddleton's identification is unreliable because he test fired the firearm in question only after substantially reconstructing it with replacement parts. Among other parts, Sgt. Weddleton replaced the firing pin, recoil spring, barrel, and trigger lever (but, significantly, not the breech face) before test firing the firearm. After test firing the gun, Sgt. Weddleton compared the spent cartridge cases with those found near the scene of the Fernandes shooting and declared them a match. Defendants argue that the use of replacement parts, particularly the recoil spring, would so affect the marks transferred from the breech face to the cartridge case as to make any identification unreliable. However, defense expert LaMagna used different ammunition than Sgt. Weddleton, undermining his conclusions that the use of a different recoil spring would substantially affect the breech face markings. Moreover, he used a new spring rather than a used one. Special Agent Curtis disagreed with Mr. LaMagna's analysis, contending that the replacement parts would have no impact on the breech face marks transferred to the cartridge casings in this case. (Daubert Hr'g Tr. 61, Sept. 16, 2005.) Curtis explained that the replaced recoil spring, a primary cornerstone of the defendants' argument, does not come into play in the action of the firearm until after the breech face is marked.
Special Agent Curtis has substantially more specific experience in the field than Mr. LaMagna, who, while trained in engineering, had no training in ballistics other than a correspondence course. Curtis demonstrated that the replacement parts should not make a significant difference in breech face marks. Extensive cross-examination of the government's experts and, if desired, testimony from the defense expert, will highlight the alleged shortcomings of this procedure for the jury.

ORDER
Defendants' motion in limine to exclude ballistics evidence is ALLOWED without prejudice to supplementation by the government (Docket No. 940). The government must ensure that its proffered firearms identification testimony comports with the established standards in the field for peer review and documentation. If the expert opinion meets these standards, the expert may testify that the cartridge cases were fired from a particular firearm to a reasonable degree of ballistic certainty. However, the expert may not testify that there is a match to an exact statistical certainty.
NOTES
[1]  In a recent opinion, Judge Gertner of this District expressed "serious reservations" regarding the reliability of firearm toolmark identification evidence. See United States v. Green, No. 02-10301-NG, slip. op. at 5 (D.Mass. Dec. 20, 2001). Judge Gertner admitted the evidence with some limitations, noting that "precedent plainly points in favor of admissibility." Id. at 39.
[2]  Of serious concern, Sgt. Weddleton indicated that he does not even consider subclass characteristics when he examines breech face markings. (Daubert Hr'g Tr. 49, Oct. 28, 2005.)
[3]  Although the AFTE guidelines to which Special Agent Curtis referred were not offered into evidence, the Court accepts as credible his testimony that the standard in the field is for the examiner to document his or her findings through the use of notes, sketches, or photographs.
