
433 F.Supp.2d 157 (2006)
Jacob BRADLEY, Noah Bradley, Keith Ridley, and Jared Thomas, Plaintiffs,
v.
CITY OF LYNN, et al., Defendants.
Civil Action No. 05-10213-PBS.
United States District Court, D. Massachusetts.
June 2, 2006.
*158 Alfred Gordon, Pyle, Rome, Lichten & Ehrenberg, P.C., Harold L. Lichten, Shannon E. Liss-Riordan, Pyle, Rome Lichten, Ehrenberg & Liss-Riordan, P.C., Boston, MA, for Plaintiffs.
George S. Markopoulos, George S. Markopoilos, James Lamanna, City Solicitor's Office, Lynn, MA, Sookyoung Shin, Ronald F. Kehoe, Attorney General's Office, Boston, MA, for Defendants.

MEMORANDUM AND ORDER RE: MOTION FOR PRELIMINARY INJUNCTION
SARIS, District Judge.

I. INTRODUCTION
Intervenor Plaintiffs New England Area Conference of the NAACP and the Boston Society of the Vulcans move for preliminary and permanent injunctive relief to require Defendant Commonwealth of Massachusetts, Division of Human Resources (the "HRD") to reorder Boston Firefighter Certification List No. 260302, which is to be used for the hiring of firefighters in Boston.
As background, the class plaintiffs allege that the civil service cognitive examination used to qualify and rank applicants has a disparate impact on minorities in violation of Title VII of the Civil Rights Act of 1964, 42 U.S.C. § 2000e-2(k) (2006), and the federal consent decree in Boston Chapter, NAAC'P, Inc. v. Beecher, 371 F.Supp. 507 (D.Mass.1974) (the "Beecher decree"). Pursuant to Fed.R.Civ.P. 23(a) and (b)(2), the Court certified the plaintiff firefighter and police officer classes for minorities (defined as Black and Hispanic) on March 24, 2006. (Docket No. 81.) A six-day bench trial for the liability phase began on April 11, 2006, and the parties rested on May 4, 2006. At the request of all counsel, the Court ordered that closing briefs be submitted by June 1, 2006, and scheduled oral argument for June 9, 2006.
Boston has recently requested a new list from the HRD to hire fifty more firefighters based on the 2004 examination, thus igniting this last-minute motion. This would be the third class hired from the 2004 examination. The new list was certified on April 11, 2006 and expanded on April 21, 2006. The expanded list was produced to the plaintiffs on May 4, 2006, the last day of trial.
The class plaintiffs have not joined in the intervenor plaintiffs' motion for preliminary and permanent injunctive relief and state that:
[G]iven the urgent need of the Boston Fire Department to achieve staffing levels necessary to guard and protect the public safety, including all residents of the City, [they] seek no action, order, or remedy that will in any way impair the City's ability to complete the screening and hiring process without any delay.
(Pls.' Letter, May 16, 2006.) After hearing, the motion is DENIED.

II. LEGAL DISCUSSION
In deciding whether to grant a preliminary injunction, a court must weigh the following four factors:

*159 (1) the likelihood of success on the merits; (2) the potential for irreparable harm [to the movant] if the injunction is denied; (3) the balance of relevant impositions, i.e., the hardship to the nonmovant if enjoined as contrasted with the hardship to the movant if no injunction issues; and (4) the effect (if any) of the court's ruling on the public interest.
Wine & Spirits Retailers, Inc. v. Rhode Island, 418 F.3d 36, 46 (1st Cir.2005). Unfortunately, this trial record involves complex statistical issues that have not been fully vetted, briefed, or argued. Accordingly, I make the following conclusions regarding this motion for preliminary relief without prejudice to my final analysis in the bench trial, which will contain detailed findings of fact.
A. Likelihood of Success
I find that the plaintiffs have demonstrated a likelihood of success on their claim that the 2004 cognitive civil service examination has a disparate and adverse impact on minorities in Boston in violation of Title VII and the Beecher decree. See EEOC v. Steamship Clerks Union, Local 1066, 48 F.3d 594, 601-02 (1st Cir.1995) (setting forth standards for disparate impact analysis).
The commonly accepted, longstanding benchmark for evaluating disparate impact is the "four-fifths rule" of the Uniform Guidelines on Employee Selection Procedures adopted by the Equal Opportunity Commission ("EEOC"), 29 C.F.R. § 1607.4(D) (2005) (the "EEOC Guidelines"), which provides that a selection rate that "is less than [80%] of the rate for the group with the highest rate will generally be regarded" as evidence of adverse impact. The plaintiffs have submitted evidence showing that the civil service examination has had a disparate impact on the hiring of minorities in Boston, particularly within the subcategory of candidates with veteran status. Since Boston was released from the Beecher decree three years ago, seven minority veterans have been hired out of twenty-seven minority veterans who passed the 2004 examination; during the same period, ninety non-minority veterans have been hired out of 177 non-minority veterans who passed the examination.[1] (Ex. 33D). These Boston hiring statistics result in a violation of the four-fifths rule of the EEOC Guidelines. Moreover, the evidence indicates that the hiring process continues to have a disparate impact on minorities: of the 156 names contained on the new Boston Firefighter Certification List No. 260302 at issue in this motion, only twenty-one candidates are Black or Hispanic. (See Exs. 33H, M, Q.)
The HRD presented evidence to show that these Boston hiring discrepancies may be due to factors other than race. Specifically, the HRD's expert Dr. Rick Jacobs submitted a response comparing the pool of candidates reached for consideration with those actually hired. Dr. Jacobs believes that the statistics show that minorities in Boston have a higher "dropout rate." According to Dr. Jacobs, from 2004 to date, minorities have dropped out at a rate of 58.8% and non-minorities at a rate of 41%. By "drop out rate," Dr. Jacobs refers to those candidates who have been certified by the HRD and reached for consideration by Boston, but "drop out" because *160 they are unwilling to serve (maybe they accepted another job or moved); fail the background check (maybe they have a disqualifying conviction, flunk the drug test, or fail to meet the residency requirement); or are screened out by interview. Plaintiffs' expert Dr. Frank Landy responds that while minorities who are certified and reached do "drop out" at a somewhat higher rate, the difference in "drop out rates" is not statistically significant in Boston, and any differential would not erase the adverse impact of the cognitive ability exam.
Even if Dr. Jacobs were correct, the "drop out" data would not be dispositive in a disparate impact analysis of Boston hiring. Boston has hired multiple classes from the 2004 examination. Because all candidates are ranked based on scores, candidates who have lower scores are reached in later classes than those with higher scores. Dr. Jacobs has not analyzed whether the drop out rates vary from class to class or the reason for the "drop out." An analysis of the veteran candidates appointed in Boston in the June 2005 and January 2006 classes shows that white veterans were two times more likely to be appointed in the earlier wave of hiring than minority veterans. (Ex. 33D, at 3.) Based on the literature in the field, Dr. Landy points out: "[T]he length of time an applicant has to wait for appointment (white or minority) has a substantial impact on drop out likelihood  the longer you wait for appointment, the more likely it is that you will have taken another job or lost interest in the job in question if and when the employment offer comes." (Ex. 33T, at 3.) It is not surprising, then, that because minorities are bunched at the bottom of the scores as discussed below, they would be more likely to have a higher "drop out rate."
Plaintiffs argue that both Title VII and the Beecher decree require that this Court examine the disparate impact of the cognitive examination on the employment opportunities of minorities rather than the bottom-line gross hiring statistics, which may be driven by factors such as "drop out rate." In Connecticut v. Teal, 457 U.S. 440, 102 S.Ct. 2525, 73 L.Ed.2d 130 (1982), the Supreme Court reaffirmed that Title VII prohibits "procedures or testing mechanisms that operate as `built-in headwinds' for minority groups." Id. at 448-49, 102 S.Ct. 2525 (quoting Griggs v. Duke Power Co., 401 U.S. 424, 432, 91 S.Ct. 849, 28 L.Ed.2d 158 (1971)). Commenting on congressional concern about the widespread use by state and local governmental agencies of invalid selection techniques that had a discriminatory impact, the Court stated:
In considering claims of disparate impact under [Title VII], this Court has consistently focused on employment and promotion requirements that create a discriminatory bar to opportunities. This Court has never read [Title VII] as requiring the focus to be placed on the overall number of minority or female applicants actually hired or promoted.
Id. at 450, 102 S.Ct. 2525. In Teal, the pass-fail examination for a public agency had a disparate impact on minorities' eligibility for promotion but no effect on the bottom-line promotion statistics because of an affirmative action program for promoting those who passed. See id at 443-44, 102 S.Ct. 2525. Rejecting the agency's "bottom-line defense," the Supreme Court admonished, "The suggestion that disparate impact should be measured only at the bottom line ignores the fact that Title VII guarantees these individual respondents the opportunity to compete equally with white workers on the basis of jobrelated criteria." Id. at 451, 102 S.Ct. 2525.
*161 Other courts have applied Teal to reject examinations used to rank-order candidates. For example, in Waisome v. Port Auth. of N.Y. & N.J., 948 F.2d 1370, 1378 (2d Cir.1991), involving a composite score of tests to rank police officers for promotion, the Second Circuit held:
Moreover, our prior case law lends support to the use of the [effective cutoff score]. Where a written test served, as here, both as a passing "gate" to further consideration for promotion, and as a major component of the ultimate score required for promotion, we indicated there was no disparate impact in the pass rate, but the disparity in actual promotions established that the written test had a prohibited disparate impact. In [Kirkland v. New York Dep't of Correctional Servs., 711 F.2d 1117 (2d Cir. 1983)], as in the present case, evidence demonstrated that, though there was no disparity in the rate at which minority candidates for promotion passed an examination, their representation on the eligibility list was disproportionately low at the top of the list and high at its bottom. Hence, remand is required for the district court to develop a full record against which to evaluate the evidence of bunching and to determine whether the written examination had a disparate impact when these statistics and all the surrounding facts and circumstances are considered.
Id. at 1378 (citations omitted).
Here, Teal is not an exact fit because the passing score of seventy was chosen specifically, and the test results were even manipulated by dropping questions and by weighting others differently, so that the examination would pass muster under the EEOC Guidelines.[2] Therefore, unlike Teal, there is no showing of discrimination under the four-fifths rule against minorities based on the nominal passing score. Nonetheless, as in Waisome, the evidence is undisputed that there is a disparate impact under the four-fifths rule caused by the test scores of passing minorities. By using test scores as a ranking mechanism, minorities were bunched at the bottom of the scores, in both the veteran and nonveteran categories. Thus, the 2004 examination served as an artificial barrier to employment opportunities for lowerranked minority candidates who are reached at a later time than non-minorities  or not reached at all  unless the employer can demonstrate it measures skills related to effective job performance.
Therefore, while the plaintiffs' statistics are not fireproof, I find that the plaintiffs have demonstrated a likelihood of success in making out a prima facie case that the 2004 examination has had a disparate and adverse impact on hiring minority firefighters in Boston.
The burden thus shifts to the HRD to show that the test has been validated as job related and consistent with business necessity. Steamship Clerks, 48 F.3d at 601-02 (describing Title VII burden-shifting framework); Beecher, 371 F.Supp. at 521 (ordering that "entrance examinations in the future for the purpose of selecting firefighters . . . shall be demonstrably jobrelated and validated"). Unfortunately, the HRD was lax in meeting this requirement, perhaps because the Beecher decree readily provided a shortcut to a diverse firefighter force by imposing certification quotas.
While the HRD conducted job analyses throughout the 1990s to the present, the last time the entry-level firefighter cognitive examination was validated was in 1992 *162 (based on 1986 data from Ohio), and all of the testimony indicated that the examination should be validated about every five years. Even if use of the 1992 validation were still appropriate, the Court finds the testimony of the plaintiffs' expert Dr. Landy, who supervised the 1992 validation, compelling. Specifically, Dr. Landy testified that the 1992 validation recommended rank ordering by score only when a cognitive test constituted 40% and a physical test constituted 60% of the overall score. Indeed, this is what the HRD implemented for the hiring process that followed the 1992 validation.[3] Yet, to screen and rank candidates from the 2004 examination, the HRD now uses only the written cognitive examination.
While the experts do not dispute that cognitive ability tests predict job performance in public safety positions, and I agree; physical abilities also play an important role. Significantly, the HRD does not now test for cardiovascular fitness, muscle strength, or muscular endurance until after a candidate has been ranked and certified based on cognitive ability and given a conditional offer of employment. Because physical ability is not part of a composite test score, the use of the cognitive score as the sole basis of ranking has not been validated.
Moreover, plaintiffs have demonstrated likely success on the availability of alternative methods with less adverse impact. Rank ordering by exact score is suspect in light of the persuasive testimony by all expert witnesses that differences in test scores are not significant within as much as an eight-point spread, and yet, the HRD has not adopted score banding. Boston Police Superior Officers Fed'n v. City of Boston, 147 F.3d 13, 23-24 (1st Cir.1998) (finding that one-point difference in promotional exam was "as a matter of testing accuracy, negligible" where evidence showed that "candidates who scored within a three-point band should be considered functionally equivalent . . . and equally qualified to successfully perform the job as any other person in that score band"). Nor does it test for desirable psychological traits. While there is no uniformly followed standard in the field, the Commonwealth is out-of-step with the majority of other jurisdictions which give greater weight to these other physical and psychological characteristics and ranking approaches. (Ex. 33N.)
Accordingly, the plaintiffs have demonstrated a likelihood of success in demonstrating that the 2004 civil service examination, as it is used to rank candidates, has not been properly validated and violates Title VII and the Beecher decree.
B. Irreparable Harm, Balance of Hardships, and Public Interest
The problem for the intervenor plaintiffs is that they fail to meet the other requirements for a preliminary injunction. The intervenor plaintiffs have shown no irreparable harm if the Court denies the injunction. Even if the Court were ultimately to conclude that the rights of members in the plaintiff class residing in Boston have been violated, the Court could devise an appropriate prospective remedy as in Beecher and Quinn, including back pay and preferential certification, hiring, and seniority. Beecher, 371 F.Supp. at 507 (establishing consent decree with preferential certification quotas); Quinn v. City of Boston, 279 F.Supp.2d 51, 52-53 (D.Mass.2003) (ordering *163 that plaintiffs be hired conditionally and awarded back pay).
Moreover, the requested preliminary injunction would likely slow down the hiring of firefighters. The city received Boston Firefighter Certification List No. 260302 over a month ago, and has already begun its hiring procedure by administering questionnaires, conducting orientation sessions, and performing background checks. An injunction requiring a reordering would thus also interfere with the settled expectations of innocent non-minorities on the list who have already begun the Boston hiring process.
The intervenor plaintiffs have not proposed a clear remedy. At the hearing, they conceded that banding the scores (i.e., within an eight-point spread) would not work to improve minority representation, and did not propose job performance reordering based on, for example, weighting the cognitive scores at 40% or any of the alternative approaches recommended by the class plaintiffs' expert. Instead, when pressed by the Court at the hearing to specify how the list should be reordered, the intervenor plaintiffs suggested that all of the minorities who passed the examination should be added to the certified list and that the list should be reordered to rank candidates for hiring not by examination score but by mathematical ratio according to race to ensure no disparate impact. Any such race-based reordering would be tantamount to requiring a certification quota without engaging in heightened constitutional scrutiny. See Gratz v. Bollinger, 539 U.S. 244, 268-70, 123 S.Ct. 2411, 156 L.Ed.2d 257 (2003) (holding that automatic, non-individualized racial preference is not narrowly tailored to achieve compelling interest of diversity); Quinn v. City of Boston, 325 F.3d 18, 38-39 (1st Cir.2003) (finding that Beecher certification quotas were no longer justified because compelling interest of remedying past effects of discrimination had been achieved in Boston firefighter hiring).
Finally, Intervenor Plaintiff NAACP waited two years to challenge the 2004 examination, and the Beecher decree wisely stated that any resolution regarding entrance examinations "shall be accomplished before any such test is put into use for the purpose of qualifying or selecting." 371 F.Supp. at 521.
Because three of the four factors in this case weigh against the issuance of equitable relief, I decline to issue the requested injunction to reorder the certified list to Boston.

ORDER
Intervenor Plaintiffs' motion for preliminary injunctive relief is DENIED. (Docket No. 109.)
NOTES
[1]  The intervenor plaintiffs propose different statistics based on a Boston Globe article. (Intervenor Pls.' Mem. 4.) The state defendants point out that the difference between the trial exhibit and this article is that the Globe article includes two all-white classes hired as a result of a court ordered remedy. (State Defs.' Mem. 7 n. 2.) While the record is not completely transparent as to the second all-white group, the first all-white group was not based on the 2004 examination. (Trial Tr. 114-16, May 4, 2006.)
[2]  Dr. Landy attacks the use of seventy as a passing score even though it meets the four-fifths rule. I do not reach that argument for purposes of this preliminary ruling.
[3]  HRD abandoned the physical exam as a ranking mechanism when some candidates had medical problems without making a serious effort to find an appropriate and safe alternative physical abilities test.
