251 F.3d 315 (2nd Cir. 2001)
THE HERRICK COMPANY, INC. AND NORTON HERRICK, PLAINTIFFS-COUNTER-DEFENDANTS-APPELLANTS-CROSS-APPELLEES,v.SCS COMMUNICATIONS, INC. AND STEPHEN C. SWID, DEFENDANTS-COUNTER-CLAIMANTS-APPELLEES-CROSS-APPELLANTS,VETTA SPORTS, INC., HENRY N. CHAN, RICHARD SHEINBERG AND STEPHEN D. WEINROTH, DEFENDANTS-COUNTER-CLAIMANTS,SKADDEN, ARPS, SLATE, MEAGHER & FLOM, LLP AND MARK SMITH, DEFENDANTS.
Docket No. 99-7976(L), 99-7996 & 99-7998(XAP)August Term, 1999
UNITED STATES COURT OF APPEALSFOR THE SECOND CIRCUIT
Argued August 7, 2000Decided May 23, 2001

Appeal and cross-appeal from a grant of partial summary judgment and from a subsequent jury verdict as to those issues not decided on summary judgment. VACATED and REMANDED for further determinations concerning federal subject matter jurisdiction.[Copyrighted Material Omitted][Copyrighted Material Omitted]
Gerald Walpin, (Joseph Zuckerman & Barry Michael Okun on the brief), Rosenman & Colin Llp, New York, N.Y. for Plaintiffs-Counter-Defendants-Appellants-Cross-Appellees.
Gregory P. Joseph, Fried, Frank, Harris, Schriver & Jacobson, New York, N.Y. (David Spears, Richards, Spears, Kibbe & Orbe, on the brief) (Einer Elhauge & David Rosenberg, of counsel, on the brief), for Defendants-Counter- Claimants-Appellees-Cross-Appellants.
Before: Calabresi, Cabranes, and Pooler, Circuit Judges.

Calabresi, Circuit Judge

1
The Herrick Company, Inc. and Norton Herrick (collectively "Herrick" or "plaintiffs") and SCS Communications, Inc. and its principal Stephen C. Swid (collectively "SCS/Swid") appeal and cross- appeal in this diversity suit from a judgment, entered by the United States District Court for the Southern District of New York (Patterson, J.), following a jury verdict awarding Herrick compensatory damages against SCS/Swid for breach of contract and breach of fiduciary duty. On appeal, Herrick attacks a setoff the district court granted against the jury award. SCS/Swid appeals the judgment below more generally, arguing (a) that the district court erred in deciding at summary judgment that a contract existed and therefore sending only questions of breach and damages to the jury, (b) that the jury trial on breach and damages was infected with error, and (c) that there is no federal jurisdiction over the lawsuit. Because we find an absence of federal subject matter jurisdiction over the lawsuit as presently constituted and because, as an appellate court, we cannot determine whether the defect in jurisdiction may be cured, we do not reach the merits of the parties' substantive contentions. We therefore vacate the judgment of the district court and remand the case to that court for further consideration.

I. BACKGROUND

2
This complicated appeal arises out of a much simpler business deal that went wrong. In that deal, Herrick and SCS/Swid planned the joint acquisition of the Orleander Group, a manufacturer of bicycle accessories. At the heart of the lawsuit is an August 16, 1993 Letter Agreement (the "Letter Agreement") executed by Herrick Company, Inc., SCS Communications, Inc., and TOG Acquisition Co. (the entity to be used as the acquisition vehicle). Plaintiffs claim that the Letter Agreement is a legally binding contract creating a joint venture to acquire the Orleander Group.


3
Negotiations about the structure of the acquisition vehicle and about the rights and responsibilities of the parties thereto broke down on November 21, 1993, and, eight days later, on November 29, SCS and TOG Acquisition Co. proceeded, with the assistance of the law firm Skadden, Arps, Slate, Meagher & Flom ("Skadden") and one of its partners, Mark Smith, to complete the acquisition of the Orleander Group on their own. Herrick responded by bringing the lawsuit now before us, in which it claims it was improperly deprived of its half-share in the joint acquisition. Specifically, Herrick sued the businesses and businesspeople involved in the deal -- (1) SCS/Swid, (2) SCS Communications director Stephen Weinroth, (3) Vetta Sports, Inc. (as successor to TOG Acquisition Co.), and (4) TOG Acquisition Co. officers Richard Scheinberg & Henry N. Chan -- (collectively, the "business defendants") asserting causes of action, based on the Letter Agreement, for breach of contract, breach of fiduciary duty, fraudulent inducement, and knowing participation in a breach of fiduciary duty. In addition, Herrick sued the law firm and lawyer involved in the deal -- Skadden and Skadden partner Mark Smith -- asserting a variety of causes of action involving breaches of fiduciary duties allegedly created by Skadden's role as attorney to the joint venture and to Herrick as one of the joint venturers. Plaintiffs sought both compensatory and punitive damages.


4
The several parties all moved and cross-moved for summary judgment. In relevant part, (1) the business defendants moved for summary judgment dismissing plaintiffs' complaint on the ground (among others) that the Letter Agreement was not an enforceable contract and that it did not create a joint venture; and (2) plaintiffs cross-moved for summary judgment against all defendants on all claims, and in particular against the business defendants on the ground that the Letter Agreement was an enforceable contract creating a joint venture, which the business defendants breached.1


5
The district court decided the motions for summary judgment in an Opinion and Order dated December 3, 1996. In that Opinion and Order, the district court rejected the business defendants' contention that the Letter Agreement was no more than "a classic agreement to agree which did not even create enforceable obligations let alone a joint venture," and found instead that the Letter Agreement was, as a matter of law, a valid and enforceable contract creating a joint venture to acquire the Orleander Group. At the same time, however, the district court refused to hold that the business defendants had breached the contract created by the Letter Agreement, finding that the questions of breach and damages "appear to be genuine issues of material fact for the jury to determine," and hence not appropriate for summary judgment.2


6
The case therefore went to trial (before a jury) on the issues of the business defendants' breach of the Letter Agreement, Skadden's breach of fiduciary duties, and damages. The trial began on January 12, 1999, and on January 15 (the fourth day of the trial), counsel for plaintiffs and for all defendants other than SCS/Swid announced in open court that they had reached a settlement agreement (whose terms were not revealed to the jury and remain sealed or redacted from the district court's unsealed orders). Consequently, all defendants other than SCS/Swid were excused from the remainder of the trial. The district court, although not until July 23, entered stipulations and orders dismissing plaintiffs' case against the settling defendants but retaining jurisdiction over any disputes relating to the settlement.


7
Plaintiffs continued to present their case against SCS/Swid, and, on February 10, 1999, the jury returned a verdict finding (1) SCS liable to plaintiffs for breach of contract and breach of fiduciary duty, (2) Swid, as SCS's alter ego, liable to plaintiffs for breach of contract and breach of fiduciary duty, and (3) Swid liable to plaintiffs for knowing participation in a breach of fiduciary duty. The jury awarded plaintiffs $10,549,000 in damages, and on February 25, the district court entered a judgment in the amount of $10,549,000 plus prejudgment and post-judgment interest, subject to SCS/Swid's right to a setoff due to plaintiffs' settlement with the other defendants.


8
Uncertainty about the amount of this setoff caused the district court to vacate entry of that judgment. On August 12, 1999, the district court entered a new judgment, (1) awarding Herrick damages in the amount of $10,549,000 plus pre-judgment interest, and (2) granting SCS/Swid's motion to amend their answers (pursuant to Fed. R. Civ. P. 15) in order (a) to claim a setoff in respect of the settlement between Herrick and the other defendants and (b) to establish the present discounted value of the settlement and hence the amount of the setoff. In addition, the district court denied several other motions made by SCS/Swid pursuant to Fed. R. Civ. P. 50 and 59, seeking judgment notwithstanding the verdict, a new trial, and alteration of the judgment.3 Most importantly, from the perspective of this appeal, the district court rejected SCS/Swid's argument, which was presented to the court only after the trial had been completed and the jury's verdict had been entered, that Skadden's presence in the lawsuit destroyed diversity and therefore deprived the federal courts of their only source of subject matter jurisdiction over the case. The district court found that even if, arguendo, Skadden and Herrick were in fact not diverse, this challenge to subject matter jurisdiction failed as a matter of law, primarily because of Skadden's settlement and SCS/Swid's delay in raising the challenge.


9
Both Herrick and SCS/Swid filed timely notices of appeal. Herrick seeks reversal of the district court's order that the present discounted value of the settlement should be set off against the jury verdict. Specifically, Herrick argues that the district court (1) misapplied (in several ways) New York state law governing setoffs, (2) violated Herrick's constitutional right to a jury trial, and (3) improperly credited toward the setoff settlement payments that Herrick would receive only in the future. SCS/Swid seeks reversal of "each and every portion" of the August 12 judgment, "as well as all prior decisions and orders" in the case. SCS/Swid argues (1) that the district court's finding, on summary judgment, that the August 16, 1993 Letter Agreement constituted a binding contract creating a joint venture was erroneous, (2) that the district court made several erroneous and prejudicial evidentiary rulings at trial, (3) that the district court issued erroneous damage instructions, and (4) that the jury's verdict holding Swid liable as SCS's alter ego and also for knowingly participating in a breach of fiduciary duty was unsupported by the evidence. Finally, in response to an order issued by our Court on June 15, 2000, the parties submitted letter briefs addressing the question, first raised by SCS/Swid in its post-judgment motions below, of whether federal subject matter jurisdiction is proper in this case.

II. DISCUSSION

10
Our disposition of the present appeal focuses on only the last of these questions -- whether we enjoy federal subject matter jurisdiction. Because we find that federal subject matter jurisdiction is lacking as the case is presently structured and that, as an appellate court, we cannot determine whether the defect in jurisdiction may properly be cured, our inquiry does not pass beyond that question. We therefore remand the case to the district court for further consideration.


11
We are aware, in ordering the remand, that this has been an arduous and expensive lawsuit, but subject matter jurisdiction remains "an unwaivable sine qua non for the exercise of federal judicial power," Curley v. Brignoli, Curley & Roberts Assocs., 915 F.2d 81, 83 (2d Cir. 1990). We have recently observed that, "jurisdiction is not a game," and that, "[a]s the Supreme Court has made abundantly clear, it is one of the fundamental tenets of our Constitution that only some cases may be brought in federal court." E.R. Squibb & Sons, Inc. v. Accident & Cas. Ins. Co., 160 F.3d 925, 929 (2d Cir. 1998). We cannot avoid addressing the threshold question of jurisdiction simply because our finding that federal jurisdiction does not exist threatens to prove burdensome and costly, or because it may undermine an expensive and substantially completed litigation. Squibb, 160 F.3d at 929-30. Accordingly, we must consider arguments attacking federal jurisdiction whenever they arise, and in doing so, we review the district court's legal conclusions de novo. Viacom Int'l, Inc. v. Kearney, 212 F.3d 721, 726 (2d Cir. 2000).

A. Diversity

12
In the case before us, it is unquestioned that the only source of federal subject matter jurisdiction is diversity of citizenship pursuant to 28 U.S.C. § 1332. In relevant part, that statute establishes that diversity jurisdiction exists over civil actions (1) between "citizens of different States" and (2) between "citizens of a State and citizens or subjects of a foreign state." 28 U.S.C. § 1332(a)(1) & (2). We must, therefore, consider whether the parties in the case before us fall into either of these categories. We focus, in particular, on whether Skadden's participation in the case destroys diversity under this statute as it has been interpreted.


13
We begin this enquiry with the axiomatic observation that diversity jurisdiction is available only when all adverse parties to a litigation are completely diverse in their citizenships. See Owen Equip. & Erection Co. v. Kroger, 437 U.S. 365, 373-74 (1978); Squibb, 160 F.3d at 930 (citing Strawbridge v. Curtiss, 7 U.S. (3 Cranch) 267 (1806)). In addition, we note two further, equally well-settled, principles governing diversity jurisdiction. First, for purposes of establishing diversity, a partnership has the citizenship of each of its partners. See Carden v. Arkoma Assoc., 494 U.S. 185, 192-95 (1990). And second, United States citizens "domiciled abroad are neither citizens of any state of the United States nor citizens or subjects of a foreign state," so that "§ 1332(a) does not provide that the courts have jurisdiction over a suit to which such persons are parties." Cresswell v. Sullivan & Cromwell, 922 F.2d 60, 68 (2d Cir. 1990).4


14
Putting these two principles together and applying them to Skadden generates the conclusion that if Skadden has among its partners any U.S. citizens who are domiciled abroad, then Skadden and Herrick (which is a citizen of Florida) are non-diverse. And this is precisely the conclusion SCS/Swid urges upon us, providing -- by way of identification of possible U.S. citizens domiciled abroad -- biographical materials, compiled by Skadden itself, that describe several Skadden partners as educated in the United States and belonging to American State bars, but permanently living in a foreign city and working at a foreign office.


15
Furthermore, it is well established that "[t]he party seeking to invoke jurisdiction under 28 U.S.C. § 1332 bears the burden of demonstrating that the grounds for diversity exist and that diversity is complete." Advani Enter., Inc. v. Underwriters at Lloyds, 140 F.3d 157, 160 (2d Cir. 1998) (citing McNutt v. General Motors Acceptance Corp., 298 U.S. 178, 189 (1936) and Strawbridge, 7 U.S. (3 Cranch) 267). Given the evidence SCS/Swid presents that indicates that the Skadden partners in question are U.S. citizens domiciled abroad and the absence of any evidence overcoming or even countervailing this suggestion, Herrick cannot be said to have carried its burden. In the ordinary case, therefore, a straightforward application of the rules for allocating the burden of proving diversity would lead us to find that Skadden and Herrick are not diverse.


16
A somewhat unusual feature of this case, however, precludes that conclusion in the absence of further analysis. The evidence that SCS/Swid has marshaled against Skadden's diversity (involving, as it does, Skadden partners who appear to have U.S. roots and more recent long-term residence abroad) indicates that if these partners have indeed established a foreign domicile, such a domicile constitutes a change from an earlier, although unidentified, U.S. domicile. This, Herrick argues, implicates the venerable rule that "[w]here a change of domicile is alleged, the burden of proof rests upon the party making the allegation." Desmare v. United States, 93 U.S. (3 Otto.) 605, 610 (1876). Herrick claims, in essence, that SCS/Swid's offer of proof -- which amounts to no more than publicity materials produced by Skadden claiming that the partners in question have worked at foreign offices since the late 1980's or early 1990's -- is insufficient to carry SCS/Swid's burden of establishing the change in domicile on which the lack of diversity depends.


17
On the specific combination of facts before us -- under which the allegation of diversity relies on old domiciles and the denial of diversity relies on changes in domicile -- the rules articulated by McNutt (that the party invoking federal jurisdiction bears the burden of proving diversity) and by Desmare (that the party alleging a change of domicile bears the burden of proving that change) cut in opposite directions. We must, therefore, decide how to resolve the tension between them. As it happens, this combination of circumstances, though somewhat unusual, is far from unprecedented.


18
One response to the potential conflict between the McNutt and Desmare rules has been to hold that the ultimate burden of persuasion concerning diversity and hence domicile remains with the party invoking federal jurisdiction and to treat (at least when the two rules are in tension) the Desmare rule that the party alleging a change in domicile bears the burden of proving that change to refer to the onus of production only. Thus the Fifth Circuit has said that "[w]hile some opinions seem to imply that the burden of persuasion rests with the party attempting to show a change of domicile, this is an overstatement. The proper rule is that the party attempting to show a change assumes the burden of going forward on that issue. The ultimate burden on the issue of jurisdiction rests with the plaintiff or the party invoking federal jurisdiction." Coury v. Prot, 85 F.3d 244, 250 (5th Cir. 1996); see also Lew v. Moss, 797 F.2d 747, 751 (9th Cir. 1986); Slaughter v. Toye Bros. Yellow Cab Co., 359 F.2d 954, 956 (5th Cir. 1966).


19
We have not wholly joined in this approach, noting that "it misconceives the purpose of [the] hallowed [Desmare] presumption, which, unlike evidentiary presumptions, is premised not on probabilities or on which party has more ready access to pertinent information, but rather on a judicial policy determination that in ascertaining diversity jurisdiction in a highly mobile society there is a need to fix domicile with some reasonable certainty at the threshold of litigation." Gutierrez v. Fox, 141 F.3d 425, 427 n. 1 (2d Cir. 1998) (internal citations omitted). And we concluded, in Guitierrez, that "[a]s a corollary to this presumption, the person alleging a change of domicile [in connection with ascertaining diversity] has the burden of proving it." Id. at 427.


20
The Guitierrez holding, however, is expressly tied to (indeed it is in explicitly a "corollary" of) the "purpose" it was designed to serve, namely the need, in a mobile society, "to fix domicile with some reasonable certainty at the threshold of litigation." Id. at 427 n. 1. And for this reason, a party seeking to challenge diversity by alleging a change of domicile does not, even under Guitierrez, bear the burden of proving that change if the party seeking to establish diversity has not carried its (prior) burden of establishing a specific initial domicile from which the change would be a departure.5 If the party invoking diversity jurisdiction has not established such an original (default) domicile, then placing the burden of proof on the party asserting a change will in no way help "to fix domicile with some reasonable certainty at the threshold of litigation." Id. Indeed, and to the contrary, it will serve only to complicate and confuse judicial efforts to ascertain diversity jurisdiction, by engaging the parties and the court in an uncertain effort to determine an unclear departure from an unknown beginning.


21
Even under Guitierrez, therefore, the party invoking diversity jurisdiction continues to bear, as McNutt teaches, the burden of persuasion in establishing specific initial domiciles that support the existence of diversity jurisdiction (and from which the alleged changes in domicile represent a departure). Accordingly, the burden of establishing diversity in the case at bar remains (for the present) with Herrick, which must establish actual initial, diverse, domestic domiciles for the partners SCS/Swid claims now live and work abroad. Only after such original domiciles were demonstrated would SCS/Swid bear the burden of proving that they had been abandoned for the foreign domiciles that it alleges currently apply.


22
The sole evidence Herrick has presented in support of its assertion that Skadden is diverse is Skadden's pre-trial admission that diversity jurisdiction existed. This admission is a "strong factor in favor of a similar judicial finding," Guiterrez, 141 F.3d at 427, and may well establish a prima facie showing of diversity. SCS/Swid, however, has presented specific evidence (also based on Skadden's own statements) that several Skadden partners have long-term residences and professional focuses abroad. This evidence (involving, as it does, allegations of residence only and not domicile) might not be enough to establish lack of diversity were the burden of persuasion to lie with SCS/Swid. But it is sufficiently powerful to create serious doubts about Skadden's diversity, doubts Herrick has been unable to address fully on appeal. Because the burden of establishing diversity remains with Herrick, this failure is fatal to Herrick's arguments that it and Skadden are diverse.

Supplemental Jurisdiction

23
It is conceivable, nevertheless, that the federal courts, in connection with the dispute between Herrick and SCS/Swid (who are unquestionably diverse), might entertain supplemental jurisdiction over the dispute involving non-diverse Skadden.6 It is possible, first, that the federal courts might enjoy supplemental jurisdiction over the initial lawsuit between Herrick and Skadden. And second, even if this more aggressive assertion of supplemental jurisdiction fails, the federal courts might nevertheless enjoy supplemental jurisdiction over the narrower subject matter of the settlement involving Skadden, which is the only context in which Skadden remains in the case as it has reached us. We address each alternative in turn.


24
Consideration of the first of these need not detain us long. In the (diversity) case at bar, supplemental jurisdiction is alleged not merely over a pendent claim but rather over a claim involving a party as to whom there would be no independent federal subject matter jurisdiction. See Aldinger v. Howard, 427 U.S. 1, 15 (1976). In such cases, "the [supplemental] jurisdiction of the federal courts is limited not only by the provision of Art. III of the Constitution [extending the federal judicial power to "Cases" and "Controversies" only] but also by Acts of Congress," Kroger, 437 U.S. at 372, including, most critically in the case before us, the diversity statute, 28 U.S.C. § 1332, which mandates complete diversity.7 See Kroger, 437 U.S. at 373 (noting that "[o]ver the years Congress has repeatedly re-enacted or amended that statute conferring diversity jurisdiction, leaving intact this rule of complete diversity").8 Thus it is well-settled that the existence of diversity between a plaintiff and one defendant cannot support a federal court's exercise of supplemental jurisdiction over another non-diverse defendant. See 13B Wright, Miller & Cooper, Federal Practice and Procedure: Jurisdiction 2d 150 (1984) (citing Kroger and stating that the suggestion that such supplemental jurisdiction is available has been "authoritatively rebuffed"). Any other result would "allow the requirement of complete diversity to be circumvented" and "would simply flout the congressional command." Id. at 377.


25
But even though the first effort to bring the dispute involving Skadden within the supplemental jurisdiction of the federal courts fails, the second effort might still succeed. Thus, although there may have been no federal jurisdiction over the lawsuit at the time it was filed, the subsequent settlement with Skadden may have cured the earlier defect. And indeed, Herrick argues precisely this. It contends that regardless of what may have been true earlier, at least from the time of the settlement onwards, there has been no defect of federal jurisdiction.


26
In presenting this argument, Herrick does not deny that, although the district court dismissed the case against Skadden, it expressly retained jurisdiction over "any dispute, controversy, or claim arising out of or relating directly or indirectly to the [Skadden] settlement." Instead, relying on Kokkonen v. Guardian Life Ins. Co., 511 U.S. 375 (1994), Herrick claims that application of the principles of supplemental jurisdiction to the settlement (rather than to Skadden's role in the original lawsuit), reveals that at least from the time of the settlement onwards (1) the district court could properly exercise supplemental jurisdiction over the settlement without an independent jurisdictional basis for doing so and (2) notwithstanding this extension of the district court's supplemental jurisdiction, all disputes between Herrick and Skadden concerning the settlement now constitute a separate action. This case, it claims, is jurisdictionally distinct from the remaining dispute between Herrick and SCS/Swid, and does not infect it. Herrick thus argues that the district court's retention of jurisdiction over the settlement was proper and that even if it was not, any jurisdictional errors the district court made concerning the settlement do not undermine the district court's exercise of jurisdiction over the main lawsuit against SCS/Swid.9 We reject both these contentions.


27
Our conclusion follows directly from the logic of Kokkonen itself. In Kokkonen, a party to a settlement which had been entered by the United States District Court for the Eastern District of California requested an order of enforcement despite the fact that the district court did not expressly retain jurisdiction over the settlement when the settlement was entered. Although the other party objected that the district court lacked subject matter jurisdiction over the claim, the district court entered the enforcement order, asserting an inherent power to do so, and the Ninth Circuit affirmed. Kokkonen, 511 U.S. at 377. A unanimous Supreme Court reversed, holding that enforcement of the settlement agreement in that case was "more than just a continuation or renewal of the dismissed suit,.. hence [it] require[d] its own basis for jurisdiction," and no such independent basis of jurisdiction existed. Id. at 378. In fact, the Supreme Court said, in these circumstances, a suit to enforce a settlement merely "involves a claim for breach of a contract, part of the consideration for which was dismissal of an earlier federal suit." Id. at 381. Because "[n]o federal statute makes that connection (if it constitutionally could) the basis for federal-court jurisdiction over the contract dispute," enforcement of the settlement agreement is for the state courts. Id.


28
In reaching this conclusion, the Court noted that "[t]he situation would be quite different if the parties' obligation to comply with the terms of the settlement agreement had been made part of the order of dismissal -- either by separate provision (such as a provision `retaining jurisdiction' over the settlement agreement) or by incorporating the terms of the settlement agreement in the order." Id. In such a case, the Court noted, "a breach of the agreement would be a violation of the [court's] order." Id. And under the familiar principle that a court has ancillary jurisdiction "relating to the court's power to protect its proceedings and vindicate its authority," id. at 380, "ancillary jurisdiction to enforce the agreement would therefore exist." Id. at 381. See also Grimes v. Chrysler Motors Corp., 565 F.2d 841, 844 (2d Cir. 1977) (per curiam) ("jurisdiction over the distribution of the settlement funds can be sustained as ancillary to jurisdiction over the claim itself").


29
Kokkonen thus establishes the straightforward principle that in order for a federal court to retain ancillary jurisdiction to enforce a settlement agreement, the retention of that jurisdiction must serve or connect to a prior legitimate exercise of the court's authority. And this idea reveals why the two propositions Herrick advances -- (1) that the district court could exercise supplemental jurisdiction over the settlement agreement, and (2) that this exercise of jurisdiction is distinct from, and cannot imperil, the district court's jurisdiction over the main dispute -- are both mistaken.


30
First, the district court's exercise of continuing jurisdiction over the settlement involving Skadden depends on the district court's prior exercise of jurisdiction over the lawsuit involving Skadden. Where, as here, there was no proper federal jurisdiction over the initial lawsuit, the reason by reference to which Kokkonen justifies the exercise of supplemental jurisdiction over the settlement -- the vindication of the court's prior authority -- falls away. Because there is no properly exercised prior authority to vindicate, the district court necessarily erred in retaining jurisdiction over the settlement.


31
Second, the district court's mistake in retaining jurisdiction over the settlement cannot be isolated from the remainder of the lawsuit, precisely because that jurisdiction depends on the district court's mistaken assertion of subject matter jurisdiction (of authority) over the settled dispute. The assertion of supplemental jurisdiction over the settlement, precisely because it involves no independent jurisdictional basis separate from the original dispute, must be understood to imply an assertion of continued involvement in and jurisdiction over that original dispute. Accordingly, the jurisdictional defect caused by Skadden's inclusion as a defendant remains in place -- and continues to destroy federal jurisdiction over the original suit -- right up through the present day.10


32
Finally, this conclusion is in keeping with the firm rule, articulated in Kroger and affirmed by 28 U.S.C. § 1367(b), that assertions of ancillary jurisdiction may not be used to avoid the requirement of complete diversity expressed in 28 U.S.C. § 1332. This requirement is imposed on the "original jurisdiction" of the federal district courts, 28 U.S.C. § 1332(a), that is, not only on the final decision-rendering capacity of these courts, but also on every exercise of their judicial power. If problems of incomplete diversity could be cured simply by allowing plaintiffs to settle with non-diverse defendants subject to the continuing jurisdiction of the district court, then plaintiffs could secure all the services of the federal courts -- including the supervision of discovery, the issuance of partial summary judgment, and, perhaps most critically, the enforcement of settlements - - in cases over which the federal courts do not have proper subject matter jurisdiction, so long as the plaintiffs did not require final decision-making by these courts. This result would, to a substantial degree, "allow the requirement of complete diversity to be circumvented" and "would simply flout the congressional command." Kroger, 437 U.S. at 377.11 For all these reasons, Skadden's role in the lawsuit continues to destroy diversity even today, and federal subject matter jurisdiction over the case is as defective now as it was when Herrick filed its complaint.

A. Curing Jurisdiction

33
If there are no facts sufficient to overcome SCS/Swid's allegations that Skadden has jurisdiction-destroying, foreign-domiciled partners, then the federal courts cannot establish jurisdiction over the case by means of the doctrine of supplemental jurisdiction. In this event, the inquiry must turn to the question of whether the continuing defect in federal jurisdiction may be cured. It is possible that such a cure is available. Cf. Squibb, 160 F.3d at 935. Pursuing this inquiry, however, requires us to address questions about the scope of the federal juridical power and about our institutional competence as an appellate court. We must ask, that is, whether defects of federal jurisdiction may ever be cured so late in the day and, if they may in some cases be cured, whether we, as an appellate court, are suited to determine whether this is such a case.


34
The existence of federal jurisdiction over a case initially filed in federal court ordinarily depends on the facts as they stood when the complaint was filed. See, e.g., Smith v. Sperling, 354 U.S. 91, 93 n.1 (1957). In the case at bar, there is no question that the requirements of complete diversity were not demonstrated at the time Herrick filed its initial complaint, so that under this general rule, federal jurisdiction over the case would be invalid ab initio. There are, however, several well-recognized exceptions to this rule, which allow federal courts, under certain circumstances to cure defects of federal jurisdiction (a) by establishing ex post the original existence of the required jurisdictional facts, see Jacobs v. Patent Enforcement Fund, Inc., 230 F.3d 565, 567 (2d Cir. 2000), or (b) by dismissing jurisdictional spoilers, nunc pro tunc, pursuant to Fed. R. Civ. P. 21, see Newman-Green, Inc. v. Alfonzo-Larrain, 490 U.S. 826 (1989). These are the doctrines to which we now turn.


35
The first of these exceptions, although completely straightforward, is not available in the case before us given the facts that have at this point been established. We have indeed held that an adequate pleading of diversity, rather than being itself a necessary element of diversity jurisdiction, is "merely an allegation informing the court that diversity jurisdiction independently exists." Jacobs, 230 F.3d at 567. And we concluded, for this reason, that even though "a complaint must present certain quite particular allegations of diversity jurisdiction in order to be adequate, the actual existence of diversity jurisdiction, ab initio, does not depend on the complaint's compliance with these procedural requirements." Id. at 567-68 (emphasis in original). As a result, where the facts necessary to the establishment of diversity jurisdiction are subsequently determined to have obtained all along, a federal court may simply allow a complaint to be amended to assert those necessary facts and then treat diversity jurisdiction as having existed from the beginning. But no such amendment is possible when the underlying facts (and not merely the pleadings) are inadequate to support federal jurisdiction. For curing jurisdiction in such a circumstance requires more than changing just the pleadings. Cf. Newman-Green.


36
Our earlier discussion of the burden of proving diversity showed that Herrick, up to date, has not established the facts necessary to demonstrate that diversity jurisdiction existed at the time the complaint in this case was filed. Accordingly, we cannot at this stage employ the first method of curing the jurisdictional defect.


37
With respect to the second way of curing such defects, the Supreme Court, in Caterpillar, Inc. v. Lewis, 519 U.S. 61, 64 (1996), held that an initial failure of diversity was "not fatal to the ensuing adjudication if federal jurisdictional requirements [were] met at the time judgment [was] entered."12 But, since the jurisdictional defect in this case perdures, a broader rule than the Caterpillar holding is needed to salvage jurisdiction here. Herrick urges that we find this rule in Newman-Green,, 490 U.S. at 833 n. 7 (1989), in which the Supreme Court (citing with approval our earlier holding in Caspary v. Louisiana Land & Exploration Co., 725 F.2d 189, 191-92 (2d Cir. 1984) (per curiam)) held that, even on appeal, a court may salvage jurisdiction by removing, pursuant to Fed. R. Civ. P. 21, a dispensable non-diverse party from a suit. See Newman-Green, 490 U.S. at 832-33 (extending Fed. R. Civ. P. 21 to appellate courts); see also Squibb, 160 F.3d 935-40 (commenting on and applying Newman-Green). Herrick argues that under the broad jurisdiction-saving power promoted by Newman-Green, even Skadden's continued presence in the case as a jurisdictional spoiler does not require us to dismiss the suit before us for want of jurisdiction. Instead, Herrick claims, we can and should cure jurisdiction by dismissing Skadden from the case, leaving the judgment below -- insofar as it concerns the remaining parties -- intact.


38
Newman-Green does indeed give the appellate courts power to cure jurisdictional defects even on appeal, and this power is backed by weighty reasons. As the Supreme Court has remarked, "[o]nce a diversity case has been tried in federal court with rules of decision supplied by state law... considerations of finality, efficiency, and economy become overwhelming." Caterpillar, 519 U.S. at 75 (internal citation omitted). Similarly, the Court has emphasized that "requiring dismissal after years of litigation would impose unnecessary and wasteful burdens on the parties, judges, and other litigants waiting for judicial attention." Newman-Green, 490 U.S. at 836. At the same time, however, the problems of defective jurisdiction that this power was designed to address are themselves weighty, being tied to the fundamental constitutional idea that federal courts have only limited jurisdiction, see Squibb, 160 F.3d at 929. And both the circumstances in which Newman-Green was decided and the language of the decision itself reveal that power to cure jurisdictional defects at the appellate level, although there, is to be used conservatively. 490 U.S. at 837.


39
Significantly, Newman-Green involved an appeal from a decision on summary judgment concerning defendants who were jointly and severally liable to the plaintiff. As a result, the Supreme Court noted, none of the diverse defendants could be prejudiced by a dismissal of the non- diverse defendant, see Newman-Green, 490 U.S. at 838. And a dismissal of the entire decision on jurisdictional grounds would only engender a new federal lawsuit against the diverse defendants, which would "proceed to a preordained [summary] judgment," id. at 837. Moreover, in addition to emphasizing these features of the case before it and their intimate connection to its holding, the Newman-Green Court also took pains to emphasize that the power to cure jurisdictional defects at the appellate level and to affirm a lower court adjudication by creating federal jurisdiction nunc pro tunc should be limited to similarly appropriate circumstances. "[W]e emphasize that such authority should be exercised sparingly," the Court said, and that "[i]n each case, the appellate court should carefully consider whether the dismissal of a nondiverse party will prejudice any of the parties in the litigation." Id. at 837- 38. Indeed, the Court concluded, if factual disputes about prejudice arise, "it might be appropriate to remand the case to the district court, which would be in a better position to make the prejudice determination." Id. at 838.


40
The facts of the case at bar, and the differences between the circumstances they present and those at issue in Newman-Green, make it much more difficult to conclude at the appellate level that jurisdiction may properly be salvaged in this case and require, instead, that the case be remanded to the district court for further findings. To begin with, while it was uncontested in Newman-Green that the diverse and nondiverse defendants would be jointly and severally liable for any damages the plaintiff recovered, in the case before us it is possible, and Herrick vigorously argues, that at least some of the damages paid by Skadden reflect claims Herrick could bring against Skadden alone, and therefore should not be set-off against the jury verdict that Herrick has obtained against SCS/Swid. Furthermore, whereas in Newman-Green, the judgment saved by creating jurisdiction nunc pro tunc was a summary judgment, that is, a judgment that would have to be agreed to by any reasonable jury, in the instant case, the judgment Herrick asks us to save is a verdict rendered by an actual jury that had for four days of trial seen SCS/Swid presented side-by-side with Skadden, described to the jury as "one of the largest, most powerful and sophisticated law firms in this country."


41
In elaborating on the idea that prejudice caused by the presence of a non-diverse party might preclude salvaging jurisdiction in the manner of Newman-Green, the Supreme Court noted that "[i]t may be that the presence of the nondiverse party produced a tactical advantage for one party or another." Newman-Green, 490 U.S. at 838. Where the possibility of such a tactical advantage exists, a jurisdictional defect that might have produced the advantage cannot properly be cured on appeal, and jurisdiction cannot be created nunc protunc by an appellate court. It is usually best in such cases to let the district court weigh the advantage -- if any-- that exists, the prejudice -- if any -- that advantage causes, and also the existence of countervailing factors, before deciding -- in the first instance -- on the propriety of curing, ex post, the jurisdictional defect. In that way the appellate court can review the decision below with the benefit of the findings made by the lower court. Similarly, and even more clearly, when an appellate court cannot, because of the limitations of its institutional competence, readily determine whether or not a jurisdictional defect created a tactical advantage, it should, under Newman-Green, remand that question to the district court.


42
In the case at bar, SCS/Swid claims that it has suffered prejudice as a result of Skadden's inclusion in the lawsuit. Although these assertions may in the end turn out to be meritless, and although we do not, of course, adjudicate their merits here, we cannot dismiss them as frivolous. Accordingly, and in light of the Supreme Court's admonition to caution, we remand the question of the existence of jurisdiction to the district court for further proceedings as described by Newman-Green.


43
This conclusion is not altered by the fact that SCS/Swid failed to raise the question of federal jurisdiction until after the jury had rendered its verdict. In this regard, the case at bar must be distinguished from the line of cases founded on Grubbs v. General Electric Credit Corp., 405 U.S. 699 (1972), which address the question whether the improper removal of a suit to federal district court requires vacating the judgment of that court. In Grubbs, the Supreme Court held that "the validity of the removal procedure... may not be raised for the first time on appeal." Id. at 700. Although the district court concluded, based on this precedent, that SCS/Swid's delay in raising the question of diversity jurisdiction estopped it from asserting its challenge when it did, Grubbs does not support this result.


44
We believe that the Grubbs holding is not so sweeping as the district court's treatment of it suggested. In deciding Grubbs, the Supreme Court emphasized that the district court properly had federal jurisdiction over the case at the time it entered its judgment. As a result, the issue in Grubbs was not the existence of federal jurisdiction, per se, but the propriety of the removal procedure employed. Indeed, the parties in Grubbs conceded that original federal jurisdiction would have been present had the case been filed directly in federal court. Id. at 704. The Supreme Court underscored this distinction when it noted that "where after removal a case is tried on the merits without objection and the federal court enters judgment, the issue in subsequent proceedings on appeal is not whether the case was properly removed, but whether the federal district court would have had original jurisdiction of the case had it been filed in that court." Id. at 702. Accordingly, Grubbs does not stand for the proposition that a party may be estopped from raising objections to federal jurisdiction itself if these objections are not timely presented to the district court, but only for the notion that where original federal jurisdiction would exist a party is estopped from raising objections to removal unless they are presented below.13 This proposition, as the Fifth Circuit has commented, does no more than "simply apply[] the rule that objections to procedure are deemed to be waived unless brought to the district court's attention." Paxton v. Weaver, 553 F.2d 936, 942 (5th Cir. 1977).


45
In the case at bar, and in contrast to Grubbs, Herrick chose the federal forum in which to file its initial complaint, and SCS/Swid is not merely attacking a removal procedure but is instead challenging precisely what Grubbs assumed, namely that the pre-requisites for federal jurisdiction obtain. Under these circumstances, the fundamental principle that the limits on federal subject matter jurisdiction cannot be waived, and may be challenged at any time, governs. See Curley, 915 F.2d at 83.


46
This does not mean, however, that delay in raising such a challenge may not be relevant to the question of whether the challenger can claim prejudice. Indeed, timeliness may well constitute a countervailing factor to the argument that the other party obtained a tactical advantage from the existence of improper jurisdiction. But that is, of course, a very different question from that of waiver or estoppel.

III. CONCLUSION

47
We hold that because Herrick has failed to establish specific domestic domiciles for Skadden's partners from which SCS/Swid's allegations of foreign domiciles might represent a departure, McNutt rather than Desmare governs this case. Accordingly the burden of persuasion concerning diversity rests with Herrick, the party asserting the subject matter jurisdiction of the federal courts. We also hold that Herrick has failed, on present evidence, to carry this burden, so that it must be assumed that Herrick and Skadden are not diverse and that there is no independent source of federal jurisdiction over Herrick's lawsuit against Skadden. We conclude, therefore, that the district court could not properly exercise supplemental jurisdiction either over Herrick's initial lawsuit against Skadden or over the narrower subject matter of the settlement involving Skadden. We further hold that by asserting supplemental jurisdiction over the settlement, the district court has for jurisdictional purposes retained Skadden as a party to the lawsuit right through this appeal, so that the defect in federal subject matter jurisdiction caused by Skadden's participation in the case perdures. Finally, we hold that although the federal courts may in appropriate instances cure jurisdictional defects of the type presented by Skadden (by dismissing jurisdictionally improper parties pursuant to Fed. R. Civ. P. 21), and may do so even on appeal, this power may be exercised by appellate courts only when the presence of the jurisdictional spoiler has clearly not prejudiced the remaining parties in the lawsuit.


48
Because we conclude that we cannot, on the facts of this case, determine whether or not SCS/Swid has been prejudiced by Skadden's presence in the suit, we are unable to decide the ultimate jurisdictional question presented by the case ourselves but must instead remand the case to the district court for further consideration of this question. And because we neither have jurisdiction over the case as it stands before us nor can cure the jurisdictional defect ourselves, we do not reach any of the parties' remaining contentions on appeal.


49
On remand, the district court should decide whether to conduct a further evidentiary hearing concerning Skadden's domicile in order to determine whether Skadden is in fact diverse. If district court decides that Skadden is not diverse, the court should determine whether SCS/Swid was unduly prejudiced by Skadden's participation in the lawsuit and consequently whether jurisdiction over the main case may be salvaged, under Newman-Green, by eliminating the court's previously asserted jurisdiction over the settlement involving Skadden. If jurisdiction is salvaged, the district court may then consider whether it should appropriately reinstate part or all of its prior judgment.


50
Accordingly, the judgment below is VACATED and the case is REMANDED for further proceedings consistent with this opinion.



NOTES:


1
  The parties also made a series of further requests for summary judgment that are not relevant to the present appeal. Specifically, the business defendants asked for summary judgment dismissing the fraud claim against them; plaintiffs asked for summary judgment against Skadden on all claims; and Skadden asked for summary judgment dismissing the claims alleging that they had breached their fiduciary duty and that they substantially assisted the business defendants' breach of the Letter Agreement.


2
  In addition, the district court granted the business defendants' motion for summary judgment dismissing the claim against them for fraud; denied plaintiffs' motion for summary judgment against Skadden; denied Skadden's motion for summary judgment dismissing the claim against them for breach of fiduciary duty; and granted Skadden's motion for summary judgment dismissing the claim against them for substantially assisting the business defendants' breach of the Letter Agreement.


3
  The district court explained these denials in a Revised Opinion and Order filed on August 27.


4
  Although, as we have remarked, we are "unclear as to Congress's rationale for not granting United States citizens domiciled abroad rights parallel to those it accords to foreign nationals, the language of § 1332(a) is specific and requires the conclusion that a suit by or against United States citizens domiciled abroad may not be premised on diversity." Cresswell, 922 F.2d at 68.


5
  In Guitierrez, there was no question that the plaintiff's initial domicile was New Jersey, and the argument focused exclusively on whether it had been adequately established that the plaintiff had abandoned this domicile and established a new one in New York. See id., 141 F.3d at 428-29.


6
  This sort of jurisdiction had variously been called ancillary or pendent. "Traditionally, ancillary jurisdiction refers to joinder, usually by a party other than the plaintiff, of additional claims and parties added after the plaintiff's claim has been filed. It is mainly a tool for defendants and third parties whose interests would be injured if their jurisdictionally insufficient claims could not be heard in an ongoing action in federal court. Pendent jurisdiction traditionally refers to the joinder of a state-law claim by a party already presenting a federal question claim against the same defendant." Baylis v. Marriott Corp., 843 F.2d 658, 663-64 (2d Cir. 1988) (internal citations omitted). Nevertheless, 28 U.S.C. § 1367, enacted in 1990, has effectively eliminated any difference that might once have existed between ancillary and pendent jurisdiction. Indeed, as the Supreme Court said long before the adoption of § 1367:
Given the complexities of the many manifestations of federal jurisdiction... there is little profit in attempting to decide... whether there are any "principled" differences between pendent and ancillary jurisdiction.
Aldinger v. Howard, 427 U.S. 1, 13 (1976).


7
  The supplemental jurisdiction statute, 28 U.S.C. § 1367, also expressly recognizes the limit the requirement of complete diversity imposes on the scope of the supplemental jurisdiction the statute creates. Although as a general matter, § 1367 expands supplemental jurisdiction to all claims and all parties that are part of the same constitutional case over which there exists independent federal jurisdiction, see 28 U.S.C. § 1367(a), it retains, as one of several exceptions to this broad rule, the principle that in cases in which original federal jurisdiction is founded solely on diversity, there shall generally be no supplemental jurisdiction over claims by plaintiffs against persons made parties under Fed. R. Civ. P. 14, 19, 20 or 24 "when exercising supplemental jurisdiction over such claims would be inconsistent with the jurisdictional requirements of section 1332," including the requirement of complete diversity. See 28 U.S.C. § 1367(b). For an interesting discussion of the legislative history of § 1367 and of problems arising under this statute, see Christopher M. Fairman, Abdication to Academia: The Case of the Supplemental Jurisdiction Statute, 28 U.S.C. § 1367, 19 Seton Hall Legis. J. 157.


8
  There do exist some precedents allowing supplemental jurisdiction even in cases in which complete diversity is lacking. See, e.g., Krippendorf v. Hyde, 110 U.S. 276 (1884) (allowing ancillary jurisdiction to be exercised when the pendent party asserted a claim on contested assets within the federal court's exclusive control); Local Loan Co. v. Hunt, 292 U.S. 234, 239 (1934) (allowing ancillary jurisdiction when necessary to give effect to the federal court's judgment); Moor v. County of Alameda, 411 U.S. 693, 714-15 (1973) (allowing ancillary jurisdiction over pendent parties brought into an action through a compulsory counter- claim under Fed. R. Civ. P. 13(a) & (h)). But these cases typically involve "claims by a defending party haled into court against his will, or by another person whose rights might be irretrievably lost unless he could assert them in an ongoing action in a federal court." Kroger, 437 U.S. at 376. (The effect of Kroger was therefore "to limit ancillary jurisdiction primarily to claims asserted by parties in a defensive posture, or who did not choose the federal forum. Therefore, at least in diversity cases, ancillary jurisdiction usually is not available for claims asserted by the plaintiff." 13 Wright, Miller and Cooper, Federal Practice and Procedure, 104.) By contrast, a plaintiff like Herrick "cannot complain if ancillary jurisdiction does not encompass all of his possible claims in a case such as this one, since it is he who has chosen the federal rather than the state forum and must thus accept its limitations." Kroger, 437 U.S. at 376.


9
  The parties disagree vigorously about when the settlement took effect, and consequently about when this approach to salvaging federal jurisdiction became available to Herrick. Specifically, Herrick contends that the settlement took effect on January 15, 1999, when it was announced and the district court stated, in open court, that it would "consider the case terminated" as to the settling defendants (Skadden included). SCS/Swid, by contrast, claims that the settlement took effect only on July 23, 1999, when the stipulation and order of dismissal of Herrick's case against the settling defendants was entered pursuant to Fed. R. Civ. P. 41(a). If the settlement did, in fact, remove Skadden from the case for jurisdictional purposes and did thereby cure the defect in federal jurisdiction, then it might become necessary to resolve this question because the federal courts' capacity for curing jurisdictional defects may depend upon how early in the history of a case the defects are addressed. See infra. But because we conclude that Skadden remains a jurisdictional spoiler to the main lawsuit despite the settlement, we need not decide, at this time, when the settlement took effect.


10
  This reasoning is consistent with the Supreme Court's conclusion, in Caterpillar, Inc. v. Lewis, 519 U.S. 61 (1996), discussed infra, that the pre-trial dismissal, pursuant to a settlement, of the only non-diverse defendant in that case meant that all jurisdictional defects in the case had been cured by the time of judgment. The record in Caterpillar reveals that when the district court dismissed the settling defendant, it neither retained jurisdiction over the settlement nor "so-ordered" the settlement's terms as part of the dismissal. See Order of the United States District Court for the Eastern District of Kentucky in Lewis v. Caterpillar Inc., Cv. 90-84 (June 8 1993); Order of the United States District Court for the Eastern District of Kentucky in Lewis v. Caterpillar Inc., Cv. 90-84 (Sept. 24, 1990). Accordingly, by the settlement, the non- diverse defendant was genuinely eliminated from the case.


11
  Similarly, allowing a settlement (that maintains the court's supervisory powers) to cure problems of incomplete diversity would have the effect of allowing the settling defendant to waive any objection to diversity jurisdiction and would therefore violate the principle that subject matter jurisdiction is "an unwaivable sine qua non for the exercise of federal judicial power." Curley, 915 F.2d at 83.


12
  At several other places in the Caterpillar opinion, the Supreme Court emphasized that the jurisdictional defect in that case had been cured "prior to trial," or by "the time of trial and judgment," or "before the trial commenced." Id. at 67, 73. Because we have concluded, see supra, that the jurisdictional defect in the case at bar survives through the present appeal, we need not address this discrepancy between the Supreme Court's broader statement of its holding and narrower focus implicated by the facts of the case.
We note, furthermore, that this conclusion saves us from having to address the complicated question, hotly disputed by the parties, of the proper application of the (possibly uncertain) Caterpillar rule to the (unquestionably confusing) circumstances of this case. In particular, we need not decide whether the involved procedural history of this case (the January 15 settlement announcement, February 25 entry of judgment, April 13 vacatur of that judgment, July 23 Stipulation and Order dismissing the settling parties, and August 12 entry of a new judgment as to the remaining parties) should be interpreted as creating a settlement before or after the entry of judgment.


13
  Indeed, in discussing the principle that where original federal jurisdiction over a case would have been proper, a defendant who has removed the action is estopped from protesting that there was no right to removal, Grubbs noted, citing American Fire & Casualty Co. v. Finn, 341 U.S. 6, 17 (1951), that this principle has "no application to a case where at the time of judgment citizens of the same State were on both sides of the litigation." Grubbs, 405 U.S. at 704.


