What does falsification look like anyway?

Vulcan vs Neptune

There’s an argument that plays out every so often in linguistics the goes as follows:

Critic: This data falsifies theory T.
Proponent: Not necessarily, if you consider arguments X,Y, and Z.
Critic: Well, then theory T seems to be unfalsifiable!

This is obviously a specious argument on the part of the critic, since unfalsified does not entail unfalsifiable, but I think it stems from a very understandable frustration—theorists often have an uncanny ability to wriggle free of data that appears to falsify their theories, even though falsificationism is assumed by a large majority of linguists. The problem is that the logic falsificationism, while being quite sound, maybe unimpeachable, turns out to be fiendishly difficult to apply.

At its simplest, the logic of falsificationism says that a theory is scientific insofar as one can construct a basic statementi.e., a statement of fact—that would contradict the theory. This, of course, is an oversimplification of Karl Popper’s idea of Critical Rationalism in a number of ways. For one, falsifiability is not an absolute notion. Rather, we can compare the relative falsifiability of two theories by looking at what Popper calls their empirical content—the number of basic statements that would contradict them. So if a simple theoretical statement P has a particular empirical content, then the conjunction P & Q will have a greater empirical content, and the disjunction P v Q will have a lesser empirical content. This is a useful heuristic when constructing or criticizing a theory internally, and seems like a straightforward guide to testing theories empirically. Historically, though, this is not the case, largely because it is often difficult to recognize when we’ve arrived at and accurately formulated a falsifying fact. In fact, it is often, maybe always, the case that we don’t recognize a falsifying fact as such until after one theory has been superseded by another.

Take for instance the case of the respective orbits of Mercury and Uranus. By the 19th century, Newtonian mechanics had allowed astronomers to make very precise predictions about the rotations of the planets, and based on those predictions, there was a problem: two of the planets were misbehaving. First, it was discovered that Uranus—then the last known planet from the sun—wasn’t showing up where it should have been. Basically, Newton’s mechanics predicted that on such and so day and time Uranus would be in a particular spot in the sky, but the facts were otherwise. Rather than cry “falsification!”, though, the astronomers of the day hypothesized an object on the other side of Uranus that was affecting its orbit. One such astronomer, Urbain Le Verrier was even able to work backwards and predict where that object could be found. So in September of 1846, armed with Le Verrier’s calculations, Johann Gottfried Galle, was able to observe an eighth planet—Neptune. Thus, an apparent falsification became corroboration.

Urbain Le Verrier (1811-1877)
Johann Galle (1812-1910)

I’ve previously written about this story as a vindication of the theory first approach to science. What I didn’t write about, and what is almost never discussed in this context is Le Verrier’s work on the misbehaving orbit of Mercury. Again, armed with Newton’s precise mechanics, Le Verrier calculated the Newtonian prediction for Mercury’s orbit, and again[1]Technically though, Le Verrier’s work on Mercury predated his work on Uranus Mercury didn’t behave as expected. Again, rather than throw out Newtonian mechanics, Le Verrier hypothesized the planet Vulcan between Mercury and the sun, and set about trying to observe it. While many people claimed to observe Vulcan, none of these observations were reliably replicated. Le Verrier was undeterred, though, perhaps because observing a planet that close to the sun was quite tricky. Of course, it would be easy to paint Le Verrier as an eccentric—indeed, his Vulcan hypothesis is somewhat downplayed in his legacy—but he doesn’t seem to have been treated so by his contemporaries. The Vulcan hypothesis wasn’t universally believed, but neither does it seem to be the Flat-Earth theory of its day.

It was only when Einstein used his General Theory of Relativity to accurately calculate Mercury’s orbit, that the scientific community seems to have abandoned the search for Vulcan. Mercury’s orbit is now considered a classical successful test of General Relativity, but why don’t we consider it a refutation of Newtonian Mechanics? Strict falsificationism would seem to dictate that, but then a strict falsificationist would have thrown out Newtonian Mechanics as soon as we noticed Uranus misbehaving. So, falsificationism of this sort leads us to something of a paradox—if a single basic statement contradicts a theory, there’s no way of knowing if there is some second basic statement that, in conjunction with the first, could save the theory.

Still, it’s difficult to toss out falsification entirely, because a theory that doesn’t reflect reality, may be interesting but isn’t scientific.[2]Though sometimes, theories which seem to be empirically idle end up being scientifically important (cf. non-Euclidean geometry) Also, any reasonable person who has ever tried to give an explanation to any phenomenon, probably rejects most of their own ideas rather quickly on empirical bases. We should instead adopt falsificationism as a relative notion—use it when comparing multiple theories. So, Le Verrier was ultimately wrong, but acted reasonably—he had a pretty good theory of mechanics so he worked to reconcile it with some problematic data. Had someone developed General Relativity in Le Verrier’s time, then it would have been unreasonable to insist that a hypothesized planet was a better explanation than an improved theory.

Returning to the hypothetical debate between the Critic and the Proponent, then, I think a reasonable albeit slightly rude response for the proponent would be “Well, do you have a better theory?”

References

References
1 Technically though, Le Verrier’s work on Mercury predated his work on Uranus
2 Though sometimes, theories which seem to be empirically idle end up being scientifically important (cf. non-Euclidean geometry)
Subscribe
Notify of
guest

31 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Omer Preminger
17 days ago

I think there is an important difference between the Vulcan/Mercury scenario, and the kind of minimalist theorizing that often comes under the kind of criticism you touch on here. With respect to the orbit of mercury, the two theories in question are:

1. Vulcan exists, and there is a variegated set of explanations for the unreliability in observing it.

2. General Relativity.

For quite a while, (2) wasn’t available, and so, for that period of time, (1) was probably the best available theory.

But these must be contrasted with (3):

3. The perturbations in Mercury’s orbit violate Newtonian mechanics, therefore I need not concern myself with them.

Now, obviously, (3) is fair game insofar as no scientific theory is a Theory of Everything, and there is no intrinsic moral value, positive or negative, in heuristically ignoring a given set of data. But – and this, I think, is the part that gets lost by most of minimalism’s defenders – (3) is *not* a player in the arena in (1-2). That is: in discussions of the orbits of Mercury, (3) is not a contender, and its existence is not an argument against (1).

To take an example from within linguistics, consider the theory of Abstract Case. Let us take for granted that Abstract Case is a no-go in accounting for the morphosyntax of Basque, Icelandic, Sakha, and Shipibo. Proponents of Abstract Case are well within their rights to ignore these data if they want to! But the theory of Abstract Case is not a player in the field of theories of these data, and its existence is irrelevant to adjudicating among theories that do have something to say about them. At best, the theory [Abstract Case plus a very long series of construction-specific stipulations] is a contender (arguably, one that is roughly on a par with (1)). And it is not difficult to imagine why people prefer Marantz’s theory to that one.

Omer Preminger
17 days ago
Reply to  Dan Milway

The issue is a bit more subtle. It is true that there is a version of the theory of Abstract Case that makes no claims whatsoever about the morphological form that nominals take. (Fun fact: when I recently pointed this out in a draft, a reviewer – an otherwise very nice one! – thought this to be polemical criticism of that theory.) Let’s assume we’re talking about that version of the theory. If so, then lots of the more well-known data from, say, Icelandic is indeed irrelevant. (And some data from Icelandic, e.g. obligatory A-movement under passivization, can even be seen as validation of the theory in question.) However, I included Basque in that list for a reason. As I’ve discussed on the Faculty of Language blog, Basque provides us with counterevidence even to this weaker version of the theory. (See also pp. 929-930 of this paper.) In a nutshell, we can show that datives are interveners in A-relations targeting the absolutive argument, and that datives are located higher than absolutives (in the construction in question), meaning even v is not a viable candidate for assigning the absolutive its abstract case (because it’s too high, the dative intervening between the two).

So, to finally answer your question: no, Abstract Case theory is not silent on the data I referenced as a whole. Abstract Case Theory is to the morphosyntax of these languages as Newtonian Mechanics is to… the orbit of Mercury.

(NB: This dative-absolutive configuration in Basque is not the only such problem for Abstract Case theory in the four languages mentioned. There’s at least one different but equally problematic configuration in Icelandic, and at least two in Sakha. They’re just a bit more complicated to lay out, so I went with the Basque one.)

AJD
AJD
17 days ago

The remark “theorists often have an uncanny ability to wriggle free of data that appears to falsify their theories” reminds me of the linguist version of the “How do you prove all odd numbers are prime?” joke:

How does a linguist prove all odd numbers are prime?
“3 is prime, 5 is prime, 7 is prime, 9… hmm, yeah, I can get 9 to be prime.”

Omer Preminger
17 days ago
Reply to  AJD

The (or a) more complete version of that joke:

A mathematician, an engineer, and a linguist are discussing whether all odd numbers are prime.

The mathematician says, “3 is prime, 5 is prime, 7 is prime… yeah I’m pretty sure I can prove this by induction.”

The engineer says, “3 is prime, 5 is prime, 7 is prime, 9… that’s gotta be a sampling error, 11 is prime, 13 is prime – see?”

The linguist says, “3 is prime, 5 is prime, 7 is prime, 9… 9. 9 (with emphasis). *pauses* yeah, I think I can get 9 as prime.”

At that moment, another linguist walks into the room asking them all what they’ve been talking about. They bring the second linguist up to speed, explaining that they’re currently debating whether 9 is prime. The first linguist says, “I think it is!”

To this, the second linguist responds: “I’m not sure I can get 9 as prime, but there’s definitely a contrast with 6.”

David Marjanović
David Marjanović
17 days ago

Mercury’s orbit is now considered a classical successful test of General Relativity, but why don’t we consider it a refutation of Newtonian Mechanics?

Uh… we do?

Newtonian mechanics is wrong. It’s a good enough approximation for a lot of applications, so we keep using it because it’s so much simpler mathematically – but it is wrong.

But the point holds that a lot of parsimony is hidden in falsification. For example, we might not yet have noticed a Vulcan composed entirely of dark matter; that’s just a less parsimonious hypothesis than that it simply doesn’t exist.

Last edited 17 days ago by Dan Milway
David Marjanović
David Marjanović
16 days ago
Reply to  Dan Milway

Oh, that’s what you mean. It’s <b>a</b> falsification of Newtonian mechanics (along with countless others, like every use of GPS). Whether it’s <b>the</b> historically significant one, I don’t even know. (I think that’s usually believed to be Eddington’s measurement of how the sun’s gravity bends light – which later turned out to have an error margin too great for that purpose…)

Yes, parsimony is necessarily relative; a hypothesis can’t just be parsimonious, it has to be more or less parsimonious than another.

I’m not in philosophy myself, I’m in phylogenetics where parsimony is basically all we can do anyway, so I haven’t encountered a lot of actual falsificationist arguments…

Omer Preminger
15 days ago
Reply to  Dan Milway

The current stance within orthodox minimalist circles strikes me less like an adherence to Newtonian mechanics in the face of the Mercury/Vulcan problem, and more like the many folks who refused to accept Newton’s theory in the first place on the grounds that it required “action at a distance” (=gravity), something ruled out by their prior theoretical commitments. (We could call it the Strong Contactist Thesis )

Omer Preminger
15 days ago
Reply to  Dan Milway

Is it beside the point, though? I don’t think so. Your post starts with a (hypothetical) dialogue explicitly labeled as “in linguistics”; and it ends as follows:

I think a reasonable albeit slightly rude response for the proponent would be “Well, do you have a better theory?”

Assumptions like the Strong Contactist Thesis (jokingly labeled so in my previous comment, but an entirely real facet of contemporary theoretical convictions in Newton’s time) are baked into what the proponent presumably envisions when they say “a better theory.” And, assuming we all agree that in the harsh light of hindsight, the Strong Contactist Thesis was an impediment to theoretical progress insofar as it impeded the adoption of Newton’s proposals, I think my point is in fact directly relevant to your point: it highlights the folly of the proponent’s proposed reply.

(As a side note, there is of course active work trying to understand gravity *without* action at a distance, showing that the Strong Contactist Thesis was not, in fact, bulls**t. But the lesson I take from this is that things like the Strong Minimalist Thesis can be sensible while, simultaneously, it can still be a bad idea to apply them to evaluate what counts as a “better” theory at a given point in the development of a science.)

Omer Preminger
15 days ago
Reply to  Dan Milway

And my point is that “Is X a better theory than Y?” is just as muddy of a question than “Does data D count as a falsification of theory X?” – the note on which the original post ends makes it seem as if this is not the case.

Re:my thesis, the predictions (and what would falsify them) are discussed therein. For example: an instance of failed agree that did not involve a counterfeeding relation with movement and nevertheless caused a crash was a problem for this theory. The very data discussed in the thesis contains one very prominent example of precisely this (the so-called Person Licensing Condition effects), and the thesis goes into a discussion of to what extent this can be viewed as a falsification! Another thing it rules out, for example, is a theory where canonical subjects can bear only a proper subset of the set of available cases in the language, but it’s a *different* proper subset than the set of cases that can stand on the other side of an agreement relation with a verb. (I’m not aware of any violations of this prediction, btw.) And there’s more where that came from.

In any event, my thesis is not a great example to use in this discussion, for the following reason. The point of departure for my thesis is that Chomsky’s crash-based theory of Agree is probably a “better theory” – that is to say, I concede more or less from the start that that theory, could it have been made to work, would be the one we’d want – and the thesis is a treatise on why it doesn’t work. That is: I explicitly say Chomsky’s theory rules out more things than mine does, it’s just that the data speak really loudly in favor of the latter. Now, of course, if my theory ruled out *nothing*, then it wouldn’t be much of a theory. But thankfully, there’s plenty it still rules out.

Omer Preminger
14 days ago
Reply to  Dan Milway

Doesn’t the story of (what I jokingly called) the Strong Contactist Thesis show that that’s not the case? The people who subscribed to the SCT took Newton’s proposal to be obviously worse, since, whatever its empirical, it violated the SCT. In hindsight, we view Newton’s theory as more explanatory, not less, and that’s *even* if we put some stock in the SCT (which Newton’s theory violated). This seems very far from extreme clarity, to me. But maybe I’m missing something.

Dmitrii Maksimovich Zelenskii
Dmitrii Maksimovich Zelenskii
4 days ago
Reply to  Omer Preminger

Does “different” here mean “including cases which are not in the former subset”? Because if we’re going with literal, technical definition of different, quirky case is a violation if, for whatever reason, at least one case is not found as quirky (I think Russian qualifies? Datives and accusatives can be canonical subjects in at least some idiolects and constructions (%Меня от себя тошнит; %Мне себя жалко), but, whatever tricks you pull, prepositional case will not be a subject argument for the obvious reason hidden in its name. Agreement, on the other hand, is always to nominative. So the former set is {NOM, DAT, ACC} (and maybe something else but definitely not PREP), the latter is {NOM}, and both are different proper subsets of {NOM, ACC, GEN, DAT, INSTR, PREP, LOC} (or whatever you believe to be the right caseset for Russian). Full disclosure: I don’t quite remember the whole set of arguments for canonicity of a subject which sets Icelandic apart from German and Russian, only the anaphor test, so maybe Russian won’t qualify after all).

(To be fair, I wouldn’t even expect of an agreement theory to make predictions about sets of cases, given that most of them are basically bound prepositions. Tangentially related – while Deal (if I understood her comment correctly) shows some list of people who don’t buy Abstractness Gambit, it can be considered a winner among those who cling to abstract Case: “Case” is a misnomer gifted to us by Vergnaud, and the feature is probably something verbal (like T, Pesetsky & Torrego-style?).)

Last edited 4 days ago by Dmitrii Maksimovich Zelenskii
Omer Preminger
4 days ago

A few quick points:

1. I don’t think in the end, Russian has what would qualify as “quirky subjects” in the sense that Icelandic does. (At last count, I think there were 17 different ways in which these Icelandic datives/accusatives/genitives behaved as subjects, with the ability to antecede subject-oriented anaphors being just one of the 17.)

2. Of course, “quirky subject” is a piece of terminology, and we could go back and forth about what the necessary and sufficient conditions should be to declare something a member of this class. Ethan Poole has a conference paper from a while ago called “Deconstructing Quirky Subjects” which is the best attempt I know of to rigorously approach that project.

3. Bound prepositions (as you suggest for Russian) are explicitly set aside for what counts as the “relevant set of cases” for agreement and subjecthood, in the context of the prediction I was discussing.

4. I personally think the Abstractness Gambit is suspect on a couple of fronts. You talked about one (the one that is being discussed in the FoL post you linked to). But another one is the fact that so-called “morphological case” (i.e., the kind Marantz’s/Bobaljik’s theories are about) is itself *abstract*, in the sense that it cannot possibly be a theory of case forms; see e.g. pages 3-4 (“the third point”) in lingbuzz/005463 for a brief exposition of this. Of course, there is no a priori reason why there couldn’t be these *two* abstract systems running alongside each other, especially if one of them is “Case” only in name. But it’s at least worth pausing to appreciate that that’s indeed what is on the table.

Dmitrii Maksimovich Zelenskii
Dmitrii Maksimovich Zelenskii
3 days ago
Reply to  Omer Preminger
  1. As I said, I don’t remember the other 16 😉
  2. A theory that treats a theoretical element (be it called “quirky subject”, “canonical subject”, or whatever) as a list of pluses and minuses in a table – especially a seventeen-element list – would be deeply unsatisfying; (ETA: cut off at the wrong time, now I look like a moron who hasn’t read the second sentence of your point 2… I meant to add something like “we want to know what the properties – and their clustering if there is any – stems from, and perhaps Poole achieves that”.)
  3. OK, then… do you believe there is a syntactic sense in which English “with” is a preposition but, say, Russian or Sanskrit instrumental is not? Because, the further I look, the less evidence I find for that.
  4. I mean, yes, morphological case is not about phonological exponents per se, not even in their UR, that much is clear. However, this does not make it any more abstract than, say, its neighbor, inflection class (or than prepositions with varying exponents). Inflection class is known to have some correlation with gender; however, I don’t see much suspicion about having gender in syntax sufficiently divorced from morphological class. Relation between m-case and “Case” is even more tangential (because of numerous empirical problems to make it not tangential you know better than I do). So yes, this is what is on the table – having a morphological system (even if it is in syntax in the wider sense suggested by Y-model, it is evidently a system quite different from something akin to Agree, features and all that stuff) alongside a core syntactic system which are kinda somewhat related sometimes but maybe not.
Last edited 3 days ago by Dmitrii Maksimovich Zelenskii
Omer Preminger
3 days ago

3. This is indeed a subtle issue, but I’m not as sure as you are (or seem to be?) that the issue is hopeless. So, for example, there is the issue of whether the suspected {case marker / adposition} is obligatory on each conjunct in a coordination, or it can be omitted (or live “outside” the coordination, so to speak). Now of course, one might question to what extent this is a valid diagnostic, or if it is, what exactly it diagnoses. The phenomenon known as “suspended affixation” in Turkic languages suggests that we might expect lots of crosslinguistic variation, regardless of the P vs. case issue, about what can and cannot be “stranded” outside of a coordination. Nevertheless, it does seem that this particular diagnostic provides a useful cut between “case suspects” and “adposition suspects”.

Dmitrii Maksimovich Zelenskii
Dmitrii Maksimovich Zelenskii
3 days ago
Reply to  Omer Preminger

I am very much afraid this diagnostic will turn up something very general like “truly bound affixes (not clitics) are not subject to ellipsis”. Ditto for another plausible diagnostics called “there is – in some languages at least – agreement in case but not agreement in prepositions” (not quite true, Old Russian texts, for one, show duplicated prepositions quite often, and it seems to have also been the case in some other old IE).

More importantly, though, I do not see whether we can really gain anything by setting up the distinction – whether we have different expectations for external syntactic behaviour of “with a knife” as opposed to its Russian translation “ножом” (or, to put it within one language, between German “dem Messer” and “mit dem Messer”). Do you expect to find something like that?

Omer Preminger
3 days ago

Well, I think we have one potential example right here. Predictions like the one above (concerning the relation between “subject cases” and “agreement cases”) only apply to cased DPs, not PPs. So Icelandic with-phrases are not part of the prediction, but datives are… (The prediction could be false, of course, but the difference between with-phrases and datives has this consequence for what the prediction even is.)

Dmitrii Maksimovich Zelenskii
Dmitrii Maksimovich Zelenskii
3 days ago
Reply to  Omer Preminger

Yes – and this, for me, is a huge reason why this prediction is suspect. Though I would need to see the reason for this prediction as right now I can’t even understand why it works – and why on specific cases rather than on Bobaljik-Marantz unmarked-dependent-lexical case hierarchy.

Omer Preminger
3 days ago

One person’s modus ponens is another’s modus tollens, I suppose. That is: the prediction – on the terms I’ve stated above – seems to hold exceptionlessly, suggesting (to me at least) that there’s something to this way of carving up the empirical picture.

Dmitrii Maksimovich Zelenskii
Dmitrii Maksimovich Zelenskii
3 days ago
Reply to  Omer Preminger

If you are talking about (205) in Preminger 2014, the suggested examples seem to be on unmarked < dependent < lexical scale rather than on the whole language-particular case set – and that does not seem to require that (at least lexical) cases are distinct from prepositions. Unless we have (the relevant kind of) φ-agreement with PPs?

Omer Preminger
3 days ago

Well, the prediction can only be as fine grained as the case-hierarchy it sits atop. But I’ll note that others (Caha, Demirok, McFadden, Zompi) have shown the hierarchy you mentioned can be further articulated to capture distinctions within the third (“lexical”) member. But no, PPs are excluded by definition (and regardless of whether they are agreed with), with reference to the kinds of diagnostics I alluded to earlier.

Dmitrii Maksimovich Zelenskii
Dmitrii Maksimovich Zelenskii
3 days ago
Reply to  Omer Preminger

At least Caha’s approach seems to explicitly require (“functional”) prepositions and cases to be the same heads syntactically though, judging by his dissertation. (Also, since you mentioned Caha, I don’t buy the Anchor Condition – how bizarre a coincidence is it that all cases where he needs it to actually do anything are instrumental on top of the dependent case?)

Dmitrii Maksimovich Zelenskii
Dmitrii Maksimovich Zelenskii
3 days ago

I wholeheartedly agree with your general gist: data don’t falsify by themselves, but different theories may still be compared by handling the data. However, if we apply the idea in the post as described, a question looms: is it true that if there is only one fighter in the ring, the fighter always wins? Or, to give a more realistic example, is any theory with some degree of generality and simplicity better than the theory of just having a list of things (or, at least, a list of phases which can have empty places for another phases – essentially a version of Construction Grammar)? Common sense suggests the answer is no. But why not? Can we formulate it in a practical, not philosophy-of-science-based way?