Piantadosi and MLMs again (I)

Last spring, Steven Piantadosi, professor of psychology and neuroscience, posted a paean to Modern Language Models (MLMs) entitled Modern language models refute Chomsky’s approach to language on LingBuzz. This triggered a wave of responses from linguists, including one from myself, pointing out the many ways that he was wrong. Recently, Prof. Piantadosi attached a postscript to his paper in which he responds to his critics. The responses are so shockingly bad, I felt I had to respond—at least to those that stem from my critiques—which I will do, spaced out across a few short posts.

In my critique, I brought up the problem of impossible languages, as did Moro et al. in their response. In addressing this critique, Prof. Piantadosi surprisingly begins with a brief diatribe against “poverty of the stimulus.” I say surprisingly, not because it’s surprising for an empiricist to mockingly invoke “poverty of stimulus” much in the same way as creationists mockingly ask why there are still apes if we evolved from them, but because poverty of stimulus is completely irrelevant to the problem of impossible languages and neither I nor Moro et al. even use the phrase “poverty of stimulus.”[1]For my part, I didn’t mention it because empiricists are generally quite assiduous in their refusal to understand poverty of stimulus arguments.

This irrelevancy expressed, Prof. Piantadosi moves on to a more on-point discussion. He argues that it would be wrong-headed for the constraints that would make some languages impossible to be encoded in our model from the start. Rather, if we start with an unconstrained model, we can discover the constraints naturally:

If you try to take constraints into account too early, you might have a harder time discovering the key pieces and dynamics, and could create a worse overall solution. For language specifically, what needs to be built in innately to explain the typology will interact in rich and complex ways with what can be learned, and what other pressures (e.g. communicative, social) shape the form of language. If we see a pattern and assume it is innate from the start, we may never discover these other forces because we will, mistakenly, think innateness explained everything

p36 (v6)

This makes a certain intuitive sense. The problem is that it’s refuted both by the history of generative syntax and the history of science more broadly.

In early theories, a constraint like “No mirroring transformations!” would have to be stated explicitly. Current theories, though, are much simpler with most constraints being derivable from the theory rather than tacked onto the theory.

A digression on scholarly responsibility: Your average engineer working on MLMs could be forgiven for not being up on the latest theories in generative syntax, but Piantadosi is an Associate Professor who has chosen to write a critique of generative syntax, so he really ought to know these things. In fact, he would only not know these thing by a conscious choice not to know or laziness.

Furthermore, the natural sciences have progressed thus far in precisely the opposite direction as what Piantadosi prescribes—they have started with highly constrained theories and progress has generally occurred when some constraint is questioned. Copernicus questioned the constraint that Earth stood still, Newton questioned the constraint that all action was local, Friedrich Wöhler questioned the constraint that organic and inorganic substances were inherently distinct.

None of this, of course, means that we couldn’t do science in the way that Piantadosi suggests—I think Feyerabend was correct that there is no singular Scientific Method—but the proof of the pudding is in the eating. Piantadosi is effectively making a promise that if we let MLM research run its course we will find new insights[2]He seems to contradict himself later on when he asserts that the “science” of MLMs may never be intelligible to humans. More on this in a later post. that we could not find had we stuck with the old direction of scientific progress, and he may be right—just as AGI may actually be 5 years away this time—but I’ll believe it when I see it.


After expressing his methodological objections to considering impossible languages, Piantdosi expresses skepticism as to the existence of impossible languages, stating ” More troubling, the idea of “impossible languages” has never actually been empirically justified.” (p37, v6) This is a truly astounding assertion on his part considering both Moro et al. and I explicitly cite experimental studies that arguable provide exactly the empirical justification that Piantadosi claims does not exist. Both studies cited present participants with two types of made-up languages—one which follows and one which violates the rules of language as theorized by generative syntax—and observes their responses as they try to learn the rules of the particular languages. The study I cite (Smith and Tsimpli 1995) compares the behavioural responses of a linguistic savant to those of neurotypical participants, while the studies cited by Moro et al. (Tettamanti et al., 2002; Musso et al., 2003) uses neuro-imaging techniques. Instead Prof. Piantadosi refers to every empiricists favourite straw-man argument—the alleged lack of embedding structures in Pirahã.

This bears repeating. Both Moro et al. and I expressly point to experimental evidence of impossible languages, and Piantadosi’s response is that no one has ever provided evidence of impossible languages.

So, either Prof. Piantadosi commented on mine and Moro et al‘s critiques without reading them, or he read them and deliberately misrepresented them. It is difficult to see how this could be the result of laziness or even willful ignorance rather than dishonesty.

I’ll leave off here, and return to some of Prof. Piantadosi’s responses to my critiques at a later time.

Notes

Notes
1 For my part, I didn’t mention it because empiricists are generally quite assiduous in their refusal to understand poverty of stimulus arguments.
2 He seems to contradict himself later on when he asserts that the “science” of MLMs may never be intelligible to humans. More on this in a later post.

The Descriptivist Fallacy

A recent hobby-horse of mine—borrowed from Norbert Hornstein—is the idea that the vast majority of what is called “theoretical generative syntax” is not theoretical, but descriptive. The usual response when I assert this seems to be bafflement, but I recently got a different response—one that I wasn’t able to respond to in the moment, so I’m using this post to sort out my thoughts.

The context of this response was that I had hyperbolically expressed anger at the title of one of the special sessions at the upcoming NELS conference—”Experimental Methods In Theoretical Linguistics.” My anger—more accurately described as irritation—was that, since experiment and theory are complementary terms in science, the title of the session was contradictory unless the NELS organizers were misusing the terms. My point, of course, was that the organizers of NELS—one of the most prestigious conferences in the field of generative linguistics—were misusing the terms because the field as a whole has taken to misusing the terms. A colleague, however, objected, saying that generative linguists were a speech community and that it was impossible for a speech community to systematically misuse words of its own language. My colleague was, in effect, accusing me of the worst offense in linguistics—prescriptivism.

This was a jarring rebuttal because, on the one hand, they aren’t wrong, I was being prescriptive. But, on the other hand and contrary to the first thing students are taught about linguistics, a prescriptive approach to language is not always bad. To see this, let’s consider the to basic rationales for descriptivism as an ethos.

The first rationale is purely practical—if we linguists want to understand the facts of language, we must approach them as they are, not as we think they should be. This is nothing more than standard scientific practice.

The second rationale is a moral one, stemming from the observation that language prescription tends to be directed at groups that lack power in society—Black English has historically been treated as “broken”, features of young women’s speech (“up-talk” in the 90s and “vocal fry” in the 2010s) is always policed, rural dialects are mocked. Thus, prescriptivism is seen as a type of oppressive action. Many linguists make it no further in thinking about prescriptivism, unfortunately, but there are many cases in which prescriptivism is not oppressive. Some good instances of prescriptivism—assuming they are done in good faith—are as follows:

  1. criticizing the use of obfuscatory phrases like “officer-involved shooting” by mainstream media
  2. calling out racist and antisemitic dog-whistling by political actors.
  3. discouraging the use of slurs
  4. encouraging inclusive language
  5. recommending that a writer avoid ambiguity
  6. Asking an actor to speak up

Examples 1 and 2 are obviously non-oppressive uses of prescriptivism, as they are directed at powerful actors; 3 and 4 can be acceptable even if not directed at a powerful person, because they attempt to address another oppressive act; and 5 and 6 are useful prescriptions, as they help the addressee to perform their task at hand more effectively.

Now, I’m not going to try to convince you that the field of generative syntax is some powerful institution, nor that the definition of “theory” is an issue of social justice. Here my colleague was correct—members of the field are free to use their terminology as they see fit. My prescription is of the third variety—a helpful suggestion from a member of the field that wants it to advance. So, while my prescription may be wrong, I’m not wrong to offer it.

Using anti-prescriptivism as a defense against critique is not surprising—I’m sure I’ve had that reaction to editorial suggestions on my work. In fact, I’d say it’s a species of a phenomenon common among folks who care about social justice, where folks mistake a formal transgression for a violation of an underlying principle. In this case the formal act of prescription occurred but without any violation of the principle of anti-oppression.

Why are some ideas so sticky? A hypothesis

Anyone who has tried to articulate a new idea or criticize old ones may have noticed that some ideas are washed away relatively easily, while others seem to actively resist even the strongest challenges—some ideas are stickier than others. In some cases, there’s an obvious reason for this stickiness—in some cases there’s even a good reason for it. Some ideas are sticky because they’ve never really been interrogated. Some are sticky because there are powerful parts of society that depend on them. Some are sticky because they’re true, or close to true. But I’ve started to think there’s another reason an idea can be sticky—the amount of mental effort people put into understanding the idea as students.

Take, for instance, X-bar theory. I don’t think there’s some powerful cabal propping it up, it’s not old enough to just be taken for granted, and Chomsky’s Problems of Projection papers showed that it was not really tenable. Yet X-bar persists. Not just in how syntacticians draw trees, or how they informally talk about them, but I remember commentary on my definition of minimal search here involved puzzlement about why I didn’t simply formalize the idea that specifiers were invisible to search followed by more puzzlement when I explained that the notion of specifier was unformulable.

In my experience, the stickiness of X-bar theory—and syntactic projection/labels more broadly—doesn’t manifest itself in an attempt to rebut arguments against it, but in attempts to save it—to reconstitute it in a theory that doesn’t include it.[1]My reading of Zeijstra’s chapter in this volume is as one such attempt This is very strange behaviour—X-bar is a theoretical construct, it’s valid insofar as it is coherent and empirically useful. Why are syntacticians fighting for it? I wondered about this for a while and then I remembered my experience learning X-bar and teaching it—it’s a real challenge. It’s probably the first challenging theoretical construct that syntax students are exposed to. It tends to be presented as a fait accompli, so students just have to learn how it functions. As a result, those students who do manage to figure it out are proud of it and defend it like someone protecting their cherished possessions.[2]I think I may be describing “effort justification,” but I’m basing this just on the Wikipedia article

Of course, it’s a bit dangerous to speculate about the psychological motivations of others, but I’m certain I’ve had this reaction in the past when someone’s challenged an idea that I at one point struggled to learn. And I’ve heard students complain about the fact that every successive level of learning syntax starts with “everything you learned last year is wrong”—or at least that’s the sense they get. So, I have a feeling there’s at least a kernel of truth to my hypothesis. Now, how do I go about testing it?


Addendum

As I was writing this, I remembered something I frequently think when I’m preparing tests and exams that I’ve thus far only formulated as a somewhat snarky question:

How much of our current linguistic theory depends on how well it lends itself to constructing problem sets and exam questions?

Notes

Notes
1 My reading of Zeijstra’s chapter in this volume is as one such attempt
2 I think I may be describing “effort justification,” but I’m basing this just on the Wikipedia article

Unmoored theory

I’ve written before about the dichotomy of descriptive vs theoretical sciences, but I’ve recently noticed another apparent dichotomy within theoretical sciences—expansionary vs focusing sciences. Expansionary sciences are those whose domain tends to expand—(neo)classical economics seems to claim all human interaction in its domain; formal semantics now covers pragmatics, hand gestures, and monkey communication—while focusing sciences tend to rather constant domain or even a shrinking one—chemistry today is about pretty much the same things as it was in the 17th century; generative syntactic theory is still about the language faculty. Assuming this is true,[1]It’s pretty much a tautology that a science’s domain will either grow, stay constant, or shrink over time the question is, whether it reflects some underlying difference between these sciences. I’d like to argue that the distinction follows from how firm its foundations are, and in particular what I’ll call its empirical conjecture.

Every scientific theory, I think, basically takes the form of a conjoined sentence “There are these things/phenomena in the world and they act like this.” The second conjunct is the formal system that give a theory its deductive power. The first conjunct is the empirical conjecture, and it turns the deductions of the formal system into predictions. While every science that progresses does so by positing new sorts of invisible entities, categories, etc., they all start with more or less familiar entities, categories, etc.—planets, metals, persons, etc. This link to the familiar, is the empirical foundation of a science. Sciences with a firm foundation are those whose empirical conjecture can be uncontroversially explained to a lay person or even an expert critic operating in good faith.

Contemporaries of, say, Robert Boyle might have thought the notion of corpuscles insanity, but they wouldn’t disagree that matter exists, exists in different forms, and that some of those forms interact in regular ways. Even the fiercest critic of UG, provided they are acting in good faith, would acknowledge that humans have a capacity for language and that that capacity probably has to do with our brains.

The same, I think, cannot be said about (neo)classical economics or formal semantics.[2]Now obviously, there’s a big difference between the two fields—neoclassical economics is extremely useful to the rich and powerful since it let’s them justify just about any … Continue reading Classical economics starts with the conjecture that there are these members of the species homo economicus—the perfectly rational, self-interested, utility maximizing agent—and derives theorems from there. This is obviously a bad characterization of humans. It is simultaneously too dim of a view of humans—we behave altruistically and non-individualistically all the time—and one that gives us far too much credit—we are far from perfectly rational. Formal semantics, on the other hand, starts with the conjecture that meaning is reference—that words have meaning only insofar as they refer to things in the world. While not as obviously false as the homo economicus conjecture, the referentialist conjecture is still false—most words, upon close inspection, do not refer[3]I could point you to my own writing on this, the works of Jerrold Katz, and arguments from Noam Chomsky on referentialsm, or I could point out that one of the godfathers of referentialism, Ludwig … Continue reading, and there is a whole universe of meaning that has little to do with reference.

Most economists and semanticists would no doubt object to what the previous paragraph says about their discipline, and the objections would take one of two forms. Either they would defend homo economicus/referentialism, or they would downplay the importance of the conjecture in question—“Homo economicus is just a useful teaching tool for undergrads. No one takes it seriously anymore!”[4]Though, as the late David Graeber pointed out, economists never object when homo economicus is discussed in a positive light. “Semanticists don’t mean reference literally, we use model theory!”—and it’s this sort of response that I think can explain the expansionary behaviour of these disciplines. Suppose we take these objections to be honest expressions of what people in the field believe—that economics isn’t about homo economicus and formal semantics isn’t about reference. Well then, what are they about? The rise of behavioural economics suggests that economists are still looking for a replacement model of human agency, and model theory is basically just reference delayed.

The theories, then, seem to be about nothing at all—or at least nothing that exists in the real world—and as a result, they can be about anything at all—they are unmoored.

Furthermore, there’s an incentive to expand your domain when possible. A theory of nothing obviously can’t be justified by giving any sort of deep explanation of any one aspect of nature, so it has to be justified by appearing to offer explanations to a breadth of topics. Neoclassical economics can’t seem to predict when a bubble will burst, or what will cause inflation, but it can give what looks like insight into family structures. Formal semantics can’t explain why “That pixel is red and green.” is contradictory, but it provides a formal language to translate pragmatics into.

There’s a link here to my past post about falsification, because just as a theory about nothing can be a theory about anything, a theory about nothing cannot be false. So, watch out—if your empirical domain seems to be expanding, you might not be doing science any more.

Notes

Notes
1 It’s pretty much a tautology that a science’s domain will either grow, stay constant, or shrink over time
2 Now obviously, there’s a big difference between the two fields—neoclassical economics is extremely useful to the rich and powerful since it let’s them justify just about any horrendous crimes they would want to commit in the name of expanding their wealth and power, while formal semantics is a subdiscipline of a minor oddball discipline on the boundaries of humanities, social science, and cognitive science. But I’m a linguist, and I think mostly linguists read this.
3 I could point you to my own writing on this, the works of Jerrold Katz, and arguments from Noam Chomsky on referentialsm, or I could point out that one of the godfathers of referentialism, Ludwig Wittgenstein, seems to have repudiated it in his later work.
4 Though, as the late David Graeber pointed out, economists never object when homo economicus is discussed in a positive light.

What does falsification look like anyway?

Vulcan vs Neptune

There’s an argument that plays out every so often in linguistics the goes as follows:

Critic: This data falsifies theory T.
Proponent: Not necessarily, if you consider arguments X,Y, and Z.
Critic: Well, then theory T seems to be unfalsifiable!

This is obviously a specious argument on the part of the critic, since unfalsified does not entail unfalsifiable, but I think it stems from a very understandable frustration—theorists often have an uncanny ability to wriggle free of data that appears to falsify their theories, even though falsificationism is assumed by a large majority of linguists. The problem is that the logic falsificationism, while being quite sound, maybe unimpeachable, turns out to be fiendishly difficult to apply.

At its simplest, the logic of falsificationism says that a theory is scientific insofar as one can construct a basic statementi.e., a statement of fact—that would contradict the theory. This, of course, is an oversimplification of Karl Popper’s idea of Critical Rationalism in a number of ways. For one, falsifiability is not an absolute notion. Rather, we can compare the relative falsifiability of two theories by looking at what Popper calls their empirical content—the number of basic statements that would contradict them. So if a simple theoretical statement P has a particular empirical content, then the conjunction P & Q will have a greater empirical content, and the disjunction P v Q will have a lesser empirical content. This is a useful heuristic when constructing or criticizing a theory internally, and seems like a straightforward guide to testing theories empirically. Historically, though, this is not the case, largely because it is often difficult to recognize when we’ve arrived at and accurately formulated a falsifying fact. In fact, it is often, maybe always, the case that we don’t recognize a falsifying fact as such until after one theory has been superseded by another.

Take for instance the case of the respective orbits of Mercury and Uranus. By the 19th century, Newtonian mechanics had allowed astronomers to make very precise predictions about the rotations of the planets, and based on those predictions, there was a problem: two of the planets were misbehaving. First, it was discovered that Uranus—then the last known planet from the sun—wasn’t showing up where it should have been. Basically, Newton’s mechanics predicted that on such and so day and time Uranus would be in a particular spot in the sky, but the facts were otherwise. Rather than cry “falsification!”, though, the astronomers of the day hypothesized an object on the other side of Uranus that was affecting its orbit. One such astronomer, Urbain Le Verrier was even able to work backwards and predict where that object could be found. So in September of 1846, armed with Le Verrier’s calculations, Johann Gottfried Galle, was able to observe an eighth planet—Neptune. Thus, an apparent falsification became corroboration.

Urbain Le Verrier (1811-1877)
Johann Galle (1812-1910)

I’ve previously written about this story as a vindication of the theory first approach to science. What I didn’t write about, and what is almost never discussed in this context is Le Verrier’s work on the misbehaving orbit of Mercury. Again, armed with Newton’s precise mechanics, Le Verrier calculated the Newtonian prediction for Mercury’s orbit, and again[1]Technically though, Le Verrier’s work on Mercury predated his work on Uranus Mercury didn’t behave as expected. Again, rather than throw out Newtonian mechanics, Le Verrier hypothesized the planet Vulcan between Mercury and the sun, and set about trying to observe it. While many people claimed to observe Vulcan, none of these observations were reliably replicated. Le Verrier was undeterred, though, perhaps because observing a planet that close to the sun was quite tricky. Of course, it would be easy to paint Le Verrier as an eccentric—indeed, his Vulcan hypothesis is somewhat downplayed in his legacy—but he doesn’t seem to have been treated so by his contemporaries. The Vulcan hypothesis wasn’t universally believed, but neither does it seem to be the Flat-Earth theory of its day.

It was only when Einstein used his General Theory of Relativity to accurately calculate Mercury’s orbit, that the scientific community seems to have abandoned the search for Vulcan. Mercury’s orbit is now considered a classical successful test of General Relativity, but why don’t we consider it a refutation of Newtonian Mechanics? Strict falsificationism would seem to dictate that, but then a strict falsificationist would have thrown out Newtonian Mechanics as soon as we noticed Uranus misbehaving. So, falsificationism of this sort leads us to something of a paradox—if a single basic statement contradicts a theory, there’s no way of knowing if there is some second basic statement that, in conjunction with the first, could save the theory.

Still, it’s difficult to toss out falsification entirely, because a theory that doesn’t reflect reality, may be interesting but isn’t scientific.[2]Though sometimes, theories which seem to be empirically idle end up being scientifically important (cf. non-Euclidean geometry) Also, any reasonable person who has ever tried to give an explanation to any phenomenon, probably rejects most of their own ideas rather quickly on empirical bases. We should instead adopt falsificationism as a relative notion—use it when comparing multiple theories. So, Le Verrier was ultimately wrong, but acted reasonably—he had a pretty good theory of mechanics so he worked to reconcile it with some problematic data. Had someone developed General Relativity in Le Verrier’s time, then it would have been unreasonable to insist that a hypothesized planet was a better explanation than an improved theory.

Returning to the hypothetical debate between the Critic and the Proponent, then, I think a reasonable albeit slightly rude response for the proponent would be “Well, do you have a better theory?”

Notes

Notes
1 Technically though, Le Verrier’s work on Mercury predated his work on Uranus
2 Though sometimes, theories which seem to be empirically idle end up being scientifically important (cf. non-Euclidean geometry)

Internal unity in science again

Or, how to criticize a scientific theory

Recently, I discovered a book called The Primacy of Grammar by philosopher Nirmalangshu Mukherji. The book is basically an extended, and in my opinion quite good, apologia for biolinguistics as a science. The book is very readable and covers a decent amount of ground, including an entire chapter discussing the viability of incorporating a faculty of music into biolinguistic theory. I highly recommend it.

At one point, while defending biolinguistics from the charge of incompleteness levied by semanticists and philosophers, Mukherji makes the following point.

[D]uring the development of a science, a point comes when our pretheoretical expectations that led to the science in the first place have changed enough, and have been accommodated enough in the science for the science to define its objects in a theory-internal fashion. At this point, the science—viewed as a body of doctrines—becomes complete in carving out some specific aspect of nature. From that point on, only radical changes in the body of theory itself—not pressures from common sense—force further shifting of domains (Mukherji 2001). In the case of grammatical theory, either that point has not been reached or … the point has been reached but not yet recognized.

Mukherji (2010, 122-3)

There are two interesting claims that Mukherji is making about linguistic theory and scientific theory in general. One is that theoretical objects are solely governed by theory-internal considerations. The other is that the theory itself determines what in the external world it applies to.

The first claim reminded me of a meeting I had with my doctoral supervisor while I was writing my thesis. My theoretical explanation rested on the hypothesis that even the simplest of non-function words, like coffee, were decomposable into root objects (√COFFEE) and categorizing heads (n0). I had a dilemma though. It was crucial to my argument that, while categorizing heads had discrete features, roots were treated as featureless blobs by the grammar, but I couldn’t figure out how to justify such a claim. When I expressed my concern to my supervisor, she immediately put my worries to rest. I didn’t need to justify that claim, she pointed out, because roots by their definition have no features.

I had fallen into a very common trap in syntax—I had treated a theory-internal object as an empirical object. Empirical objects can be observed and sensibly argued about. Take, for instance, English specificational clauses (e.g. The winner is Mary). Linguists can and do argue about the nature of these—i.e. whether or they are truly the inverse of predicational clauses (e.g., Mary is the winner)— and cite facts the do so. This is because empirical objects and phenomena are out there in the real world, regardless of whether we study them. Theory-internal objects, on the other hand are not subject to fact-based argument, because, unless the Platonists are right, they have no objective reality. As long as my theory is internally consistent, I can define its objects however I damn please. The true test of any theory is how well it can be mapped onto some aspect of reality.

This brings me to Mukherji’s second assertion, that the empirical domain to a theory is determined by the theory itself. In the context of his book, this assertion is about linguistic meaning. The pretheoretic notion of meaning is what he calls a “thick” notion—a multifaceted concept that is very difficult to pin down. The development of a biolinguistic theory of grammar, though, has led to a thinner notion of meaning, namely, the LF of a given expression. Now obviously, this notion of meaning doesn’t include notions of reference, truth, or felicity, but why should we expect it to? Yes, those notions belong to our common-sense ideas of meaning, but surely at this stage of human history, we should expect that scientific inquiry will reveal our common-sense notions to be flawed.

As an analogy, Aristotle and his contemporaries didn’t distinguish between physics, biology, chemistry, geology, an so on—they were all part of physics. One of the innovations of the scientific revolutions, then, was to narrow the scope of investigation—to develop theories of a sliver of nature. If Aristotle saw our modern physics departments, he might look past all of their fantastic theoretical advances and wonder instead why no one in the department was studying plants and animals. Most critiques of internalist/biolinguistic notions of semantics by modern philosophers and formal semanticists echo this hypothetical time-travelling Aristotle—they brush off any advances and wonder where the theory of truth is.

Taken together, these assertions imply a general principle: Scientific theories should be assessed on their own terms. Criticizing grammatical theory for its lack of a theory of reference makes as much sense as criticizing Special Relativity for its lack of a theory of genetic inheritance. While this may seem to render any theory beyond criticism, the history of science demonstrates that this isn’t the case. Consider, for instance, quantum mechanics, which has been subject to a number of criticisms in its own terms—see: Einstein’s criticisms of QM, Schrödinger’s cat, and the measurement problem. In some cases these criticisms are insurmountable, but in others addressing them head-on and modifying or clarifying the theory is what leads to advances in the theory. Chomsky’s Label Theory, I think, is one of the latter sorts of cases—a theory-internal problem was identified and addressed and as a result two unexplained phenomena (the EPP and the ECP) were given a theoretical explanation. We can debate how well that explanation generalizes and whether it leans too heavily on some auxiliary hypotheses, but what’s important is that a theory-internal addressing of a theory-internal problem opened up the possibility of such an explanation. This may seem wildly counter-intuitive, but as I argued in a previous post, this is the only practical way to do science.

The principle that a theory should be criticized in its own terms is, I think, what irks the majority of linguists about biolinguistic grammatical theory the most. It bothers them because it means that very few of their objections to the theory ever really stick. Ergativity, for instance, is often touted as a serious problem for Abstract Case Theory, but since grammatical theory has nothing to say about particular case alignments, theorists can just say “Yeah, that’s interesting” and move on. Or to take a more extreme case, recent years have seen all out assaults on grammatical theory from people who bizarrely call themselves “cognitive linguists”, people like Vyvyan Evans and Daniel Everett, they claim to have evidence that roundly refutes the very notion of a language faculty. The response of biolinguists to this assault: mostly a resounding shrug as we turn back to our work.

So, critics of biolinguistic grammatical theory dismiss it in a number of way. They say it’s too vague or slippery to be any good as a theory, which usually means they refuse to seriously engage with it, they complain that the theory keeps changing—a peculiar complaint to lodge against a scientific theory, or they accuse theorists of arrogance—a charge that, despite being occasionally true, is not a criticism of the theory. This kind of hostility can be bewildering, especially because a corollary of the idea that a theory defines its own domain is that everything outside that domain is a free-for-all. It’s hard to imagine a geneticist being upset that their data is irrelevant to Special Relativity. I have some ideas about where the hostility comes from but they’ll take me pretty far afield, so I’ll save them for a later post and leave it here.

Play, games, science and bureaucracy

In the titular essay of his 2015 book The Utopia of Rules, David Graeber argues for a distinction between play and games. Play, according to Graeber is free, creative, and open-ended, while games are rigid, repetitive, and closed-off. Play underlies art, science, conversation, and community, while games are the preferred method of bureaucracy. This idea really resonated with me, partially because I’m someone who doesn’t really like games, but also because I think it’s perfectly consonant with something I’ve written about previously: the distinction between theoretical and descriptive science. In this post, I’ll explore that intuition, and argue that theoretical scientific research tends to center play, while descriptive research tends to center games.

The key distinction between games and play, according to Graeber, is rules. While both are leisure activities done for sheer enjoyment, games are defined by their rules. These rules can be rather simple (e.g., checkers), fiendishly complex (e.g., Settlers of Catan), or something in between, but whatever they are, the rules make the game. What’s more, Graeber argues, it’s the rules that make games an enjoyable respite from the ambiguities of real life. At any given point in a game, there are a finite number of possible moves and a fixed objective. If only we had that same certainty when navigating interactions with neighbours, co-workers, and romantic interests!

If games are defined by their rules, then play is defined by it’s lack of rules. The best example—one used by Graeber—is that of children playing. There are no rules to how children play. In fact, as Graeber observes, a good portion of play between children involves negotiating the rules. What’s more, there’s no winning at play, only enjoyment. Play is open-ended—set a group of children to play and there’s no knowing what they’ll come up with.

Yet, play is not random. It follows principles—such as the innate social instincts of children—which are a different sort of thing from the rules that govern games. Rules must be explicit, determinate, and preferably compiled in some centralized place so that, in a well-designed game, disputes can be always be settled by consulting some authority, usually a rule-book. Principles are often implicit—no one teaches kids how to play—can be quite vague—a main principle of improv is “Listen”—and are arguably somehow internal—if there are principles to playing a musical instrument, they come from the laws of physics, the form and material of the instrument, and our own innate sense of melody, harmony, and rhythm.

As this description might suggest, rules and principles, games and play, are often in conflict with each other. Taking a playful activity, and turning it into a game can eliminate some of the enjoyment. Take, for instance, flirtation—a playful activity, for which an anthropologist might be able to discover some principles. People who treat flirtation as a game understandably tend to be judged as creepy. Understandably, because gaming assumes a determinate rules—if I do X in situation Y, then Z will happen—and no-one likes to be treated like a robot. Or, consider figures like Chelsea Manning or Edward Snowden. Each was faced with a conflict between the external rules of an institution and their internal principles, and chose the latter.

This conflict, however, need not be an overall negative. Any art form at any given time follows a number of rules and conventions that, at their best, defines the space in which an artist can play. Eventually, though, the rules and conventions of a given art form or genre become too fixed and end up stifling the playfulness of the artists. I remember my cousin, who was a cinema studies major in undergrad talk about watching Citizen Kane for a class. The students were confused—this is widely lauded as one of the greatest films ever made, but they couldn’t see what was so special. The instructor explained that Citizen Kane was groundbreaking when it came out, it broke all the rules, but it ended up replacing them with new ones. Now those new rules are so commonplace, that they are completely unremarkable. While I don’t think we could develop an entire theory of aesthetics based solely on the balance between games and play, the opposition seems to be active in how we judge art.

But what does this have to do with science? Well, thus far I’ve suggested that games are defined by external rules, while play is guided by internal principles. This contrast lines up quite nicely with Husserl’s definitions descriptive and theoretical sciences respectively. Descriptive sciences are sets of truths grouped together by some externally imposed categorization, while theoretical sciences are sets of truth which have an internal cohesion. If I’m on the right track, then descriptive sciences should share some characteristics with games, while theoretical sciences should share some with play.

Much as games impose rules on their participants, descriptive sciences impose methods on their researchers. Often times they are quite explicit about this. Noam Chomsky, for instance, often says of linguistics education in the mid-20th century, that it was almost exclusively devoted to learning and practicing elicitation procedures (read: methods). The cognitive revolution that Chomsky was at the center of changed this, allowing theory to take center-stage, but we are currently in the midst of a shift back towards method. Graduate students are now expected or even required to take courses in “experimental” or quantitative methods. Job ads for tenure-track positions are rarely simply for a phonologist, or a semanticist, but rather, they invariably ask for experience with quantitative analysis or experimental methods, etc.

The problem with this is that methods in science, like rules in games, serve to fence in possibilities. When you boil it down to its essences, a well run experiment or corpus study is nothing but an attempt to frame and answer a yes-or-no question. What’s more, each method is quite restricted as to what sort of questions it can even answer. Even the presentation of method-driven research tends to be rather rigidly formatted—experimental reports follow the IMRaD format, so do many observational studies, and grammars, the output of field methods, tend to start with the phonetics and end with the syntax/semantics. So when someone says they’re going to perform an eye-tracking study, or some linguistic fieldwork, you can be fairly certain as to what their results will look like, just like you can be certain of what a game of chess will look like.

Contrast this with theoretical work, which tends to start with sometimes horribly broad questions and often ends up somewhere no-one would have expected. So, asking what language is yielded results in computer science, inquiring about the motion of the planets led to a new understanding of tides, and asking about the nature of debt reveals profound truths about human society. No game could have these kinds of results—if you sat down to play Pandemic and ended up robbing a bank, it probably means you read the rules wrong at least. But theory is not like a game, it’s inherently playful.

Now anyone who has read any scientific theory might object to this, as the writing produced by theorists tends to be rather abstract and inaccessible, but writing theory is like retelling an fun conversation—the fun is found in the moment and can never be fully recreated. The playful nature of theory, I think, can be seen in two of the main criticisms leveled at theoretical thinkers by non-theorists: that theoreticians can’t make up their minds and that they just make it up as they go along. These criticisms, however, tend to crop up whenever there is serious theoretical progress being made. In fact, many advances in scientific theories are met with outright hostility by the scientific community (see, atomic theory, relativity, the theory of grammar, etc), likely, i think, because a new theory tends to invalidate a good portion of what the contemporary community spend years, decades, or centuries, getting accustomed to, or worse yet, a theoretical advance might appear to render certain empirical results irrelevant or even meaningless.

Compare this to children playing. If children make up some rules while playing, only a fool would take those to be set in stone. Almost certainly, the children would come up against those rules and decide to toss them by the wayside.

As I mentioned, Graeber discusses games and play as a way of analyzing bureaucracy and our relationship to it. Bureaucracy—be it in government, corporations, or academia—is about creating games that aren’t fun. They are also impersonal power structures, what Hannah Arendt calls “rule by no-one”. And just as games are, bureaucracies are designed to hem in playfulness, because allowing people to be playful might lead them to realize that a better life is possible without those bureaucracies.

Within science, too, we can see bureaucracies being aligned with strictly methodical empirical work and somewhat hostile to theoretical work. We can see this in how the respective researcher organize themselves. Empirical work is done in a lab, which does not just refer to a physical space, but to a hierarchical organization with a PI (primary investigator), supervising and directing post-docs and grad students, who often in turn supervise and direct undergraduate research assistants—a chief executive, middle management, and workers. Theoretical work, on the other hand, is done in a wide array of spontaneously organized affinity groups. So, for instance, neither the Vienna Circle, in philosophy, nor the Bourbaki group, in mathematics, had any particular hierarchical structure and both were quite broad in their interests.

The distinction can even be seen in how theoretical and descriptive sciences interact with time and space. Experimental work must be done in specially designed rooms, sometimes made just for that one experiment, and observational work must be done in the natural habitat of the phenomena to be observed—just as a chess game must be limited to an 8×8 grid. Theoretical work, can be done almost anywhere: in a cafe, a bar, on a train, in a dark basement, or spacious detached house. The less specialized the better. In fact, the only limiting factor is the theorist themself. As for time, nowhere is this clearer than in the timelines given by PhD candidates in their thesis proposal. While not all games are on a clock, all games must account for all of their time—each moment of a game has a purpose. This is what a timeline for a descriptive project looks like: “Next month I’ll travel to place X where I’ll conduct Y hours of interviews. The following month I will organize and code the data…” and so on. It’s impossible to provide such detail in the plan for a theoretical work for several related reasons: The time spent working tends to be unstructured. You never know when inspiration or some kind of moment of clarity will strike. You can’t possibly know what the next step is until you complete the current step. and so on. Certainly, the playful work of theory can sometimes benefit from some structure, but descriptive work, like a game, absolutely depends on structured time and space.

This alignment can also be seen with how theory and method interact with the superstructures of scientific research, that is, the funding apparatuses—granting agencies and corporations. Both sorts of structures are bureaucratic and tend to be structurally opposed to theoretical (read: playful) work. In both cases, funders must evaluate a bunch of proposals and choose to fund those that are most likely to yield a significant result. Suppose you’re a grant evaluator and you have two proposals in front of you: Proposal A is to do linguistic fieldwork on some understudied and endangered language focusing on some possibly interesting aspect of that language, and Proposal B is to attempt to reconcile two seemingly contradictory theoretical frameworks. Assuming each researcher is eminently qualified to carry out their respective plans, which would you fund? Proposal A is all but guaranteed to have some results—they may be underwhelming, but they could be breakthroughs (though this is very unlikely)—a guarantee that’s implicit in the method—It’s always worked before. If Proposal B is successful, it is all but guaranteed to be a major breakthrough, however there is absolutely no guarantee that it will be successful—if the researcher cannot reconcile the two frameworks, then we cannot draw any particular conclusion from it. So which one do you choose? The guarantee, or the conditional guarantee? The conditional guarantee is a gamble, and bureaucrats aren’t supposed to gamble, so we go with the guarantee.

So, bureaucratic funding structures are more inclined to fund methods-based research, that’s fine as far as it goes—theoretical research is dirt cheap, only requiring a nourished mind and some writing materials—but grants aren’t just about the money. Today, grants are used as a metric for research capability. If you can get a lot of grants, then you must be a good researcher. Set aside the fact that virtually any academic will tell you that grant-writing is a particular skill that isn’t directly related to research ability, or that many researchers delegate grant-writing to their post-docs, the logic here is particularly twisted: Granting agencies use past grants as an indication of a good researcher, so do hiring committees. This makes sense—previous success in a process is a good indicator of future success—provided everything stays more or less the same. Thus the grant system and other bureaucratic systems are likely to defend the status quo, by funding descriptive rather than theoretical work.

If my analysis is correct, then the sciences are being held back by the bureaucracies that are supposed to enable them such as university administration and funding agencies. They’re also held back by their own mythology—the “scientific method”—which promises breakthroughs if only they keep playing the game. This should not be too surprising to anyone who considers how bureaucracies hold them back in their day-to-day lives. What’s frustrating about this though, is that academia, more than any sector of modern society, is supposed to be self-organized. University administrators (Deans, Presidents, Provosts, etc.) are supposed to be drawn from the faculty of that university, and funding organizations are supposed to be run by researchers. So, unlike the bureaucracies the demean the poor and outsource jobs, the bureaucracies that stifle academics are self-imposed. The positive side of this is that, if academics wanted to, they could dismantle many of their bureaucracies tomorrow.

Instrumentalism in Linguistics

(Note: Unlike my previous posts, this one is not aimed at a general audience. this one’s for linguists)

As a generative linguist, I like to think of myself as a scientist. Certainly, my field is not as mature and developed as physics, chemistry, and biology, but my fellow linguists and I approach language and its relation to human psychology scientifically. This is crucial to our identity. Sure our universities consider linguistics a member of the humanities, and we often share departments with literary theorists, but we’re scientists!

Because it’s so central to our identity, we’re horribly insecure about our status as scientists. As a result of our desire to be seen as a scientific field, we’ve adopted a particular philosophy of science without even realizing it: Instrumentalism.

But, what is instrumentalism? It’s the belief that the sole, or at least primary, purpose of a scientific theory is its ability to generate and predict the outcome of empirical tests. So, one theory is preferable to another if and only if the former better predicts the data than the latter. A theory’s simplicity, intelligibility, or consistency is at best a secondary consideration. Two theories that have the same empirical value can then be compared according to these standards. Generative linguistics seems to have adopted this philosophy, to its detriment.

What’s wrong with instrumentalism? Nothing per se. It definitely has its place in science. It’s perfectly reasonable for a chemist in a lab to view quantum mechanics as an experiment-generating machine. In fact, it might be an impediment to their work to worry about how intelligible QM is. They would be happy to leave that kind of thinking to the theorists and philosophers while they, the experimenter, used the sanitized mathematical expressions of QM to design and carry out their work.

“Linguistics is a science,” the linguist thinks to themself. “ So, linguists ought to behave like scientists.” Then with a glance at the experimental chemist, the linguist adopts instrumentalism. But, there’s a fallacy in that line of thinking: Instrumentalism being an appropriate attitude for some people in a mature science, like chemistry, does not mean it should be the default attitude for people in a nascent science, like linguistics. In fact, there are good reasons for instrumentalism to be only a marginally acceptable attitude in linguistics. Rather, we should judge our theories on the more humanistic measures of intelligibility, simplicity, and self-consistency in addition to consistency with experience.

What’s wrong with instrumentalism in linguistics?

So why can’t linguists be like the chemist in the lab? Why can’t we read the theory, develop the tests of the theory, and run them? There are a number of reasons. First, as some philosophers of science have argued, It is never the case that a theoretical statement is put to the test by an empirical statement, but rather the former is tested by the latter in light of a suite of background assumptions. So, chemists can count the number of molecules in a sample of gas if they know its pressure, volume, and temperature. How do they know, say, the temperature of the gas sample? They use a thermometer, of course, an instrument they trust by virtue of their background assumptions regarding the how matter, in general, and mercury, in particular, are affected by temperature changes. Lucky for chemists, those assumptions have centuries worth of testing and thinking behind them. No such luck for generative linguists, we’ve only got a few decades of testing and thinking behind our assumptions, which is reflected by how few empirical tools we have and how unreliable they are. Our tests for syntactic constituency are pretty good in a few cases — good enough to provide evidence that syntax traffics in constituency — but they give way too many false positives and negatives. Their unreliability means real syntactic work must develop diagnostics which are more intricate and which carry much more theoretical baggage. If a theory is merely a hypothesis-machine, and the tools for testing those hypotheses depend on the theory, how can we avoid rigging the game in our favour?

Suppose we have two theories, T1 and T2, which are sets of statements regarding an empirical domain D. T1 has been rigorously vetted and found to be internally consistent, simple, and intelligible, and predicts 80% of the facts in D. T2 is rife with inconsistencies, hidden complexities, and opaque concepts, but covers 90% of the facts in D. Which is the better theory? Instrumentalism would suggest T2 is the superior theory due to its empirical coverage. Non-dogmatic people might disagree, but I suspect would all be uncomfortable with instrumentalism as the sole arbiter in this case.

The second problem, which exacerbates the first, is that there’s too much data, and it’s too easy to get even more. This has resulted in subdisciplines being further divided into several niches each devoted to a particular phenomenon or group of languages. Such a narrowing of the empirical domain, coupled with an instrumentalist view of theorizing, has frequently led to the development of competing theories of that domain, theories which are largely impenetrable to those conversant with the general theory but uninitiated with the niche in question. This is a different situation from the one described above. In this situation T1 and T2 might each cover 60% of a subdomain D’, but those 60% are overlapping. Each has a core set of facts that the other cannot, as yet, touch, so the two sides take turns claiming parts of the overlap as their sole territory, and no progress is made.

Often it’s the case that one of the competing specific theories is inconsistent with the general theory, but proponents of the other theory don’t use that fact in their arguments. In their estimation the data always trumps theory, regardless of how inherently theory-laden the description of the data is. It’s as if two factions were fighting each other with swords despite the fact that one side had a cache of rifles and ammunition that they decided not to use.

The third problem, one that has been noted by other theory-minded linguists here and here, is that the line between theoretical and empirical linguistics is blurry. To put it a bit more strongly, what is called “theoretical linguistics” is often empirical linguistics masquerading as theoretical. This assertion becomes clear when we look at the usual structure of a “theoretical syntax” paper in the abstract. First, a grammatical phenomenon is identified and demonstrate. After some discussion of previous work, the author demonstrates the results of some diagnostics and from those results gives a formal analysis of the phenomenon. If we translated this into the language of a mature science it would be indistinguishable from an experimental report. A phenomenon is identified and discussed, the results of some empirical techniques are reported, and an analysis is given.

You might ask “So what? Who cares what empirical syntacticians call themselves?” Well, if you’re a “theoretical syntactician,” then you might propose a modification of syntactic theory to make your empirical analysis work, and other “theoretical syntacticians” will accept those modifications and propose some modifications of their own. It doesn’t take too long in this cycle before the standard theory is rife with inconsistencies, hidden complexities, and opaque concepts. None of that matters, however, if your goal is just to cover the data.

Or, to take another common “theoretical” move, suppose we find an empirical generalization, G (e.g., All languages that allow X also allow Y), the difficult task of the theoretician is to show that G follows from independently motivated theoretical principles. The “theoretician,” on the other hand, has another path available, which is to restate G in “theoretical” terms (e.g., Functional head, H, is responsible for both X and Y), and then (maybe) go looking for some corroboration. Never mind that restating G in different terms does nothing to expand our understanding of why G holds, but understanding is always secondary for instrumentalism.

So, what’s to be done?

Reading this, you might think I don’t value empirical work in linguistics, which is simply not the case. Quite frankly, I am constantly in awe of linguists who can take a horrible mess of data and make even a modicum sense out of it. Empirical work has value, but linguistics has somehow managed to both over- and under-value it. We over-value it by tacitly embracing instrumentalism as our guiding philosophy. We under-value it by giving the title “theoretical linguist” a certain level of prestige. We think empirical work is easier and less-than. This has led us to under-value theoretical work, and view theoretical arguments as just gravy when they’re in our favour, and irrelevancies when they’re against us.

What we should strive for, is an appropriate balance between empirical and theoretical work. To get to that balance we must do the unthinkable and look to the humanities. To develop as a science, we ought to look at mature sciences, not as they are now, but as they developed. Put another way, we need to think historically. If we truly want our theory to explain the human language faculty, we need to accept that we will be explaining it to humans and designing a theory that another human can understand requires us to embrace our non-rational qualities like intuition and imagination.

In sum, we could all use a little humility. Maybe we’ll reach a point when instrumentalism will work for empirical linguistics, but we’re not there yet, and pretending we are won’t make it so.