The DP Hypothesis—a case study of a sticky idea

Recently, in service of a course I’m teaching, I had a chance to revisit and fully engage with what might be the stickiest idea in generative syntax—The DP hypothesis. For those of you who aren’t linguists, the DP hypothesis, though highly technical, is fairly simple to get the gist of based on a couple of observations:

Observation 1: Words in sentences naturally cluster together into phrases like “the toys”, “to the store”, or “eat an apple.”

Observation 2: In every phrase, there is a single main word called the head of the phrase. So, for instance, the head of the phrase “eat an apple” is the verb “eat.”

These observations are formalized in syntactic theory, so that “eat an apple” is labeled a VP (Verb Phrase), while “to the store” is a PP (Preposition Phrase). Which leads us to the DP hypothesis: Phrases like “the toys,” “a red phone,” or “my dog” should be labelled as DPs (Determiner Phrases) because their heads are “the,” “a,” and “my,” which are called determiners in modern generative syntax.

This is fairly counterintuitive, to say the least. The intuitive hypothesis—the one that pretty much every linguist accepted until the 1980s—is that those phrases are NPs (Noun Phrases), but if we only accepted intuitive proposals, there’d be no science to speak of. Indeed, the all the good scientific theories start off counterintuitive and become intuitive only by force of argument. One of the joys of theory is experiencing that shift of mind-set—it can feel like magic when done right.

So it was quite unnerving when I started reading the actual arguments for the DP hypothesis, which I had, at one point, fully bought into, and and began to become less convinced by each one. It didn’t feel like magic, it felt like a con.

My source for this is a handbook chapter by Judy Bernstein that summarizes the basic argument for the DP Hypothesis—a twofold argument consisting of a Parallelism argument and purported direct evidence of the DP Hypothesis— as previously advanced sand developed by Szabolcsi, Abney, Longobardi, Kayne, Bernstein herself, and others.

The parallelism argument is based on another counterintuitive theory developed in in the mid-20th century which states that clauses, previously considered either headless or VPs, are actually headed by abstract (i.e., silent) words. That is, they are variously considered TPs (Tense Phrases), IP’s (Inflection Phrases), or CPs (Complementizer Phrases). The parallelism argument states that “if clauses are like that, then ‘noun phrases’ be like that too” and then finds data where “noun phrases” look like clauses in some way. This might seem reasonable on its face, but it’s a complete non sequitur. Maybe the structure of a “noun phrase” parallels that of a clause, but maybe it doesn’t. In fact, there’s probably good reason to think that the structure of “noun phrases” is the inverse of the structure of the clause—the clause “projects” from the verb, and verbs and nouns are complementary, so shouldn’t the noun have complementary properties to the verb?

Following through on parallelism, if extended VPs are actually CPs, then extended NPs are DPs. Once you have that hypothesis, you can start making “predictions” and checking if the data supports them. And of course there is data that becomes easy to explain once we have the DP Hypothesis. Again, this is good as far as it goes, but there’s a key word missing—”only.” We need data that only becomes easy to explain once we have the DP Hypothesis. And while I don’t have competing analyses for the data adduced for the DP Hypothesis at the ready—though Ben Bruening has one for at least one such phenomenon—I’m not really convinced that none exist.

And that’s the foundation of the DP Hypothesis, a weak argument resting on another weak argument. Yet, it’s a sticky one—I can count on one hand the contemporary generative syntacticians that have expressed skepticism about it. Why is it so sticky? My hypothesis is that it’s useful as a shibboleth and as a “project pump”.

Its usefulness as a shibboleth is fairly straightforward—there’s no quicker way to mark yourself as a generative syntactician than to put DPs in your tree diagrams. Even I find it jarring to see NPs in trees.

To see the utility of the DP Hypothesis as a “project pump”, one need only to look at the Cartography/Nanosyntax literature. Once you open up a space for invisible functional heads between N and D, you seem to find them everywhere. This, I think, is what Chomsky meant when he described the DP Hypothesis as “…very fruitful, leading to a lot of interesting
work” before saying “I’ve never really been convinced by it.” Who cares if it’s correct, it contains infinite dissertations!

Now maybe I’m being to hard on the DP and its fans. After all, as far as theoretical avenues go, the DP Hypothesis is something of a cul de sac, albeit a large one—the core theory doesn’t really care whether “the bee” is a DP or and NP, so what’s the harm? I could point out that by maiking such a feeble hypothesis our standard, we’ve opened ourselves to being dunked on my anti-generativists. Or I could bore you with such Romantic notions as “calling all things by their right names.” Instead, I’ll be practical and point out that, contrary to contemporary digital wisdom, the world is not infinite, and every bit of real estate given to the DP cul-de-sac in the form of journal articles, conference presentations, tenure-track hires, etc. is space that could be used otherwise. And, to torture the metaphor further, shouldn’t we try to use our real estate for work with a stronger foundation?

The “science” of modern “AI”

(or Piantadosi and MLMs again (II)—continuation of this post)

In my critique of Prof. Piantadosi’s manuscript “Modern language models refute Chomsky’s approach to language,” I point out that regardless of the respective empirical results of Generative Linguistics and MLMs, the latter does not supersede the former because the two have fundamentally different goals. Generative Linguistics aims to provide a rational explanation of a natural phenomenon while MLM are designed to simulate human language use. Piantadosi does not dispute this, but rather states that

… there is an interesting debate about the nature of science lurking here. The critics’ position seems to be that in order for something to be a scientific theory, it must be intuitively comprehensible to us. I disagree because there are many phenomena in nature which probably will never admit a
simple enough description for us to comprehend. We cannot just exclude these things from scientific inquiry.

p37 of v7 (emphasis in original)

Being one of the “critics” referred to here, I can grant the professor’s description of my position as basically accurate if a bit glib. But what is his position? He doesn’t say precisely, but we can make some inferences. In lieu of a clear statement of his position, for instance, Piantadosi follows the above quote with this:

There probably is no simple theory of a stock market (why IBM takes on a particular value) or dynamics in complex systems (why an O2 molecule hits a particular place on my eyeball). Certainly there are local, proximate causes (Tom Jones bid $142 for IBM; the O2 molecule was bumped by another), but when you start to trace these causes back into the complex system, you will quickly exceed our ability to understand the complex network of interactions.

p37 of v7

These are slightly bizarre comments, as we do have comprehensible (i.e., simple) theories of stock markets—the efficient markets hypothesis, for instance[1]This should not be taken as an endorsement of the efficient markets hypothesis—or any part of (neo)classical economics—as correct. A theory’s scientific-ness is no guarantee of its … Continue reading—and gases—the kinetic theory, for instance—which can give approximate predictions regarding real life events like the examples given. The professor’s view can be narrowed down slightly based onhis assertion that Rawski & Baumont (2023) “seem to misunderstand the linkage between experiment and theory” (p34 of v7)[2]This is a bold claim for Piantadosi to make given that he is a psychologist, while Lucie Baumont—the latter half of Rawski & Baumont—is an empirical astrophysicist. when they state that “Explanatory power, not predictive adequacy, forms the core of physics and ultimately all modern science.” (Rawski & Baumont 2023) It would seem clear, then, that, for Piantadosi at least, that a “theory” is scientific only insofar as it has predictive power.

This may seem like a reasonable characterization—despite myriad insinuations to the contrary, virtually no one believes that predictive power is unimportant—but as soon as one attempts to develop that characterization things get dicey. What, for instance, is the required level of accuracy and precision for science? And What sort of things should a true science be able to predict? To use one of Piantadosi’s examples, individual molecules are the primitives of the kinetic theory of gasses, and the theory makes precise predictions about the behaviour of a gas—i.e., gas molecules in aggregate—but it is highly doubtful that it would make predictions about the actual motion of a particular molecule in any situation. Surely, this would be too much to ask of any theory of physics, yet Piantadosi seems to believe it is within the realm of scientific inquiry.

There’s also a question of what it means to “predict” something. Piantadosi’s argument boils down to “MLMs are better than Chomsky’s approach theory, because they make more correct predictions,” yet nowhere does he explicitly say what those predictions are, nor does he document any tests of those predictions. Instead, we are treated to his prompts to a chatbot followed by the chatbot’s response. Perhaps these are the predictions. Perhaps they predict how a human would respond to such prompts. If so, then so much the worse for MLMs qua scientific theories because, even if MLMs were indistinguishable from humans, the odds of any two humans answering a single question the same way is vanishingly slim, and any way to determine a general similarity between utterances would almost certainly be either arbitrary or dependent on some theoretical framework. At best, MLMs simulate human language use, meaning they no more predict facts of language than a compass predicts facts of geometry.

Chomsky’s approach to theories of language, on the other hand makes clear predictions if one bothers to engage with it. The predictions are of the form “Given theoretical statement T, a competent speaker of language L will judge expression S as (un)acceptable in context C.” This is exactly the sort of prediction that one finds in other sciences—”if one performs precisely this action under precisely these conditions, one will observe precisely this reaction”—and the sort of prediction that is absent in Piantadosi’s paper.

Indeed these predictions seem to be absent in the entire contemporary “AI” discourse, and with good reason—”AI” is not a scientific enterprise. It’s an engineering project. A fact that is immediately obvious when one considers how it measures success—against a battery of predetermined arbitrary tests. MLM researchers, then, aren’t discovering truths, they’re building tools to spec, like good engineers.

This is not to cast aspersions on engineers, but it does raise a question—the core question: How exactly can an engineering project like MLMs refute a scientific theory like Generative Grammar?

Notes

Notes
1 This should not be taken as an endorsement of the efficient markets hypothesis—or any part of (neo)classical economics—as correct. A theory’s scientific-ness is no guarantee of its correctness.
2 This is a bold claim for Piantadosi to make given that he is a psychologist, while Lucie Baumont—the latter half of Rawski & Baumont—is an empirical astrophysicist.

Piantadosi and MLMs again (I)

Last spring, Steven Piantadosi, professor of psychology and neuroscience, posted a paean to Modern Language Models (MLMs) entitled Modern language models refute Chomsky’s approach to language on LingBuzz. This triggered a wave of responses from linguists, including one from myself, pointing out the many ways that he was wrong. Recently, Prof. Piantadosi attached a postscript to his paper in which he responds to his critics. The responses are so shockingly bad, I felt I had to respond—at least to those that stem from my critiques—which I will do, spaced out across a few short posts.

In my critique, I brought up the problem of impossible languages, as did Moro et al. in their response. In addressing this critique, Prof. Piantadosi surprisingly begins with a brief diatribe against “poverty of the stimulus.” I say surprisingly, not because it’s surprising for an empiricist to mockingly invoke “poverty of stimulus” much in the same way as creationists mockingly ask why there are still apes if we evolved from them, but because poverty of stimulus is completely irrelevant to the problem of impossible languages and neither I nor Moro et al. even use the phrase “poverty of stimulus.”[1]For my part, I didn’t mention it because empiricists are generally quite assiduous in their refusal to understand poverty of stimulus arguments.

This irrelevancy expressed, Prof. Piantadosi moves on to a more on-point discussion. He argues that it would be wrong-headed for the constraints that would make some languages impossible to be encoded in our model from the start. Rather, if we start with an unconstrained model, we can discover the constraints naturally:

If you try to take constraints into account too early, you might have a harder time discovering the key pieces and dynamics, and could create a worse overall solution. For language specifically, what needs to be built in innately to explain the typology will interact in rich and complex ways with what can be learned, and what other pressures (e.g. communicative, social) shape the form of language. If we see a pattern and assume it is innate from the start, we may never discover these other forces because we will, mistakenly, think innateness explained everything

p36 (v6)

This makes a certain intuitive sense. The problem is that it’s refuted both by the history of generative syntax and the history of science more broadly.

In early theories, a constraint like “No mirroring transformations!” would have to be stated explicitly. Current theories, though, are much simpler with most constraints being derivable from the theory rather than tacked onto the theory.

A digression on scholarly responsibility: Your average engineer working on MLMs could be forgiven for not being up on the latest theories in generative syntax, but Piantadosi is an Associate Professor who has chosen to write a critique of generative syntax, so he really ought to know these things. In fact, he would only not know these thing by a conscious choice not to know or laziness.

Furthermore, the natural sciences have progressed thus far in precisely the opposite direction as what Piantadosi prescribes—they have started with highly constrained theories and progress has generally occurred when some constraint is questioned. Copernicus questioned the constraint that Earth stood still, Newton questioned the constraint that all action was local, Friedrich Wöhler questioned the constraint that organic and inorganic substances were inherently distinct.

None of this, of course, means that we couldn’t do science in the way that Piantadosi suggests—I think Feyerabend was correct that there is no singular Scientific Method—but the proof of the pudding is in the eating. Piantadosi is effectively making a promise that if we let MLM research run its course we will find new insights[2]He seems to contradict himself later on when he asserts that the “science” of MLMs may never be intelligible to humans. More on this in a later post. that we could not find had we stuck with the old direction of scientific progress, and he may be right—just as AGI may actually be 5 years away this time—but I’ll believe it when I see it.


After expressing his methodological objections to considering impossible languages, Piantdosi expresses skepticism as to the existence of impossible languages, stating ” More troubling, the idea of “impossible languages” has never actually been empirically justified.” (p37, v6) This is a truly astounding assertion on his part considering both Moro et al. and I explicitly cite experimental studies that arguable provide exactly the empirical justification that Piantadosi claims does not exist. Both studies cited present participants with two types of made-up languages—one which follows and one which violates the rules of language as theorized by generative syntax—and observes their responses as they try to learn the rules of the particular languages. The study I cite (Smith and Tsimpli 1995) compares the behavioural responses of a linguistic savant to those of neurotypical participants, while the studies cited by Moro et al. (Tettamanti et al., 2002; Musso et al., 2003) uses neuro-imaging techniques. Instead Prof. Piantadosi refers to every empiricists favourite straw-man argument—the alleged lack of embedding structures in Pirahã.

This bears repeating. Both Moro et al. and I expressly point to experimental evidence of impossible languages, and Piantadosi’s response is that no one has ever provided evidence of impossible languages.

So, either Prof. Piantadosi commented on mine and Moro et al‘s critiques without reading them, or he read them and deliberately misrepresented them. It is difficult to see how this could be the result of laziness or even willful ignorance rather than dishonesty.

I’ll leave off here, and return to some of Prof. Piantadosi’s responses to my critiques at a later time.

Notes

Notes
1 For my part, I didn’t mention it because empiricists are generally quite assiduous in their refusal to understand poverty of stimulus arguments.
2 He seems to contradict himself later on when he asserts that the “science” of MLMs may never be intelligible to humans. More on this in a later post.

The Descriptivist Fallacy

A recent hobby-horse of mine—borrowed from Norbert Hornstein—is the idea that the vast majority of what is called “theoretical generative syntax” is not theoretical, but descriptive. The usual response when I assert this seems to be bafflement, but I recently got a different response—one that I wasn’t able to respond to in the moment, so I’m using this post to sort out my thoughts.

The context of this response was that I had hyperbolically expressed anger at the title of one of the special sessions at the upcoming NELS conference—”Experimental Methods In Theoretical Linguistics.” My anger—more accurately described as irritation—was that, since experiment and theory are complementary terms in science, the title of the session was contradictory unless the NELS organizers were misusing the terms. My point, of course, was that the organizers of NELS—one of the most prestigious conferences in the field of generative linguistics—were misusing the terms because the field as a whole has taken to misusing the terms. A colleague, however, objected, saying that generative linguists were a speech community and that it was impossible for a speech community to systematically misuse words of its own language. My colleague was, in effect, accusing me of the worst offense in linguistics—prescriptivism.

This was a jarring rebuttal because, on the one hand, they aren’t wrong, I was being prescriptive. But, on the other hand and contrary to the first thing students are taught about linguistics, a prescriptive approach to language is not always bad. To see this, let’s consider the to basic rationales for descriptivism as an ethos.

The first rationale is purely practical—if we linguists want to understand the facts of language, we must approach them as they are, not as we think they should be. This is nothing more than standard scientific practice.

The second rationale is a moral one, stemming from the observation that language prescription tends to be directed at groups that lack power in society—Black English has historically been treated as “broken”, features of young women’s speech (“up-talk” in the 90s and “vocal fry” in the 2010s) is always policed, rural dialects are mocked. Thus, prescriptivism is seen as a type of oppressive action. Many linguists make it no further in thinking about prescriptivism, unfortunately, but there are many cases in which prescriptivism is not oppressive. Some good instances of prescriptivism—assuming they are done in good faith—are as follows:

  1. criticizing the use of obfuscatory phrases like “officer-involved shooting” by mainstream media
  2. calling out racist and antisemitic dog-whistling by political actors.
  3. discouraging the use of slurs
  4. encouraging inclusive language
  5. recommending that a writer avoid ambiguity
  6. Asking an actor to speak up

Examples 1 and 2 are obviously non-oppressive uses of prescriptivism, as they are directed at powerful actors; 3 and 4 can be acceptable even if not directed at a powerful person, because they attempt to address another oppressive act; and 5 and 6 are useful prescriptions, as they help the addressee to perform their task at hand more effectively.

Now, I’m not going to try to convince you that the field of generative syntax is some powerful institution, nor that the definition of “theory” is an issue of social justice. Here my colleague was correct—members of the field are free to use their terminology as they see fit. My prescription is of the third variety—a helpful suggestion from a member of the field that wants it to advance. So, while my prescription may be wrong, I’m not wrong to offer it.

Using anti-prescriptivism as a defense against critique is not surprising—I’m sure I’ve had that reaction to editorial suggestions on my work. In fact, I’d say it’s a species of a phenomenon common among folks who care about social justice, where folks mistake a formal transgression for a violation of an underlying principle. In this case the formal act of prescription occurred but without any violation of the principle of anti-oppression.

How do we get good at using language?

Or: What the hell is a figure of speech anyway?

At a certain level I have the same level of English competence as Katie Crutchfield, Josh Gondelman, and Alexandria Ocasio-Cortez. This may seem boastful to a delusional degree of me, but we’re all native speakers of a North American variety of English of a similar age, and this is the level of competence that linguists tend to care about. Indeed, according to our best theories of language, the four of us are practically indistinguishable.

Of course, outside of providing grammaticality judgements, I wouldn’t place myself anywhere near those three, each of whom could easily be counted among the most skilled users of English living. But what does it mean for people to have varied levels of skill in their language use? And is this even something that linguistic theory should be concerned about?

Linguists, of course, have settled on 5 broad levels of description of a given language

  1. Phonetics
  2. Phonology
  3. Morphology
  4. Syntax
  5. Semantics

It seems quite reasonable to say we can break down language skill along these lines. So, skilled speakers can achieve a desire effect by manipulating their phonetics, say by raising their voices, hitting certain sounds in a particular way, or the like. Likewise, phonological theory can provide decent analyses of rhyme, alliteration, rhythm etc. Skilled users of a language also know when to use (morphologically) simple vs complex words, and which word best conveys the meaning they intend. Maybe a phonetician, phonologist, morphologist, or semanticist, will disagree, but these seem like fairly straightforward to formalize, because they all involve choosing from among a finite set of possibilities—a language only has so many lexical entries to choose from. What does skill mean in the infinite realm of syntax? What does it mean to choose the correct figure of speech? Or even more basically, how does one express any figure of speech in the terms of syntactic theory?

It’s not immediately obvious that there is any way to answer these questions in a generative theory for the simple reason that figures of speech are global properties of expressions, while grammatical theory deals in local interactions between parts of expressions. Take an example from Abraham Lincoln’s second inaugural address:

(1) Fondly do we hope—fervently do we pray—that this mighty scourge of war may speedily pass away.

There are three syntactic processes employed by Lincoln here that I can point out:

(2) Right Node Raising
Fondly do we hope that this mighty scourge of war may speedily pass away, and fervently do we pray that this mighty scourge of war may speedily pass away. -> (1)

(3) Subject-Aux Inversion
Fondly we hope … -> (1)

(4) Adverb fronting
We hope fondly… -> (1)

Each of these represents a choice—conscious or otherwise—that Lincoln made in writing his speech and, while most generative theories allow for choices to be made, they are not at the same levels.

Minimalist theories, for instance, allow for choices at each stage of sentence construction—you can either move constituent, add a constituent, or stop the derivation. Each of (3) and (4) could conceivably be represented as a single choice, but it seems highly unlikely that (2) could. In fact, there is nothing approaching a consensus as to how right node raising is achievable, but it is almost certainly a complex phenomenon. It’s not as if we have a singular operation RNR(X) which changes a mundane sentence into something like (1), yet Lincoln and other writers and orators seem to have it as a tool in their rhetorical toolboxes.

Rhetorical skill of this kind suggest the possibility of a meta-grammatical knowledge, which all speakers of a language have to some extent, and which highly skilled users have in abundance. But what could this meta-grammatical knowledge consist of? Well, if the theoretical representation of a sentence is a derivation, then the theoretical representation of a figure of speech would be a class of derivations. This suggests an ability to abstract over derivations in some way and therefore, it suggests that we are able to acquire not just lexical items, but also abstractions of derivations.

This may seem to contradict the basic idea of Minimalism by suggesting two grammatical systems and indeed, it might be a good career move on my part to declare that the fact of figures of speech disproves the SMT, but I don’t see any contradiction inherent here. In fact, what I’m suggesting here and have argued for elsewhere is something that is a fairly basic observation from computer science and mathematical logic—that the distinction between operations and operands is not that distinct. I am merely suggesting that part of a mature linguistic knowledge is higher-order grammatical functions—functions that operate on other functions and/or yield other functions—and that, since any recursive system is probably able to represent higher-order functions, we should absolutely expect our grammars to allow for them.

Assuming this sort of abstraction is available and responsible for figures of speech, our task as theorists then is to figure out what form the abstraction takes, and how it is acquired, so I can stop comparing myself to Katie Crutchfield, Josh Gondelman, and AOC.

De re/De dicto ambiguities and the class struggle

If you follow the news in Ontario, you likely heard that our education workers are demanding an 11.7% wage raise in the current round of bargaining with the provincial government. If, however, you are more actively engaged with this particular story—i.e., you read past the headline, or you read the union’s summary of bargaining proposals—you may have discovered that, actually, the education workers are demanding a flat annual $3.25/hr increase across the board. On the surface, these seem to be two wildly different assertions that can’t both be true. One side must be lying! Strictly speaking, though, neither side is lying, but one side is definitely misinforming.

Consider a version of the headline (1) that supports the government’s line.

(1) Union wants 11.7% raise for Ontario education workers in bargaining proposal.

This sentence is ambiguous. More specifically is shows a de re/de dicto ambiguity. The classic example of such an ambiguity is in (2).

(2) Alex wants to marry a millionaire.

There is one way of interpreting this in which Alex wants to get married and one of his criteria for a spouse is that they be a millionaire. This is the de dicto (lit. “of what is said”) interpretation of (2). The other way of interpreting it is that Alex is deeply in love with a particular person and wants to marry them. It just so happens that Alex’s prospective spouse is a millionaire—a fact which Alex may or may not know. This is the de re (lit. “of the thing”) interpretation of (2). Notice how (2) can describe wildly different realities—for instance, Alex can despise millionaires as a class, but unknowingly want to marry a millionaire.

Turning back to our headline in (1), what are the different readings? The de dicto interpretation is one in which the union representatives sit down at the bargaining table and say something like “We demand an 11.7% raise”. The de re interpretation is one in which the union representatives demanded, say, a flat raise that happens to come out to an 11.7% raise for those workers with the lowest wages when you do the math. The de re interpretation is compatible with the assertions made by the union, so it’s probably the accurate interpretation.

So, (1) is, strictly speaking, not false under one interpretation. It is misinformation, though, because it deliberately introduces a substantive ambiguity in a way that, the alternative headline in (3) does not.

(3) Union wants $3.25/hr raise for Ontario education workers in bargaining proposal

Of course (3) has the de re/de dicto ambiguity—all expressions of desire do—but both interpretations would accurately describe the actual situation. Someone reading the headline (3) would be properly informed regardless of how they interpreted it, while (1) leads some readers to believe a falsehood.

What’s more, I think it’s reasonable to call the headline in (1) deliberate misinformation.

The simplest way to report the union’s bargaining positions would be to simply report it—copy and paste from their official summary. To report the percentage increase as they did, someone had to do the arithmetic to convert absolute terms to relative terms—a simple step, but an extra step nonetheless. Furthermore, to report a single percentage increase, they had to look only at one segment of education workers—the lowest-paid segment. Had they done the calculation on all education workers, they would have come up with a range of percentages, because $3.25 is 11.7% of $27.78, but 8.78% of 37.78, and so on. So, misinforming the public by publishing (1) instead of (3) involved at least two deliberate choices.

It’s worth asking why misinform in this way. A $3.25/hr raise is still substantial and the government could still argue that it’s too high, so why misinform? One reason is that puts workers in the position of explaining that it’s not a bald-faced lie, but it’s misleading, making us seem like pedants. but I think there’s another reason for the government to push the 11.7% figure, it plays into and furthers an anti-union trope that we’re all familiar with.

Bosses always paint organized labour as lazy, greedy, and corrupt—”Union leaders only care about themselves only we bosses care about workers and children.” They especially like to claim that unionized workers, since they enjoy higher wages and better working conditions, don’t care about poor working folks.[1]Indeed there are case in which some union bosses have pursued gains for themselves at the expense of other workers—e.g., construction Unions endorsing the intensely anti-worker Ontario PC Party … Continue reading The $3.25/hr raise demand, however, reveals these tropes as lies.

For various reasons, different jobs, even within a single union, have unequal wages. These inequalities can be used as a wedge to keep workers fighting amongst themselves rather than together against their bosses. Proportional wage increases maintain and entrench those inequalities—if everyone gets a 5% bump, the gap between the top and bottom stays effectively the same. Absolute wage increases, however, shrink those inequalities. Taking the example from above a $37.78/hr worker makes 1.33x the $27.78/hr worker, but after a $3.25/hr raise for both the gap narrow slightly to 1.29x, and continues to do so. So, contrary to the common trope, union actions show solidarity rather than greed.[2]Similar remarks can be made about job actions, which are often taken as proof that workers are inherently lazy. On the contrary, strikes are physically and emotionally grueling and rarely taken on … Continue reading

So what’s the takeaway here? It’s frankly unreasonable to expect ordinary readers to do a formal semantic analysis of their news, though journalists could stand to be a bit less credulous of claims like (1). My takeaway is that this is just more evidence of my personal maxim that people in positions of power lie and mislead whenever it suits them as long as no one questions them. Also, maybe J-schools should have required Linguistics training.

Notes

Notes
1 Indeed there are case in which some union bosses have pursued gains for themselves at the expense of other workers—e.g., construction Unions endorsing the intensely anti-worker Ontario PC Party because they love building pointless highways and sprawling suburbs
2 Similar remarks can be made about job actions, which are often taken as proof that workers are inherently lazy. On the contrary, strikes are physically and emotionally grueling and rarely taken on lightly

Some good news on the publication front

Today I woke up to an email from the editor of Biolinguistics informing me that my manuscript “A parallel derivation theory of adjuncts” had been accepted for publication. I was quite relieved, especially since I had been expecting some news about my submission for a couple of days—the ability to monitor the progress of submissions on a journal’s website is a decidedly mixed blessing—and there was a definite possibility in my mind that it could have been rejected.

It was also a relief because it’s been a long road with this paper. I first wrote about the kernel of its central idea—that syntactic adjuncts were entirely separate objects from their “hosts”—in my thesis, and I presented it a couple of times within the University of Toronto Linguistics Department a few times. I first realized that it had some legs when it was accepted as a talk at the 2020 LSA Meeting in New Orleans, and I started working on it in earnest in the spring and summer of 2020, submitting the first manuscript version to a different journal in August 2020.

If you follow me on Twitter, you saw my reactions to the peer-review process in real time, but it’s worth summarizing. Versions of this manuscript underwent peer-review at multiple journals and in every case there were one or two constructive reviews—some positive reviews, and some negative reviews that nevertheless pointed out serious but fixable issues—but invariably there was one reviewer who was clearly hostile to the manuscript—there was often sarcasm and vague comments.

I’m sure the manuscript improved over the various submissions, but I believe that the main reason that the paper will finally be published is because the editor of Biolinguistics, Kleanthes Grohmann, recognized and agreed with me that one of the reviewers was being unreasonable, so I definitely owe him my gratitude.

There’s more edits to go, but you can look forward to seeing my paper in Biolinguistics in the near future.

Why are some ideas so sticky? A hypothesis

Anyone who has tried to articulate a new idea or criticize old ones may have noticed that some ideas are washed away relatively easily, while others seem to actively resist even the strongest challenges—some ideas are stickier than others. In some cases, there’s an obvious reason for this stickiness—in some cases there’s even a good reason for it. Some ideas are sticky because they’ve never really been interrogated. Some are sticky because there are powerful parts of society that depend on them. Some are sticky because they’re true, or close to true. But I’ve started to think there’s another reason an idea can be sticky—the amount of mental effort people put into understanding the idea as students.

Take, for instance, X-bar theory. I don’t think there’s some powerful cabal propping it up, it’s not old enough to just be taken for granted, and Chomsky’s Problems of Projection papers showed that it was not really tenable. Yet X-bar persists. Not just in how syntacticians draw trees, or how they informally talk about them, but I remember commentary on my definition of minimal search here involved puzzlement about why I didn’t simply formalize the idea that specifiers were invisible to search followed by more puzzlement when I explained that the notion of specifier was unformulable.

In my experience, the stickiness of X-bar theory—and syntactic projection/labels more broadly—doesn’t manifest itself in an attempt to rebut arguments against it, but in attempts to save it—to reconstitute it in a theory that doesn’t include it.[1]My reading of Zeijstra’s chapter in this volume is as one such attempt This is very strange behaviour—X-bar is a theoretical construct, it’s valid insofar as it is coherent and empirically useful. Why are syntacticians fighting for it? I wondered about this for a while and then I remembered my experience learning X-bar and teaching it—it’s a real challenge. It’s probably the first challenging theoretical construct that syntax students are exposed to. It tends to be presented as a fait accompli, so students just have to learn how it functions. As a result, those students who do manage to figure it out are proud of it and defend it like someone protecting their cherished possessions.[2]I think I may be describing “effort justification,” but I’m basing this just on the Wikipedia article

Of course, it’s a bit dangerous to speculate about the psychological motivations of others, but I’m certain I’ve had this reaction in the past when someone’s challenged an idea that I at one point struggled to learn. And I’ve heard students complain about the fact that every successive level of learning syntax starts with “everything you learned last year is wrong”—or at least that’s the sense they get. So, I have a feeling there’s at least a kernel of truth to my hypothesis. Now, how do I go about testing it?


Addendum

As I was writing this, I remembered something I frequently think when I’m preparing tests and exams that I’ve thus far only formulated as a somewhat snarky question:

How much of our current linguistic theory depends on how well it lends itself to constructing problem sets and exam questions?

Notes

Notes
1 My reading of Zeijstra’s chapter in this volume is as one such attempt
2 I think I may be describing “effort justification,” but I’m basing this just on the Wikipedia article

Some idle thoughts on the arguments for semantic externalism/internalism

This semester I’m teaching an intro semantics course for the first time and I decided to use Saeed’s Semantics as a textbook. Its seems like a good textbook; it gives a good survey of all the modern approaches to semantics—internalist, externalist, even so-called cognitive semantics—though the externalist bias is clear if you know what to look for. For instance, the text is quick to bring up the famous externalist thought experiments—Putnam’s robotic cats, Quine’s gavagai, etc—to undercut the internalist approaches, but doesn’t really seem to present the internalist critiques and counterarguments. So, I’ve been striving to correct that in my lectures.

While I was preparing my most recent lecture, something struck me. More precisely, I was suddenly able to put words to something that’s bothered me for a while about the whole debate: The externalist case is strongest for natural kinds, but the internalist case is strongest for human concepts. Putnam talks about cats and water, Kripke talks about tigers and gold, while Katz talks about bachelors and sometimes artifacts. This is not to say that the arguments on either side are unanswerable—Chomsky, I think has provided pretty good arguments that even, for natural kinds, our internal concepts are quite complicated, and there are many thorny issues for internalist approaches too—but they do have slightly different empirical bases, which no doubt inform their approach—if your theory can handle artifact concepts really well, you might be tempted to treat everything that way.

I don’t quite know what to make of this observation yet, but I wanted to write it down before I forgot about it.


There’s also a potential, but maybe half-baked, political implication to this observation. Natural kinds, are more or less constant in that, while they can be tamed and used by humans, we can’t really change them that much, and thinking that you can, say, turn lead into gold would mark you as a bit of a crackpot. Artifacts and social relations, on the other hand, are literally created by free human action. If you view the world with natural kinds at the center, you may be led to the view that the world has its own immutable laws that we can maybe harness, maybe adapt to, but never change.

If, on the other hand, your theory centers artifacts and social relations, then you might be led to the conclusion, as expressed by the late David Graeber, that “the ultimate hidden truth of the world is that it is something we make and could just as easily make differently.”

But, of course, I’m just speculating here.

Unmoored theory

I’ve written before about the dichotomy of descriptive vs theoretical sciences, but I’ve recently noticed another apparent dichotomy within theoretical sciences—expansionary vs focusing sciences. Expansionary sciences are those whose domain tends to expand—(neo)classical economics seems to claim all human interaction in its domain; formal semantics now covers pragmatics, hand gestures, and monkey communication—while focusing sciences tend to rather constant domain or even a shrinking one—chemistry today is about pretty much the same things as it was in the 17th century; generative syntactic theory is still about the language faculty. Assuming this is true,[1]It’s pretty much a tautology that a science’s domain will either grow, stay constant, or shrink over time the question is, whether it reflects some underlying difference between these sciences. I’d like to argue that the distinction follows from how firm its foundations are, and in particular what I’ll call its empirical conjecture.

Every scientific theory, I think, basically takes the form of a conjoined sentence “There are these things/phenomena in the world and they act like this.” The second conjunct is the formal system that give a theory its deductive power. The first conjunct is the empirical conjecture, and it turns the deductions of the formal system into predictions. While every science that progresses does so by positing new sorts of invisible entities, categories, etc., they all start with more or less familiar entities, categories, etc.—planets, metals, persons, etc. This link to the familiar, is the empirical foundation of a science. Sciences with a firm foundation are those whose empirical conjecture can be uncontroversially explained to a lay person or even an expert critic operating in good faith.

Contemporaries of, say, Robert Boyle might have thought the notion of corpuscles insanity, but they wouldn’t disagree that matter exists, exists in different forms, and that some of those forms interact in regular ways. Even the fiercest critic of UG, provided they are acting in good faith, would acknowledge that humans have a capacity for language and that that capacity probably has to do with our brains.

The same, I think, cannot be said about (neo)classical economics or formal semantics.[2]Now obviously, there’s a big difference between the two fields—neoclassical economics is extremely useful to the rich and powerful since it let’s them justify just about any … Continue reading Classical economics starts with the conjecture that there are these members of the species homo economicus—the perfectly rational, self-interested, utility maximizing agent—and derives theorems from there. This is obviously a bad characterization of humans. It is simultaneously too dim of a view of humans—we behave altruistically and non-individualistically all the time—and one that gives us far too much credit—we are far from perfectly rational. Formal semantics, on the other hand, starts with the conjecture that meaning is reference—that words have meaning only insofar as they refer to things in the world. While not as obviously false as the homo economicus conjecture, the referentialist conjecture is still false—most words, upon close inspection, do not refer[3]I could point you to my own writing on this, the works of Jerrold Katz, and arguments from Noam Chomsky on referentialsm, or I could point out that one of the godfathers of referentialism, Ludwig … Continue reading, and there is a whole universe of meaning that has little to do with reference.

Most economists and semanticists would no doubt object to what the previous paragraph says about their discipline, and the objections would take one of two forms. Either they would defend homo economicus/referentialism, or they would downplay the importance of the conjecture in question—“Homo economicus is just a useful teaching tool for undergrads. No one takes it seriously anymore!”[4]Though, as the late David Graeber pointed out, economists never object when homo economicus is discussed in a positive light. “Semanticists don’t mean reference literally, we use model theory!”—and it’s this sort of response that I think can explain the expansionary behaviour of these disciplines. Suppose we take these objections to be honest expressions of what people in the field believe—that economics isn’t about homo economicus and formal semantics isn’t about reference. Well then, what are they about? The rise of behavioural economics suggests that economists are still looking for a replacement model of human agency, and model theory is basically just reference delayed.

The theories, then, seem to be about nothing at all—or at least nothing that exists in the real world—and as a result, they can be about anything at all—they are unmoored.

Furthermore, there’s an incentive to expand your domain when possible. A theory of nothing obviously can’t be justified by giving any sort of deep explanation of any one aspect of nature, so it has to be justified by appearing to offer explanations to a breadth of topics. Neoclassical economics can’t seem to predict when a bubble will burst, or what will cause inflation, but it can give what looks like insight into family structures. Formal semantics can’t explain why “That pixel is red and green.” is contradictory, but it provides a formal language to translate pragmatics into.

There’s a link here to my past post about falsification, because just as a theory about nothing can be a theory about anything, a theory about nothing cannot be false. So, watch out—if your empirical domain seems to be expanding, you might not be doing science any more.

Notes

Notes
1 It’s pretty much a tautology that a science’s domain will either grow, stay constant, or shrink over time
2 Now obviously, there’s a big difference between the two fields—neoclassical economics is extremely useful to the rich and powerful since it let’s them justify just about any horrendous crimes they would want to commit in the name of expanding their wealth and power, while formal semantics is a subdiscipline of a minor oddball discipline on the boundaries of humanities, social science, and cognitive science. But I’m a linguist, and I think mostly linguists read this.
3 I could point you to my own writing on this, the works of Jerrold Katz, and arguments from Noam Chomsky on referentialsm, or I could point out that one of the godfathers of referentialism, Ludwig Wittgenstein, seems to have repudiated it in his later work.
4 Though, as the late David Graeber pointed out, economists never object when homo economicus is discussed in a positive light.