How do we get good at using language?

Or: What the hell is a figure of speech anyway?

At a certain level I have the same level of English competence as Katie Crutchfield, Josh Gondelman, and Alexandria Ocasio-Cortez. This may seem boastful to a delusional degree of me, but we’re all native speakers of a North American variety of English of a similar age, and this is the level of competence that linguists tend to care about. Indeed, according to our best theories of language, the four of us are practically indistinguishable.

Of course, outside of providing grammaticality judgements, I wouldn’t place myself anywhere near those three, each of whom could easily be counted among the most skilled users of English living. But what does it mean for people to have varied levels of skill in their language use? And is this even something that linguistic theory should be concerned about?

Linguists, of course, have settled on 5 broad levels of description of a given language

  1. Phonetics
  2. Phonology
  3. Morphology
  4. Syntax
  5. Semantics

It seems quite reasonable to say we can break down language skill along these lines. So, skilled speakers can achieve a desire effect by manipulating their phonetics, say by raising their voices, hitting certain sounds in a particular way, or the like. Likewise, phonological theory can provide decent analyses of rhyme, alliteration, rhythm etc. Skilled users of a language also know when to use (morphologically) simple vs complex words, and which word best conveys the meaning they intend. Maybe a phonetician, phonologist, morphologist, or semanticist, will disagree, but these seem like fairly straightforward to formalize, because they all involve choosing from among a finite set of possibilities—a language only has so many lexical entries to choose from. What does skill mean in the infinite realm of syntax? What does it mean to choose the correct figure of speech? Or even more basically, how does one express any figure of speech in the terms of syntactic theory?

It’s not immediately obvious that there is any way to answer these questions in a generative theory for the simple reason that figures of speech are global properties of expressions, while grammatical theory deals in local interactions between parts of expressions. Take an example from Abraham Lincoln’s second inaugural address:

(1) Fondly do we hope—fervently do we pray—that this mighty scourge of war may speedily pass away.

There are three syntactic processes employed by Lincoln here that I can point out:

(2) Right Node Raising
Fondly do we hope that this mighty scourge of war may speedily pass away, and fervently do we pray that this mighty scourge of war may speedily pass away. -> (1)

(3) Subject-Aux Inversion
Fondly we hope … -> (1)

(4) Adverb fronting
We hope fondly… -> (1)

Each of these represents a choice—conscious or otherwise—that Lincoln made in writing his speech and, while most generative theories allow for choices to be made, they are not at the same levels.

Minimalist theories, for instance, allow for choices at each stage of sentence construction—you can either move constituent, add a constituent, or stop the derivation. Each of (3) and (4) could conceivably be represented as a single choice, but it seems highly unlikely that (2) could. In fact, there is nothing approaching a consensus as to how right node raising is achievable, but it is almost certainly a complex phenomenon. It’s not as if we have a singular operation RNR(X) which changes a mundane sentence into something like (1), yet Lincoln and other writers and orators seem to have it as a tool in their rhetorical toolboxes.

Rhetorical skill of this kind suggest the possibility of a meta-grammatical knowledge, which all speakers of a language have to some extent, and which highly skilled users have in abundance. But what could this meta-grammatical knowledge consist of? Well, if the theoretical representation of a sentence is a derivation, then the theoretical representation of a figure of speech would be a class of derivations. This suggests an ability to abstract over derivations in some way and therefore, it suggests that we are able to acquire not just lexical items, but also abstractions of derivations.

This may seem to contradict the basic idea of Minimalism by suggesting two grammatical systems and indeed, it might be a good career move on my part to declare that the fact of figures of speech disproves the SMT, but I don’t see any contradiction inherent here. In fact, what I’m suggesting here and have argued for elsewhere is something that is a fairly basic observation from computer science and mathematical logic—that the distinction between operations and operands is not that distinct. I am merely suggesting that part of a mature linguistic knowledge is higher-order grammatical functions—functions that operate on other functions and/or yield other functions—and that, since any recursive system is probably able to represent higher-order functions, we should absolutely expect our grammars to allow for them.

Assuming this sort of abstraction is available and responsible for figures of speech, our task as theorists then is to figure out what form the abstraction takes, and how it is acquired, so I can stop comparing myself to Katie Crutchfield, Josh Gondelman, and AOC.

De re/De dicto ambiguities and the class struggle

If you follow the news in Ontario, you likely heard that our education workers are demanding an 11.7% wage raise in the current round of bargaining with the provincial government. If, however, you are more actively engaged with this particular story—i.e., you read past the headline, or you read the union’s summary of bargaining proposals—you may have discovered that, actually, the education workers are demanding a flat annual $3.25/hr increase across the board. On the surface, these seem to be two wildly different assertions that can’t both be true. One side must be lying! Strictly speaking, though, neither side is lying, but one side is definitely misinforming.

Consider a version of the headline (1) that supports the government’s line.

(1) Union wants 11.7% raise for Ontario education workers in bargaining proposal.

This sentence is ambiguous. More specifically is shows a de re/de dicto ambiguity. The classic example of such an ambiguity is in (2).

(2) Alex wants to marry a millionaire.

There is one way of interpreting this in which Alex wants to get married and one of his criteria for a spouse is that they be a millionaire. This is the de dicto (lit. “of what is said”) interpretation of (2). The other way of interpreting it is that Alex is deeply in love with a particular person and wants to marry them. It just so happens that Alex’s prospective spouse is a millionaire—a fact which Alex may or may not know. This is the de re (lit. “of the thing”) interpretation of (2). Notice how (2) can describe wildly different realities—for instance, Alex can despise millionaires as a class, but unknowingly want to marry a millionaire.

Turning back to our headline in (1), what are the different readings? The de dicto interpretation is one in which the union representatives sit down at the bargaining table and say something like “We demand an 11.7% raise”. The de re interpretation is one in which the union representatives demanded, say, a flat raise that happens to come out to an 11.7% raise for those workers with the lowest wages when you do the math. The de re interpretation is compatible with the assertions made by the union, so it’s probably the accurate interpretation.

So, (1) is, strictly speaking, not false under one interpretation. It is misinformation, though, because it deliberately introduces a substantive ambiguity in a way that, the alternative headline in (3) does not.

(3) Union wants $3.25/hr raise for Ontario education workers in bargaining proposal

Of course (3) has the de re/de dicto ambiguity—all expressions of desire do—but both interpretations would accurately describe the actual situation. Someone reading the headline (3) would be properly informed regardless of how they interpreted it, while (1) leads some readers to believe a falsehood.

What’s more, I think it’s reasonable to call the headline in (1) deliberate misinformation.

The simplest way to report the union’s bargaining positions would be to simply report it—copy and paste from their official summary. To report the percentage increase as they did, someone had to do the arithmetic to convert absolute terms to relative terms—a simple step, but an extra step nonetheless. Furthermore, to report a single percentage increase, they had to look only at one segment of education workers—the lowest-paid segment. Had they done the calculation on all education workers, they would have come up with a range of percentages, because $3.25 is 11.7% of $27.78, but 8.78% of 37.78, and so on. So, misinforming the public by publishing (1) instead of (3) involved at least two deliberate choices.

It’s worth asking why misinform in this way. A $3.25/hr raise is still substantial and the government could still argue that it’s too high, so why misinform? One reason is that puts workers in the position of explaining that it’s not a bald-faced lie, but it’s misleading, making us seem like pedants. but I think there’s another reason for the government to push the 11.7% figure, it plays into and furthers an anti-union trope that we’re all familiar with.

Bosses always paint organized labour as lazy, greedy, and corrupt—”Union leaders only care about themselves only we bosses care about workers and children.” They especially like to claim that unionized workers, since they enjoy higher wages and better working conditions, don’t care about poor working folks.[1]Indeed there are case in which some union bosses have pursued gains for themselves at the expense of other workers—e.g., construction Unions endorsing the intensely anti-worker Ontario PC Party … Continue reading The $3.25/hr raise demand, however, reveals these tropes as lies.

For various reasons, different jobs, even within a single union, have unequal wages. These inequalities can be used as a wedge to keep workers fighting amongst themselves rather than together against their bosses. Proportional wage increases maintain and entrench those inequalities—if everyone gets a 5% bump, the gap between the top and bottom stays effectively the same. Absolute wage increases, however, shrink those inequalities. Taking the example from above a $37.78/hr worker makes 1.33x the $27.78/hr worker, but after a $3.25/hr raise for both the gap narrow slightly to 1.29x, and continues to do so. So, contrary to the common trope, union actions show solidarity rather than greed.[2]Similar remarks can be made about job actions, which are often taken as proof that workers are inherently lazy. On the contrary, strikes are physically and emotionally grueling and rarely taken on … Continue reading

So what’s the takeaway here? It’s frankly unreasonable to expect ordinary readers to do a formal semantic analysis of their news, though journalists could stand to be a bit less credulous of claims like (1). My takeaway is that this is just more evidence of my personal maxim that people in positions of power lie and mislead whenever it suits them as long as no one questions them. Also, maybe J-schools should have required Linguistics training.

Notes

Notes
1 Indeed there are case in which some union bosses have pursued gains for themselves at the expense of other workers—e.g., construction Unions endorsing the intensely anti-worker Ontario PC Party because they love building pointless highways and sprawling suburbs
2 Similar remarks can be made about job actions, which are often taken as proof that workers are inherently lazy. On the contrary, strikes are physically and emotionally grueling and rarely taken on lightly

Why are there no Cartesian products in grammar?

This post, I think, doesn’t rise above the level of “musings.” I think there’s something here, but I’m not sure if I can articulate it properly.

An adequate scientific theory is one in which facts about nature are reflected in facts about the theory. Every entity in the theory should have an analogue in nature, relations in the theory should be found in nature, and simple things in the theory should be ubiquitous in nature. This last concern is at the core of minimalist worries about movement—early theories saw movement as complex and had to explain its ubiquity, while later theories see it as simple and have to explain the constraints on it. But my concern here is not minimalist theories of syntax, but model-theoretic semantics.

Model theories of semantics often use set-theory as their formal systems,[1]Yes, I know that there are many other types of model theories put forth so if they are adequate, then ubiquitous semantic phenomena should be simply expressible in set theory, and simple set-theoretic notions should be ubiquitous in semantics. For the most part this seems to be the case—you can do a lot of semantics with membership, subset, intersection, etc.—but obviously it’s not perfect. One point of mismatch is the notion of the Cartesian product (X × Y = {⟨x, y⟩ | xX, yY }) a very straightforward notion in set-theory, but one that does not have a neat analogue in language.

What do I mean by this? Well, consider the set-theoretic statement in (1) and its natural language translation in (2).

(1) P × P ⊆ R

(2) Photographers respect themselves and each other.

What set-theory expresses in a simple statement, language does in a compound one. Or consider (3) and (4) which invert the situation

(3) (P × P) − {⟨p, p⟩ | p ∈ P} ⊆ R

(4) Photographers respect each other.

The natural language expression has gotten simpler at the expense of its set-theoretic translation. This strikes me as a problem.

If natural language semantics is best expressed as set theory (or something similar), why isn’t there a simple bound expression like each-selves with the denotation in (5)?

(5) λX.λY (Y × Y ⊆ X)

What’s more, this doesn’t seem to be a quirk of English. When I first noticed this gap, I asked some native non-English speakers—I got data from Spanish, French (Canadian and Metropolitan), Dutch, Italian, Cantonese, Mandarin, Persian, Italian, Korean, Japanese, Hungarian, Kurdish, Tagalog, Western Armenian, and Russian[2]I’d be happy to get more data if you have it. You can email me, put it in the comments, or fill out this brief questionnaire.—and got fairly consistent results. Occasionally there was ambiguity between plural reflexives and reciprocals—French se, for instance, seemed to be ambiguous—but none of the languages had an each-selves.

My suspicion—i.e. my half-formed hypothesis—is that the “meanings” of reflexives and reciprocals are entirely syntactic. We don’t interpret themselves or each other as expressions of set-theory or whatever. Rather, sentences with reflexives and reciprocals are inherently incomplete, and the particular reflexive or reciprocals tells the hearer how to complete it—themselves says “derive a sentence for each member of the subject where that member is also the object”, while each other says “for each member of the subject, derive a set of sentences where each object is one of the other members of the subject.” Setting aside the fact that this, even to me, proposal is mostly nonsense, it still predicts that there should be an each selves. Perhaps making it sensible, would fix this issue, or vice versa. Or maybe it is just nonsense, but plenty of theories started as nonsense.

Notes

Notes
1 Yes, I know that there are many other types of model theories put forth
2 I’d be happy to get more data if you have it. You can email me, put it in the comments, or fill out this brief questionnaire.

Some good news on the publication front

Today I woke up to an email from the editor of Biolinguistics informing me that my manuscript “A parallel derivation theory of adjuncts” had been accepted for publication. I was quite relieved, especially since I had been expecting some news about my submission for a couple of days—the ability to monitor the progress of submissions on a journal’s website is a decidedly mixed blessing—and there was a definite possibility in my mind that it could have been rejected.

It was also a relief because it’s been a long road with this paper. I first wrote about the kernel of its central idea—that syntactic adjuncts were entirely separate objects from their “hosts”—in my thesis, and I presented it a couple of times within the University of Toronto Linguistics Department a few times. I first realized that it had some legs when it was accepted as a talk at the 2020 LSA Meeting in New Orleans, and I started working on it in earnest in the spring and summer of 2020, submitting the first manuscript version to a different journal in August 2020.

If you follow me on Twitter, you saw my reactions to the peer-review process in real time, but it’s worth summarizing. Versions of this manuscript underwent peer-review at multiple journals and in every case there were one or two constructive reviews—some positive reviews, and some negative reviews that nevertheless pointed out serious but fixable issues—but invariably there was one reviewer who was clearly hostile to the manuscript—there was often sarcasm and vague comments.

I’m sure the manuscript improved over the various submissions, but I believe that the main reason that the paper will finally be published is because the editor of Biolinguistics, Kleanthes Grohmann, recognized and agreed with me that one of the reviewers was being unreasonable, so I definitely owe him my gratitude.

There’s more edits to go, but you can look forward to seeing my paper in Biolinguistics in the near future.

Why are some ideas so sticky? A hypothesis

Anyone who has tried to articulate a new idea or criticize old ones may have noticed that some ideas are washed away relatively easily, while others seem to actively resist even the strongest challenges—some ideas are stickier than others. In some cases, there’s an obvious reason for this stickiness—in some cases there’s even a good reason for it. Some ideas are sticky because they’ve never really been interrogated. Some are sticky because there are powerful parts of society that depend on them. Some are sticky because they’re true, or close to true. But I’ve started to think there’s another reason an idea can be sticky—the amount of mental effort people put into understanding the idea as students.

Take, for instance, X-bar theory. I don’t think there’s some powerful cabal propping it up, it’s not old enough to just be taken for granted, and Chomsky’s Problems of Projection papers showed that it was not really tenable. Yet X-bar persists. Not just in how syntacticians draw trees, or how they informally talk about them, but I remember commentary on my definition of minimal search here involved puzzlement about why I didn’t simply formalize the idea that specifiers were invisible to search followed by more puzzlement when I explained that the notion of specifier was unformulable.

In my experience, the stickiness of X-bar theory—and syntactic projection/labels more broadly—doesn’t manifest itself in an attempt to rebut arguments against it, but in attempts to save it—to reconstitute it in a theory that doesn’t include it.[1]My reading of Zeijstra’s chapter in this volume is as one such attempt This is very strange behaviour—X-bar is a theoretical construct, it’s valid insofar as it is coherent and empirically useful. Why are syntacticians fighting for it? I wondered about this for a while and then I remembered my experience learning X-bar and teaching it—it’s a real challenge. It’s probably the first challenging theoretical construct that syntax students are exposed to. It tends to be presented as a fait accompli, so students just have to learn how it functions. As a result, those students who do manage to figure it out are proud of it and defend it like someone protecting their cherished possessions.[2]I think I may be describing “effort justification,” but I’m basing this just on the Wikipedia article

Of course, it’s a bit dangerous to speculate about the psychological motivations of others, but I’m certain I’ve had this reaction in the past when someone’s challenged an idea that I at one point struggled to learn. And I’ve heard students complain about the fact that every successive level of learning syntax starts with “everything you learned last year is wrong”—or at least that’s the sense they get. So, I have a feeling there’s at least a kernel of truth to my hypothesis. Now, how do I go about testing it?


Addendum

As I was writing this, I remembered something I frequently think when I’m preparing tests and exams that I’ve thus far only formulated as a somewhat snarky question:

How much of our current linguistic theory depends on how well it lends itself to constructing problem sets and exam questions?

Notes

Notes
1 My reading of Zeijstra’s chapter in this volume is as one such attempt
2 I think I may be describing “effort justification,” but I’m basing this just on the Wikipedia article

Bad omens for generative syntax

In the last few weeks there have been a couple of posts in the generative linguistics blogosphere that don’t bode well for the field.

The first is the sad announcement from Omer Preminger that he is leaving academia in order to live in the same town as his wife. This news is also rather shocking, since Preminger is a fairly prominent syntactician—someone whose work, though I didn’t always agree with it, had to be addressed seriously—and if a scholar of his prominence and ability can’t negotiate something as reasonable as a spousal hire, what hope does anyone else have in having a life and an academic career too. I’m just a sessional lecturer, so treating me like a robot is still the norm, but to hear that faculty members are also expected to be robots, is disconcerting to be sure.

Omer promises more reflections on his time in academia, which I will read with some interest when it comes out, but I am sorry to see him leaving academia.

The second concerning report comes from Chris Collins. Collins, it seems, applied to some of the same tenured/tenure-track jobs as me this past year, and got the same boilerplate rejection emails as me. That a tenured professor is in the same job market as me is not especially surprising, It should be surprising that no university wanted to hire him, since he not only has a fairly strong empirical program, but he’s made important contributions to syntactic theory—while the idea of label-free syntax is commonly attributed to Chomsky (2013; 2015), he cites Collins for the idea, and slightly more recently, Collins’ work with Ed Stabler formalizing minimalist syntax in a few ways predicted Chomsky’s most recent move to workspace-based MERGE, and on a personal note, has been an invaluable resource to my work.

Collins’ explanation of his unsuccessful applications is twofold and both parts suggest bad trends in generative syntax.

The first explanation is one that I gather is common across all academic fields[1]At least those fields that modern capitalism deems useless.—department budgets are too tight to hire a senior scholar like Collins, when junior candidates are available and cheaper. Collins is probably right on this, but unfortunately commenting on the last war. While it’s probably true that junior hires are preferred over senior hires for budgetary reasons, junior tenure-track faculty are not the floor. Why hire an expensive faculty member who you have to provide with an office, a small research budget, and a pension, when you can hire a few precarious adjuncts for cheaper?

As an aside of sorts, I remember having arguments in grad school with my fellow grad students about whether our department should hire tenured faculty away from other departments. The standing wisdom was that that was the trajectory—smaller departments hired junior faculty, and once they’d proved themselves they’d move on to bigger and better places, opening up a spot at their old place. There was a feeling that, sure, there was no growth in faculty positions, but they were at least going to replace faculty that left or retired. I was skeptical of that line. University administrators had adopted the neoliberal model almost entirely—The Market reigned supreme—and The Market was clear: Linguistics, along with the broader humanities, was useless, so why not take every opportunity to squeeze those useless departments, say, by delaying replacement hires.

All of this is to say that I think Collins has identified a trend, but not a new one. The lower echelons of academia have been enduring this trend for some time now. Perhaps now that it’s reaching the upper echelons, we can see about stopping or reversing it … perhaps.

Collins’ second explanation is that, while he has made valuable contributions in recent years, the field doesn’t appreciate those contributions, and I think he might add the qualifier “yet” to that assessment. Again, I think he’s correct, and he’s identified a trend that I first saw identified by Norbert Hornstein, namely that much of what we call “theoretical syntax” is actually empirical/analytical work. This trend, I think, has morphed first to the point where so-called theoretical syntacticians were puzzled by actual theoretical work, then to the point where they are hostile to it. I suspect Collins has been a victim—though he in no way frames himself as a victim—of this hostility.

So, while there is a decided difference in degree between these two career setbacks, I think they are both part of the same trend, a trend which has been affecting more marginalized and vulnerable parts of academia for some time. The fact that this trend is now directly affecting established generative syntacticians should make the field as a whole take notice. At least I hope it does.

Notes

Notes
1 At least those fields that modern capitalism deems useless.

Unmoored theory

I’ve written before about the dichotomy of descriptive vs theoretical sciences, but I’ve recently noticed another apparent dichotomy within theoretical sciences—expansionary vs focusing sciences. Expansionary sciences are those whose domain tends to expand—(neo)classical economics seems to claim all human interaction in its domain; formal semantics now covers pragmatics, hand gestures, and monkey communication—while focusing sciences tend to rather constant domain or even a shrinking one—chemistry today is about pretty much the same things as it was in the 17th century; generative syntactic theory is still about the language faculty. Assuming this is true,[1]It’s pretty much a tautology that a science’s domain will either grow, stay constant, or shrink over time the question is, whether it reflects some underlying difference between these sciences. I’d like to argue that the distinction follows from how firm its foundations are, and in particular what I’ll call its empirical conjecture.

Every scientific theory, I think, basically takes the form of a conjoined sentence “There are these things/phenomena in the world and they act like this.” The second conjunct is the formal system that give a theory its deductive power. The first conjunct is the empirical conjecture, and it turns the deductions of the formal system into predictions. While every science that progresses does so by positing new sorts of invisible entities, categories, etc., they all start with more or less familiar entities, categories, etc.—planets, metals, persons, etc. This link to the familiar, is the empirical foundation of a science. Sciences with a firm foundation are those whose empirical conjecture can be uncontroversially explained to a lay person or even an expert critic operating in good faith.

Contemporaries of, say, Robert Boyle might have thought the notion of corpuscles insanity, but they wouldn’t disagree that matter exists, exists in different forms, and that some of those forms interact in regular ways. Even the fiercest critic of UG, provided they are acting in good faith, would acknowledge that humans have a capacity for language and that that capacity probably has to do with our brains.

The same, I think, cannot be said about (neo)classical economics or formal semantics.[2]Now obviously, there’s a big difference between the two fields—neoclassical economics is extremely useful to the rich and powerful since it let’s them justify just about any … Continue reading Classical economics starts with the conjecture that there are these members of the species homo economicus—the perfectly rational, self-interested, utility maximizing agent—and derives theorems from there. This is obviously a bad characterization of humans. It is simultaneously too dim of a view of humans—we behave altruistically and non-individualistically all the time—and one that gives us far too much credit—we are far from perfectly rational. Formal semantics, on the other hand, starts with the conjecture that meaning is reference—that words have meaning only insofar as they refer to things in the world. While not as obviously false as the homo economicus conjecture, the referentialist conjecture is still false—most words, upon close inspection, do not refer[3]I could point you to my own writing on this, the works of Jerrold Katz, and arguments from Noam Chomsky on referentialsm, or I could point out that one of the godfathers of referentialism, Ludwig … Continue reading, and there is a whole universe of meaning that has little to do with reference.

Most economists and semanticists would no doubt object to what the previous paragraph says about their discipline, and the objections would take one of two forms. Either they would defend homo economicus/referentialism, or they would downplay the importance of the conjecture in question—“Homo economicus is just a useful teaching tool for undergrads. No one takes it seriously anymore!”[4]Though, as the late David Graeber pointed out, economists never object when homo economicus is discussed in a positive light. “Semanticists don’t mean reference literally, we use model theory!”—and it’s this sort of response that I think can explain the expansionary behaviour of these disciplines. Suppose we take these objections to be honest expressions of what people in the field believe—that economics isn’t about homo economicus and formal semantics isn’t about reference. Well then, what are they about? The rise of behavioural economics suggests that economists are still looking for a replacement model of human agency, and model theory is basically just reference delayed.

The theories, then, seem to be about nothing at all—or at least nothing that exists in the real world—and as a result, they can be about anything at all—they are unmoored.

Furthermore, there’s an incentive to expand your domain when possible. A theory of nothing obviously can’t be justified by giving any sort of deep explanation of any one aspect of nature, so it has to be justified by appearing to offer explanations to a breadth of topics. Neoclassical economics can’t seem to predict when a bubble will burst, or what will cause inflation, but it can give what looks like insight into family structures. Formal semantics can’t explain why “That pixel is red and green.” is contradictory, but it provides a formal language to translate pragmatics into.

There’s a link here to my past post about falsification, because just as a theory about nothing can be a theory about anything, a theory about nothing cannot be false. So, watch out—if your empirical domain seems to be expanding, you might not be doing science any more.

Notes

Notes
1 It’s pretty much a tautology that a science’s domain will either grow, stay constant, or shrink over time
2 Now obviously, there’s a big difference between the two fields—neoclassical economics is extremely useful to the rich and powerful since it let’s them justify just about any horrendous crimes they would want to commit in the name of expanding their wealth and power, while formal semantics is a subdiscipline of a minor oddball discipline on the boundaries of humanities, social science, and cognitive science. But I’m a linguist, and I think mostly linguists read this.
3 I could point you to my own writing on this, the works of Jerrold Katz, and arguments from Noam Chomsky on referentialsm, or I could point out that one of the godfathers of referentialism, Ludwig Wittgenstein, seems to have repudiated it in his later work.
4 Though, as the late David Graeber pointed out, economists never object when homo economicus is discussed in a positive light.

What does falsification look like anyway?

Vulcan vs Neptune

There’s an argument that plays out every so often in linguistics the goes as follows:

Critic: This data falsifies theory T.
Proponent: Not necessarily, if you consider arguments X,Y, and Z.
Critic: Well, then theory T seems to be unfalsifiable!

This is obviously a specious argument on the part of the critic, since unfalsified does not entail unfalsifiable, but I think it stems from a very understandable frustration—theorists often have an uncanny ability to wriggle free of data that appears to falsify their theories, even though falsificationism is assumed by a large majority of linguists. The problem is that the logic falsificationism, while being quite sound, maybe unimpeachable, turns out to be fiendishly difficult to apply.

At its simplest, the logic of falsificationism says that a theory is scientific insofar as one can construct a basic statementi.e., a statement of fact—that would contradict the theory. This, of course, is an oversimplification of Karl Popper’s idea of Critical Rationalism in a number of ways. For one, falsifiability is not an absolute notion. Rather, we can compare the relative falsifiability of two theories by looking at what Popper calls their empirical content—the number of basic statements that would contradict them. So if a simple theoretical statement P has a particular empirical content, then the conjunction P & Q will have a greater empirical content, and the disjunction P v Q will have a lesser empirical content. This is a useful heuristic when constructing or criticizing a theory internally, and seems like a straightforward guide to testing theories empirically. Historically, though, this is not the case, largely because it is often difficult to recognize when we’ve arrived at and accurately formulated a falsifying fact. In fact, it is often, maybe always, the case that we don’t recognize a falsifying fact as such until after one theory has been superseded by another.

Take for instance the case of the respective orbits of Mercury and Uranus. By the 19th century, Newtonian mechanics had allowed astronomers to make very precise predictions about the rotations of the planets, and based on those predictions, there was a problem: two of the planets were misbehaving. First, it was discovered that Uranus—then the last known planet from the sun—wasn’t showing up where it should have been. Basically, Newton’s mechanics predicted that on such and so day and time Uranus would be in a particular spot in the sky, but the facts were otherwise. Rather than cry “falsification!”, though, the astronomers of the day hypothesized an object on the other side of Uranus that was affecting its orbit. One such astronomer, Urbain Le Verrier was even able to work backwards and predict where that object could be found. So in September of 1846, armed with Le Verrier’s calculations, Johann Gottfried Galle, was able to observe an eighth planet—Neptune. Thus, an apparent falsification became corroboration.

Urbain Le Verrier (1811-1877)
Johann Galle (1812-1910)

I’ve previously written about this story as a vindication of the theory first approach to science. What I didn’t write about, and what is almost never discussed in this context is Le Verrier’s work on the misbehaving orbit of Mercury. Again, armed with Newton’s precise mechanics, Le Verrier calculated the Newtonian prediction for Mercury’s orbit, and again[1]Technically though, Le Verrier’s work on Mercury predated his work on Uranus Mercury didn’t behave as expected. Again, rather than throw out Newtonian mechanics, Le Verrier hypothesized the planet Vulcan between Mercury and the sun, and set about trying to observe it. While many people claimed to observe Vulcan, none of these observations were reliably replicated. Le Verrier was undeterred, though, perhaps because observing a planet that close to the sun was quite tricky. Of course, it would be easy to paint Le Verrier as an eccentric—indeed, his Vulcan hypothesis is somewhat downplayed in his legacy—but he doesn’t seem to have been treated so by his contemporaries. The Vulcan hypothesis wasn’t universally believed, but neither does it seem to be the Flat-Earth theory of its day.

It was only when Einstein used his General Theory of Relativity to accurately calculate Mercury’s orbit, that the scientific community seems to have abandoned the search for Vulcan. Mercury’s orbit is now considered a classical successful test of General Relativity, but why don’t we consider it a refutation of Newtonian Mechanics? Strict falsificationism would seem to dictate that, but then a strict falsificationist would have thrown out Newtonian Mechanics as soon as we noticed Uranus misbehaving. So, falsificationism of this sort leads us to something of a paradox—if a single basic statement contradicts a theory, there’s no way of knowing if there is some second basic statement that, in conjunction with the first, could save the theory.

Still, it’s difficult to toss out falsification entirely, because a theory that doesn’t reflect reality, may be interesting but isn’t scientific.[2]Though sometimes, theories which seem to be empirically idle end up being scientifically important (cf. non-Euclidean geometry) Also, any reasonable person who has ever tried to give an explanation to any phenomenon, probably rejects most of their own ideas rather quickly on empirical bases. We should instead adopt falsificationism as a relative notion—use it when comparing multiple theories. So, Le Verrier was ultimately wrong, but acted reasonably—he had a pretty good theory of mechanics so he worked to reconcile it with some problematic data. Had someone developed General Relativity in Le Verrier’s time, then it would have been unreasonable to insist that a hypothesized planet was a better explanation than an improved theory.

Returning to the hypothetical debate between the Critic and the Proponent, then, I think a reasonable albeit slightly rude response for the proponent would be “Well, do you have a better theory?”

Notes

Notes
1 Technically though, Le Verrier’s work on Mercury predated his work on Uranus
2 Though sometimes, theories which seem to be empirically idle end up being scientifically important (cf. non-Euclidean geometry)

Chris Collins interviews Noam Chomsky about formal semantics

Over on his blog, Chis Collins has posted the text of a conversation he had over email with Noam Chomsky on the topic of formal semantics. While Chomsky has been very open about his views on semantics for a long time, this interview is worth reading for working linguists because Collins frames the conversation around work by linguists—Heim & Kratzer, and Larson & Segal—rather than philosophers—Quine, Austin, Wittgenstein, Frege, et al.

You should read it for yourself, but I’d like to highlight one passage that jumped out at me. Of the current state of the field, Chomsky says:

Work in formal semantics has been some of the most exciting parts of the field in recent years, but it hasn’t been treated with the kind of critical analysis that other parts of syntax (including generative phonology) have been within generative grammar since its origins. Questions about explanatory power, simplicity, learnability, generality, evolvability, and so. More as a descriptive technology. That raises questions.

p 5. (emphasis mine)

It’s true that formal semantics today is a vibrant field. There’s always new analyses, The methods of formal semantics are being applied to new sets of data, and, indeed, it’s virtually impossible to even write a paper on syntax without a bit of formal semantics. Yet it is also true that almost no one has been thinking about the theory underpinning the analytical technology. As a result, I don’t think many working semanticists are even aware that there is such a theory, or if they are aware, they tend to wave their hands, saying “that’s philosophy”. Formal semanticists, it seems, have effectively gaslit themselves.

Chomsky’s framing here is interesting, too. He could be understood as suggesting that formal semantics could engage in theoretical inquiry while maintaining its vibrancy. It’s not clear that this is the case though. Currently, formal semantics bears a striking similarity to the machine-learning/neural-nets style of AI, in that both are being applied to a very wide array of “problems” but a closer look at the respective technologies very likely would cause us to question whether they should be. Obviously, the stakes are different—no one’s ever been injured in a car crash because they used lambdas to analyze a speech act—but the principle is the same.

But I digress. Collins and Chomsky’s conversation is interesting and very accessible to anyone who familiar with Heim & Kratzer-style semantics. It’s well worth a read.

The Poverty of Referentialist Semantics

(What follows is a bit of a rant. I hope it holds together a bit. If you make it past the inflammatory title, let me know what you think.)

When Gregor Mendel first discovered his Laws of Inheritance, it was a great revelation. To be sure, humanity has perhaps always known that many of a person’s (or plant’s, animal’s, bacterium’s, etc) traits are inherited from their parents, but Mendel was able to give that knowledge a quantitative expression. Of course, this was just the beginning of the modern study of genetics, as scientists asked the next obvious question: How are traits inherited? This question persisted for the better part of a century until a team of scientists showed experimentally, that inheritance proceeds via DNA. Again, this raised a question that has spurred research to this day: How does DNA encode physical traits? But why am I writing about genetics in a post about semantics? Well, to make a point of contrast with the theory that has dominated the field of linguistic semantics for the past few decades: Formal Semantics.

As in the case of inheritance, we’ve always know that words, phrases, and sentences have meanings, but we’ve had a tougher time understanding this fact. In the late 19th and early 20th century philosophers, psychologists and linguists seemed to settle on a way of understanding linguistic meaning: linguistic expression are meaningful by virtue of the fact that they refer to objects in the world. So, “dog” has a meaning for modern English speakers because it refers to dogs. This principle has led the modern field of semantics, although not in the same way as the discoveries of genetics led that field. If semanticists had proceeded as the geneticists had, they would have immediately asked the obvious question: How do linguistic expressions refer to objects in the world? Instead of pursuing this question, semanticists seem to have banished it and, in fact, virtually any questions about the reference relation, and have done so, I believe, to the detriment of the field.

At first blush, it might seem that semanticists should be forgiven for not centring this question in their inquiry. Curiosity about genetic inheritance, to continue my comparison, is quite natural, likely because we can observe its facts objectively. Certainly, it’s a cliché that no one likes to admit that they’re like their parents. There is very little resistance, on the other hand, to seeing such a similarity in other people. The facts of inheritance are unavoidable, but they are not coupled with anything approaching intuition about them. In fact, many of the facts are fundamentally unintuitive: How can a trait skip a generation? Why does male pattern baldness come from the mother’s side? How can a long line of brown-eyed people produce a blue-eyed child? This dearth of intuition about an abundance of evidence means that no one objects to followup questions to any scientific advance in the field. In fact, the right kind of follow-up questions are welcomed.

On the other hand, linguistics, especially generative linguistics, faces the opposite situation. In many ways, the object of generative inquiry is our intuitive knowledge about our own language. It should be obvious here that the average person’s intuitions about language vastly outweigh the objective facts about language.* Our intuitions about language are so close to our core, that it is very uncomfortable for us to entertain questions about it. We like to think that we know our own minds, but a question like what is language?—properly pursued—highlights just how little we understand that mind. This is not to say it’s an unanswerable or ill-formed question; it’s not a species of zen kōan. Language exists and we can distinguish it from other things, so, unlike the sound of one hand clapping, it has a nature that we can perhaps gain some understanding of. In fact, the field of generative syntax shows us that language is amenable to rational inquiry, provided researchers are open to follow-up questions: Chomsky’s initial answer to the question was that language is a computational procedure that generates an infinite array of meaningful expressions, which raised the obvious question: What sort of computational procedure? In many ways this is the driving question of generative syntactic theory, but it has also raised a number of additional questions, some of which are still open.

Just as what is language? is a difficult question, so are what is meaning? and how do words refer? So semanticists can be forgiven for balking at them initially. But, again, this is not to say that these are unanswerable questions in principle. What’s more, I don’t think semanticists even attempt to argue that the questions are too hard. On the contrary, the answer to the questions are so obvious that they don’t warrant a response. Are they right? Is it a boring, obvious question? I don’t think so. I think it is an interesting question whose surface simplicity masks a universe of complexity. In fact, I can demonstrate that complexity with some seemingly simple examples.

Before I demonstrate the complexity of reference in language, let’s look at some simple cases of reference to get a sense of what sort of relation it is. Consider for instance, longitude and latitude. The string 53° 20′ 57.6″ N, 6° 15′ 39.87″ W refers to a particular location on earth. Specifically it refers to the location of the Dublin General Post Office. That sequence of symbols is not intrinsically linked to that spot on earth; it is linked by the convention of longitude and latitude, which is to say it is linked arbitrarily. Despite its arbitrary nature, though, the link is objective; it doesn’t matter who is reading it, it still refers to that particular location. Similar remarks apply to variable assignment in computer programs, which are arbitrarily linked to a location in a computer’s RAM, or numerals like 4 or IV, which are arbitrarily linked to a particular number (assuming numbers have objective reality). These seem to suggest the following definition of the reference relation.

(R) reference is the arbitrary and objective mapping between symbols and objects or sets of objects.

For a moment, let’s set aside two types of expressions: subjective expressions like my favourite book, or next door, and proper names like The Dublin General Post Office, or Edward Snowden. For the purposes of this post, I will grant that the question of how the latter refer is already solved, and the question of how the former refer is too difficult to answer at this point. Even if we restrict ourselves to common nouns that ostensibly refer to physical objects, we run into interesting problems.

Consider the word “chair”. English speakers are very good at correctly identifying certain masses of matter as chairs, and identifying others as not chairs. This seems like a textbook case of reference, but how are we able to do it?

A chair

Not a chair

In order for reference to obtain here, there must be some intrinsic property (or constellation of properties) that marks the thing on the left as a chair and is lacking in the thing on the right. Let’s skip some pointless speculation and settle on shape as the determining factor. That is, chairs are chairs by virtue of their shape. And let’s grant that that chair-shape can be codified in such a way as to allow reference to obtain. That would be great, except that it still doesn’t fully capture the meaning of “chair”.

Suppose, for instance, a sculptor creates an object that looks exactly like a chair, and an art gallery buys it to display as part of its collection. Is that object a chair? No, it’s a sculpture. Why? Because it no longer serves the function of a chair. So the objective shape of an artifact is not sufficient to determine it’s chair-ness; we need to say something about its function, and function, I would argue, is subjective.

Or consider the following narrative:

Sadie has just moved into her first apartment in a major Western city and she needs to furnish it. Being less than wealthy she opts to buy furniture from Ikea. She goes online and orders her Ikea furniture. The next day three flat-pack boxes arrive at her door: One contains a bookshelf, one contains a bed, and the other contains a chair.

In what sense does that box contain a chair? It contains prefabricated parts which can be assembled to form a chair. Neither the box, nor its contents are chair-shaped, yet we’re happy to call the contents a chair. What if Sadie were a skilled woodworker and wanted to build her own furniture from, say, several 2-by-4s. Would we call those uncrafted 2-by-4s a chair? I don’t think so. Let’s continue the narrative.

Sadie assembles her furniture and other furniture and enjoys it for a year, at which point her landlord decides to evict her in order to double the rent. Sadie finds another apartment and pack up her belongings. In order to facilitate the move she disassembles her furniture and puts them the the trunks of the cars of her various helpful siblings. Her bookcase goes with Rose, her bed with Declan, and her chair with Violet.

Again, we refer to a bundle of chair parts as a chair. What if Sadie had taken out her anger at being evicted by her greedy landlord on the chair, hacking it to pieces with an axe? Would the resulting pile of rubble be a chair? Certainly not.

What does this tell us about how the word “chair” is linked to the object that I’m sitting on as I write this? That link cannot be reference as defined above in (R), because it’s not purely objective. The chair-ness of an object depends not only on its objective form, but also on its subjective function. And this problem will crop up with any artifact-word (e.g., “table”, “book”, “toque”). If we were to shift our domain away from artifact-words, no doubt we’d find more words that don’t refer in the sense of (R). Maybe we’d find real honest-to-goodness referring words, but we’d still be left with a language that contains a sizable chunk of non-referential expressions. Worse still, modern formal semanticists have expanded the universe of “real objects” to which expressions can refer to include situations, events, degrees, and so on. What’s the objective nature of an event? or a situation? or a degree? No idea, but I know them when I see them.

“So what?”you might say. “Formal semantics works. Just look at all of the papers published, problems raised and solved, linguistic phenomena described.Who cares if we don’t know how reference works?” Well, if semantics is the study of meaning, and meaning is reference, then how can there be any measure of the success of semantics that isn’t a measure of it understanding of what reference is?

Again, consider a comparison with genetics. What if, instead of asking follow-ups to Mendel’s laws, geneticists had merely developed the laws to greater precision? Our current understanding of genetics would be wildly impoverished. We certainly would not have all of the advances that currently characterize genetic science. Quite obviously genetics is much the richer for asking those follow-up questions.

No doubt semantics would be much richer if it allowed follow-up questions.


* It is precisely this situation that makes it so difficult to communicate the aims of generative linguistics, and why the main type of linguistics that gains any sort of traction in the mainstream press is the type that looks at other people’s language. Consider the moral panic about the speech patterns of young women that surfaces every so often, the NY Time Bestseller Because Internet, by Gretchen McCulloch, which looks at the linguistic innovation on the internet, or even the current discussion about the origins of Toronto slang. To paraphrase Mark Twain, nothing so needs research and discussion as other people’s language.

I’m being generous here. In fact, most paradoxes of reference are about proper names (See, Katz, J. J. (1986). Why intensionalists ought not be Fregeans. Truth and interpretation, 59-91.)