What does falsification look like anyway?

Vulcan vs Neptune

There’s an argument that plays out every so often in linguistics the goes as follows:

Critic: This data falsifies theory T.
Proponent: Not necessarily, if you consider arguments X,Y, and Z.
Critic: Well, then theory T seems to be unfalsifiable!

This is obviously a specious argument on the part of the critic, since unfalsified does not entail unfalsifiable, but I think it stems from a very understandable frustration—theorists often have an uncanny ability to wriggle free of data that appears to falsify their theories, even though falsificationism is assumed by a large majority of linguists. The problem is that the logic falsificationism, while being quite sound, maybe unimpeachable, turns out to be fiendishly difficult to apply.

At its simplest, the logic of falsificationism says that a theory is scientific insofar as one can construct a basic statementi.e., a statement of fact—that would contradict the theory. This, of course, is an oversimplification of Karl Popper’s idea of Critical Rationalism in a number of ways. For one, falsifiability is not an absolute notion. Rather, we can compare the relative falsifiability of two theories by looking at what Popper calls their empirical content—the number of basic statements that would contradict them. So if a simple theoretical statement P has a particular empirical content, then the conjunction P & Q will have a greater empirical content, and the disjunction P v Q will have a lesser empirical content. This is a useful heuristic when constructing or criticizing a theory internally, and seems like a straightforward guide to testing theories empirically. Historically, though, this is not the case, largely because it is often difficult to recognize when we’ve arrived at and accurately formulated a falsifying fact. In fact, it is often, maybe always, the case that we don’t recognize a falsifying fact as such until after one theory has been superseded by another.

Take for instance the case of the respective orbits of Mercury and Uranus. By the 19th century, Newtonian mechanics had allowed astronomers to make very precise predictions about the rotations of the planets, and based on those predictions, there was a problem: two of the planets were misbehaving. First, it was discovered that Uranus—then the last known planet from the sun—wasn’t showing up where it should have been. Basically, Newton’s mechanics predicted that on such and so day and time Uranus would be in a particular spot in the sky, but the facts were otherwise. Rather than cry “falsification!”, though, the astronomers of the day hypothesized an object on the other side of Uranus that was affecting its orbit. One such astronomer, Urbain Le Verrier was even able to work backwards and predict where that object could be found. So in September of 1846, armed with Le Verrier’s calculations, Johann Gottfried Galle, was able to observe an eighth planet—Neptune. Thus, an apparent falsification became corroboration.

Urbain Le Verrier (1811-1877)
Johann Galle (1812-1910)

I’ve previously written about this story as a vindication of the theory first approach to science. What I didn’t write about, and what is almost never discussed in this context is Le Verrier’s work on the misbehaving orbit of Mercury. Again, armed with Newton’s precise mechanics, Le Verrier calculated the Newtonian prediction for Mercury’s orbit, and again[1]Technically though, Le Verrier’s work on Mercury predated his work on Uranus Mercury didn’t behave as expected. Again, rather than throw out Newtonian mechanics, Le Verrier hypothesized the planet Vulcan between Mercury and the sun, and set about trying to observe it. While many people claimed to observe Vulcan, none of these observations were reliably replicated. Le Verrier was undeterred, though, perhaps because observing a planet that close to the sun was quite tricky. Of course, it would be easy to paint Le Verrier as an eccentric—indeed, his Vulcan hypothesis is somewhat downplayed in his legacy—but he doesn’t seem to have been treated so by his contemporaries. The Vulcan hypothesis wasn’t universally believed, but neither does it seem to be the Flat-Earth theory of its day.

It was only when Einstein used his General Theory of Relativity to accurately calculate Mercury’s orbit, that the scientific community seems to have abandoned the search for Vulcan. Mercury’s orbit is now considered a classical successful test of General Relativity, but why don’t we consider it a refutation of Newtonian Mechanics? Strict falsificationism would seem to dictate that, but then a strict falsificationist would have thrown out Newtonian Mechanics as soon as we noticed Uranus misbehaving. So, falsificationism of this sort leads us to something of a paradox—if a single basic statement contradicts a theory, there’s no way of knowing if there is some second basic statement that, in conjunction with the first, could save the theory.

Still, it’s difficult to toss out falsification entirely, because a theory that doesn’t reflect reality, may be interesting but isn’t scientific.[2]Though sometimes, theories which seem to be empirically idle end up being scientifically important (cf. non-Euclidean geometry) Also, any reasonable person who has ever tried to give an explanation to any phenomenon, probably rejects most of their own ideas rather quickly on empirical bases. We should instead adopt falsificationism as a relative notion—use it when comparing multiple theories. So, Le Verrier was ultimately wrong, but acted reasonably—he had a pretty good theory of mechanics so he worked to reconcile it with some problematic data. Had someone developed General Relativity in Le Verrier’s time, then it would have been unreasonable to insist that a hypothesized planet was a better explanation than an improved theory.

Returning to the hypothetical debate between the Critic and the Proponent, then, I think a reasonable albeit slightly rude response for the proponent would be “Well, do you have a better theory?”

References

References
1 Technically though, Le Verrier’s work on Mercury predated his work on Uranus
2 Though sometimes, theories which seem to be empirically idle end up being scientifically important (cf. non-Euclidean geometry)

Chris Collins interviews Noam Chomsky about formal semantics

Over on his blog, Chis Collins has posted the text of a conversation he had over email with Noam Chomsky on the topic of formal semantics. While Chomsky has been very open about his views on semantics for a long time, this interview is worth reading for working linguists because Collins frames the conversation around work by linguists—Heim & Kratzer, and Larson & Segal—rather than philosophers—Quine, Austin, Wittgenstein, Frege, et al.

You should read it for yourself, but I’d like to highlight one passage that jumped out at me. Of the current state of the field, Chomsky says:

Work in formal semantics has been some of the most exciting parts of the field in recent years, but it hasn’t been treated with the kind of critical analysis that other parts of syntax (including generative phonology) have been within generative grammar since its origins. Questions about explanatory power, simplicity, learnability, generality, evolvability, and so. More as a descriptive technology. That raises questions.

p 5. (emphasis mine)

It’s true that formal semantics today is a vibrant field. There’s always new analyses, The methods of formal semantics are being applied to new sets of data, and, indeed, it’s virtually impossible to even write a paper on syntax without a bit of formal semantics. Yet it is also true that almost no one has been thinking about the theory underpinning the analytical technology. As a result, I don’t think many working semanticists are even aware that there is such a theory, or if they are aware, they tend to wave their hands, saying “that’s philosophy”. Formal semanticists, it seems, have effectively gaslit themselves.

Chomsky’s framing here is interesting, too. He could be understood as suggesting that formal semantics could engage in theoretical inquiry while maintaining its vibrancy. It’s not clear that this is the case though. Currently, formal semantics bears a striking similarity to the machine-learning/neural-nets style of AI, in that both are being applied to a very wide array of “problems” but a closer look at the respective technologies very likely would cause us to question whether they should be. Obviously, the stakes are different—no one’s ever been injured in a car crash because they used lambdas to analyze a speech act—but the principle is the same.

But I digress. Collins and Chomsky’s conversation is interesting and very accessible to anyone who familiar with Heim & Kratzer-style semantics. It’s well worth a read.

Break up Big University; Create Jobs

This argument in this article (tweeted out by Shit Academics Say) is just designed to pit one group of workers (sessional lecturers) against another (tenured faculty). This is because it ignores the fact that the number of faculty positions at least in Canada is kept artificially low.

Notice that there is no mention whatsoever of class-sizes. Coming from UofT, I can tell you, class sizes have been out of control for years. My intro bio class so big that no lecture halls could house it. Lectures were in the 1730-seat Convocation Hall.

Convocation hall is a beautiful building but it is designed for ceremony, not pedagogy. There is no chalkboard or whiteboard, and if there were, the students in the upper balcony wouldn't be able to read them. What's more the seats have no writing surface for note-taking.

More recently, I taught a "general interest" linguistics course (a "bird course") so big that it also couldn't be housed in a proper lecture hall. Instead we had what was basically a movie theatre. The lights were perpetually dimmed, and again, no chalkboard.

These sorts of non-classrooms really only allow for one type of teaching style, possibly the worst type: A lecturer droning on about a slide deck.

Beyond just the lectures, it's quite impossible for all 1000+ students of such a class to have direct access to their professors in office hours. There aren't enough hours in the day.

(Of course, most students don't go to office hours. It might make a good action, though, for student unions to organize students to go to office hours en masse. Not to shout slogans at professors, but just to ask for help)

Clearly, UofT, the largest university in Canada, has reached its capacity of students.

Imagine, though, if we kept tenure and the researcher/teacher model of academia and put hard limits on class sizes. Say, 200 for 1st yr classes, 100 for 2nd yr and so on. How would that affect things?

The neoliberal response would probably be "well, you'd have to have fewer students, probably only well-off white students."
But there's another possibility: Expand the faculty size by creating new universities.

This could mean founding a brand new university, or it could mean splitting up oversized universities. UofT, for instance, has three campuses: Downtown, Scarborough, and Mississauga. Why not spin them off from each other?

There are definitely ways to do this that I haven't thought about, and none of them are perfect, and all of them require public funding. But that's true of any societal problem.

But we can't really expect to solve the problem without an adequate diagnosis of the problem's source.
There's no shortage of qualified educators, nor is there a shortage of people who want/need an education.The problem is infrastructure.

So, whenever someone makes an argument pitting workers against workers, it can only really serve to obscure the fact that the problem is elsewhere—with management, with bureaucracy, with politicians.

Originally tweeted by Dan Milway (@thrilway) on June 26, 2021.

But it’s obvious, isn’t it?

As a linguist or, more specifically, as a theoretical syntactician, I hold and often express some minority opinions.[1]Outside of syntactic theory too Often these opinions are met with bafflement and an assertion like “We’ve known for years that that’s not the case” because of this phenomenon, or that piece of data—“Control is derived by movement? But what about de se interpretation??” “Merge is free? But what about c-selection??” “Long-distance Agree isn’t real? But what about English existential clauses??”[2]I have a hypothesis that the vehemence with which someone will defend a theory or analysis is correlated with how much they struggled to understand it in school. Basically, we’re more likely to … Continue reading These sorts of objections are often tossed out as if the data speaks for itself when really, the thing that makes scientific inquiry so tough is that the data rarely speaks for itself, and when it does, it doesn’t do so clearly.

Take, for instance, the case of English existential clauses like (1) and (2) and how they are used as absolute proof of the existence of Long-Distance Agree.

(1) There ?seems/seem to be several fish in the tank.
(2) There seems/*seem to be a fish in the tank.

In both sentences, the grammatical subject is the expletive there, but the verb agrees with a DP[3]I still think I buy the DP hypothesis, but I’m also intrigued by Chomsky’s recent rejection of it and amused by the reaction to this rejection. that appears to be structurally “lower” in the clause. Therefore, there must be some non-movement way of getting features from a lower object onto a higher object—Long-Distance Agree. This is often presented as the obvious conclusion, the only conclusion, or the simplest conclusion. “Obvious” is in the eye of the beholder and doesn’t usually mean “correct”; Norbert Hornstein, in his A Theory of Syntax proposes three alternative analyses to Long-Distance Agree; only “simplest” has legs, although that’s debatable.

Occam’s razor says “entities should not be multiplied without necessity,” and any analysis of (1) and (2) without Long-Distance Agree will have to say that in both cases, the agreeing DP is covertly in subject position. These covert subjects are argued to constitute an unnecessary multiplication of entities, but one could just as easily argue that Long-Distance Agree is an unnecessary entity. What’s more, covert movement and silent elements both have independent arguments in their favour.

Of course, the covert subject analysis of (1) and (2) is not without its flaws. Chief among them, in my opinion, is that it would seem to wrongly predict that (1) and (2) mean the same thing as (3) and (4), respectively.

(3) Three fish seem to be in the tank.
(4) A fish seems to be in the tank.

These sentences differ from (1) and (2) in that they—(3) and (4)—presuppose the existence of three fish or a single fish, while (1) and (2) merely assert it. This contrast is clearest in (5)-(8) which are examples that Chomsky has been using for several decades.

(5) There’s a fly in my soup.
(6) There’s a flaw in my argument.
(7) A fly is in my soup.
(8) *?A flaw is in my argument.

Likewise, Long-Distance Agree has its own problems, some of which I discuss in my latest paper. Indeed, it is vanishingly rare in any field of inquiry—or life itself—to find an unproblematic solution to a problem.

My goal here isn’t to argue that Long-Distance Agree is wrong,[4]Though, I do think it is. but to point out that it’s not a foregone conclusion. In fact, I think that if we listed the hypotheses/theories/notions that most syntacticians took to be (nearly) unquestionable and honestly assessed the arguments in their favours, I doubt that many would turn out to be as robust as they seem. This doesn’t mean that we need to reject every idea that less than 100% solid, just that we should hold on to them a little more loosely. As a rule, we should all carry with us the idea that we could very well be wrong about almost everything. The world’s more interesting that way.

References

References
1 Outside of syntactic theory too
2 I have a hypothesis that the vehemence with which someone will defend a theory or analysis is correlated with how much they struggled to understand it in school. Basically, we’re more likely to die on a hill if we had to fight to summit that hill. This has some interesting implications that I might get into in a later post.
3 I still think I buy the DP hypothesis, but I’m also intrigued by Chomsky’s recent rejection of it and amused by the reaction to this rejection.
4 Though, I do think it is.

New LingBuzz Paper

(or “How I’ve been spending my unemployment*”)

Yesterday I finished and posted a paper to LingBuzz. It’s titled “Agree as derivational operation: Its definition and discontents” and its abstract is given below. If it sounds interesting, have a look and let me know what you think.

Using the framework laid out by Collins and Stabler (2016), I formalize Agree as a syntactic operation. I begin by constructing a formal definition a version of long-distance Agree in which a higher object values a feature on a lower object, and modify that definition to reflect various several versions of Agree that have been proposed in the “minimalist” literature. I then discuss the theoretical implications of these formal definitions, arguing that Agree (i) muddies our understanding of the evolution of language, (ii) requires a new conception of the lexicon, (iii) objectively and significantly increases the complexity of syntactic derivations, and (iv) unjustifiably violates NTC in all its non-vacuous forms. I conclude that Agree, as it is commonly understood, should not be considered a narrowly syntactic operation.

*Thanks to the Canada Recovery Benefit, I was able to feed myself and make rent while I wrote this.

On the notion of an intellectual coup

In chapter nine of his book Goliath: The 100-Year War Between Monopoly Power and Democracy, Matt Stoller recounts the story of the genesis of the Chicago School of law & economics—the school of thought which has come to dominate virtually every aspect of the Western power structure since the 1970s. In Stoller’s telling, it truly could be considered a moment of epoch in economics, law, political science, and related disciplines, much as the Copernican geocentrism was for physics, or Mendel’s laws were for biology, or Generative Grammar was for psychology. The shift in thinking brought on by the Chicago school was perhaps as drastic and far-reaching as those brought on by these intellectual revolutions. Yet, in reading it, it struck me that it would wrong to describe the founding of the Chicago school as a revolution because it wasn’t one—it was an intellectual coup.

But what makes something an intellectual revolution? What makes it an intellectual coup? To stick with the analogy to political processes, the difference is legitimacy—revolutions are legitimate changes, while coups are illegitimate. Legitimacy, of course, is hard to judge objectively, but still, to call something a revolution is to judge it to be legitimate. The violent 1973 overthrow of the democratically elected Allende government in Chile is commonly called a “coup” rather than a revolution. Similarly, Historian Michael J. Klarman refers to the US Constitutional Convention as a coup to indicate that he judges it to have been illegitimate. And importantly, the revolution-coup distinction doesn’t boil down to the simple subjective value judgement of revolutions are good and coups are bad. So, while conservatives the world round, likely agree that the American Revolution was good, many argue that the French and Russian revolutions were bad. Interestingly, though, I don’t know that many people would think that a coup could be good. So, while most Americans would probably say the Constitutional convention is good, they probably wouldn’t describe it as a coup, perhaps because illegitimacy is per se bad.

So what makes a shift of ideas illegitimate—what makes it an intellectual coup? To see this we should look at what a legitimate shift looks like. The stories we’re used to hearing involve a disinterested person (or possibly a group) proposing a new idea in an open forum, while make an honest critical argument that it is superior to a contemporaneously widely-accepted idea. The proposal must be open, so that fair criticisms can be aired. The proposer should be disinterested in the sense that the proposed idea is not a means to some other material end (e.g., money or political influence), but rather an end in itself. The discourse around the idea should acknowledge and address the ideas antecedents and rivals, because it allows the larger community to accurately assess the merits of new idea.

We can see all of these criteria in the great shifts in the history of ideas. Even Galileo and Copernicus, whose work predated any of the modern intellectual institutions—like peer-reviewed journals, conferences, or universal primary education—that we all take for granted, opened their work to criticism—not by their peers primarily, but the Inquisition—and did so, not as a means to an end but for the sake of the ideas themselves—what self-interested person would open themselves to the punishment that a renaissance inquisition could dole out. Finally, it would be hard to credibly suggest that the early heliocentrists could ignore or misrepresent their intellectual competitors, which had been taken as a religious dogma, uncritically believed by their contemporaries. The very story of the Copernican revolution is one of competing ideas.

An illegitimate shift would go against one or more of these criteria. It would develop an idea in a less-than-open way; it would be put forth on behalf of some interest group, or as a means to an end for the proposer; or it would either ignore or caricature its competitor-ideas. And more often than not, the latter infraction will be the most characteristic feature of an intellectual coup. Taking the rise of the Chicago School, and its views on monopoly and antitrust, as Stoller recounts it as our prototype, we can see all of these features in play.

The story starts with wealthy businessman and New Deal enemy Harold Luhnow using his foundation The Volker Fund to finance a right-wing research project at the University of Chicago, starts continues with the project’s leading academic Aaron Director gathering a cadre of acolytes and eventually using private funds to start a journal that would be friendly to their ideas. What really allowed the Chicago School to change from a fringe endeavour to the dominant school of thought in the Western social sciences, in Stoller’s assessment, were a pair of rhetorical misappropriations: Adopting “the language of Jeffersonian democracy” and “the apolitical language of science.”

Jeffersonian democracy was in favour of the rights of the individual in opposition to centralized power, a stance that comes from Classical Liberalism and that the Chicago School loudly endorsed. The rhetorical trick, though, is that the Chicago School (and modern right-libertarians) treated authoritarian institutions like corporations as individuals and democratic institutions like labour unions as centralized power. Yet, even a cursory glance at many of the paragons of classical liberalism shows a number of views that we would now associate with a radical left-wing position. Some of Marx’s economic ideas come almost directly from Adam Smith, ideas like the labour theory of value, or the essentially parasitic nature of landlords. Of course, these views of Smith that don’t jibe with the right-wing caricature of him are either ignored or treated as a source of embarrassment. This move, of course, was aided by the fact that, by the time the right-wing Chicago School was appropriating the classical liberal tradition, the American left seemed to be pushing that tradition away. In fact, a recurring theme in Stoller’s is that the left has largely ceded populism to the right and embraced elitism.

Using the rhetoric of “science”, though, has probably been a much more powerful trick, because the general public including much of the elite’s attitude toward it is about as positive as its understanding of the term is murky. Nearly everyone—even flat-earthers, anti-vaxxers, and climate deniers—thinks science is good, but no one could define it. Sure, some would say something about experimental methods, or falsificationism, or spout some Kuhnian nonsense, and everyone would probably agree that quantum physics is a science, while film criticism is not, but few probably realize that philosophers of science have been consistently unable to pin down what constitutes a science. So, when an economist throws graphs and equations at us and declares scientific a statement that offends common sense, very few people are intellectually equipped to dispute them. In the case of the Chicago School, they were at an advantage because, until they adopted it, the claim that economics (along with politics, law, and history) could be a science like physics was probably only held by strict Marxists. The opposing position was one that worried about notions like power and democracy—hardly the kinds of ideas amenable to scientific analysis. If you think that Google doesn’t really compete in an open market, but uses its market power to crush all competition, then you probably also think the sun revolves around the earth.

While the moneyed interests backing the Chicago School and its insular nature in the early days certainly indicate that it was not likely to lead a legitimate intellectual shift, its rhetorical tricks, I believe, are what makes its success a coup rather than a revolution, and what has made its ideas so stubborn. It fosters the oppressive slogan “There is no alternative.” By co-opting the great thinkers of the enlightenment, the Chicago School can paint any opponents as anti-rational romantics, and by misappropriating the language of science, they can group dissenters with conspiracy theorists and backwards peasants. This makes it seem like a difficult position to argue against, but as many have discovered recently, it’s a surprisingly brittle position.

Take, for instance, the Chicago School position on antitrust laws—that they were intended as a consumer protection. This has been the standard position of antitrust enforcers in the U.S. and it’s based on an article by Robert Bork. It’s how obvious monopolists, like Google and Facebook have escaped enforcement thus far. But, as Stoller’s book documents, the actual legislative intent of U.S. antitrust laws had nothing to do with consumer welfare, and everything to do with power. Bork’s article, then, was a work of fiction, and once you understand that, the entire edifice of modern antitrust thinking begins to crumble.

So, the Chicago School carried out an intellectual coup—one that struck virtually every aspect of our society—but have there been intellectual coups in other fields? Two spring to mind for me—one in physics, and one in my own field of linguistics. Before I describe them, though, a brief word on motivations as an aspect of intellectual coups is in order.

One of the features of an intellectual coup that I described above is that of an ulterior motive driving it. In the case of the Chicago School it was driven by capitalists set on dismantling the New Deal for their own financial interests. Does that mean that everyone who subscribes to the Chicago School does so so that billionaires can make more money? Not at all. There are definitely Chicago Schoolers who are true believers. Indeed, I would wager that most, if not all, of them are. Hell, even political coups have true believers in them. What about the particular ulterior motives? Are all intellectual coups done on behalf of capital? No. Motivations take all sorts of forms, and are often subconscious. Bold claims are often rewarded with minor celebrity or notoriety which might have material benefits like job offers or the like. They are also sometimes correct. So, if a researcher makes a bold claim, are they doing so to stand out among their peers or are doing so because they truly believe the claim? It’s almost never possible to tell. Since intellectual coups are essentially based on intellectual dishonesty and its probably a safe choice to assume that those that enact an intellectual coup are capable and well-meaning people, discussions of motivations are useful to understand how a capable and well-meaning person could get caught up in a coup. As such, I will focus more on the means rather than the motive when diagnosing a coup.

The Copenhagen Quantum Coup

If you’re at all interested in the history of science, you may have heard of the Bohr-Einstein debate. The narrative that you likely heard was that in the early 20th century, the world community of physicists had accepted quantum mechanics with a single holdout, Albert Einstein, who engaged Niels Bohr in a debate at the 5th Solvay Conference in 1927. Einstein made a valiant argument, capping it with the declaration that “God does not play dice!” When it was Bohr’s turn, he wiped the floor with Einstein, showing that the old man was past his prime and out of step with the new physics. He even used Einstein’s own theory of relativity against him! And with that, Quantum mechanics reigned supreme, relegating all critics to the dustbin of history.

It’s a good story and even has a good moral about the fallibility of even a genius like Einstein. The trouble, though, at least according to Adam Becker in his excellent book What is Real?, is that the debate didn’t go down like that. For starters, Einstein wasn’t skeptical about quantum mechanics, but rather had questions about how we are to interpret it. Bohr was advocating for what’s misleadingly called “the Copenhagen Interpretation” which basically says that there is no way to give quantum theory a realist interpretation, all we can do is solve the equations and compare the solutions to experimental results. Furthermore, as Becker recounts, Einstein’s arguments weren’t out of step with contemporary physics. In fact, they were brilliantly simple thought experiments that struck at the very core of quantum mechanics. Their simplicity, however, meant that they sailed over the heads of Bohr and his cadre. It was Bohr’s response that missed the point. And finally, that famous quote from Einstein was in a letter to his friend Max Born, not at the conference in question.

This certainly has the hallmarks of an intellectual coup—it depends on a rhetorical trick of manipulating a narrative to favour one outcome, it shuts down debate by lumping dissenters in with the anti-rationalists, and it’s rather brittle—but it’s not quite as bald-faced as the Chicago School coup. Even as Becker tells it, the scientists in Bohr’s camp probably believed that Einstein was losing it and that he’s missed the point entirely. What’s more, the Copenhagen perspective, which the popularized telling of the debate supports, is not a pack of falsehoods like the Chicago School, but rather an overly narrow conception on the nature of scientific inquiry—a conception called “instrumentalism” which tends to banish humanistic questions of truth, reality, and interpretation to the realm of philosophy and views “philosophy” as a term of abuse.

But where is the dishonesty that I said every coup was based on? It seems to have come in the form of laziness—Bohr and his compatriots should have made a better effort to understand Einstein’s critique. This laziness, I believe, rises to the level of dishonesty, because it ended up benefiting the Copenhagen perspective in a predictable way. As Becker describes, Bohr, for various reasons, wanted to show that Quantum Mechanics as formulated in the 1920s was complete and closed—a perfect theory. Paradoxes and interpretive issues, such as the ones that Einstein was raising, revealed imperfections, which had to be ignored. Whether Bohr had all of this in his mind at the Solvay Conference is beside the point. His, and his followers’, was a sin of omission.

The Formal Semantics Coup

The standard theoretical framework of contemporary semantics, at least within the generativist sphere, is known as formal semantics. Few semanticists would likely agree that there is such thing as a standard theory, but those same semanticists probably agree on the following:

  1. The meaning of a word or a phrase is the thing or set of things that that word or phrase refers to.
  2. The meaning of a sentence is its truth conditions.
  3. Linguistic meanings can be expressed by translating expressions of a Natural Language into formulas of formal logic.
  4. Any aspect of language that doesn’t meet the requirements of 1-3 is outside the domain of semantics.

The origins of these standard tenets of formal semantics, though, are not some empirical discovery, or the results of some reasoned debate, but rather the declarations of a handful of influential logicians and philosophers. The ascendency of formal semantics, then, is due not to a revolution, but a coup. Since linguistic theory doesn’t get the same amount of press as economics and physics, the historical contours of the shift to formal semantics are at best murky. As such, I’ll explain my coup diagnosis through a series of personal anecdotes—not the ideal method, but the best I can do right now.

I was first exposed to formal semantics in my graduate coursework. The four numbered statements above were what I took for granted for a while. I was aware that there were other ways of looking at meaning, and that formal semantics was a relatively recent addition to the generative grammar family of theories, and I guess I assumed that the advent of formal semantics was an intellectual revolution and there must’ve been a great debate between the formalists and the non-formalists and the formalists came out on top. Of course, no one ever talked about that debate—I knew about the ongoing debates between behaviourists and generativists, and the “wars” between Generative Semantics and interpretive semantics, but no one told the tales of the Great Formal Semantics Debates. This should have been my first red flag—academics aren’t shy about their revolutionary arguments.

I first began to have qualms about formal semantics, when I heard Noam Chomsky’s lucid critiques of referentialism (tenet #1 above) in the Michel Gondry documentary Is The Man Who Is Tall Happy. Here was the man who founded Generative Syntax, who’s often considered a genius, and whose publications are usually major events in the field arguing that we’ve been doing semantics all wrong. As I better familiarized myself with his arguments, it became clear that he was holding a reasonable position. If I ever brought it up to a working semanticist, though, they would first brush it off saying basically “Chomsky needs to stay in his lane,” but when I put the arguments to them, they would acknowledge that they might be sound arguments, but that formal semantics was the only game in town (i.e., There is no alternative). One even told me straight out that, sure I could go against formal semantics, but if I did, I’d never get hired by any linguistics department (Of course, given the prevailing political and economic environment surrounding academic institutions, the odds of me getting hired regardless of my stance on formal semantics are pretty long anyway). This was when I first started to suspect something was amiss—the only defense that could be mustered for formal semantics was that everyone else was doing it and we can’t imagine an alternative.

I had to admit, though, that, despite my misgivings, I had no alternative to formal semantics and, being a syntactician, I didn’t really have the inclination to spend a lot of time coming up with one. As luck would have it, though, I happened upon exactly the sort of alternative that wasn’t supposed to exist: Jerrold Katz’ Semantic Theory. Published in 1972, the theory Katz proposed was explicitly non-referentialist, formal (in the sense of having a formalism), and opposed to what we now call formal linguistics. It was quite a surprise because I had heard of Katz—I read a paper he co-authored with Jerry Fodor for a syntax course—but strangely, he was always associated with the Generative Semantics crew—strangely, because he explicitly argues against them in his book. So, contrary to what I’d been told, there was an alternative, but why was I just finding out about it now? Unfortunately, Jerrold Katz died a few years before I ever picked up his book, as had his occasional co-author Jerry Fodor, so I couldn’t get their accounts of why his work had fallen out of favour. I asked the semanticists I knew about him and they recognized the name but had no idea about his work. The best explanation I got was from Chomsky, who said that he did good work, but semanticists were no longer interested in the questions he was asking. No stories of an LSA where Katz squared off against the new upstarts and was soundly beaten, no debates in the pages of Language or Linguistic Inquiry, Katz was just brushed aside and never spoken of again. Instead, the very fiats of philosophers and logicians (Carnap, Lewis, Quine, etc.) that Katz had argued against became the unexamined cornerstones of the field.

So, while the givenness of formal semantics was probably not the result of the schemes of a cabal of moneyed academics, like the Chicago School was, it doesn’t seem to have been the result of an open debate based on ideas and evidence, and it’s held in place, not by reason, but basically by sociopolitical forces. Thus I feel comfortable suggesting that it was the result of an intellectual coup.

Summing up: There’s always an alternative

I’ve offered a few potential features of an intellectual coup here, but nothing like an exhaustive diagnostic checklist. One important feature, though, is the “there is no alternative” attitude that they seem to foster. Any progress that we’ve made as a species, be it political, social, intellectual, or otherwise, stems from our ability to imagine a different way of doing things. So, for an intellectual community to be open to progress, it has to accept that there other ways of thinking about the world. Some of those alternatives are worse, some are better, but the only sure-fire way not to make progress is to declare that there is no alternative.

A Response to some comments by Omer Preminger on my comments on Chomsky’s UCLA Lectures

On his blog, Omer Preminger posted some comments on my comments on Chomsky’s UCLA Lectures, in which he argues that “committing oneself to the brand of minimalism that Chomsky has been preaching lately means committing oneself to a relatively strong version of the Sapir-Whorf Hypothesis.” His argument goes as follows.

Language variation exists. To take Preminger’s example, “in Kaqchikel, the subject of a transitive clause cannot be targeted for wh-interrogation, relativization, or focalization. In English, it can.” 21st century Chomskyan minimalism, and specifically the SMT, says that this variation comes from (a) variation between the lexicon and (b) the interaction of the lexical items with either the Sensory-Motor system or the Conceptual-Intentional system. Since speakers of a language can process and pronounce some ungrammatical expressions—some Kaqchikel speakers can pronounce an equivalent of (1) but judge it as unacceptable—some instances of variation are due to the interaction of the Conceptual-Intentional system with the lexicon.

(1) It was the dog who saw the child.

It follows from this that either (a) the Conceptual-Intentional systems of English-speakers and Kaqchikel-speakers differ from each other or (b) English-speakers can construct Conceptual-Intentional objects that Kaqchikel-speakers cannot (and vice-versa, I assume). Option a, Preminger asserts, is the Sapir-Whorf hypothesis, while option b is tantamount to (a non-trivial version of) it. So, the SMT leads unavoidably to the Sapir-Whorf hypothesis.

I don’t think Preminger’s argument is sound, and even if it were, its conclusion isn’t as dire as he makes it out to be. Let’s take these one at a time in reverse order.

The version of the Sapir-Whorf hypothesis that Preminger has deduced from the SMT is something like the following—the Conceptual-Intentional (CI) content of a language is the set of all (distinct) CI objects constructed by that language and different languages have different CI content. This hypothesis, it seems, turns on how we distinguish between CI objects—far from a trivial question. Obviously contradictory, contrary, and logically independent sentences are CI-distinct from each other, as are non-mutually entailing sentences and co-extensive but non-co-intentisive expresions, but what about true paraphrases? Assuming there is some way in Kaqchikel of expressing the proposition expressed by (1), then we can avoid Sapir-Whorf by saying that paraphrases express identical CI-objects. This avoidance, however, is only temporary. Take (2) and (3), for instance.

(2) Bill sold secrets to Karla.
(3) Karla bought secrets from Karla.

If (2) and (3) map to the same CI object, what does that object “look” like? Is (2) the “base form” and (3) is converted to it or vice versa? Do some varieties of English choose (2) and others (3), and wouldn’t that make these varieties distinct languages?

If (2) and (3) are distinct, however, it frees us—and more importantly, the language learner—from having to choose a base form, but it leads us immediately to the question of what it means to be a paraphrase, or a synonym. I find this a more interesting theoretical question, than any of those raised above, but I’m willing to listen if someone thinks otherwise.

So, we end up with some version of the Sapir-Whorf hypothesis no matter which way we go. I realize this is a troubling result for many generative linguists as linguistic relativity, along with behaviourism and connectionism, is one of the deadly sins of linguistics. For me, though, Sapir-Whorf suffers from the same flaw that virtually all broad hypotheses of the social sciences suffer from—it’s so vague that it can be twisted and contorted to meet any data. In the famous words of Wolfgang Pauli, it’s not even wrong. If we were dealing with atoms and quarks, we could just ignore such a theory, but since Sapir-Whorf deals with people, we need two be a bit more careful. One need not think very hard to see how Sapir-Whorf or any other vague social hypothesis can be used to excuse, or even encourage, all varieties of discrimination and violence.

The version of Sapir-Whorf that Preminger identifies—the one that I discuss above–seems rather trivial to me, though.

There’s also a few problems with Preminger’s argument that jumped out at me, of which I’ll highlight two. First, in his discussion of the Sensory-Motor (SM) system, he seems to assume that any expression that is pronouncable by a speaker is a-ok with that speaker’s SM system—He seems to assume this because he asserts that any argument to the contrary is specious. Since the offending Kaqchikel string is a-ok with the SM system it must run afoul of either the narrow syntax (unlikely according to SMT) or the CI system. This line of reasoning, though, is flawed, as we can see by applying it’s logic to a non-deviant sentence, like the English version of (1). Following Preminger’s reasoning, the SM system tells us how to pronounce (1) and the CI system uses the structure of (1) generated by Merge for internal thought. This, however, leaves out the step of mapping the linear pronunciation of (1) to its hierarchical structure. Either (a) then Narrow Syntax does this mapping, (b) the SM system does this mapping, or (c) some third system does this mapping. Option a, of course, violates SMT, while option b contradicts Preminger’s premise, this leaves option c. Proposing a system in between pronunciation and syntax would allow us to save both SMT and Preminger’s notion of the SM system, but it would also invalidate Preminger’s over all argument.

The second issue is the assumption that non-SM ungrammaticality means non-generation. This is a common way of thinking of formal grammars, but very early on in the generative enterprise, researchers (including Chomsky) recognized that it was far to rigid—that there was a spectrum from prefect grammaticality to word salad that couldn’t be captured by the generated/not-generated dichotomy. Even without considering degrees of grammaticality, though, we can find examples of ungrammatical sentences that can be generated. Consider (4) as compared to (5).

(4) *What did who see?
(5) Who saw what?

Now, (4) is ungrammatical because wh-movement prefers to target the highest wh-expression, which suggests that in order to judge (4) as ungrammatical, a speaker needs to generate it. So, the Kaqchikel version of (1) might be generated by the grammar, but such generation would be deviant somehow.

Throughout his argument, though, Preminger says that he is only “tak[ing] Chomsky at his word”—I’ll leave that to the reader to judge. Regardless, though, if Chomsky had made such an assumptions in an argument, it would be a flawed argument, but it wouldn’t refute the SMT.

A note on an equivocation in the UCLA Lectures

In his recent UCLA Lectures, Chomsky makes the following two suggestive remarks which seem to be contradictory:

. . . [I]magine the simplest case where you have a lexicon of one element and we have the operation internal Merge. [. . . ] You have one element: let’s just give it the name zero (0). We internally merge zero with itself. That gives us the set {0, 0}, which is just the set zero. Okay, we’ve now constructed a new element, the set zero, which we call one.

p24

We want to say that [X], the workspace which is a set containing X is distinct from X.
[X] ≠ X
We don’t want to identify a singleton set with its member. If we did, the workspace itself would be accessible to MERGE. However, in the case of the elements produced by MERGE, we want to say the opposite.
{X} = X
We want to identify singleton sets with their members.

p37

So in the case of arithmetic, a singleton set ({0}, one) is distinct from its member (0), but the two are identical in the case of language. This is either a contradiction—in which case we need to eliminate one of the statements—or its an equivocation—in which case we need to find and understand the source of the error. The former option would be expedient, but the latter is more interesting. So, I’ll go with the latter.

The source of the equivocation, in my estimation, is the notion of identity—Chomsky’s remarks become consistent when we take him to be using different measures of identity and, in order to understand these distinctions, we need to dust off a rarely used dichotomy—form vs substance.

This dichotomy is perhaps best known to syntacticians due to Chomsky’s distinction between “formal universals” and “substantive universals” in Aspects, where formal universals were constraints on the types of grammatical rules in the grammar and substantive universal were constraints on the types of grammatical objects in the grammar. Now, depending on what aspect of grammar or cognition we are concerned with, the terms “form” and “substance” will pick out different notions and relations, but since we’re dealing with syntax here we can say that “form” picks out purely structural notions and relations, such as are derived by merge, while substance picks out everything else.

By extension, then, two expressions are formally identical if they are derived by the same sequences of applications of merge. This is a rather expansive notion. Suppose we derived a structure from an arbitrary array A of symbols, any structure whose derivation can be expressed by swapping the symbols in A for distinct symbols will be formally identical to the original structure. So, “The sincerity frightened the boy.” and “*The boy frightened the sincerity” would be formally identical, but, obviously, substantively distinct.

Substantive identity, though is more complex. If substance picks out everything except form, then it would pick out everything to do with the pronunciation and meaning of an expression. So, from the pronunciation side, a structurally ambiguous expression is a set of (partially) substantively identical but formally distinct sentences, as are paraphrases on the meaning side.

Turning back to the topic at hand, the distinction between a singleton set and its member is purely formal, and therein lies the resolution of the apparent contradiction. Arithmetic is purely formal, so it traffics in formal identity/distinctness. Note that Chomsky doesn’t suggest that zero is a particular object—it could be any object. Linguistic expressions, on the other hand, have form and substance. So a singleton set {LI} and its member LI are formally distinct but, since they would mean and be pronounced the same, are substantively identical.

It follows from this, I believe, that the narrow faculty of language, if it is also responsible for our faculty of arithmetic, must be purely formal—constructing expressions with no regard for their content. So, the application of merge cannot be contingent on the contents of its input, nor could an operation like Agree, which is sensitive to substance of an expression, be part of that same faculty. These conclusions, incidentally, can also be drawn from the Strong Minimalist Thesis

Internal unity in science again

Or, how to criticize a scientific theory

Recently, I discovered a book called The Primacy of Grammar by philosopher Nirmalangshu Mukherji. The book is basically an extended, and in my opinion quite good, apologia for biolinguistics as a science. The book is very readable and covers a decent amount of ground, including an entire chapter discussing the viability of incorporating a faculty of music into biolinguistic theory. I highly recommend it.

At one point, while defending biolinguistics from the charge of incompleteness levied by semanticists and philosophers, Mukherji makes the following point.

[D]uring the development of a science, a point comes when our pretheoretical expectations that led to the science in the first place have changed enough, and have been accommodated enough in the science for the science to define its objects in a theory-internal fashion. At this point, the science—viewed as a body of doctrines—becomes complete in carving out some specific aspect of nature. From that point on, only radical changes in the body of theory itself—not pressures from common sense—force further shifting of domains (Mukherji 2001). In the case of grammatical theory, either that point has not been reached or … the point has been reached but not yet recognized.

Mukherji (2010, 122-3)

There are two interesting claims that Mukherji is making about linguistic theory and scientific theory in general. One is that theoretical objects are solely governed by theory-internal considerations. The other is that the theory itself determines what in the external world it applies to.

The first claim reminded me of a meeting I had with my doctoral supervisor while I was writing my thesis. My theoretical explanation rested on the hypothesis that even the simplest of non-function words, like coffee, were decomposable into root objects (√COFFEE) and categorizing heads (n0). I had a dilemma though. It was crucial to my argument that, while categorizing heads had discrete features, roots were treated as featureless blobs by the grammar, but I couldn’t figure out how to justify such a claim. When I expressed my concern to my supervisor, she immediately put my worries to rest. I didn’t need to justify that claim, she pointed out, because roots by their definition have no features.

I had fallen into a very common trap in syntax—I had treated a theory-internal object as an empirical object. Empirical objects can be observed and sensibly argued about. Take, for instance, English specificational clauses (e.g. The winner is Mary). Linguists can and do argue about the nature of these—i.e. whether or they are truly the inverse of predicational clauses (e.g., Mary is the winner)— and cite facts the do so. This is because empirical objects and phenomena are out there in the real world, regardless of whether we study them. Theory-internal objects, on the other hand are not subject to fact-based argument, because, unless the Platonists are right, they have no objective reality. As long as my theory is internally consistent, I can define its objects however I damn please. The true test of any theory is how well it can be mapped onto some aspect of reality.

This brings me to Mukherji’s second assertion, that the empirical domain to a theory is determined by the theory itself. In the context of his book, this assertion is about linguistic meaning. The pretheoretic notion of meaning is what he calls a “thick” notion—a multifaceted concept that is very difficult to pin down. The development of a biolinguistic theory of grammar, though, has led to a thinner notion of meaning, namely, the LF of a given expression. Now obviously, this notion of meaning doesn’t include notions of reference, truth, or felicity, but why should we expect it to? Yes, those notions belong to our common-sense ideas of meaning, but surely at this stage of human history, we should expect that scientific inquiry will reveal our common-sense notions to be flawed.

As an analogy, Aristotle and his contemporaries didn’t distinguish between physics, biology, chemistry, geology, an so on—they were all part of physics. One of the innovations of the scientific revolutions, then, was to narrow the scope of investigation—to develop theories of a sliver of nature. If Aristotle saw our modern physics departments, he might look past all of their fantastic theoretical advances and wonder instead why no one in the department was studying plants and animals. Most critiques of internalist/biolinguistic notions of semantics by modern philosophers and formal semanticists echo this hypothetical time-travelling Aristotle—they brush off any advances and wonder where the theory of truth is.

Taken together, these assertions imply a general principle: Scientific theories should be assessed on their own terms. Criticizing grammatical theory for its lack of a theory of reference makes as much sense as criticizing Special Relativity for its lack of a theory of genetic inheritance. While this may seem to render any theory beyond criticism, the history of science demonstrates that this isn’t the case. Consider, for instance, quantum mechanics, which has been subject to a number of criticisms in its own terms—see: Einstein’s criticisms of QM, Schrödinger’s cat, and the measurement problem. In some cases these criticisms are insurmountable, but in others addressing them head-on and modifying or clarifying the theory is what leads to advances in the theory. Chomsky’s Label Theory, I think, is one of the latter sorts of cases—a theory-internal problem was identified and addressed and as a result two unexplained phenomena (the EPP and the ECP) were given a theoretical explanation. We can debate how well that explanation generalizes and whether it leans too heavily on some auxiliary hypotheses, but what’s important is that a theory-internal addressing of a theory-internal problem opened up the possibility of such an explanation. This may seem wildly counter-intuitive, but as I argued in a previous post, this is the only practical way to do science.

The principle that a theory should be criticized in its own terms is, I think, what irks the majority of linguists about biolinguistic grammatical theory the most. It bothers them because it means that very few of their objections to the theory ever really stick. Ergativity, for instance, is often touted as a serious problem for Abstract Case Theory, but since grammatical theory has nothing to say about particular case alignments, theorists can just say “Yeah, that’s interesting” and move on. Or to take a more extreme case, recent years have seen all out assaults on grammatical theory from people who bizarrely call themselves “cognitive linguists”, people like Vyvyan Evans and Daniel Everett, they claim to have evidence that roundly refutes the very notion of a language faculty. The response of biolinguists to this assault: mostly a resounding shrug as we turn back to our work.

So, critics of biolinguistic grammatical theory dismiss it in a number of way. They say it’s too vague or slippery to be any good as a theory, which usually means they refuse to seriously engage with it, they complain that the theory keeps changing—a peculiar complaint to lodge against a scientific theory, or they accuse theorists of arrogance—a charge that, despite being occasionally true, is not a criticism of the theory. This kind of hostility can be bewildering, especially because a corollary of the idea that a theory defines its own domain is that everything outside that domain is a free-for-all. It’s hard to imagine a geneticist being upset that their data is irrelevant to Special Relativity. I have some ideas about where the hostility comes from but they’ll take me pretty far afield, so I’ll save them for a later post and leave it here.

Two freedoms

As I sit at home
I am presented with
An infinite sparkling sea 
Of choices at my fingertips

Such new delights offered
To arrive at my demand
And take me through from
Day to night and back again

But I would give anything to sit
And lean back in the creaking chairs
Of the same old bar
To drink the same old beer

With the same old friends
And new ones also
While we talk about our todays
And dream up our tomorrows