Some good news on the publication front

Today I woke up to an email from the editor of Biolinguistics informing me that my manuscript “A parallel derivation theory of adjuncts” had been accepted for publication. I was quite relieved, especially since I had been expecting some news about my submission for a couple of days—the ability to monitor the progress of submissions on a journal’s website is a decidedly mixed blessing—and there was a definite possibility in my mind that it could have been rejected.

It was also a relief because it’s been a long road with this paper. I first wrote about the kernel of its central idea—that syntactic adjuncts were entirely separate objects from their “hosts”—in my thesis, and I presented it a couple of times within the University of Toronto Linguistics Department a few times. I first realized that it had some legs when it was accepted as a talk at the 2020 LSA Meeting in New Orleans, and I started working on it in earnest in the spring and summer of 2020, submitting the first manuscript version to a different journal in August 2020.

If you follow me on Twitter, you saw my reactions to the peer-review process in real time, but it’s worth summarizing. Versions of this manuscript underwent peer-review at multiple journals and in every case there were one or two constructive reviews—some positive reviews, and some negative reviews that nevertheless pointed out serious but fixable issues—but invariably there was one reviewer who was clearly hostile to the manuscript—there was often sarcasm and vague comments.

I’m sure the manuscript improved over the various submissions, but I believe that the main reason that the paper will finally be published is because the editor of Biolinguistics, Kleanthes Grohmann, recognized and agreed with me that one of the reviewers was being unreasonable, so I definitely owe him my gratitude.

There’s more edits to go, but you can look forward to seeing my paper in Biolinguistics in the near future.

Why are some ideas so sticky? A hypothesis

Anyone who has tried to articulate a new idea or criticize old ones may have noticed that some ideas are washed away relatively easily, while others seem to actively resist even the strongest challenges—some ideas are stickier than others. In some cases, there’s an obvious reason for this stickiness—in some cases there’s even a good reason for it. Some ideas are sticky because they’ve never really been interrogated. Some are sticky because there are powerful parts of society that depend on them. Some are sticky because they’re true, or close to true. But I’ve started to think there’s another reason an idea can be sticky—the amount of mental effort people put into understanding the idea as students.

Take, for instance, X-bar theory. I don’t think there’s some powerful cabal propping it up, it’s not old enough to just be taken for granted, and Chomsky’s Problems of Projection papers showed that it was not really tenable. Yet X-bar persists. Not just in how syntacticians draw trees, or how they informally talk about them, but I remember commentary on my definition of minimal search here involved puzzlement about why I didn’t simply formalize the idea that specifiers were invisible to search followed by more puzzlement when I explained that the notion of specifier was unformulable.

In my experience, the stickiness of X-bar theory—and syntactic projection/labels more broadly—doesn’t manifest itself in an attempt to rebut arguments against it, but in attempts to save it—to reconstitute it in a theory that doesn’t include it.[1]My reading of Zeijstra’s chapter in this volume is as one such attempt This is very strange behaviour—X-bar is a theoretical construct, it’s valid insofar as it is coherent and empirically useful. Why are syntacticians fighting for it? I wondered about this for a while and then I remembered my experience learning X-bar and teaching it—it’s a real challenge. It’s probably the first challenging theoretical construct that syntax students are exposed to. It tends to be presented as a fait accompli, so students just have to learn how it functions. As a result, those students who do manage to figure it out are proud of it and defend it like someone protecting their cherished possessions.[2]I think I may be describing “effort justification,” but I’m basing this just on the Wikipedia article

Of course, it’s a bit dangerous to speculate about the psychological motivations of others, but I’m certain I’ve had this reaction in the past when someone’s challenged an idea that I at one point struggled to learn. And I’ve heard students complain about the fact that every successive level of learning syntax starts with “everything you learned last year is wrong”—or at least that’s the sense they get. So, I have a feeling there’s at least a kernel of truth to my hypothesis. Now, how do I go about testing it?


Addendum

As I was writing this, I remembered something I frequently think when I’m preparing tests and exams that I’ve thus far only formulated as a somewhat snarky question:

How much of our current linguistic theory depends on how well it lends itself to constructing problem sets and exam questions?

References

References
1 My reading of Zeijstra’s chapter in this volume is as one such attempt
2 I think I may be describing “effort justification,” but I’m basing this just on the Wikipedia article

Some idle thoughts on the arguments for semantic externalism/internalism

This semester I’m teaching an intro semantics course for the first time and I decided to use Saeed’s Semantics as a textbook. Its seems like a good textbook; it gives a good survey of all the modern approaches to semantics—internalist, externalist, even so-called cognitive semantics—though the externalist bias is clear if you know what to look for. For instance, the text is quick to bring up the famous externalist thought experiments—Putnam’s robotic cats, Quine’s gavagai, etc—to undercut the internalist approaches, but doesn’t really seem to present the internalist critiques and counterarguments. So, I’ve been striving to correct that in my lectures.

While I was preparing my most recent lecture, something struck me. More precisely, I was suddenly able to put words to something that’s bothered me for a while about the whole debate: The externalist case is strongest for natural kinds, but the internalist case is strongest for human concepts. Putnam talks about cats and water, Kripke talks about tigers and gold, while Katz talks about bachelors and sometimes artifacts. This is not to say that the arguments on either side are unanswerable—Chomsky, I think has provided pretty good arguments that even, for natural kinds, our internal concepts are quite complicated, and there are many thorny issues for internalist approaches too—but they do have slightly different empirical bases, which no doubt inform their approach—if your theory can handle artifact concepts really well, you might be tempted to treat everything that way.

I don’t quite know what to make of this observation yet, but I wanted to write it down before I forgot about it.


There’s also a potential, but maybe half-baked, political implication to this observation. Natural kinds, are more or less constant in that, while they can be tamed and used by humans, we can’t really change them that much, and thinking that you can, say, turn lead into gold would mark you as a bit of a crackpot. Artifacts and social relations, on the other hand, are literally created by free human action. If you view the world with natural kinds at the center, you may be led to the view that the world has its own immutable laws that we can maybe harness, maybe adapt to, but never change.

If, on the other hand, your theory centers artifacts and social relations, then you might be led to the conclusion, as expressed by the late David Graeber, that “the ultimate hidden truth of the world is that it is something we make and could just as easily make differently.”

But, of course, I’m just speculating here.

Unmoored theory

I’ve written before about the dichotomy of descriptive vs theoretical sciences, but I’ve recently noticed another apparent dichotomy within theoretical sciences—expansionary vs focusing sciences. Expansionary sciences are those whose domain tends to expand—(neo)classical economics seems to claim all human interaction in its domain; formal semantics now covers pragmatics, hand gestures, and monkey communication—while focusing sciences tend to rather constant domain or even a shrinking one—chemistry today is about pretty much the same things as it was in the 17th century; generative syntactic theory is still about the language faculty. Assuming this is true,[1]It’s pretty much a tautology that a science’s domain will either grow, stay constant, or shrink over time the question is, whether it reflects some underlying difference between these sciences. I’d like to argue that the distinction follows from how firm its foundations are, and in particular what I’ll call its empirical conjecture.

Every scientific theory, I think, basically takes the form of a conjoined sentence “There are these things/phenomena in the world and they act like this.” The second conjunct is the formal system that give a theory its deductive power. The first conjunct is the empirical conjecture, and it turns the deductions of the formal system into predictions. While every science that progresses does so by positing new sorts of invisible entities, categories, etc., they all start with more or less familiar entities, categories, etc.—planets, metals, persons, etc. This link to the familiar, is the empirical foundation of a science. Sciences with a firm foundation are those whose empirical conjecture can be uncontroversially explained to a lay person or even an expert critic operating in good faith.

Contemporaries of, say, Robert Boyle might have thought the notion of corpuscles insanity, but they wouldn’t disagree that matter exists, exists in different forms, and that some of those forms interact in regular ways. Even the fiercest critic of UG, provided they are acting in good faith, would acknowledge that humans have a capacity for language and that that capacity probably has to do with our brains.

The same, I think, cannot be said about (neo)classical economics or formal semantics.[2]Now obviously, there’s a big difference between the two fields—neoclassical economics is extremely useful to the rich and powerful since it let’s them justify just about any … Continue reading Classical economics starts with the conjecture that there are these members of the species homo economicus—the perfectly rational, self-interested, utility maximizing agent—and derives theorems from there. This is obviously a bad characterization of humans. It is simultaneously too dim of a view of humans—we behave altruistically and non-individualistically all the time—and one that gives us far too much credit—we are far from perfectly rational. Formal semantics, on the other hand, starts with the conjecture that meaning is reference—that words have meaning only insofar as they refer to things in the world. While not as obviously false as the homo economicus conjecture, the referentialist conjecture is still false—most words, upon close inspection, do not refer[3]I could point you to my own writing on this, the works of Jerrold Katz, and arguments from Noam Chomsky on referentialsm, or I could point out that one of the godfathers of referentialism, Ludwig … Continue reading, and there is a whole universe of meaning that has little to do with reference.

Most economists and semanticists would no doubt object to what the previous paragraph says about their discipline, and the objections would take one of two forms. Either they would defend homo economicus/referentialism, or they would downplay the importance of the conjecture in question—“Homo economicus is just a useful teaching tool for undergrads. No one takes it seriously anymore!”[4]Though, as the late David Graeber pointed out, economists never object when homo economicus is discussed in a positive light. “Semanticists don’t mean reference literally, we use model theory!”—and it’s this sort of response that I think can explain the expansionary behaviour of these disciplines. Suppose we take these objections to be honest expressions of what people in the field believe—that economics isn’t about homo economicus and formal semantics isn’t about reference. Well then, what are they about? The rise of behavioural economics suggests that economists are still looking for a replacement model of human agency, and model theory is basically just reference delayed.

The theories, then, seem to be about nothing at all—or at least nothing that exists in the real world—and as a result, they can be about anything at all—they are unmoored.

Furthermore, there’s an incentive to expand your domain when possible. A theory of nothing obviously can’t be justified by giving any sort of deep explanation of any one aspect of nature, so it has to be justified by appearing to offer explanations to a breadth of topics. Neoclassical economics can’t seem to predict when a bubble will burst, or what will cause inflation, but it can give what looks like insight into family structures. Formal semantics can’t explain why “That pixel is red and green.” is contradictory, but it provides a formal language to translate pragmatics into.

There’s a link here to my past post about falsification, because just as a theory about nothing can be a theory about anything, a theory about nothing cannot be false. So, watch out—if your empirical domain seems to be expanding, you might not be doing science any more.

References

References
1 It’s pretty much a tautology that a science’s domain will either grow, stay constant, or shrink over time
2 Now obviously, there’s a big difference between the two fields—neoclassical economics is extremely useful to the rich and powerful since it let’s them justify just about any horrendous crimes they would want to commit in the name of expanding their wealth and power, while formal semantics is a subdiscipline of a minor oddball discipline on the boundaries of humanities, social science, and cognitive science. But I’m a linguist, and I think mostly linguists read this.
3 I could point you to my own writing on this, the works of Jerrold Katz, and arguments from Noam Chomsky on referentialsm, or I could point out that one of the godfathers of referentialism, Ludwig Wittgenstein, seems to have repudiated it in his later work.
4 Though, as the late David Graeber pointed out, economists never object when homo economicus is discussed in a positive light.

But it’s obvious, isn’t it?

As a linguist or, more specifically, as a theoretical syntactician, I hold and often express some minority opinions.[1]Outside of syntactic theory too Often these opinions are met with bafflement and an assertion like “We’ve known for years that that’s not the case” because of this phenomenon, or that piece of data—“Control is derived by movement? But what about de se interpretation??” “Merge is free? But what about c-selection??” “Long-distance Agree isn’t real? But what about English existential clauses??”[2]I have a hypothesis that the vehemence with which someone will defend a theory or analysis is correlated with how much they struggled to understand it in school. Basically, we’re more likely to … Continue reading These sorts of objections are often tossed out as if the data speaks for itself when really, the thing that makes scientific inquiry so tough is that the data rarely speaks for itself, and when it does, it doesn’t do so clearly.

Take, for instance, the case of English existential clauses like (1) and (2) and how they are used as absolute proof of the existence of Long-Distance Agree.

(1) There ?seems/seem to be several fish in the tank.
(2) There seems/*seem to be a fish in the tank.

In both sentences, the grammatical subject is the expletive there, but the verb agrees with a DP[3]I still think I buy the DP hypothesis, but I’m also intrigued by Chomsky’s recent rejection of it and amused by the reaction to this rejection. that appears to be structurally “lower” in the clause. Therefore, there must be some non-movement way of getting features from a lower object onto a higher object—Long-Distance Agree. This is often presented as the obvious conclusion, the only conclusion, or the simplest conclusion. “Obvious” is in the eye of the beholder and doesn’t usually mean “correct”; Norbert Hornstein, in his A Theory of Syntax proposes three alternative analyses to Long-Distance Agree; only “simplest” has legs, although that’s debatable.

Occam’s razor says “entities should not be multiplied without necessity,” and any analysis of (1) and (2) without Long-Distance Agree will have to say that in both cases, the agreeing DP is covertly in subject position. These covert subjects are argued to constitute an unnecessary multiplication of entities, but one could just as easily argue that Long-Distance Agree is an unnecessary entity. What’s more, covert movement and silent elements both have independent arguments in their favour.

Of course, the covert subject analysis of (1) and (2) is not without its flaws. Chief among them, in my opinion, is that it would seem to wrongly predict that (1) and (2) mean the same thing as (3) and (4), respectively.

(3) Three fish seem to be in the tank.
(4) A fish seems to be in the tank.

These sentences differ from (1) and (2) in that they—(3) and (4)—presuppose the existence of three fish or a single fish, while (1) and (2) merely assert it. This contrast is clearest in (5)-(8) which are examples that Chomsky has been using for several decades.

(5) There’s a fly in my soup.
(6) There’s a flaw in my argument.
(7) A fly is in my soup.
(8) *?A flaw is in my argument.

Likewise, Long-Distance Agree has its own problems, some of which I discuss in my latest paper. Indeed, it is vanishingly rare in any field of inquiry—or life itself—to find an unproblematic solution to a problem.

My goal here isn’t to argue that Long-Distance Agree is wrong,[4]Though, I do think it is. but to point out that it’s not a foregone conclusion. In fact, I think that if we listed the hypotheses/theories/notions that most syntacticians took to be (nearly) unquestionable and honestly assessed the arguments in their favours, I doubt that many would turn out to be as robust as they seem. This doesn’t mean that we need to reject every idea that less than 100% solid, just that we should hold on to them a little more loosely. As a rule, we should all carry with us the idea that we could very well be wrong about almost everything. The world’s more interesting that way.

References

References
1 Outside of syntactic theory too
2 I have a hypothesis that the vehemence with which someone will defend a theory or analysis is correlated with how much they struggled to understand it in school. Basically, we’re more likely to die on a hill if we had to fight to summit that hill. This has some interesting implications that I might get into in a later post.
3 I still think I buy the DP hypothesis, but I’m also intrigued by Chomsky’s recent rejection of it and amused by the reaction to this rejection.
4 Though, I do think it is.

New LingBuzz Paper

(or “How I’ve been spending my unemployment*”)

Yesterday I finished and posted a paper to LingBuzz. It’s titled “Agree as derivational operation: Its definition and discontents” and its abstract is given below. If it sounds interesting, have a look and let me know what you think.

Using the framework laid out by Collins and Stabler (2016), I formalize Agree as a syntactic operation. I begin by constructing a formal definition a version of long-distance Agree in which a higher object values a feature on a lower object, and modify that definition to reflect various several versions of Agree that have been proposed in the “minimalist” literature. I then discuss the theoretical implications of these formal definitions, arguing that Agree (i) muddies our understanding of the evolution of language, (ii) requires a new conception of the lexicon, (iii) objectively and significantly increases the complexity of syntactic derivations, and (iv) unjustifiably violates NTC in all its non-vacuous forms. I conclude that Agree, as it is commonly understood, should not be considered a narrowly syntactic operation.

*Thanks to the Canada Recovery Benefit, I was able to feed myself and make rent while I wrote this.

A Response to some comments by Omer Preminger on my comments on Chomsky’s UCLA Lectures

On his blog, Omer Preminger posted some comments on my comments on Chomsky’s UCLA Lectures, in which he argues that “committing oneself to the brand of minimalism that Chomsky has been preaching lately means committing oneself to a relatively strong version of the Sapir-Whorf Hypothesis.” His argument goes as follows.

Language variation exists. To take Preminger’s example, “in Kaqchikel, the subject of a transitive clause cannot be targeted for wh-interrogation, relativization, or focalization. In English, it can.” 21st century Chomskyan minimalism, and specifically the SMT, says that this variation comes from (a) variation between the lexicon and (b) the interaction of the lexical items with either the Sensory-Motor system or the Conceptual-Intentional system. Since speakers of a language can process and pronounce some ungrammatical expressions—some Kaqchikel speakers can pronounce an equivalent of (1) but judge it as unacceptable—some instances of variation are due to the interaction of the Conceptual-Intentional system with the lexicon.

(1) It was the dog who saw the child.

It follows from this that either (a) the Conceptual-Intentional systems of English-speakers and Kaqchikel-speakers differ from each other or (b) English-speakers can construct Conceptual-Intentional objects that Kaqchikel-speakers cannot (and vice-versa, I assume). Option a, Preminger asserts, is the Sapir-Whorf hypothesis, while option b is tantamount to (a non-trivial version of) it. So, the SMT leads unavoidably to the Sapir-Whorf hypothesis.

I don’t think Preminger’s argument is sound, and even if it were, its conclusion isn’t as dire as he makes it out to be. Let’s take these one at a time in reverse order.

The version of the Sapir-Whorf hypothesis that Preminger has deduced from the SMT is something like the following—the Conceptual-Intentional (CI) content of a language is the set of all (distinct) CI objects constructed by that language and different languages have different CI content. This hypothesis, it seems, turns on how we distinguish between CI objects—far from a trivial question. Obviously contradictory, contrary, and logically independent sentences are CI-distinct from each other, as are non-mutually entailing sentences and co-extensive but non-co-intentisive expresions, but what about true paraphrases? Assuming there is some way in Kaqchikel of expressing the proposition expressed by (1), then we can avoid Sapir-Whorf by saying that paraphrases express identical CI-objects. This avoidance, however, is only temporary. Take (2) and (3), for instance.

(2) Bill sold secrets to Karla.
(3) Karla bought secrets from Karla.

If (2) and (3) map to the same CI object, what does that object “look” like? Is (2) the “base form” and (3) is converted to it or vice versa? Do some varieties of English choose (2) and others (3), and wouldn’t that make these varieties distinct languages?

If (2) and (3) are distinct, however, it frees us—and more importantly, the language learner—from having to choose a base form, but it leads us immediately to the question of what it means to be a paraphrase, or a synonym. I find this a more interesting theoretical question, than any of those raised above, but I’m willing to listen if someone thinks otherwise.

So, we end up with some version of the Sapir-Whorf hypothesis no matter which way we go. I realize this is a troubling result for many generative linguists as linguistic relativity, along with behaviourism and connectionism, is one of the deadly sins of linguistics. For me, though, Sapir-Whorf suffers from the same flaw that virtually all broad hypotheses of the social sciences suffer from—it’s so vague that it can be twisted and contorted to meet any data. In the famous words of Wolfgang Pauli, it’s not even wrong. If we were dealing with atoms and quarks, we could just ignore such a theory, but since Sapir-Whorf deals with people, we need two be a bit more careful. One need not think very hard to see how Sapir-Whorf or any other vague social hypothesis can be used to excuse, or even encourage, all varieties of discrimination and violence.

The version of Sapir-Whorf that Preminger identifies—the one that I discuss above–seems rather trivial to me, though.

There’s also a few problems with Preminger’s argument that jumped out at me, of which I’ll highlight two. First, in his discussion of the Sensory-Motor (SM) system, he seems to assume that any expression that is pronouncable by a speaker is a-ok with that speaker’s SM system—He seems to assume this because he asserts that any argument to the contrary is specious. Since the offending Kaqchikel string is a-ok with the SM system it must run afoul of either the narrow syntax (unlikely according to SMT) or the CI system. This line of reasoning, though, is flawed, as we can see by applying it’s logic to a non-deviant sentence, like the English version of (1). Following Preminger’s reasoning, the SM system tells us how to pronounce (1) and the CI system uses the structure of (1) generated by Merge for internal thought. This, however, leaves out the step of mapping the linear pronunciation of (1) to its hierarchical structure. Either (a) then Narrow Syntax does this mapping, (b) the SM system does this mapping, or (c) some third system does this mapping. Option a, of course, violates SMT, while option b contradicts Preminger’s premise, this leaves option c. Proposing a system in between pronunciation and syntax would allow us to save both SMT and Preminger’s notion of the SM system, but it would also invalidate Preminger’s over all argument.

The second issue is the assumption that non-SM ungrammaticality means non-generation. This is a common way of thinking of formal grammars, but very early on in the generative enterprise, researchers (including Chomsky) recognized that it was far to rigid—that there was a spectrum from prefect grammaticality to word salad that couldn’t be captured by the generated/not-generated dichotomy. Even without considering degrees of grammaticality, though, we can find examples of ungrammatical sentences that can be generated. Consider (4) as compared to (5).

(4) *What did who see?
(5) Who saw what?

Now, (4) is ungrammatical because wh-movement prefers to target the highest wh-expression, which suggests that in order to judge (4) as ungrammatical, a speaker needs to generate it. So, the Kaqchikel version of (1) might be generated by the grammar, but such generation would be deviant somehow.

Throughout his argument, though, Preminger says that he is only “tak[ing] Chomsky at his word”—I’ll leave that to the reader to judge. Regardless, though, if Chomsky had made such an assumptions in an argument, it would be a flawed argument, but it wouldn’t refute the SMT.

A note on an equivocation in the UCLA Lectures

In his recent UCLA Lectures, Chomsky makes the following two suggestive remarks which seem to be contradictory:

. . . [I]magine the simplest case where you have a lexicon of one element and we have the operation internal Merge. [. . . ] You have one element: let’s just give it the name zero (0). We internally merge zero with itself. That gives us the set {0, 0}, which is just the set zero. Okay, we’ve now constructed a new element, the set zero, which we call one.

p24

We want to say that [X], the workspace which is a set containing X is distinct from X.
[X] ≠ X
We don’t want to identify a singleton set with its member. If we did, the workspace itself would be accessible to MERGE. However, in the case of the elements produced by MERGE, we want to say the opposite.
{X} = X
We want to identify singleton sets with their members.

p37

So in the case of arithmetic, a singleton set ({0}, one) is distinct from its member (0), but the two are identical in the case of language. This is either a contradiction—in which case we need to eliminate one of the statements—or its an equivocation—in which case we need to find and understand the source of the error. The former option would be expedient, but the latter is more interesting. So, I’ll go with the latter.

The source of the equivocation, in my estimation, is the notion of identity—Chomsky’s remarks become consistent when we take him to be using different measures of identity and, in order to understand these distinctions, we need to dust off a rarely used dichotomy—form vs substance.

This dichotomy is perhaps best known to syntacticians due to Chomsky’s distinction between “formal universals” and “substantive universals” in Aspects, where formal universals were constraints on the types of grammatical rules in the grammar and substantive universal were constraints on the types of grammatical objects in the grammar. Now, depending on what aspect of grammar or cognition we are concerned with, the terms “form” and “substance” will pick out different notions and relations, but since we’re dealing with syntax here we can say that “form” picks out purely structural notions and relations, such as are derived by merge, while substance picks out everything else.

By extension, then, two expressions are formally identical if they are derived by the same sequences of applications of merge. This is a rather expansive notion. Suppose we derived a structure from an arbitrary array A of symbols, any structure whose derivation can be expressed by swapping the symbols in A for distinct symbols will be formally identical to the original structure. So, “The sincerity frightened the boy.” and “*The boy frightened the sincerity” would be formally identical, but, obviously, substantively distinct.

Substantive identity, though is more complex. If substance picks out everything except form, then it would pick out everything to do with the pronunciation and meaning of an expression. So, from the pronunciation side, a structurally ambiguous expression is a set of (partially) substantively identical but formally distinct sentences, as are paraphrases on the meaning side.

Turning back to the topic at hand, the distinction between a singleton set and its member is purely formal, and therein lies the resolution of the apparent contradiction. Arithmetic is purely formal, so it traffics in formal identity/distinctness. Note that Chomsky doesn’t suggest that zero is a particular object—it could be any object. Linguistic expressions, on the other hand, have form and substance. So a singleton set {LI} and its member LI are formally distinct but, since they would mean and be pronounced the same, are substantively identical.

It follows from this, I believe, that the narrow faculty of language, if it is also responsible for our faculty of arithmetic, must be purely formal—constructing expressions with no regard for their content. So, the application of merge cannot be contingent on the contents of its input, nor could an operation like Agree, which is sensitive to substance of an expression, be part of that same faculty. These conclusions, incidentally, can also be drawn from the Strong Minimalist Thesis

Internal unity in science again

Or, how to criticize a scientific theory

Recently, I discovered a book called The Primacy of Grammar by philosopher Nirmalangshu Mukherji. The book is basically an extended, and in my opinion quite good, apologia for biolinguistics as a science. The book is very readable and covers a decent amount of ground, including an entire chapter discussing the viability of incorporating a faculty of music into biolinguistic theory. I highly recommend it.

At one point, while defending biolinguistics from the charge of incompleteness levied by semanticists and philosophers, Mukherji makes the following point.

[D]uring the development of a science, a point comes when our pretheoretical expectations that led to the science in the first place have changed enough, and have been accommodated enough in the science for the science to define its objects in a theory-internal fashion. At this point, the science—viewed as a body of doctrines—becomes complete in carving out some specific aspect of nature. From that point on, only radical changes in the body of theory itself—not pressures from common sense—force further shifting of domains (Mukherji 2001). In the case of grammatical theory, either that point has not been reached or … the point has been reached but not yet recognized.

Mukherji (2010, 122-3)

There are two interesting claims that Mukherji is making about linguistic theory and scientific theory in general. One is that theoretical objects are solely governed by theory-internal considerations. The other is that the theory itself determines what in the external world it applies to.

The first claim reminded me of a meeting I had with my doctoral supervisor while I was writing my thesis. My theoretical explanation rested on the hypothesis that even the simplest of non-function words, like coffee, were decomposable into root objects (√COFFEE) and categorizing heads (n0). I had a dilemma though. It was crucial to my argument that, while categorizing heads had discrete features, roots were treated as featureless blobs by the grammar, but I couldn’t figure out how to justify such a claim. When I expressed my concern to my supervisor, she immediately put my worries to rest. I didn’t need to justify that claim, she pointed out, because roots by their definition have no features.

I had fallen into a very common trap in syntax—I had treated a theory-internal object as an empirical object. Empirical objects can be observed and sensibly argued about. Take, for instance, English specificational clauses (e.g. The winner is Mary). Linguists can and do argue about the nature of these—i.e. whether or they are truly the inverse of predicational clauses (e.g., Mary is the winner)— and cite facts the do so. This is because empirical objects and phenomena are out there in the real world, regardless of whether we study them. Theory-internal objects, on the other hand are not subject to fact-based argument, because, unless the Platonists are right, they have no objective reality. As long as my theory is internally consistent, I can define its objects however I damn please. The true test of any theory is how well it can be mapped onto some aspect of reality.

This brings me to Mukherji’s second assertion, that the empirical domain to a theory is determined by the theory itself. In the context of his book, this assertion is about linguistic meaning. The pretheoretic notion of meaning is what he calls a “thick” notion—a multifaceted concept that is very difficult to pin down. The development of a biolinguistic theory of grammar, though, has led to a thinner notion of meaning, namely, the LF of a given expression. Now obviously, this notion of meaning doesn’t include notions of reference, truth, or felicity, but why should we expect it to? Yes, those notions belong to our common-sense ideas of meaning, but surely at this stage of human history, we should expect that scientific inquiry will reveal our common-sense notions to be flawed.

As an analogy, Aristotle and his contemporaries didn’t distinguish between physics, biology, chemistry, geology, an so on—they were all part of physics. One of the innovations of the scientific revolutions, then, was to narrow the scope of investigation—to develop theories of a sliver of nature. If Aristotle saw our modern physics departments, he might look past all of their fantastic theoretical advances and wonder instead why no one in the department was studying plants and animals. Most critiques of internalist/biolinguistic notions of semantics by modern philosophers and formal semanticists echo this hypothetical time-travelling Aristotle—they brush off any advances and wonder where the theory of truth is.

Taken together, these assertions imply a general principle: Scientific theories should be assessed on their own terms. Criticizing grammatical theory for its lack of a theory of reference makes as much sense as criticizing Special Relativity for its lack of a theory of genetic inheritance. While this may seem to render any theory beyond criticism, the history of science demonstrates that this isn’t the case. Consider, for instance, quantum mechanics, which has been subject to a number of criticisms in its own terms—see: Einstein’s criticisms of QM, Schrödinger’s cat, and the measurement problem. In some cases these criticisms are insurmountable, but in others addressing them head-on and modifying or clarifying the theory is what leads to advances in the theory. Chomsky’s Label Theory, I think, is one of the latter sorts of cases—a theory-internal problem was identified and addressed and as a result two unexplained phenomena (the EPP and the ECP) were given a theoretical explanation. We can debate how well that explanation generalizes and whether it leans too heavily on some auxiliary hypotheses, but what’s important is that a theory-internal addressing of a theory-internal problem opened up the possibility of such an explanation. This may seem wildly counter-intuitive, but as I argued in a previous post, this is the only practical way to do science.

The principle that a theory should be criticized in its own terms is, I think, what irks the majority of linguists about biolinguistic grammatical theory the most. It bothers them because it means that very few of their objections to the theory ever really stick. Ergativity, for instance, is often touted as a serious problem for Abstract Case Theory, but since grammatical theory has nothing to say about particular case alignments, theorists can just say “Yeah, that’s interesting” and move on. Or to take a more extreme case, recent years have seen all out assaults on grammatical theory from people who bizarrely call themselves “cognitive linguists”, people like Vyvyan Evans and Daniel Everett, they claim to have evidence that roundly refutes the very notion of a language faculty. The response of biolinguists to this assault: mostly a resounding shrug as we turn back to our work.

So, critics of biolinguistic grammatical theory dismiss it in a number of way. They say it’s too vague or slippery to be any good as a theory, which usually means they refuse to seriously engage with it, they complain that the theory keeps changing—a peculiar complaint to lodge against a scientific theory, or they accuse theorists of arrogance—a charge that, despite being occasionally true, is not a criticism of the theory. This kind of hostility can be bewildering, especially because a corollary of the idea that a theory defines its own domain is that everything outside that domain is a free-for-all. It’s hard to imagine a geneticist being upset that their data is irrelevant to Special Relativity. I have some ideas about where the hostility comes from but they’ll take me pretty far afield, so I’ll save them for a later post and leave it here.

Self-Promotion: I posted a manuscript to Lingbuzz.

Hi all,

I’ve been working on a paper for a few months and it’s finally reached the point where I need to show it to some people who can tell me whether or not I’m crazy. To that end, I posted it on LingBuzz.

It’s called “A workspace-based theory of adjuncts,” and be forewarned it’s pretty technical. So if you’re just here for my hot takes on why for-profit rent is bad, or what kind of science generative syntax is, or the like, it might not be for you.

If it is for you, and you have any comments on it, please let me know.

Happy reading!