Theory first, then description

Picking up on a unfollowed line of reasoning from my last post, I’d like to argue what might seem like a bold claim: The only practical way to make any discoveries or advances in any science, including syntax, is by starting with the theory.

First let me head off a possible objection, namely the Kuhnian objection that sciences don’t advance, they simply move from paradigm to paradigm. If this were true, then there is a version of my claim which is still defensible: If discoveries or advances are possible in syntax or any science, then they are possible only by starting with the theory. So, if the Kuhnian is right, then my claim is might be strictly academic, but it’s not necessarily wrong. Of course, if the Kuhnian objection is correct, then the fact that you are reading this now is tantamount to a literal miracle.

Since I take discoveries and advance to be two separate things, I’ll take them one at a time, starting with the latter.

A science advances, in my estimation, when one of two things happen: either an additional general truth is derived, or two general truths are subsumed under a single general truth. Theoretical sciences traffic in general statements, and they do so using deductive reasoning. Since deduction generates true statements from true statements, then it stands to reason that theoretical sciences are capable, in principle, of advancing. Descriptive sciences, on the other hand, traffic in particular truths (i.e., facts) and does so by a combination of observation and induction. Since observation only yields facts and induction cannot reliably derive general truths from facts, descriptive sciences cannot advance.

Turning to discoveries, which I take to be the addition of novel existential truths to our knowledge. Since existential statements (i.e., “There are X” or “X exists”) are derivable from particular statements, they can in principle be derived in a descriptive science. Hence, my use of the qualifier “practical” in my claim. My claim regarding discovery, then, is that descriptive scientific work can reliably make discoveries only insofar as it is guided by theoretical scientific work. To argue for my claim, I will first employ a thought experiment:

A densely freckled patient wishes to know whether they have any signs melanoma. They first visit a lab tech who is trained in taking biopsies but whose knowledge of melanoma is purely instrumental. He can take a biopsy, run a test on it, and interpret the results of that test. His diagnostic plan is to biopsy every single freckle on the patient’s body.

The patient’s second visit is to a trained dermatologist, whose understanding of the general nature of melanoma, allows her to recognize likely tumors by sight alone. Her diagnostic plan is to carefully examine the patient’s skin and order biopsies on the likely tumors.

Which diagnostic plan would you choose if you were this patient?

Clearly the dermatologist’s plan is more practical, and it is only possible because of her theoretical understanding of melanoma. Of course, both methods would have a good shot at discovering a tumor if it existed, assuming the lab tech is exceptionally thorough and the dermatologist is exceptionally knowlegable about melanoma. However, suppose we increased the amount of patients that wished to be screened. Clearly the rate of discovery of tumors would go down in either case, but likely the decline would be greater for the lab tech, whose initial burden of work is greater. Or suppose we vary the degree of thoroughness of the lab tech and the degree of tumor-knowledge of the dermatologist. An increase in thoroughness, would likely slow down the lab tech, while an increase of tumor-knowledge would speed up the dermatologist. In both cases, theoretical knowledge trumps descriptive skill because theoretical knowledge allows us to distinguish relevant data from irrelevant data.

The second part of my argument comes from the history of science, specifically the discovery of the planet Neptune. The story, at least according to Wikipedia, goes as follows: multiple astronomers noticed the the orbit of Uranus (Neptune’s neighbour closer to the sun) did not line up with what Newton’s theory of gravitation predicted. This led them to hypothesize a yet to be discovered planet whose gravitational pull disturbed the orbit of Uranus. And beyond just hypothesizing it’s existence, they, and in particular Urbain Le Verrier, were able to calculate its position in the sky at a particular time, and an astronomer with access to a suitable observatory, following Le Verrier’s instructions was able to observe Neptune in 1846.

This on its own is a powerful demonstration of the usefulness of theoretical science, but it’s made more powerful when you take into account the fact that Galileo observed Neptune as early as 1613, but he did not discover it. While this may seem like a contradiction, it makes more sense when you add that Galileo failed to recognize that Neptune was a planet, mistaking it for a star, because he had no reason to think he would find a planet there.

To sum up, descriptive science is incapable of making advances, because it traffics in particulars. And while it is indeed capable of making discoveries it is horribly inefficient. Theoretical science, on the other hand, is perfectly suited to advances, and can guide descriptive science to discoveries like a rider guides a horse.This leads to an inversion the prevailing wisdom that says only once we have a broad description of the relevant phenomena are we in a position to build a theory. On the contrary! Only once we have a clear theory are we even able to know what the relevant phenomena are.

What kind of a science is Generative Syntax?

Recently, I found myself reading Edmund Husserl’s Logical Investigations. I didn’t make it that far into it—the language is rather abstruse—but included in the fragments of what I did read was a section in which Husserl clarified something that I’ve been thinking about recently, which is the place of theory in a science. In the section in question, Husserl defines a science as a set of truths that belong together. So, the truths of physics belong together, and the truths of economics belong together, but the former and the latter don’t belong together. But what does it mean, Husserl asks, for truths to belong together?

Husserl’s answer is that it can mean one of two things. Either truths belong together because they share an internal unity or because they share an external unity. Truths—that is, true propositions—are linked by an internal unity if they are logically related. So, a theorem and the axioms that it is derived from share an internal unity, as would two theorems derived from a set of internally consistent axioms, and so on. The type of science characterized by internal unity, Husserl calls abstract, explanatory, or theoretical science. This class would include arithmetic, geometry, most modern physics, and perhaps other fields.

A set of truths has external unity if the members of the set are all about the same sort of thing. So, geography, political science, history, pre-modern physics, and so on would be the class of sciences characterized by external unity. Husserl calls these descriptive sciences.

When I read the description of this dichotomy, I was struck both by how simple and intuitive it was, and by how meaningful it was, especially compared to the common ways we tend to attempt to divide up the sciences (hard sciences vs soft sciences, science vs social science, etc). the distinction also happens to neatly divide fields of inquiry into those that generate predictions (theoretical sciences) and those that do not (descriptive sciences). Why does a theoretical science generate predictions while a descriptive one does not? Well consider the starting point of either of the two. A theoretical science, requiring internal unity, would start with axioms, which can be any kind of propositions, including universal propositions (e.g., “Every number has a successor”, “”No mass can be created or destroyed.”). On the other hand, a descriptive science, which require external unity, would start with observable facts, which must be particular propositions (e.g., “The GDP of the Marshall Islands rose by 3% last year”, “That ball fell for 5 seconds”). This matters because deductive reasoning is only possible if a systems has at least some universal premises. So, a theoretical science generates theorems, which constitute the predictions of that science. A descriptive science, on the other hand, is limited to inductive reasoning which at best generates expectations. The difference being that if a theorem/prediction is false, then at least one of the axioms that it is derived from must be false, while if an expectation is false, it doesn’t mean that the facts that “generated” that expectation are false.

Turning to the question I asked in my title, what kind of science is Generative Syntax (GS)? My answer is that there are actually two sciences—one theoretical, one descriptive—that answer to the name Generative Syntax, and that most of the current work is of the latter type. Note, I don’t mean to distinguish between experimental/corpus/field syntax and what’s commonly called “theoretical syntax”. Rather, I mean to say that, even if we restrict ourselves to “theoretical syntax,” most of the work being done today is part of a descriptive science in Husserl’s terminology. To be more concrete, let me consider two currently open fields of inquiry within GS. One which is quite active—Ergativity, and one which is less popular—Adjuncts.

Ergativity, for the uninitiated, is a phenomenon having to do with grammatical case. In English, a non-ergative language, pronouns come in two cases: nominative (I, he, she, they, etc), which is associated with subjects, and accusative (me, him, her, them, etc) which is associated with objects. An ergative language, also has two cases: ergative, which is associated with subjects of transitive verbs, and absolutive which is associated with objects of transitives and subjects of intransitives. To be sure, this is an oversimplification, and ergativity has been found to be associated with many other phenomena that don’t occur in non-ergative languages. Details aside, suppose we wanted to define a science of ergativity or, more broadly, a science of case alignment in Husserl’s terminology. What sort of unity would it have? I contend that it has only external unity. That is, it is a descriptive science. It begins with the fact that the case systems of some languages are different from the case systems that most linguistics students are used to. Put another way, if English were an ergative language, linguists would be puzzling over all these strange languages where the subjects always had the same case.

Adjuncts, a fancy term for modifiers, are the “extra” parts of sentences: adjectives and adverbs, the things newspaper editors hate. Adjuncts contrast with arguments (subjects, objects, etc) and predicates, which each sentence needs and needs in a particular arrangement. So, the sentences “She sang the song with gusto after dinner” and “She sang the song after dinner with gusto” are essentially identical, but “She sang the song” and “The song sang her” are wildly different. On its face, this is not particularly interesting—adjuncts are commonplace—but every unified theory of GS predicts that adjuncts should not exist. Take the current one, commonly called minimalism. according to this theory sentences are constructed by iterated application of an operation called Merge, which simply takes two words or phrases and creates a new phrase (Merge(X, Y) → {X, Y}≠X≠Y). It follows from this that “She sang the song” and “The song sang her” are meaningfully distinct but it also follows (falsely) that “She sang the song with gusto after dinner” and “She sang the song after dinner with gusto” are also meaningfully different. From this perspective, the study of adjuncts doesn’t constitute a science in itself, but rather it is part of a science with internal unity, a theoretical science.

So, despite the fact that research on ergativity and research on adjuncts both tend to be described as theoretical syntax in GS, the two are completely different sorts of sciences. Inquiry into the nature of adjuncts forms part of the theoretical science of syntax, while work on ergativity and, I would conjecture, the majority of current work that is called “theoretical syntax”, its use of formalisms and hypotheses notwithstanding, forms a descriptive science, which would be a part of a larger descriptive science.

Both sorts of science are valuable and, in fact, often complement each other. Accurate descriptions of the heavens were vital for early modern physicists to develop their theoretical models of mechanics, and novel theories often furnish descriptivists with better technology to aid their work. Where we get into trouble is when we confuse the two sorts of sciences. There’s an argument to be made, and and it has been made by John Ralston Saul in his book Voltaire’s Bastards, that many of the problems in our society stem from insisting that descriptive social sciences, such as international relations, economics, and law, and even much of the humanities have been treated like theoretical sciences.

Turning back to syntax and taking a micro view, why am I grinding this axe? Well, I have two main reasons: one selfish, the other more altruistic. The selfish reason is that I am a theoretician in a descriptivist’s world. This manifests itself in a number of ways, but I’ll just highlight the immediate one for me: the job market. The academic job market is insanely competitive, and PhD students are expected at least to present at conferences in order to make a name for themselves. This is a problem because (a) there are no theoretical syntax conferences and (b) a standard 20 minute talk, while often sufficient to present a new descriptive analysis of a phenomenon, is not ideal for presenting theoretical work.

Beyond that, I think the confusion of the two sorts of sciences can exacerbate imposter syndrome, especially in graduate students. It took me a while to figure out why I had such a hard time understanding some of my colleagues’ work, and why some papers on “theoretical syntax” had such wildly different characters, arguments, and styles from others. I eventually figured it out, but every so often I see a grad student struggling to make sense of the field and I just want to tell them that they’re not wrong, the field doesn’t really make sense, because it’s actually two fields.