On the notion of an intellectual coup

In chapter nine of his book Goliath: The 100-Year War Between Monopoly Power and Democracy, Matt Stoller recounts the story of the genesis of the Chicago School of law & economics—the school of thought which has come to dominate virtually every aspect of the Western power structure since the 1970s. In Stoller’s telling, it truly could be considered a moment of epoch in economics, law, political science, and related disciplines, much as the Copernican geocentrism was for physics, or Mendel’s laws were for biology, or Generative Grammar was for psychology. The shift in thinking brought on by the Chicago school was perhaps as drastic and far-reaching as those brought on by these intellectual revolutions. Yet, in reading it, it struck me that it would wrong to describe the founding of the Chicago school as a revolution because it wasn’t one—it was an intellectual coup.

But what makes something an intellectual revolution? What makes it an intellectual coup? To stick with the analogy to political processes, the difference is legitimacy—revolutions are legitimate changes, while coups are illegitimate. Legitimacy, of course, is hard to judge objectively, but still, to call something a revolution is to judge it to be legitimate. The violent 1973 overthrow of the democratically elected Allende government in Chile is commonly called a “coup” rather than a revolution. Similarly, Historian Michael J. Klarman refers to the US Constitutional Convention as a coup to indicate that he judges it to have been illegitimate. And importantly, the revolution-coup distinction doesn’t boil down to the simple subjective value judgement of revolutions are good and coups are bad. So, while conservatives the world round, likely agree that the American Revolution was good, many argue that the French and Russian revolutions were bad. Interestingly, though, I don’t know that many people would think that a coup could be good. So, while most Americans would probably say the Constitutional convention is good, they probably wouldn’t describe it as a coup, perhaps because illegitimacy is per se bad.

So what makes a shift of ideas illegitimate—what makes it an intellectual coup? To see this we should look at what a legitimate shift looks like. The stories we’re used to hearing involve a disinterested person (or possibly a group) proposing a new idea in an open forum, while make an honest critical argument that it is superior to a contemporaneously widely-accepted idea. The proposal must be open, so that fair criticisms can be aired. The proposer should be disinterested in the sense that the proposed idea is not a means to some other material end (e.g., money or political influence), but rather an end in itself. The discourse around the idea should acknowledge and address the ideas antecedents and rivals, because it allows the larger community to accurately assess the merits of new idea.

We can see all of these criteria in the great shifts in the history of ideas. Even Galileo and Copernicus, whose work predated any of the modern intellectual institutions—like peer-reviewed journals, conferences, or universal primary education—that we all take for granted, opened their work to criticism—not by their peers primarily, but the Inquisition—and did so, not as a means to an end but for the sake of the ideas themselves—what self-interested person would open themselves to the punishment that a renaissance inquisition could dole out. Finally, it would be hard to credibly suggest that the early heliocentrists could ignore or misrepresent their intellectual competitors, which had been taken as a religious dogma, uncritically believed by their contemporaries. The very story of the Copernican revolution is one of competing ideas.

An illegitimate shift would go against one or more of these criteria. It would develop an idea in a less-than-open way; it would be put forth on behalf of some interest group, or as a means to an end for the proposer; or it would either ignore or caricature its competitor-ideas. And more often than not, the latter infraction will be the most characteristic feature of an intellectual coup. Taking the rise of the Chicago School, and its views on monopoly and antitrust, as Stoller recounts it as our prototype, we can see all of these features in play.

The story starts with wealthy businessman and New Deal enemy Harold Luhnow using his foundation The Volker Fund to finance a right-wing research project at the University of Chicago, starts continues with the project’s leading academic Aaron Director gathering a cadre of acolytes and eventually using private funds to start a journal that would be friendly to their ideas. What really allowed the Chicago School to change from a fringe endeavour to the dominant school of thought in the Western social sciences, in Stoller’s assessment, were a pair of rhetorical misappropriations: Adopting “the language of Jeffersonian democracy” and “the apolitical language of science.”

Jeffersonian democracy was in favour of the rights of the individual in opposition to centralized power, a stance that comes from Classical Liberalism and that the Chicago School loudly endorsed. The rhetorical trick, though, is that the Chicago School (and modern right-libertarians) treated authoritarian institutions like corporations as individuals and democratic institutions like labour unions as centralized power. Yet, even a cursory glance at many of the paragons of classical liberalism shows a number of views that we would now associate with a radical left-wing position. Some of Marx’s economic ideas come almost directly from Adam Smith, ideas like the labour theory of value, or the essentially parasitic nature of landlords. Of course, these views of Smith that don’t jibe with the right-wing caricature of him are either ignored or treated as a source of embarrassment. This move, of course, was aided by the fact that, by the time the right-wing Chicago School was appropriating the classical liberal tradition, the American left seemed to be pushing that tradition away. In fact, a recurring theme in Stoller’s is that the left has largely ceded populism to the right and embraced elitism.

Using the rhetoric of “science”, though, has probably been a much more powerful trick, because the general public including much of the elite’s attitude toward it is about as positive as its understanding of the term is murky. Nearly everyone—even flat-earthers, anti-vaxxers, and climate deniers—thinks science is good, but no one could define it. Sure, some would say something about experimental methods, or falsificationism, or spout some Kuhnian nonsense, and everyone would probably agree that quantum physics is a science, while film criticism is not, but few probably realize that philosophers of science have been consistently unable to pin down what constitutes a science. So, when an economist throws graphs and equations at us and declares scientific a statement that offends common sense, very few people are intellectually equipped to dispute them. In the case of the Chicago School, they were at an advantage because, until they adopted it, the claim that economics (along with politics, law, and history) could be a science like physics was probably only held by strict Marxists. The opposing position was one that worried about notions like power and democracy—hardly the kinds of ideas amenable to scientific analysis. If you think that Google doesn’t really compete in an open market, but uses its market power to crush all competition, then you probably also think the sun revolves around the earth.

While the moneyed interests backing the Chicago School and its insular nature in the early days certainly indicate that it was not likely to lead a legitimate intellectual shift, its rhetorical tricks, I believe, are what makes its success a coup rather than a revolution, and what has made its ideas so stubborn. It fosters the oppressive slogan “There is no alternative.” By co-opting the great thinkers of the enlightenment, the Chicago School can paint any opponents as anti-rational romantics, and by misappropriating the language of science, they can group dissenters with conspiracy theorists and backwards peasants. This makes it seem like a difficult position to argue against, but as many have discovered recently, it’s a surprisingly brittle position.

Take, for instance, the Chicago School position on antitrust laws—that they were intended as a consumer protection. This has been the standard position of antitrust enforcers in the U.S. and it’s based on an article by Robert Bork. It’s how obvious monopolists, like Google and Facebook have escaped enforcement thus far. But, as Stoller’s book documents, the actual legislative intent of U.S. antitrust laws had nothing to do with consumer welfare, and everything to do with power. Bork’s article, then, was a work of fiction, and once you understand that, the entire edifice of modern antitrust thinking begins to crumble.

So, the Chicago School carried out an intellectual coup—one that struck virtually every aspect of our society—but have there been intellectual coups in other fields? Two spring to mind for me—one in physics, and one in my own field of linguistics. Before I describe them, though, a brief word on motivations as an aspect of intellectual coups is in order.

One of the features of an intellectual coup that I described above is that of an ulterior motive driving it. In the case of the Chicago School it was driven by capitalists set on dismantling the New Deal for their own financial interests. Does that mean that everyone who subscribes to the Chicago School does so so that billionaires can make more money? Not at all. There are definitely Chicago Schoolers who are true believers. Indeed, I would wager that most, if not all, of them are. Hell, even political coups have true believers in them. What about the particular ulterior motives? Are all intellectual coups done on behalf of capital? No. Motivations take all sorts of forms, and are often subconscious. Bold claims are often rewarded with minor celebrity or notoriety which might have material benefits like job offers or the like. They are also sometimes correct. So, if a researcher makes a bold claim, are they doing so to stand out among their peers or are doing so because they truly believe the claim? It’s almost never possible to tell. Since intellectual coups are essentially based on intellectual dishonesty and its probably a safe choice to assume that those that enact an intellectual coup are capable and well-meaning people, discussions of motivations are useful to understand how a capable and well-meaning person could get caught up in a coup. As such, I will focus more on the means rather than the motive when diagnosing a coup.

The Copenhagen Quantum Coup

If you’re at all interested in the history of science, you may have heard of the Bohr-Einstein debate. The narrative that you likely heard was that in the early 20th century, the world community of physicists had accepted quantum mechanics with a single holdout, Albert Einstein, who engaged Niels Bohr in a debate at the 5th Solvay Conference in 1927. Einstein made a valiant argument, capping it with the declaration that “God does not play dice!” When it was Bohr’s turn, he wiped the floor with Einstein, showing that the old man was past his prime and out of step with the new physics. He even used Einstein’s own theory of relativity against him! And with that, Quantum mechanics reigned supreme, relegating all critics to the dustbin of history.

It’s a good story and even has a good moral about the fallibility of even a genius like Einstein. The trouble, though, at least according to Adam Becker in his excellent book What is Real?, is that the debate didn’t go down like that. For starters, Einstein wasn’t skeptical about quantum mechanics, but rather had questions about how we are to interpret it. Bohr was advocating for what’s misleadingly called “the Copenhagen Interpretation” which basically says that there is no way to give quantum theory a realist interpretation, all we can do is solve the equations and compare the solutions to experimental results. Furthermore, as Becker recounts, Einstein’s arguments weren’t out of step with contemporary physics. In fact, they were brilliantly simple thought experiments that struck at the very core of quantum mechanics. Their simplicity, however, meant that they sailed over the heads of Bohr and his cadre. It was Bohr’s response that missed the point. And finally, that famous quote from Einstein was in a letter to his friend Max Born, not at the conference in question.

This certainly has the hallmarks of an intellectual coup—it depends on a rhetorical trick of manipulating a narrative to favour one outcome, it shuts down debate by lumping dissenters in with the anti-rationalists, and it’s rather brittle—but it’s not quite as bald-faced as the Chicago School coup. Even as Becker tells it, the scientists in Bohr’s camp probably believed that Einstein was losing it and that he’s missed the point entirely. What’s more, the Copenhagen perspective, which the popularized telling of the debate supports, is not a pack of falsehoods like the Chicago School, but rather an overly narrow conception on the nature of scientific inquiry—a conception called “instrumentalism” which tends to banish humanistic questions of truth, reality, and interpretation to the realm of philosophy and views “philosophy” as a term of abuse.

But where is the dishonesty that I said every coup was based on? It seems to have come in the form of laziness—Bohr and his compatriots should have made a better effort to understand Einstein’s critique. This laziness, I believe, rises to the level of dishonesty, because it ended up benefiting the Copenhagen perspective in a predictable way. As Becker describes, Bohr, for various reasons, wanted to show that Quantum Mechanics as formulated in the 1920s was complete and closed—a perfect theory. Paradoxes and interpretive issues, such as the ones that Einstein was raising, revealed imperfections, which had to be ignored. Whether Bohr had all of this in his mind at the Solvay Conference is beside the point. His, and his followers’, was a sin of omission.

The Formal Semantics Coup

The standard theoretical framework of contemporary semantics, at least within the generativist sphere, is known as formal semantics. Few semanticists would likely agree that there is such thing as a standard theory, but those same semanticists probably agree on the following:

  1. The meaning of a word or a phrase is the thing or set of things that that word or phrase refers to.
  2. The meaning of a sentence is its truth conditions.
  3. Linguistic meanings can be expressed by translating expressions of a Natural Language into formulas of formal logic.
  4. Any aspect of language that doesn’t meet the requirements of 1-3 is outside the domain of semantics.

The origins of these standard tenets of formal semantics, though, are not some empirical discovery, or the results of some reasoned debate, but rather the declarations of a handful of influential logicians and philosophers. The ascendency of formal semantics, then, is due not to a revolution, but a coup. Since linguistic theory doesn’t get the same amount of press as economics and physics, the historical contours of the shift to formal semantics are at best murky. As such, I’ll explain my coup diagnosis through a series of personal anecdotes—not the ideal method, but the best I can do right now.

I was first exposed to formal semantics in my graduate coursework. The four numbered statements above were what I took for granted for a while. I was aware that there were other ways of looking at meaning, and that formal semantics was a relatively recent addition to the generative grammar family of theories, and I guess I assumed that the advent of formal semantics was an intellectual revolution and there must’ve been a great debate between the formalists and the non-formalists and the formalists came out on top. Of course, no one ever talked about that debate—I knew about the ongoing debates between behaviourists and generativists, and the “wars” between Generative Semantics and interpretive semantics, but no one told the tales of the Great Formal Semantics Debates. This should have been my first red flag—academics aren’t shy about their revolutionary arguments.

I first began to have qualms about formal semantics, when I heard Noam Chomsky’s lucid critiques of referentialism (tenet #1 above) in the Michel Gondry documentary Is The Man Who Is Tall Happy. Here was the man who founded Generative Syntax, who’s often considered a genius, and whose publications are usually major events in the field arguing that we’ve been doing semantics all wrong. As I better familiarized myself with his arguments, it became clear that he was holding a reasonable position. If I ever brought it up to a working semanticist, though, they would first brush it off saying basically “Chomsky needs to stay in his lane,” but when I put the arguments to them, they would acknowledge that they might be sound arguments, but that formal semantics was the only game in town (i.e., There is no alternative). One even told me straight out that, sure I could go against formal semantics, but if I did, I’d never get hired by any linguistics department (Of course, given the prevailing political and economic environment surrounding academic institutions, the odds of me getting hired regardless of my stance on formal semantics are pretty long anyway). This was when I first started to suspect something was amiss—the only defense that could be mustered for formal semantics was that everyone else was doing it and we can’t imagine an alternative.

I had to admit, though, that, despite my misgivings, I had no alternative to formal semantics and, being a syntactician, I didn’t really have the inclination to spend a lot of time coming up with one. As luck would have it, though, I happened upon exactly the sort of alternative that wasn’t supposed to exist: Jerrold Katz’ Semantic Theory. Published in 1972, the theory Katz proposed was explicitly non-referentialist, formal (in the sense of having a formalism), and opposed to what we now call formal linguistics. It was quite a surprise because I had heard of Katz—I read a paper he co-authored with Jerry Fodor for a syntax course—but strangely, he was always associated with the Generative Semantics crew—strangely, because he explicitly argues against them in his book. So, contrary to what I’d been told, there was an alternative, but why was I just finding out about it now? Unfortunately, Jerrold Katz died a few years before I ever picked up his book, as had his occasional co-author Jerry Fodor, so I couldn’t get their accounts of why his work had fallen out of favour. I asked the semanticists I knew about him and they recognized the name but had no idea about his work. The best explanation I got was from Chomsky, who said that he did good work, but semanticists were no longer interested in the questions he was asking. No stories of an LSA where Katz squared off against the new upstarts and was soundly beaten, no debates in the pages of Language or Linguistic Inquiry, Katz was just brushed aside and never spoken of again. Instead, the very fiats of philosophers and logicians (Carnap, Lewis, Quine, etc.) that Katz had argued against became the unexamined cornerstones of the field.

So, while the givenness of formal semantics was probably not the result of the schemes of a cabal of moneyed academics, like the Chicago School was, it doesn’t seem to have been the result of an open debate based on ideas and evidence, and it’s held in place, not by reason, but basically by sociopolitical forces. Thus I feel comfortable suggesting that it was the result of an intellectual coup.

Summing up: There’s always an alternative

I’ve offered a few potential features of an intellectual coup here, but nothing like an exhaustive diagnostic checklist. One important feature, though, is the “there is no alternative” attitude that they seem to foster. Any progress that we’ve made as a species, be it political, social, intellectual, or otherwise, stems from our ability to imagine a different way of doing things. So, for an intellectual community to be open to progress, it has to accept that there other ways of thinking about the world. Some of those alternatives are worse, some are better, but the only sure-fire way not to make progress is to declare that there is no alternative.

The Poverty of Referentialist Semantics

(What follows is a bit of a rant. I hope it holds together a bit. If you make it past the inflammatory title, let me know what you think.)

When Gregor Mendel first discovered his Laws of Inheritance, it was a great revelation. To be sure, humanity has perhaps always known that many of a person’s (or plant’s, animal’s, bacterium’s, etc) traits are inherited from their parents, but Mendel was able to give that knowledge a quantitative expression. Of course, this was just the beginning of the modern study of genetics, as scientists asked the next obvious question: How are traits inherited? This question persisted for the better part of a century until a team of scientists showed experimentally, that inheritance proceeds via DNA. Again, this raised a question that has spurred research to this day: How does DNA encode physical traits? But why am I writing about genetics in a post about semantics? Well, to make a point of contrast with the theory that has dominated the field of linguistic semantics for the past few decades: Formal Semantics.

As in the case of inheritance, we’ve always know that words, phrases, and sentences have meanings, but we’ve had a tougher time understanding this fact. In the late 19th and early 20th century philosophers, psychologists and linguists seemed to settle on a way of understanding linguistic meaning: linguistic expression are meaningful by virtue of the fact that they refer to objects in the world. So, “dog” has a meaning for modern English speakers because it refers to dogs. This principle has led the modern field of semantics, although not in the same way as the discoveries of genetics led that field. If semanticists had proceeded as the geneticists had, they would have immediately asked the obvious question: How do linguistic expressions refer to objects in the world? Instead of pursuing this question, semanticists seem to have banished it and, in fact, virtually any questions about the reference relation, and have done so, I believe, to the detriment of the field.

At first blush, it might seem that semanticists should be forgiven for not centring this question in their inquiry. Curiosity about genetic inheritance, to continue my comparison, is quite natural, likely because we can observe its facts objectively. Certainly, it’s a cliché that no one likes to admit that they’re like their parents. There is very little resistance, on the other hand, to seeing such a similarity in other people. The facts of inheritance are unavoidable, but they are not coupled with anything approaching intuition about them. In fact, many of the facts are fundamentally unintuitive: How can a trait skip a generation? Why does male pattern baldness come from the mother’s side? How can a long line of brown-eyed people produce a blue-eyed child? This dearth of intuition about an abundance of evidence means that no one objects to followup questions to any scientific advance in the field. In fact, the right kind of follow-up questions are welcomed.

On the other hand, linguistics, especially generative linguistics, faces the opposite situation. In many ways, the object of generative inquiry is our intuitive knowledge about our own language. It should be obvious here that the average person’s intuitions about language vastly outweigh the objective facts about language.* Our intuitions about language are so close to our core, that it is very uncomfortable for us to entertain questions about it. We like to think that we know our own minds, but a question like what is language?—properly pursued—highlights just how little we understand that mind. This is not to say it’s an unanswerable or ill-formed question; it’s not a species of zen kōan. Language exists and we can distinguish it from other things, so, unlike the sound of one hand clapping, it has a nature that we can perhaps gain some understanding of. In fact, the field of generative syntax shows us that language is amenable to rational inquiry, provided researchers are open to follow-up questions: Chomsky’s initial answer to the question was that language is a computational procedure that generates an infinite array of meaningful expressions, which raised the obvious question: What sort of computational procedure? In many ways this is the driving question of generative syntactic theory, but it has also raised a number of additional questions, some of which are still open.

Just as what is language? is a difficult question, so are what is meaning? and how do words refer? So semanticists can be forgiven for balking at them initially. But, again, this is not to say that these are unanswerable questions in principle. What’s more, I don’t think semanticists even attempt to argue that the questions are too hard. On the contrary, the answer to the questions are so obvious that they don’t warrant a response. Are they right? Is it a boring, obvious question? I don’t think so. I think it is an interesting question whose surface simplicity masks a universe of complexity. In fact, I can demonstrate that complexity with some seemingly simple examples.

Before I demonstrate the complexity of reference in language, let’s look at some simple cases of reference to get a sense of what sort of relation it is. Consider for instance, longitude and latitude. The string 53° 20′ 57.6″ N, 6° 15′ 39.87″ W refers to a particular location on earth. Specifically it refers to the location of the Dublin General Post Office. That sequence of symbols is not intrinsically linked to that spot on earth; it is linked by the convention of longitude and latitude, which is to say it is linked arbitrarily. Despite its arbitrary nature, though, the link is objective; it doesn’t matter who is reading it, it still refers to that particular location. Similar remarks apply to variable assignment in computer programs, which are arbitrarily linked to a location in a computer’s RAM, or numerals like 4 or IV, which are arbitrarily linked to a particular number (assuming numbers have objective reality). These seem to suggest the following definition of the reference relation.

(R) reference is the arbitrary and objective mapping between symbols and objects or sets of objects.

For a moment, let’s set aside two types of expressions: subjective expressions like my favourite book, or next door, and proper names like The Dublin General Post Office, or Edward Snowden. For the purposes of this post, I will grant that the question of how the latter refer is already solved, and the question of how the former refer is too difficult to answer at this point. Even if we restrict ourselves to common nouns that ostensibly refer to physical objects, we run into interesting problems.

Consider the word “chair”. English speakers are very good at correctly identifying certain masses of matter as chairs, and identifying others as not chairs. This seems like a textbook case of reference, but how are we able to do it?

A chair

Not a chair

In order for reference to obtain here, there must be some intrinsic property (or constellation of properties) that marks the thing on the left as a chair and is lacking in the thing on the right. Let’s skip some pointless speculation and settle on shape as the determining factor. That is, chairs are chairs by virtue of their shape. And let’s grant that that chair-shape can be codified in such a way as to allow reference to obtain. That would be great, except that it still doesn’t fully capture the meaning of “chair”.

Suppose, for instance, a sculptor creates an object that looks exactly like a chair, and an art gallery buys it to display as part of its collection. Is that object a chair? No, it’s a sculpture. Why? Because it no longer serves the function of a chair. So the objective shape of an artifact is not sufficient to determine it’s chair-ness; we need to say something about its function, and function, I would argue, is subjective.

Or consider the following narrative:

Sadie has just moved into her first apartment in a major Western city and she needs to furnish it. Being less than wealthy she opts to buy furniture from Ikea. She goes online and orders her Ikea furniture. The next day three flat-pack boxes arrive at her door: One contains a bookshelf, one contains a bed, and the other contains a chair.

In what sense does that box contain a chair? It contains prefabricated parts which can be assembled to form a chair. Neither the box, nor its contents are chair-shaped, yet we’re happy to call the contents a chair. What if Sadie were a skilled woodworker and wanted to build her own furniture from, say, several 2-by-4s. Would we call those uncrafted 2-by-4s a chair? I don’t think so. Let’s continue the narrative.

Sadie assembles her furniture and other furniture and enjoys it for a year, at which point her landlord decides to evict her in order to double the rent. Sadie finds another apartment and pack up her belongings. In order to facilitate the move she disassembles her furniture and puts them the the trunks of the cars of her various helpful siblings. Her bookcase goes with Rose, her bed with Declan, and her chair with Violet.

Again, we refer to a bundle of chair parts as a chair. What if Sadie had taken out her anger at being evicted by her greedy landlord on the chair, hacking it to pieces with an axe? Would the resulting pile of rubble be a chair? Certainly not.

What does this tell us about how the word “chair” is linked to the object that I’m sitting on as I write this? That link cannot be reference as defined above in (R), because it’s not purely objective. The chair-ness of an object depends not only on its objective form, but also on its subjective function. And this problem will crop up with any artifact-word (e.g., “table”, “book”, “toque”). If we were to shift our domain away from artifact-words, no doubt we’d find more words that don’t refer in the sense of (R). Maybe we’d find real honest-to-goodness referring words, but we’d still be left with a language that contains a sizable chunk of non-referential expressions. Worse still, modern formal semanticists have expanded the universe of “real objects” to which expressions can refer to include situations, events, degrees, and so on. What’s the objective nature of an event? or a situation? or a degree? No idea, but I know them when I see them.

“So what?”you might say. “Formal semantics works. Just look at all of the papers published, problems raised and solved, linguistic phenomena described.Who cares if we don’t know how reference works?” Well, if semantics is the study of meaning, and meaning is reference, then how can there be any measure of the success of semantics that isn’t a measure of it understanding of what reference is?

Again, consider a comparison with genetics. What if, instead of asking follow-ups to Mendel’s laws, geneticists had merely developed the laws to greater precision? Our current understanding of genetics would be wildly impoverished. We certainly would not have all of the advances that currently characterize genetic science. Quite obviously genetics is much the richer for asking those follow-up questions.

No doubt semantics would be much richer if it allowed follow-up questions.


* It is precisely this situation that makes it so difficult to communicate the aims of generative linguistics, and why the main type of linguistics that gains any sort of traction in the mainstream press is the type that looks at other people’s language. Consider the moral panic about the speech patterns of young women that surfaces every so often, the NY Time Bestseller Because Internet, by Gretchen McCulloch, which looks at the linguistic innovation on the internet, or even the current discussion about the origins of Toronto slang. To paraphrase Mark Twain, nothing so needs research and discussion as other people’s language.

I’m being generous here. In fact, most paradoxes of reference are about proper names (See, Katz, J. J. (1986). Why intensionalists ought not be Fregeans. Truth and interpretation, 59-91.)

On the general character of semantic theory (Part b)

(AKA Katz’s Semantic Theory (Part IIIb). This post discusses the second half of chapter 2 of Jerrold Katz’s 1972 opus. For my discussion of the first half of the chapter, go here.

(Note: This post was written in fits and starts, which is likely reflected in its style (or lack thereof). My apologies in advance)

The first half of chapter 2 was concerned with the broader theory of language, rather than a semantic theory. In the second half of the chapter, Katz begins his sketch of the theory of semantics. It’s at this point that I pick up my review.

4. The structure of the theory of language

In this section, Katz discusses universals, which he frames, following Chomsky, as constraints on grammars. Katz differs from Chomsky, though, in how he divvies up the universals—whereas Chomsky, in Aspects, distinguishes between formal and substantive universals, Katz adds a third type: organizational universals. These classifications are defined as follows:

Formal universals constrain the form of the rules in a grammar; substantive universals provide a theoretical vocabulary from which the constructs used to formulate the rules of particular grammars are drawn; organizational universals, of which there are two subtypes, componential organizational universals and systematic organizational universals, specify the interrelations among the rules and among the systems of rules within a grammar.

p30-31

Furthermore, formal, substantive, and componential universals cross-classify with phonological, syntactic, and semantic universals. This means that we can talk about substantive phonological universals, or componential semantic universals, and so on. So, for example, a phonological theory consists in a specification of the formal, substantive, and componential universals at the phonological level, and such a specification amounts to a definition of the phonological component of the language faculty. Systematic universals, then, specify how the components of the grammar are related to each other. With this discussion, Katz sets up his goals: to specify the formal, substantive, and componential universals at the semantic level. More precisely, he aims to develop the following:

(2.7) A scheme for semantic representation consisting of a theoretical vocabulary from which semantic constructs required in the formulation of particular semantic interpretations can be drawn

p33

(2.8) A specification for the form of the dictionary and a specification of the form of the rules that project semantic representations for complex syntactic constituents from the dictionary’s representations of the senses of their minimal syntactic parts.

p33

(2.9) A specification of the form of the semantic component, of the relation between the dictionary and the projection rules, and of the manner in which these rules apply in assigning semantic representations

p3

These three aspects of semantic theory, according to Katz, represent the substantive, formal, and componential universals, respectively. A theory that contains (2.7)-(2.9), and answers questions 1-15 (as listed here) would count as an adequate semantic theory.

5. Semantic theory’s model of a semantic component

So, Katz asks rhetorically, how could it be that semantic relations, such as analyticity, synonymy, or semantic similarity, be captured in the purely formal terms required by (2.7)-(2.9)? The answer is simple: semantic relations and properties are merely formal aspects of compositional meanings of expressions. This is a bold and controversial claim: Semantic properties/relations are formal properties/relations or, to put it more strongly semantic properties/relations are, in fact, syntactic properties/relations (where “syntactic” is used is a very broad sense). Of course, this claim is theoretical and rather coarse. Katz aims to make it empirical and fine.

So, what does Katz’s semantic theory consist of? At the broadest level, it consists of a dictionary and a set of projection rules. No surprise yet; it’s a computational theory, and any computational system consists of symbols and rules. The dictionary contains entries for every morpheme in a given language, where each entry is a collection of the senses of that morpheme. Finally he defines two “technical terms.” The first is a reading which refers “a semantic representation of a sense of a morpheme, word, phrase, clause, or sentence and which is further divided into lexical readings and derived readings. The second term is semantic marker which refers to “the semantic representation of one or another of the concepts that appear as parts of senses.” Katz then continues, identifying the limiting case of semantic marker: primitive semantic markers.

Here it’s worth making a careful analogy to syntactic theory. Semantic markers, as their name suggests, are analogous to phrase markers. Each are representations of constituency: a phrase marker represents the syntactic constituents of an expression while a semantic marker represents the conceptual constituents of a concept. In each theory there are base cases of the markers: morphemes in syntactic theory and aptly named primitive semantic markers. I must stress, of course that this is only an analogy, not an isomorphism. Morphemes are not mapped to primitive semantic markers, and vice versa. Just as a simple morpheme can be phonologically complex, it can also be semantically complex. Furthermore, as we’ll see shortly, while complex semantic markers are structured, there is no reason to expect them to be structured according to the principles of syntactic theory.

Before Katz gets to the actual nitty-gritty of formalizing these notions, he pauses to discuss ontology. He’s a philosopher, after all. Semantic markers are representations of concepts and propositions, but what are concepts and propositions? Well, we can be sure of some things that they are not: images, mental ideas, and particular thoughts which Katz groups together as what calls cognitions. Cognitions, for Katz, are concrete, meaning they can be individuated by who has them, when and where they occur, and so on. If you and I have the same thought (e.g., “Toronto is the capital of Ontario”) then we had different cognitions. Concepts and propositions, for Katz, are abstract objects and, therefore, independent of space and time, meaning they can’t be individuated by their nonexistent spatiotemporal properties. They can, however, be individuated by natural languages, which Katz also takes to be abstract objects, and, in fact, are individuated easily by speakers of natural languages. Since, in a formulation echoed recently by Paul Pietroski (at around 5:45), “senses are concepts and propositions connected with phonetic (or orthographic) objects in natural languages” and the goal of linguistic theory is to construct grammars that model that connection, the question of concept- and proposition-individuation is best answered by linguistic theory.1

But, Katz’s critics might argue, individuation of concepts and propositions is not definition of “concept” or “proposition”. True, Katz responds, but so what? If we needed to explicitly define the object of our study before we started studying it, we wouldn’t have any science. He uses the example of Maxwell’s theory of electromagnetism which accurately models the behaviour and structural properties of electromagnetic waves but does not furnish any definition of electromagnetism. So if we can come up with a theory that accurately models the behaviour and structural properties of concepts and propositions, why should we demand a definition?

We also can’t expect a definition of “semantic marker” or “reading” right out of the gate. In fact, Katz argues, one of the goals of semantic theory (2.7) is to come up with those definitions and we can’t expect to have a complete theory in order to develop that theory. Nevertheless, we can use some basic intuitions to come up with a preliminary sketch of what a reading and a semantic marker might look like. For instance, the everyday word/concept “chair”, has a common sense, which is composed of subconcepts and can be represented as the set of semantic markers in (2.15).

(2.15) (Object), (Physical), (Non-living), (Artifact),
       (Furniture), (Portable), (Something with legs),
       (Something with a back), (Something with a seat),
       (Seat for one)

Of course, this is just preliminary. Katz identifies a number of places for improvement. Each of the semantic markers is likely decomposable into simple markers. Even the concept represented by “(Object)” is likely decomposable.

Or, Katz continues, we can propose that semantic markers are ways of making semantic generalizations. Katz notes that when we consider how “chair” relates to words such as “hat,” “planet,” “car,” and “molecule” compared to words such as “truth,” “thought,” “togetherness,” and “feeling.” Obviously, these words all denote distinct concepts, but just as obviously, the two groupings contrast with each other. We can think of the semantic marker “(Object)” as the distinguishing factor in these groupings: the former is a group of objects, the latter a group of non-objects. So, semantic markers, like phonological features and grammatical categories, are expressions of natural classes.

Finally, Katz proposes a third way of thinking of semantic markers: “as symbols that mark the components of senses of expressions on which inferences from sentences containing the expressions depend.” (p41) For instance we can infer (2.19) from (2.18), but we can’t infer (2.27).

(2.18) There is a chair in the room.

(2.19) There is a physical object in the room.

(2.27) There is a woman in the room.

We can express this inference pattern by saying that every semantic marker that comprises the sense of “physical object” in (2.19) is contained in the sense of “chair” in (2.18), but that is not the case for “woman” in (2.27). The sense of “woman” in (2.27) contains semantic markers like “(Female)” which are not contained in the sense of chair in (2.18). Here Katz notes that his proposal that concepts like “chair” consist of markers is merely an extension of an observation by Frege that (2.28a,b,c) are together equivalent to (2.29)

(2.28)
(a) 2 is a positive number
(b) 2 is a whole number
(c) 2 is less than 10

(2.29) 2 is a positive whole number less than 10

For Frege, “positive number”, “whole number”, and “less than 10” are all properties of “2” and marks of “positive whole number less than 10”. Katz’s extension is to say that the concepts associated with simple expressions can have their own marks.

Next, Katz discusses the notions of derived and lexical readings which are, in a sense, the inputs and outputs, respectively, of the process of semantic composition. As the name suggests, lexical readings are what is stored in the dictionary. When a syntactic object hits the semantic component of the grammar, the first step is to replace the terminal nodes with their lexical readings. Derived readings are generated by applying projection rules to the first level of non-terminal nodes, and then the next level, and so on until the syntactic object is exhausted.

The process of deriving readings, Katz asserts, must be restrictive in the sense that the interpretation of a sentence is never the every permutation of the lexical readings of its component parts. For instance, suppose the adjective “light” and the noun “book” have N and M senses in their respective lexical readings. If our process for deriving readings were unrestrictive, we would expect “light book” to have N×M senses while, in fact, fewer are available. We can see this even when we restrict ourselves to 2 senses for “light”—“low in physical weight”, and “inconsequential”—and 2 senses for “book”—“a bound collection of paper” and “a work of literature”. Restricting ourselves this much we can see that the “light book” is 2-ways ambiguous, describing a bound collection of papers with a low weight, or a work of literature whose content is inconsequential, and not a work of literature with a low weight or an inconsequential bound collection of papers. Our semantic theory, then, must be such that the compositional process it proposes can appropriately restrict the class of derived readings for a given syntactic object.

To ensure this restrictiveness, Katz proposes that the senses that make up a dictionary entry are each paired with a selectional restriction. To illustrate this, he considers the adjective “handsome” which has three senses: when applied to a person or artifact it has the sense “beautiful with dignity”; when applied applied to an amount, it has the sense “moderately large”; when applied to conduct, it has the sense “gracious or generous”. So, for Katz, the dictionary entry for “handsome” is as in (2.30).

(2.30) "handsome";[+Adj,…];(Physical),(Object),(Beautiful),
                           (Dignified in appearance),
                           <(Human),(Artifact)>
                           (Gracious),(Generous),<(Conduct)>
                           (Moderately large),<(Amount)>

Here the semantic markers in angle brackets represent the markers that must be present in the senses that “handsome” is applied to.

This solution to the problem of selection may seem stipulative and ad hoc—I know it seems that way to me—but recall that this is an early chapter in a book published in 1972. If we compared it to the theories of syntax and phonology of the time, they might appear similarly unsatisfying. The difference between Katz’s theory and syntactic and phonological theories contemporary to Katz’s theory is that syntactic and phonological theories have since developed into more formalized and hopefully explanatory theories through the collaborative effort of many researchers, while Katz’s theory never gained the traction required to spur that level of collaboration.

Katz closes out this section, with a discussion of “semantic redundancy rules” and projection rules. Rather than discuss these, I move on to the final section of the chapter.

6. Preliminary definitions of some semantic properties and relations

Here Katz shows the utility of the theory that he has thus far sketched. That is, he looks at how the semantic properties and relations identified in chapter 1 can be defined in the terms introduced in this chapter. These theoretical definitions are guided by our common sense definitions, but Katz is careful to stress that they are not determined by them. So, for instance, two things are similar when they share some feature(s). Translating this into his theory, Katz gives the definition in (2.33) for semantic similarity.

(2.33) A constituent Ci is semantically similar to a constituent Cj on a sense just in case there is a reading of Ci and a reading of Cj which have a semantic marker in common. (they can be said to semantic similar with respect to the concept φ in case the shared semantic marker represents φ)

Note that we can convert this definition into a scalar notion, so we can talk about degrees of similarity in terms of the number of shared markers. Katz does this implicitly by defining semantic distinctness as sharing no markers and synonymy as sharing all features.

Similarity is a rather simple notion, and therefore has a simple definition; others requires some complexity. For instance, analytic statements like “Liars lie” are vacuous assertions due to the fact that the the meaning of the subject is contained in the meaning of the predicate. Here, Katz gives the definition one might expect, but it is clear that more needs to be said, as the notions of subject and predicate are more difficult to define. More on this in later chapters.

A more puzzling and less often remarked upon semantic relation is antonymy—the relation that holds of the word pairs in (2.46) and of the set of words in (2.47)

(2.46) bride/groom, aunt/uncle, cow/bull, girl/boy, doe/buck

(2.47) child/cub/puppy/kitten/cygnet

Katz notes that although antonymy is generally taken to be merely lexical, it actually projects to larger expressions (e.g., “our beloved old cow”/”our beloved old bull”), and is targeted by words like “either” as demonstrated by the fact that (2.49a) is meaningful while (2.49c) is anomalous.

(2.49)
a. John is well and Mary’s not sick either.
c. John is well and Mary’s not {well/foolish/poor/dead}

In order for antonymy to be given an adequate theoretical definition, then, it must be expressed formally. Katz does this by marking semantic markers that represent antonymy sets with a superscript. For instance, “brother” and “sister” would be represented as (Sibling)(M) and (Sibling)(F), respectively. Again, this is clearly stipulative and ad hoc but that is to be expected at this stage of a theory. In fact, Katz seems to have been revising his theory up to his death, with the colour incompatibility problem—the question of why the sentence “The dot is green and red” is contradictory—occupying a the focus of a 1998 paper of his and a section of his posthumous book. Even Katz’s ad hoc solution to the problem, though, is miles ahead of any solution that could possibly be given in current formal semantics—which is bases its definition of meaning on reference—because, to my knowledge, there is no way to account for antonymy in formal semantics. Indeed, the mere fact, that Katz is able to give any theoretical definition of antonymy, puts his theory well ahead of formal semantics.

Conclusion

Katz’s rough sketch of a semantic theory is already fairly successful in that its able to provide concrete definitions of many of the semantic notions that he identifies in the first chapter.2 I don’t believe this success is due to Katz’s ingenuity, but rather to the fact that he approached theory-building as the central activity in semantic inquiry, rather than an arcane peripheral curiosity. Since the theory building is central, it can occur in tandem with analysis of linguistic intuition.

In the next chapter, Katz responds to criticisms from his contemporaries. I’m not sure how enlightening this is for modern audiences, so I might skip it. We’ll see…


  1. ^ This argument, of course, leads pretty quickly to a classic problem inherent in the notion of abstract objects: the problem of how abstract objects can interact with the physical world. We could, of course, get around this by denying that concepts and propositions are abstract but then we need to explain how two different people could have the same thought at different times, in different places. I’m not sure which is the best choice and I’m not sure that linguistics (or any science) is up to the task of deciding between the two, so I’ll just proceed by going along with Katz’s realistic attitude about abstract objects, with the caveat that it might be wrong—a kind of methodological Platonism.
  2. ^ Katz does not give definitions for presupposition or question-answer pairs here, more on that in later chapters.

On the general character of semantic theory (Part a)

(AKA Katz’s Semantic Theory (Part IIIa). This post discusses chapter 2 of Jerrold Katz’s 1972 opus. For my discussion of chapter 1, go here.)

Having delineated in chapter 1 which questions a semantic theory ought to answer, Katz goes on in chapter 2 to sketch the sort of answer that a such a theory would give. He starts at a very high level, discussing the very notion of natural language and ends up with some of the formal details of the theory that he aims to develop.

Katz begins by reminding the reader that the questions of meaning—questions 1–15 below—are absolute questions. That is, they aren’t meant to be relativized to any particular language.

  1. What are synonymy and paraphrase?
  2. What are semantic similarity and semantic difference?
  3. What is antynomy?
  4. What is superordination?
  5. What are meaningfulness and semantic anomaly?
  6. What is semantic ambiguity?
  7. What is semantic redundancy?
  8. What is semantic truth (analyticity, metalinguistic truth, etc.)?
  9. What is semantic falsehood (contradiction, metalinguistic falsehood, etc.)?
  10. What is semantically undetermined truth or falsehood (e.g., syntheticity)?
  11. What is inconsistency?
  12. What is entailment?
  13. What is presupposition?
  14. What is a possible answer to a question?
  15. What is a self-answered question?

So, asking What is semantic truth in English? is kind of like asking What is a hiccup to a Canadian?. This, Katz acknowledges, makes a strong empirical claim, namely, that every natural language should exhibit the properties whose definitions are requested by questions 1–15.

As a syntactician, this claim made me think about what notions I would include the field of syntax as universal in this sense. Notions like sentence or phrase would certainly be there, and category would likely be there. Would subject, predicate, object, and the like be there? Would modification, or transformation? How about interrogative, declarative, imperative, etc? Notions like word/morpheme, or linear precedence, certainly were included in early versions of syntax, but more recently they tend to either be banished from the theory or dissolved into other notions.

I know of very few syntactitians who ask these questions. Perhaps this is because syntax has decidedly moved beyond the early stage in which Katz found semantics in 1972, but it still behooves us to keep those questions in mind, if only for the purposes of introducing syntax to students. Furthermore, perhaps if we keep these questions in mind, they can serve as a guide for research. Before embarking to answer a research question, the researcher would try to trace that question back to one of the basic questions to judge its likely fruitfulness. I would be curious to see how the papers in, say, LI would fare under such an analysis. But I digress.

Katz continues, asserting that a theory of linguistic meaning must be embedded in a larger theory of natural language, and in order to develop such a theory we must have some sense of what sort of thing a natural language might be. It is this question that occupies the first part of this chapter

1. Theories about the objective reality of language


The first thing Katz does here is distinguish between the two main competing conceptions of language (at least the main conceptions of his day): the traditional rationalist conception of language as “the internalized rules of grammar that constitute the fluency of its native speakers”, and the empiricist conception of language as “a vast stock of sound chunks classifiable into various phonological and syntactic categories” (p12). He opts for rationalism, citing the now familiar arguments against the empiricist stance. First off, we can’t identify a language L with the set S of all actual utterances of L because any competent speaker of L can easily construct an expression that lies outside of S. This is because although practical factors force every expression of a language to be of finite length, there is no theoretical limit to the length of an expression; no matter the length of an expression, there is always a grammatical way of lengthening it.

One could, Katz continues, expand S to be the set of all expressions that a speaker of L could utter without eliciting an odd response from a hearer. However, this amounts to defining L in terms of dispositions of a speech community, namely the dispositions to accept or reject strings of L. In practical reality, though, these dispositions can be wildly inconsistent depending on a variety of psychological and external factors, so if we want a consistent definition we need to clean up our notion of dispositions. Katz does so by “incorporating recursive mechanisms of sentence generation” (p15), or, as they’re more commonly referred to, generative grammars. And once we incorporate generative grammars, we have a rationalist conception of natural language.

Thus far, there’s nothing too surprising. Katz gives us a fairly standard argument in favour of the rationalist conception of language. But this is where Katz’s discussion gets a little strange; this is where he reveals his realist (in the philosophical sense) view of language. It is a mistake, he argues, to identify, say, English with the actual internalized rules in English-speakers’ brains. This would be like “identifying arithmetic with the concrete realizations of the mathematical rules in the heads of those who can compute using positive real numbers” (p16). As evidence for this claim, Katz cites “dead languages” like Sanskrit, which seems to exist (we can make true or false assertions of it) even though its rules are not actualized in any human’s brain the way that Hindi-Urdu’s rules are. Although he doesn’t say it explicitly here, Katz is arguing that languages are abstract entities, like platonic forms. In his own words: “A language is not itself subject to the fate of the mortals who speak it. It is some sort of abstract entity, whatever it is that this means.” (p16)

Katz further defends this view by identifying it with the standard scientific practice of idealization. So a natural languages like, say, Punjabi and a biological species like homo sapiens is an idealization in that they can’t be defined in terms of concrete examples. Similarly the notions of ideal gases, perfect vacuums, and massless strings are the idealizations of physics. He also cites Chomsky’s discussion in Aspects of the “ideal speaker-listener” and Rudolph Carnap who makes a similar observation, that one cannot directly investigate language but must do so by comparison to a constructed language.

Katz’s proposal and argument that languages are abstract entities strikes me as interesting but a bit confused. Katz’s argument from dead languages is compelling, and could perhaps be made even stronger. Consider for instance, reconstructed languages such as Proto Indo-European or Proto Algonquian. At best we know a scant few details about these languages, but we can say with some certainty that they were each spoken by some speech community. Do they exist in the same sense as Sanskrit does? I think the answer has to be yes, as the only difference between a reconstructed language and a dead language seems to be a written record of that language, and that is clearly not the difference between a language and a non-language.

The argument based on idealization, though. seems to be slightly confused. The comparison of a language with a species does seem to be apt, and might point towards his conclusion, but the comparison to ideal gases etc. I think suggests a different notion of idealization, the one that I’ve always taken Chomsky to be using. Under this sense, the idealized objects that scientists employ are not hypothesized to be real, but rather to be useful. I don’t believe even the most realist of scientists believes in the existence of frictionless planes. Scientists use these idealizations to reveal real, but non-apparent aspects of the world. In discussing the ideal speaker-listener, Chomsky was not suggesting that such a person exists, just that we ought to use this idealized person to help reveal a real aspect of the world, namely, the human language faculty.

2. Effability

In the next section Katz espouses what he calls the principle of effability, which he attributes to a number of earlier philosophers (Frege, Searle, and Tarski). The essence of the principle is roughly that if a proposition or thought is expressible in any language, it is expressible in every language. He spends a good chunk of text defending and sharpening his principle, but I’ll set that discussion aside here, and focus on why he proposes this principle. According to Katz, “effability alone offers a satisfactory basis for drawing the distinction between natural languages, on the one hand, and systems of animal communication and artificial languages, on the other” (p22). Despite this bold seeming claim, Katz is rather hesitant regarding his principle. He admits that it is rather inchoate and probably not yet up to any empirical task. But only part of his claim is about the viability of effability, the other claim is that no other property of natural language can distinguish it from other similar systems.

In particular, Katz takes aim at the properties that Chomsky tends to highlight as distinguishing factors for natural language: creativity, stimulus freedom, and appropriateness. Taking these one-by-one, he argues that none of them is unique to natural language. First, he considers creativity which he takes to be the ability of a speaker-listener to produce and understand indefinitely many sentences. This, Katz argues is a property of (a) any artificial language with recursive rules, and (b) certain animal communication systems, specifically bee communication. Next, Katz takes on stimulus freedom, which he argues means freedom from external stimuli, asserting that “[i]t cannot mean freedom from the control of internal stimuli as well.”1 This being the case, says Katz, stimulus freedom doesn’t make sense as a distinction. Also, he asserts that some animal behaviour displays such stimulus freedom. Finally, Katz argues that appropriateness is not part of linguistic competence—that it is extragrammatical, and also that some animal behaviour displays this property.

I take some issue with Katz’s critiques of each of the distinguishing properties individually, but I’ll set that aside for now to highlight a broader issue. Even if we take Katz’s critiques at face value, they still don’t refute Chomsky’s claim, because Chomsky’s Cain isn’t that each of the three properties distinguishes natural language, but that the conjunction of the three is what distinguishes natural language. That is, natural language is distinct from animal communication and artificial language in that it is creative, stimulus-free, and appropriate. So, for instance, even if a bee can produce novel dances, it does so in response to a stimulus. Artificial language might be creative, but it makes little sense to talk about stimulus freedom or appropriateness with respect to them. So Katz’s critiques don’t really have that much force.

At any rate, the principle of effability, while an interesting notion, doesn’t seem to be too crucial for Katz’s theory. The index of the book lists only one reference to effability outside this section. So, on to the next.

3. Competence and Performance

In the final table-setting section of this chapter, Katz takes up and defends Chomsky’s competence/performance distinction. His discussion, though, differs from most that I’ve encountered in that he uses a debate between Chomsky and Gilbert Harman, one of Chomsky and Katz’s empiricist contemporaries. Katz first clears a significant portion of underbrush in this debate in order to get to what he takes to be the crux of the issue: the proposal that linguistic competence consists in the unconscious knowledge of general principles. He summarizes Harman’s issue, which seems to revolve around the notion of grammatical transformations, as follows.

[G]iven that we can say that speakers of a language know that certain sentences are ungrammatical, certain ones ambiguous, certain ones related in certain ways to others, and so on, what licenses us to go further and say that speakers know (tacitly) the linguistic principles whose formalization in the grammar explain the noted ungrammaticality, ambiguity, sentential relations and the like?

(p28)

This challenge, Katz seems to argue, is not based on the empiricist/rationalist debate in epistemology, but rather on the realist/fictionalist argument in the philosophy of science.2 Harman is saying that a transformational grammar is maybe a good model of a speaker-listener of a given language, but it’s just that, a model. Katz responds, with the help of a quote from his erstwhile co-author, Jerry Fodor, that the only sensible conclusion to be drawn from the empirical accuracy of a scientific theory is that the theory is a true description of reality, at least insofar as it is empirically accurate. There is, of course much more to say about this, but I’ll leave it there.

Thus, Katz sets up his conception of language in order to be able to sketch a theory of semantics within a theory of language. In my next post I will take up the details of that sketch.


  1. ^ Katz cites Cartesian Linguistics for Chomsky’s distinguishing factors, and it’s likely that CL doesn’t discuss stimulus-freedom too extensively. In more recent discussion, though, Chomsky does include internal stimuli in the property of stimulus freedom, so, it’s not clear that Katz’s critique here still holds.
  2. ^ I suspect that there is no strong demarcation between epistemology and philosophy of science, but I can’t say with any confidence one way or the other.

The Scope of Semantics

(AKA Katz’s Semantic Theory (Part II). This post discusses chapter 1 of Jerrold Katz’s 1972 opus. For my discussion of the preface, go here.)

If you’ve taken a semantics course in the past decade or two, or read an introductory textbook on the topic published in that time span, you probably encountered, likely at the outset, the question What is meaning? followed almost immediately with a fairly pat answer. In my experience, the answer given to that question was reference1—the meaning of an expression, say dog, is the set of things in the world that that expression refers to, the set of all dogs in this case. Now, I can’t exactly recall my reaction the first time a teacher presented that as an answer to the question of meaning. I might have been wholly unimpressed, or I might have had my mind blown, that way that an impressionable young mind can be blown by someone giving a pat, confident answer to a deep question. Either way, I know that every time I’ve heard that answer2 to the question of meaning since, it’s become less impressive, to the point of being slightly offensive. At best, a pat answer is incomplete; at worst, it’s flat wrong.

Of course, I never really had a better answer to the question of meaning, and most of the other answers on offer seemed much worse. I couldn’t shake the unease I had with reference as an answer, but I couldn’t fully articulate that unease. Which is why I was very quickly drawn into Semantic Theory—Katz pinpoints and articulates the source of that unease on page 3 of the book:

The misconception, it seems to me, lies in the supposition that the question “What is meaning” can be answered in a direct and straightforward way. The question is generally treated as if it were on par with questions like “What is the capital of France?” to which the direct and straightforward answer “Paris” can be given. It is supposed that an answer can be given of the form “Meaning is this or that.” But the question “What is meaning?” does not admit of a direct “this or that” answer; it’s answers is instead a whole theory [emphasis added]. It is not a question like “What is the capital of France?” “When did Einstein retire?” “Where is Tasmania?”because it is not merely a request for an isolated fact, a request which can be answered simply and directly. Rather it is a theoretical question, like “What is matter?” “What is electricity?” “What is light?”

(Katz 1972, p3)

Imagine if, instead of developing theories of matter, electricity, and light, the early physicists had been satisfied with giving a simple answer like Matter is anything you can touch and feel. We wouldn’t have a science of physics, or chemistry. We likely wouldn’t have any science as we know it.

Katz goes on to acknowledge that, if one were to ask a physicist what electricity is, they might give a simple answer, but notes that such an answer would be a highly condensed version of the theory of electromagnetism that has been developed over centuries of inquiry. Similarly, if you were to ask a phonologist what a syllable is, or what pronunciation is, or if you asked a syntactician what a sentence is, or what grammar is, you might get a similar condensed answer with a several big caveats. You certainly wouldn’t get a simple straightforward answer. In fact, one of the first tasks in any introduction to linguistics is to disabuse students of any simple answers that they may have internalized, and even to disabuse them of the notion that simple answers to such questions even exist.

This seems to leave us in a bit of a bind. If we don’t know what meaning is, how can we study it? Katz’s response: the same way we did with chemistry, biology, phonology, etc.—We identify a set of phenomena that are definitely under the umbrella of meaning, and go from there. Not to disappoint, Katz identifies 15 such phenomena which he frames as subquestions to the meaning question:

  1. What are synonymy and paraphrase?
  2. What are semantic similarity and semantic difference?
  3. What is antynomy?
  4. What is superordination?
  5. What are meaningfulness and semantic anomaly?
  6. What is semantic ambiguity?
  7. What is semantic redundancy?
  8. What is semantic truth (analyticity, metalinguistic truth, etc.)?
  9. What is semantic falsehood (contradiction, metalinguistic falsehood, etc.)?
  10. What is semantically undetermined truth or falsehood (e.g., syntheticity)?
  11. What is inconsistency?
  12. What is entailment?
  13. What is presupposition?
  14. What is a possible answer to a question?
  15. What is a self-answered question?

A formidable list to be sure, but, as far as I can tell, modern formal semantics only cares about 11–143. Katz expands on each of these with representative examples. I won’t go into those examples, but they all are based on intuitions that a person would have about linguistic meaning. If one takes these as the leading questions of semantic theory, Katz argues, then the simple answers to the meaning question lose their appeal, as they do not answer the subquestions 1–15, or at least cannot do so without a complex semantic theory to supplement them.

Furthermore, Katz points out that the debates between the competing simple answers all use arguments based on the phenomena that 1–15 as about. Take, for instance, the best known critique of the referentialist answer. If we assume that meaning=reference, then any two expressions that have the same referent, must be synonymous. Gottlob Frege, the godfather of formal semantics, argued that there were expressions which had different meanings but had the same referent, the classic example of which is the morning star and the evening star. The two expressions have different meanings (they differ as to when the star appears in the sky), however they refer to the same object (the planet Venus). And once you start to think about it you can come up with a seeming infinity of such examples.

Katz goes on to show that critiques of other simple answers to the meaning question are based on what hes call “strong pretheoretical intuitions,” all of which raise at least one of questions 1–15. His point here seems to be that we can’t divorce our semantic theory from pretheoretical intuitions such as the ones the form the basis of 1–15, so why not just embrace it? Why not throw away the “leading principles” and just try to build a theory that answers 1–15?

Katz closes the chapter by discussing skepticism with regards to meaning. It’s hard to honestly maintain skepticism, he argues, when we can marshal an extensive body of evidence that meaning exists. That body of evidence starts with an explication of 1–15, but likely extends beyond that. It is even harder to honestly maintain skepticism if we can build a theory that shows the regular and law-like behaviour of the evidence marshaled. Taking a suggestion from Quine (who played a major role in the preface), Katz compares the situation that he finds himself in to that which ancient astronomers found themselves in:

Astronomy found its answer to “What are planets?” by constructing a theory that explained planetary motion on the assumption that planets are physical objects that obey standard mechanical laws. In the same spirit, once we construct a theory that can successfully explain a reasonably large portion of semantic phenomena, we can base our answer to “What is meaning?” on what the theory had to assume meaning was in order to provide its explanations.

(Katz 1972, p10

Semantics, as it is taught and studied today, is commonly considered by non-semanticists to be the most arcane and opaque subfield of linguistics. It’s not clear what is more obscure, the questions that semanticists ask or the formalism that they use to answer those questions. I often wonder if there is something endemic to questions of meaning that make them seem arcane to many, or if it is a failing in the standard answer that leads to this feeling. This chapter of Katz’s book, for me, rules out the former. The questions in 1–15 are far from arcane, or, at least, they’re no more arcane than the questions that occupy the other subfields of linguistics. Maybe if we took Katz’s view of semantics, fewer students would run screaming (or grumbling, or yawning) from semantics classes.

In the next chapter, entitled “On the general character of semantic theory” Katz begins constructing his theory.


Footnotes (the links might not work, sorry)

  1. ^ I learned my semantics in a generative department where reference was the answer. Other departments might have had another answer.
  2. ^ and sometimes I’ve even given that answer as a teacher.
  3. ^ Entailment and inconsistency are the key phenomena. Presuppositions are useful as diagnostics. Questions have only recently gained currency lately it seems.

Katz’s Semantic Theory (Part I)

(This is intended to be the first in a series of posts in which I work my way through Semantic Theory by Jerrold Katz)

Through a somewhat meandering intellectual journey that I undertook when I probably should have been writing, I found myself reading the late Jerrold J Katz’s 1972 book entitled Semantic Theory. While I began that book with a certain amount of cynicism—I think I’ve been disappointed by virtually every book that tries to develop a theory of semantics—that cynicism evaporated very quickly. It evaporated as soon it became obvious that the theory that Katz intended to develop was radically different from the theory of semantics that contemporary linguists assume and that the source of that radical difference was that Katz shared the core assumptions of generative grammar.

That last sentence, or rather its implication, may be a bit inflammatory, but I think it’s justified, for reasons that Katz elucidates.

In his preface, Katz gives something of a historical narrative of linguistics and logic in the first half of the 20th century. He picks this time frame because of what he views as an unfortunate schism that occurred in those years. His basic story is as follows. Throughout most of their history, logic and linguistics were united by their interest in what Katz calls “the classical problem of logical form,” which is clear when you consider, for instance, that the notion of subject and predicate comes from Aristotle’s logical treatise On Interpretation, or that one of the leading logical works from the renaissance to the 20th century, The Port Royal Logic, was written and published along with the Port Royal Grammar. In the 20th century, though, something happened and the two fields went their separate ways, away from the classical problem.

By Katz’s estimation, there are three factors that led to the schism: (i) The professionalization of the fields, (ii) the difficulty of the classical problem, and (iii) the dominance of empiricism in the fields. Since the story of linguistics in this period has been covered quite a bit, Katz doesn’t waste much time on it, and neither will I. The story of logic, however, interests Katz (more of a philosopher than a linguist) a great deal, and I think is useful in understanding current theories of semantics. Logicians in the early 20th century, influenced by the Katz’s three factors, abandoned the problem of logic form and sought out “manageable problems.” The problem, or perhaps program is the better word for it, that they landed on was the development of artificial languages with which to represent thought. These artificial languages, unlike natural language, wore their logical form on their sleeves, to borrow Katz’s formulation.

In order to formulate an artificial logical language, Quine—one of the Katz’s chief villains—sought to identify and highlight the “logical particles” of natural language as distinct from the extra-logical vocabulary. The logical particles (e.g., and, or, not, if—then) are those that have inferential powers, while the extra-logical words (e.g., dog, bachelor, Moira, lamp) are those that have only referential powers. This seems fairly intuitive, but Katz argues that there is no non-arbitrary way of dividing logical vocabulary from extralogical vocabulary. This is certainly an odd assertion. I mean, it’s pretty obvious that and is a logical word and dog isn’t, right? While it might be a valid intuition that these are different sorts of words, what Katz argues is that the set of words that have inferential powers is much larger than what we might call the logical particles.

To show this, Katz walks us through a possible method for identifying logical particles and demonstrates that this method cannot actually rule out any word as a logical particle. The method starts by examining a valid inference such as (1)–(3).

 (1) All terriers are dogs.
  (2) All dogs are animals.
  (3) Hence, all terriers are animals.

We can see that (1)–(3) remains valid regardless of the meaning of dogs, animals, and terriers; that is, we could replace the tokens of those words with tokens of virtually any other nouns and we’d still have a valid inference. By the same token, though, the validity of (1)–(3) depends on the meaning of all, are, and hence. So, we remove from our list of candidates for logical particles, the words that can be factored out of such valid inferences. Katz argues that, while this method gives the expected results for classical syllogisms and perhaps some other logical inferences, things get messy when we look at the full range of valid inferences

Katz presents (4) and (5) as a valid inference, but argues that the method of factoring we applied to (1)–(3) gives different results here.

 (2) Socrates is a man.
 (3) Hence, Socrates is male.

We can factor out Socrates here, but not man or male. The inference from (4) to (5) seems to depend on the meaning of the latter two words. If we follow our methodology, then we have to add male and man to our logical particles, because they seem to have inferential powers. With a few moments of thought, we can see that this leads to a situation where there is no logical/extra-logical distinction, because every word is a logical particle. Thus Quine’s program is doomed to failure.

As anyone who has leaned any formal logic knows, though, Quine’s program became the orthodoxy. And, in fact, his conception of logic is, in many ways, the basis for semantics as practiced by contemporary generative grammarians. Katz identifies the work of George Lakoff and that of Donald Davidson as early attempts to apply Quinean logic to language, and it continues to today.

As something of an aside, formal semanticists, seem to take as given the assertion that there is a privileged class of logical particles, and try to analyze a portion of the vocabulary that lies outside of that class so that it can be expressed using the logical particles and some simple atomic extra-logical “words.” what belongs to that analyzable portion of vocabulary is not well defined; I know that know, and should are in that portion and I know that dog and wallet are outside of that portion, but I can’t really get much more specific than that.

What’s stranger is that even some of those words that correspond to logical particles are up for analysis. And, triggers some implicatures which are often analyzed using the Quinean tools. The meaning of if—then, is also up for debate. I almost wrote a paper as part of my PhD on conditionals and the one thing that the semantic literature seems to agree on is that the meaning of if—then is not the material conditional (→). Being a naive syntactician, with no understanding of the history of logic, I basically took formal logic as gospel. It never occurred to me that the logician’s conception of conditional statements could be flawed.

Of course, if Katz is correct, then logics built on Quine’s logical/extra-logical distinction are the proverbial houses built on sand. And if I’m correct that formal semantics is built on Quinean logic, then formal semantics is a proverbial house built on a house built on sand. End of aside.

Having argued that the empiricist theories of logic such as those of Quine, Frege, and Carnap are unsuited for inclusion in a rationalist theory of language such as generative grammar, Katz moves on to the next task, the one that occupies the remainder of his book: the task of constructing a rationalist and scientific theory of semantics. According to Katz, this task was viewed by the philosophers of his day as an impossibility, and I don’t know if much has changed.

In fact, it seems to me that among semanticists and a number of generative syntacticians, there is a strong hostility towards rationalist conceptions of semantics as put forth by Katz (and also Chomsky). As an illustrative anecdote, I recall once I was talking with an established linguist, and I expressed some skepticism towards modern formal semantics. When I suggested that a more rationalist, intensionalist theory of semantics might be fruitful, they responded that, while I might be right, if I decided to pursue that line of research, I would never be hired as a semanticist. Luckily for me, of course, I’m a syntactician, but that’s still a rather chilling thing to hear. End of second aside.

Katz concludes his preface by putting his program in context, and outlining the structure of the book. I won’t bore you with the details, but only preview chapter 1 “The scope of semantics,” wherein Katz considers the question what is meaning?, and gives a shockingly sensible answer: That’s a complex question, we’ll need to answer it scientifically.