The Poverty of Referentialist Semantics

(What follows is a bit of a rant. I hope it holds together a bit. If you make it past the inflammatory title, let me know what you think.)

When Gregor Mendel first discovered his Laws of Inheritance, it was a great revelation. To be sure, humanity has perhaps always known that many of a person’s (or plant’s, animal’s, bacterium’s, etc) traits are inherited from their parents, but Mendel was able to give that knowledge a quantitative expression. Of course, this was just the beginning of the modern study of genetics, as scientists asked the next obvious question: How are traits inherited? This question persisted for the better part of a century until a team of scientists showed experimentally, that inheritance proceeds via DNA. Again, this raised a question that has spurred research to this day: How does DNA encode physical traits? But why am I writing about genetics in a post about semantics? Well, to make a point of contrast with the theory that has dominated the field of linguistic semantics for the past few decades: Formal Semantics.

As in the case of inheritance, we’ve always know that words, phrases, and sentences have meanings, but we’ve had a tougher time understanding this fact. In the late 19th and early 20th century philosophers, psychologists and linguists seemed to settle on a way of understanding linguistic meaning: linguistic expression are meaningful by virtue of the fact that they refer to objects in the world. So, “dog” has a meaning for modern English speakers because it refers to dogs. This principle has led the modern field of semantics, although not in the same way as the discoveries of genetics led that field. If semanticists had proceeded as the geneticists had, they would have immediately asked the obvious question: How do linguistic expressions refer to objects in the world? Instead of pursuing this question, semanticists seem to have banished it and, in fact, virtually any questions about the reference relation, and have done so, I believe, to the detriment of the field.

At first blush, it might seem that semanticists should be forgiven for not centring this question in their inquiry. Curiosity about genetic inheritance, to continue my comparison, is quite natural, likely because we can observe its facts objectively. Certainly, it’s a cliché that no one likes to admit that they’re like their parents. There is very little resistance, on the other hand, to seeing such a similarity in other people. The facts of inheritance are unavoidable, but they are not coupled with anything approaching intuition about them. In fact, many of the facts are fundamentally unintuitive: How can a trait skip a generation? Why does male pattern baldness come from the mother’s side? How can a long line of brown-eyed people produce a blue-eyed child? This dearth of intuition about an abundance of evidence means that no one objects to followup questions to any scientific advance in the field. In fact, the right kind of follow-up questions are welcomed.

On the other hand, linguistics, especially generative linguistics, faces the opposite situation. In many ways, the object of generative inquiry is our intuitive knowledge about our own language. It should be obvious here that the average person’s intuitions about language vastly outweigh the objective facts about language.* Our intuitions about language are so close to our core, that it is very uncomfortable for us to entertain questions about it. We like to think that we know our own minds, but a question like what is language?—properly pursued—highlights just how little we understand that mind. This is not to say it’s an unanswerable or ill-formed question; it’s not a species of zen kōan. Language exists and we can distinguish it from other things, so, unlike the sound of one hand clapping, it has a nature that we can perhaps gain some understanding of. In fact, the field of generative syntax shows us that language is amenable to rational inquiry, provided researchers are open to follow-up questions: Chomsky’s initial answer to the question was that language is a computational procedure that generates an infinite array of meaningful expressions, which raised the obvious question: What sort of computational procedure? In many ways this is the driving question of generative syntactic theory, but it has also raised a number of additional questions, some of which are still open.

Just as what is language? is a difficult question, so are what is meaning? and how do words refer? So semanticists can be forgiven for balking at them initially. But, again, this is not to say that these are unanswerable questions in principle. What’s more, I don’t think semanticists even attempt to argue that the questions are too hard. On the contrary, the answer to the questions are so obvious that they don’t warrant a response. Are they right? Is it a boring, obvious question? I don’t think so. I think it is an interesting question whose surface simplicity masks a universe of complexity. In fact, I can demonstrate that complexity with some seemingly simple examples.

Before I demonstrate the complexity of reference in language, let’s look at some simple cases of reference to get a sense of what sort of relation it is. Consider for instance, longitude and latitude. The string 53° 20′ 57.6″ N, 6° 15′ 39.87″ W refers to a particular location on earth. Specifically it refers to the location of the Dublin General Post Office. That sequence of symbols is not intrinsically linked to that spot on earth; it is linked by the convention of longitude and latitude, which is to say it is linked arbitrarily. Despite its arbitrary nature, though, the link is objective; it doesn’t matter who is reading it, it still refers to that particular location. Similar remarks apply to variable assignment in computer programs, which are arbitrarily linked to a location in a computer’s RAM, or numerals like 4 or IV, which are arbitrarily linked to a particular number (assuming numbers have objective reality). These seem to suggest the following definition of the reference relation.

(R) reference is the arbitrary and objective mapping between symbols and objects or sets of objects.

For a moment, let’s set aside two types of expressions: subjective expressions like my favourite book, or next door, and proper names like The Dublin General Post Office, or Edward Snowden. For the purposes of this post, I will grant that the question of how the latter refer is already solved, and the question of how the former refer is too difficult to answer at this point. Even if we restrict ourselves to common nouns that ostensibly refer to physical objects, we run into interesting problems.

Consider the word “chair”. English speakers are very good at correctly identifying certain masses of matter as chairs, and identifying others as not chairs. This seems like a textbook case of reference, but how are we able to do it?

A chair

Not a chair

In order for reference to obtain here, there must be some intrinsic property (or constellation of properties) that marks the thing on the left as a chair and is lacking in the thing on the right. Let’s skip some pointless speculation and settle on shape as the determining factor. That is, chairs are chairs by virtue of their shape. And let’s grant that that chair-shape can be codified in such a way as to allow reference to obtain. That would be great, except that it still doesn’t fully capture the meaning of “chair”.

Suppose, for instance, a sculptor creates an object that looks exactly like a chair, and an art gallery buys it to display as part of its collection. Is that object a chair? No, it’s a sculpture. Why? Because it no longer serves the function of a chair. So the objective shape of an artifact is not sufficient to determine it’s chair-ness; we need to say something about its function, and function, I would argue, is subjective.

Or consider the following narrative:

Sadie has just moved into her first apartment in a major Western city and she needs to furnish it. Being less than wealthy she opts to buy furniture from Ikea. She goes online and orders her Ikea furniture. The next day three flat-pack boxes arrive at her door: One contains a bookshelf, one contains a bed, and the other contains a chair.

In what sense does that box contain a chair? It contains prefabricated parts which can be assembled to form a chair. Neither the box, nor its contents are chair-shaped, yet we’re happy to call the contents a chair. What if Sadie were a skilled woodworker and wanted to build her own furniture from, say, several 2-by-4s. Would we call those uncrafted 2-by-4s a chair? I don’t think so. Let’s continue the narrative.

Sadie assembles her furniture and other furniture and enjoys it for a year, at which point her landlord decides to evict her in order to double the rent. Sadie finds another apartment and pack up her belongings. In order to facilitate the move she disassembles her furniture and puts them the the trunks of the cars of her various helpful siblings. Her bookcase goes with Rose, her bed with Declan, and her chair with Violet.

Again, we refer to a bundle of chair parts as a chair. What if Sadie had taken out her anger at being evicted by her greedy landlord on the chair, hacking it to pieces with an axe? Would the resulting pile of rubble be a chair? Certainly not.

What does this tell us about how the word “chair” is linked to the object that I’m sitting on as I write this? That link cannot be reference as defined above in (R), because it’s not purely objective. The chair-ness of an object depends not only on its objective form, but also on its subjective function. And this problem will crop up with any artifact-word (e.g., “table”, “book”, “toque”). If we were to shift our domain away from artifact-words, no doubt we’d find more words that don’t refer in the sense of (R). Maybe we’d find real honest-to-goodness referring words, but we’d still be left with a language that contains a sizable chunk of non-referential expressions. Worse still, modern formal semanticists have expanded the universe of “real objects” to which expressions can refer to include situations, events, degrees, and so on. What’s the objective nature of an event? or a situation? or a degree? No idea, but I know them when I see them.

“So what?”you might say. “Formal semantics works. Just look at all of the papers published, problems raised and solved, linguistic phenomena described.Who cares if we don’t know how reference works?” Well, if semantics is the study of meaning, and meaning is reference, then how can there be any measure of the success of semantics that isn’t a measure of it understanding of what reference is?

Again, consider a comparison with genetics. What if, instead of asking follow-ups to Mendel’s laws, geneticists had merely developed the laws to greater precision? Our current understanding of genetics would be wildly impoverished. We certainly would not have all of the advances that currently characterize genetic science. Quite obviously genetics is much the richer for asking those follow-up questions.

No doubt semantics would be much richer if it allowed follow-up questions.


* It is precisely this situation that makes it so difficult to communicate the aims of generative linguistics, and why the main type of linguistics that gains any sort of traction in the mainstream press is the type that looks at other people’s language. Consider the moral panic about the speech patterns of young women that surfaces every so often, the NY Time Bestseller Because Internet, by Gretchen McCulloch, which looks at the linguistic innovation on the internet, or even the current discussion about the origins of Toronto slang. To paraphrase Mark Twain, nothing so needs research and discussion as other people’s language.

I’m being generous here. In fact, most paradoxes of reference are about proper names (See, Katz, J. J. (1986). Why intensionalists ought not be Fregeans. Truth and interpretation, 59-91.)

On the general character of semantic theory (Part b)

(AKA Katz’s Semantic Theory (Part IIIb). This post discusses the second half of chapter 2 of Jerrold Katz’s 1972 opus. For my discussion of the first half of the chapter, go here.

(Note: This post was written in fits and starts, which is likely reflected in its style (or lack thereof). My apologies in advance)

The first half of chapter 2 was concerned with the broader theory of language, rather than a semantic theory. In the second half of the chapter, Katz begins his sketch of the theory of semantics. It’s at this point that I pick up my review.

4. The structure of the theory of language

In this section, Katz discusses universals, which he frames, following Chomsky, as constraints on grammars. Katz differs from Chomsky, though, in how he divvies up the universals—whereas Chomsky, in Aspects, distinguishes between formal and substantive universals, Katz adds a third type: organizational universals. These classifications are defined as follows:

Formal universals constrain the form of the rules in a grammar; substantive universals provide a theoretical vocabulary from which the constructs used to formulate the rules of particular grammars are drawn; organizational universals, of which there are two subtypes, componential organizational universals and systematic organizational universals, specify the interrelations among the rules and among the systems of rules within a grammar.

p30-31

Furthermore, formal, substantive, and componential universals cross-classify with phonological, syntactic, and semantic universals. This means that we can talk about substantive phonological universals, or componential semantic universals, and so on. So, for example, a phonological theory consists in a specification of the formal, substantive, and componential universals at the phonological level, and such a specification amounts to a definition of the phonological component of the language faculty. Systematic universals, then, specify how the components of the grammar are related to each other. With this discussion, Katz sets up his goals: to specify the formal, substantive, and componential universals at the semantic level. More precisely, he aims to develop the following:

(2.7) A scheme for semantic representation consisting of a theoretical vocabulary from which semantic constructs required in the formulation of particular semantic interpretations can be drawn

p33

(2.8) A specification for the form of the dictionary and a specification of the form of the rules that project semantic representations for complex syntactic constituents from the dictionary’s representations of the senses of their minimal syntactic parts.

p33

(2.9) A specification of the form of the semantic component, of the relation between the dictionary and the projection rules, and of the manner in which these rules apply in assigning semantic representations

p3

These three aspects of semantic theory, according to Katz, represent the substantive, formal, and componential universals, respectively. A theory that contains (2.7)-(2.9), and answers questions 1-15 (as listed here) would count as an adequate semantic theory.

5. Semantic theory’s model of a semantic component

So, Katz asks rhetorically, how could it be that semantic relations, such as analyticity, synonymy, or semantic similarity, be captured in the purely formal terms required by (2.7)-(2.9)? The answer is simple: semantic relations and properties are merely formal aspects of compositional meanings of expressions. This is a bold and controversial claim: Semantic properties/relations are formal properties/relations or, to put it more strongly semantic properties/relations are, in fact, syntactic properties/relations (where “syntactic” is used is a very broad sense). Of course, this claim is theoretical and rather coarse. Katz aims to make it empirical and fine.

So, what does Katz’s semantic theory consist of? At the broadest level, it consists of a dictionary and a set of projection rules. No surprise yet; it’s a computational theory, and any computational system consists of symbols and rules. The dictionary contains entries for every morpheme in a given language, where each entry is a collection of the senses of that morpheme. Finally he defines two “technical terms.” The first is a reading which refers “a semantic representation of a sense of a morpheme, word, phrase, clause, or sentence and which is further divided into lexical readings and derived readings. The second term is semantic marker which refers to “the semantic representation of one or another of the concepts that appear as parts of senses.” Katz then continues, identifying the limiting case of semantic marker: primitive semantic markers.

Here it’s worth making a careful analogy to syntactic theory. Semantic markers, as their name suggests, are analogous to phrase markers. Each are representations of constituency: a phrase marker represents the syntactic constituents of an expression while a semantic marker represents the conceptual constituents of a concept. In each theory there are base cases of the markers: morphemes in syntactic theory and aptly named primitive semantic markers. I must stress, of course that this is only an analogy, not an isomorphism. Morphemes are not mapped to primitive semantic markers, and vice versa. Just as a simple morpheme can be phonologically complex, it can also be semantically complex. Furthermore, as we’ll see shortly, while complex semantic markers are structured, there is no reason to expect them to be structured according to the principles of syntactic theory.

Before Katz gets to the actual nitty-gritty of formalizing these notions, he pauses to discuss ontology. He’s a philosopher, after all. Semantic markers are representations of concepts and propositions, but what are concepts and propositions? Well, we can be sure of some things that they are not: images, mental ideas, and particular thoughts which Katz groups together as what calls cognitions. Cognitions, for Katz, are concrete, meaning they can be individuated by who has them, when and where they occur, and so on. If you and I have the same thought (e.g., “Toronto is the capital of Ontario”) then we had different cognitions. Concepts and propositions, for Katz, are abstract objects and, therefore, independent of space and time, meaning they can’t be individuated by their nonexistent spatiotemporal properties. They can, however, be individuated by natural languages, which Katz also takes to be abstract objects, and, in fact, are individuated easily by speakers of natural languages. Since, in a formulation echoed recently by Paul Pietroski (at around 5:45), “senses are concepts and propositions connected with phonetic (or orthographic) objects in natural languages” and the goal of linguistic theory is to construct grammars that model that connection, the question of concept- and proposition-individuation is best answered by linguistic theory.1

But, Katz’s critics might argue, individuation of concepts and propositions is not definition of “concept” or “proposition”. True, Katz responds, but so what? If we needed to explicitly define the object of our study before we started studying it, we wouldn’t have any science. He uses the example of Maxwell’s theory of electromagnetism which accurately models the behaviour and structural properties of electromagnetic waves but does not furnish any definition of electromagnetism. So if we can come up with a theory that accurately models the behaviour and structural properties of concepts and propositions, why should we demand a definition?

We also can’t expect a definition of “semantic marker” or “reading” right out of the gate. In fact, Katz argues, one of the goals of semantic theory (2.7) is to come up with those definitions and we can’t expect to have a complete theory in order to develop that theory. Nevertheless, we can use some basic intuitions to come up with a preliminary sketch of what a reading and a semantic marker might look like. For instance, the everyday word/concept “chair”, has a common sense, which is composed of subconcepts and can be represented as the set of semantic markers in (2.15).

(2.15) (Object), (Physical), (Non-living), (Artifact),
       (Furniture), (Portable), (Something with legs),
       (Something with a back), (Something with a seat),
       (Seat for one)

Of course, this is just preliminary. Katz identifies a number of places for improvement. Each of the semantic markers is likely decomposable into simple markers. Even the concept represented by “(Object)” is likely decomposable.

Or, Katz continues, we can propose that semantic markers are ways of making semantic generalizations. Katz notes that when we consider how “chair” relates to words such as “hat,” “planet,” “car,” and “molecule” compared to words such as “truth,” “thought,” “togetherness,” and “feeling.” Obviously, these words all denote distinct concepts, but just as obviously, the two groupings contrast with each other. We can think of the semantic marker “(Object)” as the distinguishing factor in these groupings: the former is a group of objects, the latter a group of non-objects. So, semantic markers, like phonological features and grammatical categories, are expressions of natural classes.

Finally, Katz proposes a third way of thinking of semantic markers: “as symbols that mark the components of senses of expressions on which inferences from sentences containing the expressions depend.” (p41) For instance we can infer (2.19) from (2.18), but we can’t infer (2.27).

(2.18) There is a chair in the room.

(2.19) There is a physical object in the room.

(2.27) There is a woman in the room.

We can express this inference pattern by saying that every semantic marker that comprises the sense of “physical object” in (2.19) is contained in the sense of “chair” in (2.18), but that is not the case for “woman” in (2.27). The sense of “woman” in (2.27) contains semantic markers like “(Female)” which are not contained in the sense of chair in (2.18). Here Katz notes that his proposal that concepts like “chair” consist of markers is merely an extension of an observation by Frege that (2.28a,b,c) are together equivalent to (2.29)

(2.28)
(a) 2 is a positive number
(b) 2 is a whole number
(c) 2 is less than 10

(2.29) 2 is a positive whole number less than 10

For Frege, “positive number”, “whole number”, and “less than 10” are all properties of “2” and marks of “positive whole number less than 10”. Katz’s extension is to say that the concepts associated with simple expressions can have their own marks.

Next, Katz discusses the notions of derived and lexical readings which are, in a sense, the inputs and outputs, respectively, of the process of semantic composition. As the name suggests, lexical readings are what is stored in the dictionary. When a syntactic object hits the semantic component of the grammar, the first step is to replace the terminal nodes with their lexical readings. Derived readings are generated by applying projection rules to the first level of non-terminal nodes, and then the next level, and so on until the syntactic object is exhausted.

The process of deriving readings, Katz asserts, must be restrictive in the sense that the interpretation of a sentence is never the every permutation of the lexical readings of its component parts. For instance, suppose the adjective “light” and the noun “book” have N and M senses in their respective lexical readings. If our process for deriving readings were unrestrictive, we would expect “light book” to have N×M senses while, in fact, fewer are available. We can see this even when we restrict ourselves to 2 senses for “light”—“low in physical weight”, and “inconsequential”—and 2 senses for “book”—“a bound collection of paper” and “a work of literature”. Restricting ourselves this much we can see that the “light book” is 2-ways ambiguous, describing a bound collection of papers with a low weight, or a work of literature whose content is inconsequential, and not a work of literature with a low weight or an inconsequential bound collection of papers. Our semantic theory, then, must be such that the compositional process it proposes can appropriately restrict the class of derived readings for a given syntactic object.

To ensure this restrictiveness, Katz proposes that the senses that make up a dictionary entry are each paired with a selectional restriction. To illustrate this, he considers the adjective “handsome” which has three senses: when applied to a person or artifact it has the sense “beautiful with dignity”; when applied applied to an amount, it has the sense “moderately large”; when applied to conduct, it has the sense “gracious or generous”. So, for Katz, the dictionary entry for “handsome” is as in (2.30).

(2.30) "handsome";[+Adj,…];(Physical),(Object),(Beautiful),
                           (Dignified in appearance),
                           <(Human),(Artifact)>
                           (Gracious),(Generous),<(Conduct)>
                           (Moderately large),<(Amount)>

Here the semantic markers in angle brackets represent the markers that must be present in the senses that “handsome” is applied to.

This solution to the problem of selection may seem stipulative and ad hoc—I know it seems that way to me—but recall that this is an early chapter in a book published in 1972. If we compared it to the theories of syntax and phonology of the time, they might appear similarly unsatisfying. The difference between Katz’s theory and syntactic and phonological theories contemporary to Katz’s theory is that syntactic and phonological theories have since developed into more formalized and hopefully explanatory theories through the collaborative effort of many researchers, while Katz’s theory never gained the traction required to spur that level of collaboration.

Katz closes out this section, with a discussion of “semantic redundancy rules” and projection rules. Rather than discuss these, I move on to the final section of the chapter.

6. Preliminary definitions of some semantic properties and relations

Here Katz shows the utility of the theory that he has thus far sketched. That is, he looks at how the semantic properties and relations identified in chapter 1 can be defined in the terms introduced in this chapter. These theoretical definitions are guided by our common sense definitions, but Katz is careful to stress that they are not determined by them. So, for instance, two things are similar when they share some feature(s). Translating this into his theory, Katz gives the definition in (2.33) for semantic similarity.

(2.33) A constituent Ci is semantically similar to a constituent Cj on a sense just in case there is a reading of Ci and a reading of Cj which have a semantic marker in common. (they can be said to semantic similar with respect to the concept φ in case the shared semantic marker represents φ)

Note that we can convert this definition into a scalar notion, so we can talk about degrees of similarity in terms of the number of shared markers. Katz does this implicitly by defining semantic distinctness as sharing no markers and synonymy as sharing all features.

Similarity is a rather simple notion, and therefore has a simple definition; others requires some complexity. For instance, analytic statements like “Liars lie” are vacuous assertions due to the fact that the the meaning of the subject is contained in the meaning of the predicate. Here, Katz gives the definition one might expect, but it is clear that more needs to be said, as the notions of subject and predicate are more difficult to define. More on this in later chapters.

A more puzzling and less often remarked upon semantic relation is antonymy—the relation that holds of the word pairs in (2.46) and of the set of words in (2.47)

(2.46) bride/groom, aunt/uncle, cow/bull, girl/boy, doe/buck

(2.47) child/cub/puppy/kitten/cygnet

Katz notes that although antonymy is generally taken to be merely lexical, it actually projects to larger expressions (e.g., “our beloved old cow”/”our beloved old bull”), and is targeted by words like “either” as demonstrated by the fact that (2.49a) is meaningful while (2.49c) is anomalous.

(2.49)
a. John is well and Mary’s not sick either.
c. John is well and Mary’s not {well/foolish/poor/dead}

In order for antonymy to be given an adequate theoretical definition, then, it must be expressed formally. Katz does this by marking semantic markers that represent antonymy sets with a superscript. For instance, “brother” and “sister” would be represented as (Sibling)(M) and (Sibling)(F), respectively. Again, this is clearly stipulative and ad hoc but that is to be expected at this stage of a theory. In fact, Katz seems to have been revising his theory up to his death, with the colour incompatibility problem—the question of why the sentence “The dot is green and red” is contradictory—occupying a the focus of a 1998 paper of his and a section of his posthumous book. Even Katz’s ad hoc solution to the problem, though, is miles ahead of any solution that could possibly be given in current formal semantics—which is bases its definition of meaning on reference—because, to my knowledge, there is no way to account for antonymy in formal semantics. Indeed, the mere fact, that Katz is able to give any theoretical definition of antonymy, puts his theory well ahead of formal semantics.

Conclusion

Katz’s rough sketch of a semantic theory is already fairly successful in that its able to provide concrete definitions of many of the semantic notions that he identifies in the first chapter.2 I don’t believe this success is due to Katz’s ingenuity, but rather to the fact that he approached theory-building as the central activity in semantic inquiry, rather than an arcane peripheral curiosity. Since the theory building is central, it can occur in tandem with analysis of linguistic intuition.

In the next chapter, Katz responds to criticisms from his contemporaries. I’m not sure how enlightening this is for modern audiences, so I might skip it. We’ll see…


  1. ^ This argument, of course, leads pretty quickly to a classic problem inherent in the notion of abstract objects: the problem of how abstract objects can interact with the physical world. We could, of course, get around this by denying that concepts and propositions are abstract but then we need to explain how two different people could have the same thought at different times, in different places. I’m not sure which is the best choice and I’m not sure that linguistics (or any science) is up to the task of deciding between the two, so I’ll just proceed by going along with Katz’s realistic attitude about abstract objects, with the caveat that it might be wrong—a kind of methodological Platonism.
  2. ^ Katz does not give definitions for presupposition or question-answer pairs here, more on that in later chapters.