On the general character of semantic theory (Part b)

(AKA Katz’s Semantic Theory (Part IIIb). This post discusses the second half of chapter 2 of Jerrold Katz’s 1972 opus. For my discussion of the first half of the chapter, go here.

(Note: This post was written in fits and starts, which is likely reflected in its style (or lack thereof). My apologies in advance)

The first half of chapter 2 was concerned with the broader theory of language, rather than a semantic theory. In the second half of the chapter, Katz begins his sketch of the theory of semantics. It’s at this point that I pick up my review.

4. The structure of the theory of language

In this section, Katz discusses universals, which he frames, following Chomsky, as constraints on grammars. Katz differs from Chomsky, though, in how he divvies up the universals—whereas Chomsky, in Aspects, distinguishes between formal and substantive universals, Katz adds a third type: organizational universals. These classifications are defined as follows:

Formal universals constrain the form of the rules in a grammar; substantive universals provide a theoretical vocabulary from which the constructs used to formulate the rules of particular grammars are drawn; organizational universals, of which there are two subtypes, componential organizational universals and systematic organizational universals, specify the interrelations among the rules and among the systems of rules within a grammar.

p30-31

Furthermore, formal, substantive, and componential universals cross-classify with phonological, syntactic, and semantic universals. This means that we can talk about substantive phonological universals, or componential semantic universals, and so on. So, for example, a phonological theory consists in a specification of the formal, substantive, and componential universals at the phonological level, and such a specification amounts to a definition of the phonological component of the language faculty. Systematic universals, then, specify how the components of the grammar are related to each other. With this discussion, Katz sets up his goals: to specify the formal, substantive, and componential universals at the semantic level. More precisely, he aims to develop the following:

(2.7) A scheme for semantic representation consisting of a theoretical vocabulary from which semantic constructs required in the formulation of particular semantic interpretations can be drawn

p33

(2.8) A specification for the form of the dictionary and a specification of the form of the rules that project semantic representations for complex syntactic constituents from the dictionary’s representations of the senses of their minimal syntactic parts.

p33

(2.9) A specification of the form of the semantic component, of the relation between the dictionary and the projection rules, and of the manner in which these rules apply in assigning semantic representations

p3

These three aspects of semantic theory, according to Katz, represent the substantive, formal, and componential universals, respectively. A theory that contains (2.7)-(2.9), and answers questions 1-15 (as listed here) would count as an adequate semantic theory.

5. Semantic theory’s model of a semantic component

So, Katz asks rhetorically, how could it be that semantic relations, such as analyticity, synonymy, or semantic similarity, be captured in the purely formal terms required by (2.7)-(2.9)? The answer is simple: semantic relations and properties are merely formal aspects of compositional meanings of expressions. This is a bold and controversial claim: Semantic properties/relations are formal properties/relations or, to put it more strongly semantic properties/relations are, in fact, syntactic properties/relations (where “syntactic” is used is a very broad sense). Of course, this claim is theoretical and rather coarse. Katz aims to make it empirical and fine.

So, what does Katz’s semantic theory consist of? At the broadest level, it consists of a dictionary and a set of projection rules. No surprise yet; it’s a computational theory, and any computational system consists of symbols and rules. The dictionary contains entries for every morpheme in a given language, where each entry is a collection of the senses of that morpheme. Finally he defines two “technical terms.” The first is a reading which refers “a semantic representation of a sense of a morpheme, word, phrase, clause, or sentence and which is further divided into lexical readings and derived readings. The second term is semantic marker which refers to “the semantic representation of one or another of the concepts that appear as parts of senses.” Katz then continues, identifying the limiting case of semantic marker: primitive semantic markers.

Here it’s worth making a careful analogy to syntactic theory. Semantic markers, as their name suggests, are analogous to phrase markers. Each are representations of constituency: a phrase marker represents the syntactic constituents of an expression while a semantic marker represents the conceptual constituents of a concept. In each theory there are base cases of the markers: morphemes in syntactic theory and aptly named primitive semantic markers. I must stress, of course that this is only an analogy, not an isomorphism. Morphemes are not mapped to primitive semantic markers, and vice versa. Just as a simple morpheme can be phonologically complex, it can also be semantically complex. Furthermore, as we’ll see shortly, while complex semantic markers are structured, there is no reason to expect them to be structured according to the principles of syntactic theory.

Before Katz gets to the actual nitty-gritty of formalizing these notions, he pauses to discuss ontology. He’s a philosopher, after all. Semantic markers are representations of concepts and propositions, but what are concepts and propositions? Well, we can be sure of some things that they are not: images, mental ideas, and particular thoughts which Katz groups together as what calls cognitions. Cognitions, for Katz, are concrete, meaning they can be individuated by who has them, when and where they occur, and so on. If you and I have the same thought (e.g., “Toronto is the capital of Ontario”) then we had different cognitions. Concepts and propositions, for Katz, are abstract objects and, therefore, independent of space and time, meaning they can’t be individuated by their nonexistent spatiotemporal properties. They can, however, be individuated by natural languages, which Katz also takes to be abstract objects, and, in fact, are individuated easily by speakers of natural languages. Since, in a formulation echoed recently by Paul Pietroski (at around 5:45), “senses are concepts and propositions connected with phonetic (or orthographic) objects in natural languages” and the goal of linguistic theory is to construct grammars that model that connection, the question of concept- and proposition-individuation is best answered by linguistic theory.1

But, Katz’s critics might argue, individuation of concepts and propositions is not definition of “concept” or “proposition”. True, Katz responds, but so what? If we needed to explicitly define the object of our study before we started studying it, we wouldn’t have any science. He uses the example of Maxwell’s theory of electromagnetism which accurately models the behaviour and structural properties of electromagnetic waves but does not furnish any definition of electromagnetism. So if we can come up with a theory that accurately models the behaviour and structural properties of concepts and propositions, why should we demand a definition?

We also can’t expect a definition of “semantic marker” or “reading” right out of the gate. In fact, Katz argues, one of the goals of semantic theory (2.7) is to come up with those definitions and we can’t expect to have a complete theory in order to develop that theory. Nevertheless, we can use some basic intuitions to come up with a preliminary sketch of what a reading and a semantic marker might look like. For instance, the everyday word/concept “chair”, has a common sense, which is composed of subconcepts and can be represented as the set of semantic markers in (2.15).

(2.15) (Object), (Physical), (Non-living), (Artifact),
       (Furniture), (Portable), (Something with legs),
       (Something with a back), (Something with a seat),
       (Seat for one)

Of course, this is just preliminary. Katz identifies a number of places for improvement. Each of the semantic markers is likely decomposable into simple markers. Even the concept represented by “(Object)” is likely decomposable.

Or, Katz continues, we can propose that semantic markers are ways of making semantic generalizations. Katz notes that when we consider how “chair” relates to words such as “hat,” “planet,” “car,” and “molecule” compared to words such as “truth,” “thought,” “togetherness,” and “feeling.” Obviously, these words all denote distinct concepts, but just as obviously, the two groupings contrast with each other. We can think of the semantic marker “(Object)” as the distinguishing factor in these groupings: the former is a group of objects, the latter a group of non-objects. So, semantic markers, like phonological features and grammatical categories, are expressions of natural classes.

Finally, Katz proposes a third way of thinking of semantic markers: “as symbols that mark the components of senses of expressions on which inferences from sentences containing the expressions depend.” (p41) For instance we can infer (2.19) from (2.18), but we can’t infer (2.27).

(2.18) There is a chair in the room.

(2.19) There is a physical object in the room.

(2.27) There is a woman in the room.

We can express this inference pattern by saying that every semantic marker that comprises the sense of “physical object” in (2.19) is contained in the sense of “chair” in (2.18), but that is not the case for “woman” in (2.27). The sense of “woman” in (2.27) contains semantic markers like “(Female)” which are not contained in the sense of chair in (2.18). Here Katz notes that his proposal that concepts like “chair” consist of markers is merely an extension of an observation by Frege that (2.28a,b,c) are together equivalent to (2.29)

(2.28)
(a) 2 is a positive number
(b) 2 is a whole number
(c) 2 is less than 10

(2.29) 2 is a positive whole number less than 10

For Frege, “positive number”, “whole number”, and “less than 10” are all properties of “2” and marks of “positive whole number less than 10”. Katz’s extension is to say that the concepts associated with simple expressions can have their own marks.

Next, Katz discusses the notions of derived and lexical readings which are, in a sense, the inputs and outputs, respectively, of the process of semantic composition. As the name suggests, lexical readings are what is stored in the dictionary. When a syntactic object hits the semantic component of the grammar, the first step is to replace the terminal nodes with their lexical readings. Derived readings are generated by applying projection rules to the first level of non-terminal nodes, and then the next level, and so on until the syntactic object is exhausted.

The process of deriving readings, Katz asserts, must be restrictive in the sense that the interpretation of a sentence is never the every permutation of the lexical readings of its component parts. For instance, suppose the adjective “light” and the noun “book” have N and M senses in their respective lexical readings. If our process for deriving readings were unrestrictive, we would expect “light book” to have N×M senses while, in fact, fewer are available. We can see this even when we restrict ourselves to 2 senses for “light”—“low in physical weight”, and “inconsequential”—and 2 senses for “book”—“a bound collection of paper” and “a work of literature”. Restricting ourselves this much we can see that the “light book” is 2-ways ambiguous, describing a bound collection of papers with a low weight, or a work of literature whose content is inconsequential, and not a work of literature with a low weight or an inconsequential bound collection of papers. Our semantic theory, then, must be such that the compositional process it proposes can appropriately restrict the class of derived readings for a given syntactic object.

To ensure this restrictiveness, Katz proposes that the senses that make up a dictionary entry are each paired with a selectional restriction. To illustrate this, he considers the adjective “handsome” which has three senses: when applied to a person or artifact it has the sense “beautiful with dignity”; when applied applied to an amount, it has the sense “moderately large”; when applied to conduct, it has the sense “gracious or generous”. So, for Katz, the dictionary entry for “handsome” is as in (2.30).

(2.30) "handsome";[+Adj,…];(Physical),(Object),(Beautiful),
                           (Dignified in appearance),
                           <(Human),(Artifact)>
                           (Gracious),(Generous),<(Conduct)>
                           (Moderately large),<(Amount)>

Here the semantic markers in angle brackets represent the markers that must be present in the senses that “handsome” is applied to.

This solution to the problem of selection may seem stipulative and ad hoc—I know it seems that way to me—but recall that this is an early chapter in a book published in 1972. If we compared it to the theories of syntax and phonology of the time, they might appear similarly unsatisfying. The difference between Katz’s theory and syntactic and phonological theories contemporary to Katz’s theory is that syntactic and phonological theories have since developed into more formalized and hopefully explanatory theories through the collaborative effort of many researchers, while Katz’s theory never gained the traction required to spur that level of collaboration.

Katz closes out this section, with a discussion of “semantic redundancy rules” and projection rules. Rather than discuss these, I move on to the final section of the chapter.

6. Preliminary definitions of some semantic properties and relations

Here Katz shows the utility of the theory that he has thus far sketched. That is, he looks at how the semantic properties and relations identified in chapter 1 can be defined in the terms introduced in this chapter. These theoretical definitions are guided by our common sense definitions, but Katz is careful to stress that they are not determined by them. So, for instance, two things are similar when they share some feature(s). Translating this into his theory, Katz gives the definition in (2.33) for semantic similarity.

(2.33) A constituent Ci is semantically similar to a constituent Cj on a sense just in case there is a reading of Ci and a reading of Cj which have a semantic marker in common. (they can be said to semantic similar with respect to the concept φ in case the shared semantic marker represents φ)

Note that we can convert this definition into a scalar notion, so we can talk about degrees of similarity in terms of the number of shared markers. Katz does this implicitly by defining semantic distinctness as sharing no markers and synonymy as sharing all features.

Similarity is a rather simple notion, and therefore has a simple definition; others requires some complexity. For instance, analytic statements like “Liars lie” are vacuous assertions due to the fact that the the meaning of the subject is contained in the meaning of the predicate. Here, Katz gives the definition one might expect, but it is clear that more needs to be said, as the notions of subject and predicate are more difficult to define. More on this in later chapters.

A more puzzling and less often remarked upon semantic relation is antonymy—the relation that holds of the word pairs in (2.46) and of the set of words in (2.47)

(2.46) bride/groom, aunt/uncle, cow/bull, girl/boy, doe/buck

(2.47) child/cub/puppy/kitten/cygnet

Katz notes that although antonymy is generally taken to be merely lexical, it actually projects to larger expressions (e.g., “our beloved old cow”/”our beloved old bull”), and is targeted by words like “either” as demonstrated by the fact that (2.49a) is meaningful while (2.49c) is anomalous.

(2.49)
a. John is well and Mary’s not sick either.
c. John is well and Mary’s not {well/foolish/poor/dead}

In order for antonymy to be given an adequate theoretical definition, then, it must be expressed formally. Katz does this by marking semantic markers that represent antonymy sets with a superscript. For instance, “brother” and “sister” would be represented as (Sibling)(M) and (Sibling)(F), respectively. Again, this is clearly stipulative and ad hoc but that is to be expected at this stage of a theory. In fact, Katz seems to have been revising his theory up to his death, with the colour incompatibility problem—the question of why the sentence “The dot is green and red” is contradictory—occupying a the focus of a 1998 paper of his and a section of his posthumous book. Even Katz’s ad hoc solution to the problem, though, is miles ahead of any solution that could possibly be given in current formal semantics—which is bases its definition of meaning on reference—because, to my knowledge, there is no way to account for antonymy in formal semantics. Indeed, the mere fact, that Katz is able to give any theoretical definition of antonymy, puts his theory well ahead of formal semantics.

Conclusion

Katz’s rough sketch of a semantic theory is already fairly successful in that its able to provide concrete definitions of many of the semantic notions that he identifies in the first chapter.2 I don’t believe this success is due to Katz’s ingenuity, but rather to the fact that he approached theory-building as the central activity in semantic inquiry, rather than an arcane peripheral curiosity. Since the theory building is central, it can occur in tandem with analysis of linguistic intuition.

In the next chapter, Katz responds to criticisms from his contemporaries. I’m not sure how enlightening this is for modern audiences, so I might skip it. We’ll see…


  1. ^ This argument, of course, leads pretty quickly to a classic problem inherent in the notion of abstract objects: the problem of how abstract objects can interact with the physical world. We could, of course, get around this by denying that concepts and propositions are abstract but then we need to explain how two different people could have the same thought at different times, in different places. I’m not sure which is the best choice and I’m not sure that linguistics (or any science) is up to the task of deciding between the two, so I’ll just proceed by going along with Katz’s realistic attitude about abstract objects, with the caveat that it might be wrong—a kind of methodological Platonism.
  2. ^ Katz does not give definitions for presupposition or question-answer pairs here, more on that in later chapters.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s