A note on an equivocation in the UCLA Lectures

In his recent UCLA Lectures, Chomsky makes the following two suggestive remarks which seem to be contradictory:

. . . [I]magine the simplest case where you have a lexicon of one element and we have the operation internal Merge. [. . . ] You have one element: let’s just give it the name zero (0). We internally merge zero with itself. That gives us the set {0, 0}, which is just the set zero. Okay, we’ve now constructed a new element, the set zero, which we call one.


We want to say that [X], the workspace which is a set containing X is distinct from X.
[X] ≠ X
We don’t want to identify a singleton set with its member. If we did, the workspace itself would be accessible to MERGE. However, in the case of the elements produced by MERGE, we want to say the opposite.
{X} = X
We want to identify singleton sets with their members.


So in the case of arithmetic, a singleton set ({0}, one) is distinct from its member (0), but the two are identical in the case of language. This is either a contradiction—in which case we need to eliminate one of the statements—or its an equivocation—in which case we need to find and understand the source of the error. The former option would be expedient, but the latter is more interesting. So, I’ll go with the latter.

The source of the equivocation, in my estimation, is the notion of identity—Chomsky’s remarks become consistent when we take him to be using different measures of identity and, in order to understand these distinctions, we need to dust off a rarely used dichotomy—form vs substance.

This dichotomy is perhaps best known to syntacticians due to Chomsky’s distinction between “formal universals” and “substantive universals” in Aspects, where formal universals were constraints on the types of grammatical rules in the grammar and substantive universal were constraints on the types of grammatical objects in the grammar. Now, depending on what aspect of grammar or cognition we are concerned with, the terms “form” and “substance” will pick out different notions and relations, but since we’re dealing with syntax here we can say that “form” picks out purely structural notions and relations, such as are derived by merge, while substance picks out everything else.

By extension, then, two expressions are formally identical if they are derived by the same sequences of applications of merge. This is a rather expansive notion. Suppose we derived a structure from an arbitrary array A of symbols, any structure whose derivation can be expressed by swapping the symbols in A for distinct symbols will be formally identical to the original structure. So, “The sincerity frightened the boy.” and “*The boy frightened the sincerity” would be formally identical, but, obviously, substantively distinct.

Substantive identity, though is more complex. If substance picks out everything except form, then it would pick out everything to do with the pronunciation and meaning of an expression. So, from the pronunciation side, a structurally ambiguous expression is a set of (partially) substantively identical but formally distinct sentences, as are paraphrases on the meaning side.

Turning back to the topic at hand, the distinction between a singleton set and its member is purely formal, and therein lies the resolution of the apparent contradiction. Arithmetic is purely formal, so it traffics in formal identity/distinctness. Note that Chomsky doesn’t suggest that zero is a particular object—it could be any object. Linguistic expressions, on the other hand, have form and substance. So a singleton set {LI} and its member LI are formally distinct but, since they would mean and be pronounced the same, are substantively identical.

It follows from this, I believe, that the narrow faculty of language, if it is also responsible for our faculty of arithmetic, must be purely formal—constructing expressions with no regard for their content. So, the application of merge cannot be contingent on the contents of its input, nor could an operation like Agree, which is sensitive to substance of an expression, be part of that same faculty. These conclusions, incidentally, can also be drawn from the Strong Minimalist Thesis


  • 💬 blogpost: Noam Chomsky and Benjamin Lee Whorf walk into a bar… – Omer Preminger
Notify of

Newest Most Voted
Inline Feedbacks
View all comments

[…] his blog, Omer Preminger posted some comments on my comments on Chomsky’s UCLA Lectures, in which he argues that “committing oneself to the brand of […]

There is no proof that our natural ability of arithmetics (as opposed to formal mathematics) can actually distinguish singleton sets and their elements. Actually, there’s quite a strong feeling that the opposite is true. So this whole discussion can be reversed into “Merge-based faculty does not distinguish those (whether it operates on language or arithmetics)”.

Likewise, our natural arithmetics does seem to care what kinds of objects it conjoins (in particular, it requires them to share a property: we can join apple and banana to be [2 fruits] but not apple and jogging to be a 2 of anything (unless we speak of utterances “apple” and “jogging”, which are not the same) – basically a version of Agree-is-condition-on-Merge).

I think I mostly agree with you here, but I’m fairly sure what Chomsky is talking about in the relevant quotes is formal mathematics rather than what you refer to as “our natural ability of arithmetic.” Take the following quote from the UCLA Lectures:

“You might have small numbers, you might have what’s called numerosity (knowing that 80 things are more than 60 things), but that’s different from arithmetic (knowing that numbers can go on forever, knowing what the computations are of say addition, multiplication).” (p24)

So, the term “arithmetic” as Chomsky uses it in his lectures, and I do in this post is about numbers in themselves, while what you call “our natural ability of arithmetic” seems to be more about our ability to use numbers to refer to quantities of things. I’d argue that, while the latter likely includes the former, it also includes other cognitive abilities.

Yeah, but we’re trying to discern the properties of Merge, judging by your last two paragraphs. And using our species’ potential capability to do formal mathematics (which is not a natural ability provided by Merge automatically, unlike “numerosity”) as argument against Agree-is-condition-on-Merge is invalid.

It’s definitely a very weak argument against Agree-is-condition-on-Merge, largely because it rests on a lot of conjecture. But I wouldn’t say it’s invalid, at least not for the reason that you highlight in your original comment.

The fact that “our natural arithmetics does seem to care what kinds of objects it conjoins” doesn’t require Agree at all. In Chomsky’s number theory, the derivation of a natural number n starts with a workspace WS containing a single object O and performs internal merge (MERGE(WS, O)) n times. Supposing we use the same procedure for counting objects, we could define the domain of that counting by what we choose for our base object O: for bananas, we use the root BANANA, for fruits in general we use FRUIT, and so on. So we can group objects together only if we have or can construct a concept that covers those objects. No need for Agree here.

What’s more, this isn’t the sort of behaviour that we would even expect from Agree. In arithmetic, we want to ensure that we combine things that are, on some level, identical to each other. As it’s usually construed though, Agree ensures that we combine objects that are, on some level, complementary: Noun(-like) objects tend to merge with Verb(-like) objects, not other Noun(-like) objects, unvalued features Agree with valued features, and so on.

(I first thought it would be look-ahead to know what kinds of objects are conjoined but it isn’t.) But what limits us to using internal Merge there? If we have two dogs and an instance of jogging, why can’t we MERGE(MERGE(DOG, MERGE(DOG, WS(DOG))), JOGGING))? Or MERGE(MERGE(DOG, MERGE(JOGGING, WS(DOG))), DOG))?

A good counterargument, but… “unvalued features Agree with valued features” of the same type. You can’t Agree your [T:_] to [φ:3sg]. And that’s even if you don’t use the (unpleasant) machinery of feature checking, which does in some instantiations directly check for identity.


We can, but the results wouldn’t be natural numbers (or natural number analogs).
To say that Merge gives us arithmetic is to say that we can make a merge-based model of the Peano axioms. Chomsky sketched out a merge-based model of the first two axioms:

  1. ZERO is a natural number. (where ZERO is an arbitrary object)
  2. If x is a natural number, then S(x) is a natural number. (where S is MERGE(_, ZERO))

So, you can do those derivations with DOG interspersed with JOGGING, but they won’t count as natural numbers.

An interesting corollary of this is that Merge actually would furnish us with multiple models of arithmetic (one for each possible ZERO) maybe even an infinity of such models (if complex objects can stand in for ZERO).

that’s even if you don’t use the (unpleasant) machinery of feature checking, which does in some instantiations directly check for identity.

Even feature-checking versions of Agree would require that interpretable features match with uninterpretable features (however you want to formalize those notions).