Logic, from Classical Greek λόγος (logos), originally meaning ‘the word’, or ‘what is spoken’, (but coming to mean ‘thought’ or ‘reason’ or ‘an explanation’ or ‘a justification’ or ‘key’) is most often said to be the study of criteria for the evaluation of arguments, although the exact definition of logic is a matter of controversy among philosophers. A valid argument is one where there is a specific relation of logical support between the assumptions of the argument and its conclusion. The subject is grounded on valid and fallacious inference and the task of the logician is to distinguish good from bad arguments.

Traditionally, logic is studied as a branch of philosophy. Since the mid-1800s logic has also been commonly studied in mathematics, and, more recently, in set theory and computer science. As a science, logic investigates and classifies the structure of statements and arguments, both through the study of formal systems of inference, often expressed in symbolic or formal language, and through the study of arguments in natural language (a spoken language such as English, Italian, or Japanese). The scope of logic can therefore be very large, ranging from core topics such as the study of fallacies and paradoxes, to specialist analyses of reasoning such as probability, correct reasoning, and arguments involving causality.

The crucial concept of form is central to discussions of the nature of logic. The validity of an argument is determined by its logical form, not by its content.

Informal logic is the study of natural language arguments. The study of fallacies is an important branch of informal logic. Since much informal argument is not strictly speaking deductive, on many conceptions of logic, informal logic is not logic at all.

Formal logic is the study of inference with purely formal content. It is the field of study in which we are concerned with the form or structure of the inferences rather than the content.

An inference possesses a purely formal content if it can be expressed as a particular application of a wholly abstract rule, that is a rule that is not about any particular thing or property. The works of Aristotle contain the earliest known formal study of logic.
The argument "If John was strangled he died. John was strangled. Therefore John died." is another example of the argument form or rule, "If P then Q. P is true. Therefore Q is true." (This is a valid argument form, known since the Middle Ages as Modus Ponens.) On many definitions of logic, logical inference and inference with purely formal content are the same thing.

Symbolic logic is the study of abstractions, expressed in symbols, that capture the formal features of logical inference.
(Wikipedia and New World Encyclopedia)

Mathematical Logic and philosophical Logic
Russell argues that logic has two branches: mathematical and philosophical (Our Knowledge, pp. 49-52; 67). Mathematical logic contains completely general and a priori axioms and theorems as well as definitions such as the definition of number and the techniques of construction used, for example, in his theory of descriptions. Philosophical logic, which Russell sometimes simply calls logic, consists of the study of forms of propositions and the facts corresponding to them. The term 'philosophical logic' does not mean merely a study of grammar or a meta-level study of a logical language; rather, Russell has in mind the metaphysical and ontological examination of what there is. (Internet Encyclopedia of Philosophy: Russell's Metaphysics)

*Foundation of Mathematics → Logicism (Frege naive set theory → Russell's Paradox → Russell's Theory of Types, Zermelo–Fraenkel set theory) → Philosophy of Mathematics
*Symbolic Logic → Philosophy of Logic (Frege's Philosophy of Logic, Russell's Theory of Description, ......)
*Symbolic Logic → Philosophy of Language (Frege's Notions of Sence and Reference, Russell's Logical Atomism, Ideal Language, Structuralism, Critical Philosophy ......) → Analytic Philosophy
*Analytic Philosophy (Philosophy of Logic and Language) → Metaphysics (Frege's Notions of Objects and Coocepts, Russell's Realism, Logical Positivism, Scientism, Quine's Naturalized Epistemology and Holism ......)
*Metaphysical Philosophy of Language + Critical Philosophy → Post-Structuralism, Deconstructionism ...... → Postmodernism



From the time of classical Greece, logic has been recognized as a fundamental and important element of philosophy because nearly all of us, ordinary persons and scholars alike, engage in process of reasoning about all sorts of topics.
In legal trials, such reasoning is key in determining the guilt or innocence of a person charged with a crime. The matter is ultimately determined by logical reasoning: by finding supporting or disconfirming grounds for accepting or rejecting the charge.
There is clearly a difference between good and bad reasoning, as bad reasoning can lead to invalid results, and in a legal contest this may be a matter of life and death.

The question of what constitutes good reasoning is thus a fundamental issue in daily life as well as in more sophisticated and technical fields such as science and mathematics. But logic is particularly important for philosophy, since as a discipline it depends entirely upon reasoning. Philosophy activity is essentially the application of reasoning to a wide variety of topics: the moral life, our knowledge of others, reflections about the nature of the mind, and so on.

Logic, then, can be defined as the philosophical study of what counts as sound reasoning. This should not be construed as describing human psychology – that is, how persons in fact reason – but how they ought to reason in order to avoid mistakes. Logic is thus a normative rather than a descriptive science. Sound thought, according to logicians, is determined by certain rules that ensure that correct reasoning will never lead from true premises to a false conclusion. The entire, sophisticated corpus of modern logic rests upon this principle, and it is this maxim that ties ancient logic to its current, more sophisticated forms.
In the present century, logic is enormously important in all technical fields, from the operations of computers to analyzing complex weather patterns.

Scholastic logic, which began with Aristotle, was greatly refined in the Middle Ages, and shortly thereafter reached the point of completion described by Kant. It was an inferential system, designed to draw valid conclusions from premises. It was not an axiomatic system of the sort that Whitehead and Russell developed but instead consisted of a large number of ad hoc rules that allowed its users to distinguish valid from invalid reasoning. These rules apply to the only type of argumentation that the system recognized: the syllogism. Here are three examples of syllogisms:

1. No Americans are Italians.
    All Californians are Americans.
    Therefore, no Californians are Italians.
2. Some catalogues are not interesting.
    All catalogues are informative.
    Therefore, some informative things are not interesting.
3. All women are polite.
    No wrestlers are polite.
    Therefore, no wrestlers are women.

As the examples bring out:
*A syllogism is a line of reasoning (argumentation) that consists of three sentences (two premises and a conclusion).
*It contains exactly three “terms”. In example 1 above, the terms are “Americans”, “Californians”, and “Italians”.
*Each sentence begins with a general word, a so-called quantifiers (all, some, no).
*The rules of the syllogism generate only four different types of sentences that the system can deal with.
These are two “universal” sentences and two “particular” sentences:
    Two “universal” sentences (for example, “All people are mortal” and “No people are mortal”)
    Two “particular” sentences (for example, “Some people are mortal” and “Some people are not mortal”)
*One of the ad hoc features of the system is that it treats a sentence containing a proper name, such as “Socrates is mortal”, as a universal sentence. The contention is that mortality is being predicated of all of Socrates and therefore “Socrates is mortal” can be treated in the same way as “All men are mortal”, which predicates mortality of the whole class of men.

Scholastic logic identified a number of valid argument forms, but its scope was very limited. From a modern standpoint, the system suffered from serious defects: an incapacity to deal with lengthy arguments that contain more than two premises, a lack of sensitivity to the vast range of sentences found in ordinary discourse, and an inability to distinguish and classify the logical elements in language, such as subjects, predicates, quantifiers, sentential connections, anaphoric relationships, and variables. To illustrate: Scholastic logic recognizes only four types of sentences, each of them preceded by a quantifiers. Now, natural languages, such as English and French, are composed of a large variety of different sorts of sentences, such as, “If it is raining, them the streets are wet”, “Smith and Jones were acquainted”, “The head of a horse is the head of an animal”, “Each member of the platoon is a member of the company”, and so forth. Scholastic logic cannot deal with these obvious linguistic differences. Take "The head of a horse is the head of an animal”, for instance. There is no straightforward way of rendering this as a standard, Scholastic universal sentence. As a result, Scholastic logic either ignores sentences of this form or leaves any pretense of formality in trying to interpret such a sentence as (say) a universal sentence. The result is that valid arguments using such sentences cannot be accommodated by the Scholastic system. Such an argument as: The horse is an animal; therefore, the head of a horse is the head of an animal, cannot be rendered as a canonical syllogism. Similar comments apply to: If A is heavier than B, and B is heavier than C, then A is heavier than C. Scholastic logic is thus enormously restricted in its power to reproduce the kind of reasoning one finds in everyday life.


At the beginning of the nineteenth century, Immanuel Kant announced that logic was a complete and finished subject and that nothing could be added to it. Less than fifty years later, ideas put forth by Augustus De Morgan and George Boole anticipated the development of a new, non-scholastic logic, closely connected with mathematics. It was in this period that logic as a species of normative reasoning was differentiated sharply from reasoning as studied by psychologists.

In his Begriffsschrift (1879), the German mathematician Frege carried these ideas even further and invented what is now regarded as mathematical or symbolic logic. His achievement has led some scholars to describe him as the greatest logician since Aristotle. Unfortunately, because of its difficult natation, his system was not understood by the broader philosophical community.
Working independently of Frege, Alfred North Whitehead and Bertrand Russell created another version of this kind of logic. In Principia mathematica (1910-1913), they utilized an easily readable notation invented by Giuseppe Peano that led to the widespread dissemination of the new logic. Their system became the main symbolic tradition until Frege’s neglected writings were rediscovered after Second World War. (Carnap’s Meaning and Necessity of 1947 introduced Frege to younger logicians and philosophers of language). As general system, both have been superseded, but certain parts of each – especially their respective versions of the theory of descriptions – are still widely accepted today. Because of its earlier canonical status, we shall focus on Whitehead and Russell’s system.

Whereas Frege’s symbolization consisted mainly of specially created idiographs, the Whitehead/Russell scheme – though it use some symbolic token, such as the horseshoe for implication – mostly employed common symbols of punctuation such as brackets, periods, colons, exclamation marks, and letters of English and Greek alphabets. It thus had two advantages: It could be learned quickly, and it allowed for the perspicuous discrimination of key logical units.

Four of these units were especially important. They are called connectives, quantifiers, predicates, and constants. They allow for the construction of various types of propositions, from the most simple to the most complicated, Each is given a different symbolic representation. Let us consider them seriatim.

1) Connectives: Their English equivalents are “or”, “and”, “not”, “implies” (i.e., “if … then”), “equals”, and “is equivalent to”. The symbolic representations for “or” is v; for “and” is . ; for “not” is ~; and for ”implies” is ⊃.Connectives are used to form complex sentences or to modify sentences in various ways. The sentence “John will go or Jane will go” is represented symbolically by (p v q), and the sentence “Jane will not go” is represented by (~p).

2) Quantifier: These are symbols for generality. Their English equivalents are “all”, “no”, “none”, “at least one”, “some”, “there exists”, and “there is”. “All” is represented by X and “some” by a special symbol, ∃. The sentence “All dogs are white” is represented by (x) (Dx ⊃ Wx). The sentence “Some dogs are white” is represented by (∃x) (Dx ⊃ Wx). It should be noted that universal affirmative sentences are treated as hypotheticals. Thus, “All giants are tall”, is construed to mean “If anything is a giant, then it is tall”. This interpretation differs from that of Scholastic logic, since it does not imply that the subject term has an existing referent.

3) Predicates: Their English equivalents are common nouns and adjectives. These are designated by Greek letters, such as φ and ψ. Thus , the word “tall” would be represented in sentence by φ. The sentence “Some priests are tall” is represented by (∃x) (Px. Tx).

4) Constants: Their English equivalents are proper names such as Clinton and Jones. They are designated by lowercase English letters such as “a” and “b”. Thus, the sentence “Clinton is tall”, which predicates tallness of Clinton, is represented by (φc).

Various types of propositions (such as axioms, theorems, and so on) are constructed from these basic units. The particular symbolic expression for a proposition will depend on what kind of proposition it is. For designation sentences belonging to the propositional calculus, the letters “p”, “q”, “r”, “s”, and so forth are used. Thus, “If John is tall, he will qualify for the army”, is expressed as (p ⊃ q) and “Clinton and Dole are acquainted” is represented by (cAd).

The authors of Principia mathematica had two important aims. The first was to show that mathematics was a branch of logic – that is, that all mathematical propositions could be reduced to propositions containing only logical concepts such as constants, quantifiers, variables, and predicates. This was called the logistic thesis [Note: Logicism]. Their second goal was to show that mathematical logic could capture, in a purely formal notation, the large variety of idioms, including different types of sentences, that are found in ordinary discourse. In doing this, they also wished to show how vague expressions could be made more precise and how ambiguous sentences could be clarified in such a way as to expose clearly the basis for their ambiguity. The latter purpose was brilliantly realized in their theory of descriptions, which diagnosed important ambiguities in sentences whose subject terms lacked a referent. Their achievements here let directly to the notion that formal logic was an ideal language.
According to Russell and Whitehead, formal logic is at least as powerful as ordinary language and lacks the disadvantages found in natural languages. Frege, in fact, had a similar aim and spoke explicitly about developing an ideal language. But unlike Russell and Whitehead, who saw formal logic as an extension and perfection of ordinary speech. Frege believed that, despite certain overlaps, there was a basic incompatibility between the two and that for scientific purposes ordinary language should be avoided. For Russell and Whitehead, the development of an ideal language and the attempt to prove the logistic thesis were compatible; in pursuing the former they believed they were at the same time pursuing the latter.


The most important event in the history of philosophy in the nineteenth century was the invention of mathematical logic. This was not only e refoundation of the science of logic itself, but had important consequences for the philosophy of mathematics, the philosophy of language, and ultimately for philosophers’ understanding of the nature of philosophy itself.

The principal founder of mathematical logic was Gottlob Frege (1848-1925) ….

Frege’s productive career began in 1879 with the publication of a pamphlet with the title Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens [“Concept-Script: A Formal Language for Pure Thought Modeled on that of Arithmetic”].

The Begriffsschrift [“Concept-Script”] was a new symbolism designed to bring out with clarity logical relationships which were concealed in ordinary language.
Frege’s own script, which was logically elegant but typographically cumbersome, is no longer used in symbolic logic; but the calculus which it formulated has ever since formed the basis of modern logic.

Instead of the Aristotelian syllogistic, Frege placed at the front of logic the propositional calculas first explored by the Stoics: that is to say, the branch of logic that deals with those inferences which depend on the force of negation, conjunction, disjunction etc., when applied to sentences as wholes. Its fundamental principle – which again goes back to the Stoics – is to treat the truth-value (i.e. the truth or falsehood) of sentences which contain connectives such as “and’, ‘if’, ‘or’, as being determined solely by the truth-values of the component sentences which are linked by the connectives – in the way in which the truth-value of ‘John is fat and Mary is slim’ depends on the truth-values of ‘John is fat’ and 'Mary is slim’.
Composite sentences, in the logicians’ technical term, are treated as truth-functions of the simple sentences of which they are put together.
Frege’s Begriffsschrift contains the first systemic formulation of the propositional calculus; it is presented in an axiomatic manner in which all laws of logic are derived, by specified rules of inference, from a number of primitive principles.

Frege’s greatest contribution to logic was the invention of quantification theory: a method symbolizing and rigorously displaying those inferences that depend for their validity on expressions such as ‘all’ or ‘some’, ‘any’ or ‘every’, ‘no’ or ‘none’. This new method enabled him , among other things, to reformulate traditional syllogistic.

There is an analogy between the inference
          All men are mortal.
          Socrates is a man.
          So, Socrates is mortal.
and the inference
          If Socrates is a man, then Socrates is mortal.
          Socrates is a man.
          So, Socrates is mortal.

The second inference is a valid inference in the propositional calculus
          if p then q; but p, therefore q.
But it cannot be regarded as a translation of the first, since its first premiss seems to state something about Socrates in particular, whereas if ‘All men are mortal’ is true, then
          If x is a man, then x is mortal.
will be true no matter whose name is substituted for the variable ‘x’. Indeed, it will remain true even if we substitute the name of a non-man for x, since in that case the antecedent will be false, and the whole sentence, in accordance with the truth-functional rules for ‘if’-sentences, will turn out true.
So we can express the traditional proposition:
          All men are mortal.
In this way
          For all x, if x is a man, x is mortal.

This reformation forms the basis of Frege’s quantification theory: to see how, we have to explain how he conceived each of the items which go together to make up the complex sentence.

Frege introduced into logic the terminology of algebra.
An algebraic expression such as ‘x/2+1’ may be said to represent a function of x: the value of the number represented by the whole expression will depend on what we substitute for the variable ‘x’, or, in the technical term, what we take as the argument of the function.
Thus 3 will be the value of the function for the argument 4; and 4 will be the value of the function for the argument 6.
Frege applied the terminology of argument, function, and value to expression of ordinary language as well as to expressions in mathematical notation.
He replaced the grammatical notions of subject and predicate with the mathematical notions of argument and function, and he introduced truth-values as well as numbers as possible values for expressions.
Thus ‘x is a man’ represent s a function which for the argument Socrates takes the value true, and for the argument Venus takes the value false.
The expression ‘for all x’, which introduces the sentence above, that what follows (‘if x is a man, x is mortal’) is a function which is true for every argument. Such an expression is called a quantifier.

Beside ‘for all x’, the universal quantifier, there is also the particular quantifier ‘for some x’ which says that what follows is true for at least one argument.
Thus ‘some swans are black’ can be represented in a Fregean dialect as ‘For some x, x is a swan and x is black’.
This sentence can be taken as equivalent to ‘there are such things as black swans’; and indeed Frege make general use of the particular quantifier in order to represent existence. Thus, ‘God exists’ or ‘there is a God’ is represented in his system as ‘For some x, x is God’.

Using his novel notation for quantification, Frege was able to represent a calculus which formalized the theory of inference in a way more rigorous and more general than the traditional Aristotelian syllogistic logic. After Frege, for the first time, formal logic could handle arguments which involved sentences with multiple quantification, sentences which are as it were quantified at both ends, such as ‘Nobody knows everybody’ and ‘any schoolchild can master any language’.


In the Begriffsschrift ["Concept-Script"] and its sequels Frege was not interested in logic for its own sake. His motive in constructing the new concept script was to assist him in the philosophy of mathematics. The question which above all he wanted to answer was this: Do proofs in arithmetic rest on pure logic (general laws operative in every sphere of knowledge), or do they need support from empirical facts? The answer he gave was that arithmetic itself could be shown to be a branch of logic in the sense that it could be formalized without the use of any non-logical notions or axioms. It was in the Die Grundlagen der Arithmetik: eine logisch mathematische Untersuchung über den Begriff der Zahl (1884) that Frege first set out to establish this thesis, which is known by the name “logicism”.

The Grundlagen begins with the attack on the ideas of Frege’s predecessors (including Kant) and contemporaries (including Mill) on the nature of numbers and mathematical truth. Kant had maintained that the truths of mathematics were synthetic a priori, and that our knowledge of them depended on intuition. Mill, on the contrary, saw mathematical truths as a posteriori, empirical generalizations widely applicable and widely confirmed. Frege maintained that the truths of arithmetic were not synthetic at all, neither a priori nor a posteriori. Unlike geometry – which, he agreed with Kant, rest on a priori intuition – arithmetic was analytic, that is to say, it could be defined in purely logical terms and proved from purely logical principles.

The arithmetical notion of number in Frege’s system is replaced by the logical notion of ‘class’: the cardinal numbers can be defined as classes of classes with the same number of members; thus the number two is the class of pairs, and the number three the class of trios.
          [Despite appearances, this definition is not circular, because we can say what is meant by two classes having the same number of members without making use of the notion of number: thus, for instance, a waiter may know that there are as many knives as there are plates on a table without knowing how many of each there are, if he observes that there is just one knife to the right of each plate.]
Two classes have the same number of members if they can be mapped one-to-one on to each other; such classes are known as equivalent classes. A number, then, will be a class of equivalent classes.

Thus, we could define four as the class of all classes equivalent to the class of gospel-makers. But such a definition would be useless for purposes of reducing arithmetic to logic, since the fact that there were four gospel-makers is no part of logic. If Frege’s programme is to succeed, he has to find, for each number, not only a class of the right size, but a class whose size is guaranteed by logic.

What he did was to begin with zero. Zero can be defined in purely logical terms as the class of all classes equivalent to the class of objects which are not identical with themselves. Since there are no objects which are not identical with themselves, that class has no members; and since classes which have the same members are the same classes, there is only one class which has no members, the null-class, as it is called. The fact that there is only one null-class is used in proceeding to the definition of the number one, which is defined as the class of classes equivalent to the class of null-classes. Two can then be defined as the class of classes equivalent to the class whose members are zero and one, three as the class of classes equivalent to the class whose members are zero and one and two, and so on ad infinitum.
Thus the series of natural numbers is to be built up out of the purely logical notions of identity, class, class-membership, and class-equivalence.

In the Grundlagen there are theses to which Frege attaches great importance. One is that each individual number is a self-subsistent object; the other is that the content of a statement assigning a number is an assertion about a concept. …
In saying that a number is an object, Frege is not suggesting that a number is something tangible like a tree or a table; rather, he is denying that number is a property belonging to anything. In saying that a number is a self-subsistent object he is denying that it is anything subjective, any mental item, or any property od a mental item.
Concepts are, for Frege, Platonic, mind-independent entities.
So there is no contradiction between the thesis that numbers are objective, and the thesis that number-statements are statements about concepts. Frege illustrates this latter thesis with two examples.
          If I say ‘Venus has 0 moon’, there simply does not exist any moon or agglomeration of moons for anything to be asserted of; but what happens is that a property is assigned to the concept ‘moon of Venus’, namely that of including nothing under it. If I say ‘the king’s carriage is drawn by four horses’, then I assign the number four to the concept ‘horse that draw the king’s carriage’.

Statements of existence, Frege says, are a particular case of number statements [Note: a concept].
‘Affirmation of existence’, he says, ‘is in fact nothing but denial of the number nought’. What he means is that a sentence such as ‘Angels exist’ is an assertion that the concept angle has something falling under it; that is to say that the number which belongs to that concept is something other than zero.

It is because existence is a property of concepts (Note: a number statement), Frege says, that the ontological argument for the existence of God breaks down. That-there-is-a-God cannot be a property of God; if there is in fact a God, that is a property of the concept God.

According to Frege, a number is the extension of a concept. The number which belongs to the concept F is the extension of the concept ‘like-numbered to the concept F’. This is equivalent to saying that it is the class of all classes which have the same number of members as the class of Fs, as was explained above. So Frege’s theory that numbers are objects depends on the possibility of taking classes as objects,



In the Begriffsschrift and the Grundlagen Frege not only founded modern logic, but also founded the modern philosophical discipline of philosophy of logic. He did so by making a clear distinction 1) between logic and psychology, and 2) between logic and epistemology. On the other hand, there is less distinction in his work between logic and metaphysics: indeed the two are closely related.

Corresponding to the distinction functions and arguments. Frege maintained, a systematic distinction must be made between concepts and objects, which are their ontological counterparts.
Objects are what proper names stand for: there are objects of many kinds, ranging from human beings to numbers.
Concepts are items which have a fundamental incompleteness, corresponding to the gappiness in a function which is marked by its variable.
Where other philosophers talk ambiguously of the meaning of an expression, Frege introduced a distinction the reference of an expression (the object to which it refers, as the planet Venus is the reference of ‘The Moring Star’ and the sense of an expression. (‘The Evening Star’ differs in sense from ‘The Morning Star’, though both expressions, as astronomers discovered, refer to Venus). Frege maintained that the reference of a sentence was it truth-value, and held that in a scientifically respectable language every term must have a reference and every sentence must be either true or false. Many philosophers since have adopted his distinction between sense and reference, but most have rejected the notion that complete sentences have a reference of any kind.

The climax of Frege’s career as a philosopher should have been the publication of the two volumes of Die Grundgesetze der Arithmetik (1893 and 1903), in which he set out to present in a rigorous formal manner the logicist construction of arithmetic on the basis of pure logic and set theory. This work was to execute the task which had been sketched in the earlier books on the philosophy of mathematics: it was to enunciate a set of axioms which would be recognizably truths of logic, propound a set of undoubtedly sound rules of inference, and then present, one by one, derivations by these rules from these axioms of the standard truths of arithmetic.

The magnificent project aborted before it was ever completed. The first volume was published in 1893. By the time that the second volume appeared in 1903, it had been discovered that Frege’s ingenious method of building up the series of natural numbers out of merely logical notions contains a fatal flaw. The discovery was due to the English philosopher Bertrand Russell.

Russell wrote to Frege with news of his paradox on June 16, 1902. The paradox was of significance to Frege’s logical work since, in effect, it showed that the axioms Frege was using to formalize his logic were inconsistent. Specifically, Frege’s Axiom V requires that an expression such as φ(x) be considered both a function of the argument x and a function of the argument φ.
[More precisely, Frege’s Law states that the course-of-values of a concept f is identical to the course-of-values of a concept g if and only if f and g agree on the value of every argument, i.e., if and only if for every object x, f (x) = g (x). -- See The Stanford Encyclopedia of Philosophy: Gottlob Frege: section 2.4.1 for more discussion.]
In effect, it was this ambiguity that allowed Russell to construct R in such a way that it could both be and not be a member of itself.


Earlier, Russell accepted a British version of Hegelian idealism. Later, in conjunction with G.E. Moore, he abandoned idealism for the extreme realist philosophy which included a Platonist view of mathematics. It was in the course of writing a book to expound this philosophy that Russell encountered Frege’s ideas, and when the book was published in 1903 as The Principles of Mathematics it included an account of them. Much as Russell admired Frege’s writings, he detected a radical defect in his system, which he pointed out to him just as the second volume of the Grundgesetze was in press.

If we are to proceed from number to number in the way Frege proposes we must be able to form without restriction classes of classes, and classes of classes of classes, and so on. Now can a class be a member of itself? Most classes are not (e.g. the class of dogs is not a dog) but some apparently are (e.g. the class of classes is surely a class). It seems therefore that classes can be divided into two kinds: there is the class of classes that are members of themselves, and the class of classes that are not members of themselves. [Note: Here is example - The set of all squares versus the set of all non-squares. The set of all squares is not itself a square, and therefore is not a member of the set of all squares. On the other hand, the set of all non-squares is itself not a square and so should be one of its own members.]

[Note: Here is another example - The class of classes that is member of itself versus the class of classes that are not members of itself.]
Consider now this second class: is it a member of itself or not? If it is a member of itself, then since it is precisely the class of classes that are not members of themselves, it must be not a member of itself. But, if it is not a member of itself, then it qualifies for membership of the class of classes that are not members of themselves, and therefore it is a member of itself. It seems that it must either be a member of itself or not; but whichever alternative we choose we are forced to contradict ourselves.

This discovery is called Russell’s paradox; it shows that there is something vicious in the procedure of forming classes of classes ad lib, and it calls into question Frege’s whole logicist programme.

Russell himself was committed to logicism no less than Frege was, and he proceeded, in co-operation with A.N. Whitehead, to develop a logical system, using a notation different from Frege’s, in which he set out to derive the whole of arithmetic from a purely logical basis. This work was published in the three monumental volumes of Principia Mathematica between 1910 and 1913.

In order to avoid the paradox which he discovered, Russell formulated a Theory of Types. It was wrong to treat classes as randomly classifiable objects. Classes and individuals were of different logical types, and what can be true or false of one cannot be significantly asserted of the other. ‘The class of dogs is a dog’ should be regarded not as false but as meaningless. Similarly, what can meaningfully be said of classes cannot meaningfully be said of classes of classes, and so on through the hierarchy of logical types. If the difference of type between the different levels of the hierarchy is observed, then the paradox will not arise.

But another difficulty arises in place of the paradox. Once we prohibit the formation classes of classes, how can we define the series of natural numbers?
Russell retained the definition of zero as the class whose only member is the null-class, but he now treated the number one as the class of all classes equivalent to the class whose members are (a) the members of the null-class, plus (b) any object not a member of that class. The number two was treated in turn as the class of all classes equivalent to the class whose members are (a) that members of the class used to define one, plus (b) any object not a member of that defining class. In this way the numbers can be defined one after the other, and each number is a class of classes of individuals. But the natural-number series can be continued thus ad infinitum only if there is an infinite number of objects in the universe; for if there are only n individuals, then there will be no classes with n+1 members, and so no cardinal number n+1. Russell accepted this and therefore added to his axioms an axiom of infinity, i.e. the hypothesis that the number of objects in the universe is not finite. This hypothesis may be, as Russell thought it was, highly probable; but on the face of it it is far from being a logical truth; and the need to postulate it is therefore a sullying of the purity of the original programme of deriving arithmetic from logic alone.

Whem he learned of Russell’s paradox, Frege was utterly downcast. He made more than one attempt to patch up his system, but there were no more successful in salvaging logicism than was Russell’s theory of types. We now know that the logicist programme cannot ever be successfully carried out. The path from the axioms of logic via the axioms of arithmetic to the theorems of arithmetic is barred at two points, First, as Russell’s paradox showed, the naïve set theory which was part of Frege’s logical basis was inconsistent in itself, and the remedies which Frege proposed for this proved ineffective. Thus, the axioms of arithmetic cannot be derived from purely logical axioms in the way Frege hoped. Secondly, the notion of ‘axioms of arithmetic’ was itself later called in question when the Austrian mathematician Furtb Gödel showed that it was impossible to give arithmetic a complete and consistent axiomatization in the style of Principia Mathematica.
None the less, the concepts and insights developed by Frege and Russell in the course of expounding the logicist thesis have a permanent interest which is unimpaired by the defeat of that programme.


Set Theories and Russell’s Paradox

The origin of set theories can be traced back to Georg Cantor’s Über eine Eigenshaft des Inbegriffes aller reellen algebraischen Zahlen (“On a Property of the System of all the Real Algebraic Numbers”), published in 1874. He defined a set as follow:
"By a set we are to understand any collection into a whole M of definite and distinguishable objects of our intuition or our thought. These objects are called the elements of M" (Reference: David M. Burton’s “The History of Mathematics” (1997), p591.)

A simple explanation of set theory and Russell’s paradox

A set and a set of sets
The early set theorists, operating in a world of what we now call "naive set theory", loosely defined a set (a class) as a collection of things.
A finite collection of variables, like {x, y, z}, should be a set. An infinite collection of numbers, like the natural numbers N = {1, 2, 3, 4, 5, ...} should also be a set. In geometry, the collection of all points that form a line between two given points is also a set.

A set can also be a collections of sets. The question is can a set contain itself as a member?

If we have a set of all natural numbers, we could certainly have a set of everything that is not a natural number. This set would include quite a few things — the numbers -3, 1/2, and π are all not naturals, and so they would be members. The word "pizza" is not a natural number, so that would be a member. The state of California is not a natural number as well, so we would throw that in there too.

Since this set is itself pretty clearly not a natural number, but instead an enormous collection of everything ever that is not a natural number, it must be a member of itself.

Indeed, with our naive definition of a set, it is tempting to consider a set of everything – “a set of all sets”. Naturally, being itself a set, the set of all sets would also have to contain itself as an element.

Russell's Paradox
Let's first look at the set of all sets that contain themselves as member, and let's call this set A.
We've seen a couple examples of set A — the set of everything that is not a natural number, and the set of all sets.
Does A contain itself?
According to "naive set theory", if A satisfies the condition we set up for being a member of A, we can say that A contains itself as a member. if A doesn't satisfy the condition for being in A, then A is not a member. There is no paradox here.

The paradox comes in when we build the set of all sets that do not contain themselves as members. Let's call this set B.
Does B contain itself?
If we suppose B contains itself as a member. We defined B, however, as the set of all sets that do not contain themselves. So if B does contain itself, it goes against the condition we used to define B, and thus B does not contain itself.
But then if B does not contain itself, it does satisfy the condition to be a member of itself, and so it would have to contain itself.

This gives us a contradiction — the set of all sets that are not members of themselves simultaneously must and cannot be a member of itself. This contradiction makes "naive set theory" inconsistent — we have a statement that has to be simultaneously true and false.

In 1902, Bertrand Russell identified this problem which is known as Russell's Paradox.

Modern Set Theory Axiom
The modern set theory axioms are very specific about how to build sets out of other sets. In particular, the axioms very quickly forbid a set from being a member of itself. We are also much more careful with constructions like "the set of everything that is not a natural number". Rather than using a broad universe of "everything", sets like this must be constructed as subsets of a larger set that we have already defined. So, I can define the set of all real numbers that are not natural numbers, but I cannot make a set of "everything" that is not a natural number.

The axiomatization of set theory

In order to avoid the paradoxes and put it on a firm footing, set theory had to be axiomatized.
The first axiomatization was due to Zermelo (1908). Zermelo's axiomatization avoids Russell's Paradox by means of the Separation axiom, which is formulated as quantifying over properties of sets, and thus it is a second-order statement.
Further work by Skolem and Fraenkel led to the formalization of the Separation axiom in terms of formulas of first-order.
A further addition, by von Neumann, of the axiom of Foundation, led to the standard axiom system of set theory, known as the Zermelo-Fraenkel axioms plus the Axiom of Choice, or ZFC.
Other axiomatizations of set theory, such as those of von Neumann-Bernays-Gödel (NBG), or Morse-Kelley (MK), allow also for a formal treatment of proper classes.

The axioms of set theory
ZFC is an axiom system formulated in first-order logic with equality and with only one binary relation symbol ∈ for membership. Thus, we write A∈B to express that A is a member of the set B. We state below the axioms of ZFC informally.

The axioms of ZFC
Extensionality: If two sets A and B have the same elements, then they are equal.
Null Set: There exists a set, denoted by ∅ and called the empty set, which has no elements.
Pair: Given any sets A and B, there exists a set, denoted by {A,B}, which contains A and B as its only elements. In particular, there exists the set {A} which has A as its only element.
Power Set: For every setA there exists a set, denoted by P(A) and called the power set of A, whose elements are all the subsets of A.
Union: For every set A, there exists a set, denoted by ⋃A and called the union of A, whose elements are all the elements of the elements of A.
Infinity: There exists an infinite set. In particular, there exists a set Z that contains ∅ and such that if AZ, then ⋃{A,{A}}∈Z.
Separation: For every set A and every given property, there is a set containing exactly the elements of A that have that property. A property is given by a formula φ of the first-order language of set theory.

Thus, Separation is not a single axiom but an axiom schema, that is, an infinite list of axioms, one for each formula φ.

Replacement: For every given definable function with domain a set A, there is a set whose elements are all the values of the function.
Replacement is also an axiom schema, as definable functions are given by formulas.
Foundation: Every non-empty set A contains an ∈-minimal element, that is, an element such that no element of A belongs to it.

These are the axioms of Zermelo-Fraenkel set theory, or ZF. The axioms of Null Set and Pair follow from the other ZF axioms, so they may be omitted. Also, Replacement implies Separation.

Finally, there is the Axiom of Choice (AC):
Choice: For every set A of pairwise-disjoint non-empty sets, there exists a set that contains exactly one element from each set in A.

The AC was, for a long time, a controversial axiom. On the one hand, it is very useful and of wide use in mathematics. On the other hand, it has rather unintuitive consequences, such as the Banach-Tarski Paradox, which says that the unit ball can be partitioned into finitely-many pieces, which can then be rearranged to form two unit balls. The objections to the axiom arise from the fact that it asserts the existence of sets that cannot be explicitly defined. But Gödel's 1938 proof of its consistency, relative to the consistency of ZF, dispelled any suspicions left about it.

The Axiom of Choice is equivalent, modulo ZF, to the Well-ordering Principle, which asserts that every set can be well-ordered, i.e., it can be linearly ordered so that every non-empty subset has a minimal element.

Although not formally necessary, besides the symbol ∈∈ one normally uses for convenience other auxiliary defined symbols. For example,
AB expresses that A is a subset of B, i.e., every member of A is a member of B.
Other symbols are used to denote sets obtained by performing basic operations, such as
AB, which denotes the union of A and B, i.e., the set whose elements are those of A and B; or
AB, which denotes the intersection of A and B, i.e., the set whose elements are those common to A and B.
The ordered pair (A,B) is defined as the set {{A},{A,B}}.
Thus, two ordered pairs (A,B) and (C,D) are equal if and only if A=C and B=D.
And the Cartesian product A×B is defined as the set of all ordered pairs (C,D) such that CA and DB.
Given any formula φ(x,y1,…,yn) , and sets A,B1,…,Bn , one can form the set of all those elements of A that satisfy the formula φ(x, B1,…, Bn). This set is denoted by {aA:φ(a,B1,…,Bn)}.
In ZF one can easily prove that all these sets exist.


Russell's Paradox

Naïve set theory assumes the so-called naïve or unrestricted Comprehension Axiom, the axiom that for any formula φ(x) containing x as a free variable, there will exist the set {x : φ(x)} whose members are exactly those objects that satisfy φ(x). Thus, if the formula φ(x) stands for “x is prime”, then {x : φ(x)} will be the set of prime numbers.

But from the assumption of this axiom, Russell’s contradiction follows. For example, if we let φ(x) stand for xx and let R = {x: ~φ(x)}, then R is the set whose members are exactly those objects that are not members of themselves.

Is R a member of itself? If it is, then it must satisfy the condition of not being a member of itself and so it is not. If it is not, then it must not satisfy the condition of not being a member of itself, and so it must be a member of itself. Since by classical logic one case or the other must hold – either R is a member of itself or it is not – it follows that the theory implies a contradiction.

Russell's Paradox

Russell's paradox (also known as Russell's antinomy), discovered by Bertrand Russell in 1901, showed that the naive set theory of Frege leads to a contradiction.

Consider the set R of all sets that do not contain themselves as members. In set-theoretic notation:
          R = {A | AA}
Assume, as in Frege's Grundgesetze der Arithmetik, that sets can be freely defined by any condition. Then R is a well-defined set. The problem arises when it is considered whether R is an element of itself. If R is an element of R, then according to the definition, R is not an element of R; if R is not an element of R, then R has to be an element of R, again by its very definition: Hence a contradiction.

Russell's paradox was a primary motivation for the development of set theories with a more elaborate axiomatic basis than simple extensionality and unlimited set abstraction. The paradox drove Russell to develop type theory and Ernst Zermelo to develop an axiomatic set theory, which evolved into the now-canonical Zermelo–Fraenkel set theory.

Russell's Paradox

According to naive set theory, any definable collection is a set.
Let R be the set of all sets that are not members of themselves. If R is not a member of itself, then its definition dictates that it must contain itself, and if it contains itself, then it contradicts its own definition as the set of all sets that are not members of themselves. This contradiction is Russell's paradox. Symbolically:
Let R = {x | xx}, then RRRR

Set-Theoretic Paradoxes

Russell's paradox arises from considering the Russell set R of all sets that are not members of themselves, that is, the set defined defined by R = {x | xx }.
The contradiction is then derived by asking whether R is a member of itself, that is, whether RR holds. If RR then R is a member of itself, and thus RR, by definition of R. If, on the other hand, RR then R is not a member of itself, and thus RR, again by definition of R.

Russell's Paradox

Russell’s paradox, statement in set theory, devised by the English mathematician-philosopher Bertrand Russell, that demonstrated a flaw in earlier efforts to axiomatize the subject.

Russell found the paradox in 1901 and communicated it in a letter to the German mathematician-logician Gottlob Frege in 1902. Russell’s letter demonstrated an inconsistency in Frege’s axiomatic system of set theory by deriving a paradox within it.

Frege had constructed a logical system employing an unrestricted comprehension principle. The comprehension principle is the statement that, given any condition expressible by a formula ϕ(x), it is possible to form the set of all sets x meeting that condition, denoted {x | ϕ(x)}. For example, the set of all sets—the universal set—would be {x | x = x}.

It was noticed in the early days of set theory, however, that a completely unrestricted comprehension principle led to serious difficulties. In particular, Russell observed that it allowed the formation of {x | xx}, the set of all non-self-membered sets, by taking ϕ(x) to be the formula xx. Is this set—call it R—a member of itself? If it is a member of itself, then it must meet the condition of its not being a member of itself. But if it is not a member of itself, then it precisely meets the condition of being a member of itself. This impossible situation is called Russell’s paradox.

The significance of Russell’s paradox is that it demonstrates in a simple and convincing way that one cannot both hold that there is meaningful totality of all sets and also allow an unfettered comprehension principle to construct sets that must then belong to that totality. (Russell spoke of this situation as a “vicious circle.”)

Set theory avoids this paradox by imposing restrictions on the comprehension principle. The standard Zermelo-Fraenkel axiomatization (ZF) does not allow comprehension to form a set larger than previously constructed sets. (The role of constructing larger sets is given to the power-set operation.) This leads to a situation where there is no universal set—an acceptable set must not be as large as the universe of all sets.

A very different way of avoiding Russell’s paradox was proposed in 1937 by the American logician Willard Van Orman Quine. In his paper “New Foundations for Mathematical Logic,” the comprehension principle allows formation of {x | ϕ(x)} only for formulas ϕ(x) that can be written in a certain form that excludes the “vicious circle” leading to the paradox. In this approach, there is a universal set.

Fixing Russell's Paradox

Naive set theory allow mathematicians to construct the set of all sets that satisfy any property. The axiom that says that you're allowed to do this was formally introduced by Frege and is called unrestricted comprehension.

Russell found out that unrestricted comprehension principle makes it possible for mathematicians to construct the set of all sets that don't contain themselves and this leads to a contradiction.

The now-standard fix is to replace unrestricted comprehension with a weaker axiom called restricted comprehension. With restricted comprehension, you're no longer allowed to construct the set of all sets that satisfy some property. Now you're only allowed to construct the set of all elements of some other set that satisfy some property. (In the now-standard approach to set theory, the elements of sets are other sets. It forbids a set from being a member of itself.)

To fix Russell's paradox with restricted comprehension, you need to construct the set of all sets first. And another axiom, foundation, prohibits such a set from existing.

Russell's Paradox and Fixing Russell's Paradox

Russell found the paradox in 1901 and communicated it in a letter to the German mathematician-logician Gottlob Frege in 1902. Russell's letter demonstrated an inconsistency in Frege's axiomatic system of set theory by deriving a paradox within it.

Frege had constructed a logical system employing an unrestricted comprehension principle. The comprehension principle is the statement that, given any condition expressible by a formula ϕ(x), it is possible to form the set of all sets x meeting that condition, denoted {x | ϕ(x)}. For example, the set of all sets—the universal set—would be {x | x = x}.

It was noticed in the early days of set theory, however, that a completely unrestricted comprehension principle led to serious difficulties. In particular, Russell observed that it allowed the formation of {x | xx}, the set of all non-self-membered sets, by taking ϕ(x) to be the formula xx. Is this set—call it R—a member of itself? If it is a member of itself, then it must meet the condition of its not being a member of itself. But if it is not a member of itself, then it precisely meets the condition of being a member of itself. This impossible situation is called Russell's paradox.

The significance of Russell's paradox is that it demonstrates in a simple and convincing way that one cannot both hold that there is meaningful totality of all sets and also allow an unfettered comprehension principle to construct sets that must then belong to that totality. (Russell spoke of this situation as a “vicious circle.”)

Set theory avoids this paradox by imposing restrictions on the comprehension principle. The standard Zermelo-Fraenkel axiomatization does not allow comprehension to form a set larger than previously constructed sets. (The role of constructing larger sets is given to the power-set operation.) This leads to a situation where there is no universal set—an acceptable set must not be as large as the universe of all sets.

A very different way of avoiding Russell's paradox was proposed in 1937 by the American logician Willard Van Orman Quine. In his paper “New Foundations for Mathematical Logic,” the comprehension principle allows formation of {x | ϕ(x)} only for formulas ϕ(x) that can be written in a certain form that excludes the “vicious circle” leading to the paradox. In this approach, there is a universal set.


Fixing Russell's Paradox

When set theory was invented it was assumed that given any predicate, there was a set containing all the things that satisfied the predicate. This assumption is called naive comprehension. Unfortunately this allowed paradoxes like the set of all sets not containing themselves.

So people invented restricted axioms of comprehension. These are rules that say that only certain predicates give rise to sets. There are different kinds of set theory with different comprehension restrictions.

One idea for limiting comprehension is to say that if the class of things satisfies the predicate is too big, the class is not a set. All the commonly used set theories use this idea. The set of all sets is very large indeed, so it is not a set in these theories.

There are other ways of restricting comprehension that get rid of the paradoxical sets but do allow a universal set. One such set theory was invented by Quine and called New Foundations.


Russell’s paradox arises if you consider the set R = {x : xx}. Ask yourself if RR. If you suppose so, then by the definition of unrestricted set comprehension RR. You have a contradiction, so it must be the opposite of what you supposed, that is, RR.. But this is the same as saying R belongs to the complement of itself, that is, RR. You now have another contradiction, but this is far worse, since you have no hypotheses. The whole theory is logically inconsistent.

In set theory there are two ways for getting rid of the Russell’s paradox: either you disallow the set of all sets and other similar sets (see for example the Zermelo-Fraenkel set theory), or you allow them, but you also restrict the way they are used (see for example the Morse-Kelley set theory).

In the first case, set comprehension says if you have a set A you can have {xA : ϕ(x)} (notice: writing {x : ϕ(x)} is just wrong in this case, because you have to have an initial set). If you now define R = {xA : xx} and you repeat the same passages as before, it only follows that RA. There's no contradiction and the theory is consistent.

In the second case, you consider classes, not just sets. Sets are classes that belong to some other class, while proper classes are classes that belong to no class. Set comprehension, in this case, says you can have {x : ϕ(x)}, but all its members are sets by definition. If try to reproduce Russell’s paradox, you get that RR. If you then suppose that R is a set, then you have a contradiction, so R must be a proper class. This is all you get. No contradictions. The theory is consistent.



Bertrand Russell devised what he called the theory of types to prevent the paradox. In this theory, a set would be defined as being of a distinct type, like type 1. The elements of type 1 sets can then only be included in a set of type 2 because sets of type 2 are defined as containing only sets of type 1. Thus, we do not need to worry about whether or not a set of type 2 can contain itself because it’s defined as only containing sets of type 1. This theory creates a sort of hierarchy of sets.

The most accepted solution today is that of Zermelo and Fraenkel. Zermelo’s axiom of specification is, "to every set A and every definite property P(x) there corresponds a set whose elements are exactly those elements x in A for which the property P(x) holds" (Burton, 616). What this axiom does is require a preexisting set A and some property P(x) to make a new set. Previously, only the property P(x) was required. This changes the set S to . Now is impossible because it would have to satisfy the two conditions that and , which it clearly cannot. The is clear because we just state that in order for , it must satisfy the property that . If we consider the other possibility that we see that it does satisfy the property P(x), but cannot meet our second requirement that . This is because if then it follows, by our definition, that , which we already reasoned is not true. Therefore we conclude that by the law of excluded middle, which says that every proposition is either true or false (Burton, 612), . Therefore, failed the second of the two requirements to be in the set S so we conclude that and have avoided the paradox.

Russell’s Early Responses to the Paradoxs

Russell’s own response to the paradox came with his aptly named theory of types. Believing that self-application lay at the heart of the paradox, Russell’s basic idea was that we can avoid commitment to R (the set of all sets that are not members of themselves) by arranging all sentences (or, more precisely, all propositional functions, functions which give propositions as their values) into a hierarchy. It is then possible to refer to all objects for which a given condition (or predicate) holds only if they are all at the same level or of the same “type.”

This solution to Russell’s paradox is motivated in large part by adoption of the so-called vicious circle principle. The principle in effect states that no propositional function can be defined prior to specifying the function’s scope of application. In other words, before a function can be defined, one must first specify exactly those objects to which the function will apply (the function’s domain). For example, before defining the predicate “is a prime number,” one first needs to define the collection of objects that might possibly satisfy this predicate, namely the set, N, of natural numbers.

As Whitehead and Russell explain,
          An analysis of the paradoxes to be avoided shows that they all result from a kind of vicious circle. The vicious circles in question arise from supposing that a collection of objects may contain members which can only be defined by means of the collection as a whole. Thus, for example, the collection of propositions will be supposed to contain a proposition stating that “all propositions are either true or false.” It would seem, however, that such a statement could not be legitimate unless “all propositions” referred to some already definite collection, which it cannot do if new propositions are created by statements about “all propositions.” We shall, therefore, have to say that statements about “all propositions” are meaningless. … The principle which enables us to avoid illegitimate totalities may be stated as follows: “Whatever involves all of a collection must not be one of the collection”; or, conversely: “If, provided a certain collection had a total, it would have members only definable in terms of that total, then the said collection has no total.” We shall call this the “vicious-circle principle,” because it enables us to avoid the vicious circles involved in the assumption of illegitimate totalities. (1910, 2nd edn 37)

If Whitehead and Russell are right, it follows that no function’s scope of application will ever be able to include any object presupposed by the function itself. As a result, propositional functions (along with their corresponding propositions) will end up being arranged in a hierarchy of the kind Russell proposes.

Although Russell first introduced his theory of types in his 1903 Principles of Mathematics, he recognized immediately that more work needed to be done since his initial account seemed to resolve some but not all of the paradoxes.

Possible Solutions to the Paradox of Classes or Sets

It was mentioned above that late in his life, Frege gave up entirely on the feasibility of the logic of classes or sets. This is of course one ready solution to the antinomy in the class or set form: simply deny the existence of such entities altogether. Short of this, however, the following solutions have enjoyed the greatest popularity:

The Theory of Types for Classes: It was mentioned earlier that Russell advocated a more comprehensive theory of types than Frege's distinction of levels, one that divided not only properties or concepts into various types, but classes as well. Russell divided classes into classes of individuals, classes of classes of individuals, and so on. Classes were not taken to be individuals, and classes of classes of individuals were not taken to be classes of individuals. A class is never of the right type to have itself as member. Therefore, there is no such thing as the class of all classes that are not members of themselves, because for any class, the question of whether it is in itself is a violation of type. Once again, here the challenge is to explain the metaphysics of classes or sets in order to explain the philosophical grounds of the type-division.

Stratification: In 1937, W. V. Quine suggested an alternative solution in some ways similar to type-theory. His suggestion was rather than actually divide entities into individuals, classes of individuals, etc., such that the proposition that some class is in itself is always ill-formed or nonsensical, we can instead put certain restrictions on what classes are supposed to exist. Classes are only supposed to exist if their defining conditions are so as to not involve what would, in type theory, be a violation of types. Thus, for Quine, while "x is not a member of x" is a meaningful assertion, we do not suppose there to exist a class of all entities x that satisfy this statement. In Quine's system, a class is only supposed to exist for some open formula A if and only if the formula A is stratified, that is, if there is some assignment of natural numbers to the variables in A such that for each occurrence of the class membership sign, the variable preceding the membership sign is given an assignment one lower than the variable following it. This blocks Russell's paradox, because the formula used to define the problematic class has the same variable both before and after the membership sign, obviously making it unstratified. However, it has yet to be determined whether or not the resulting system, which Quine called "New Foundations for Mathematical Logic" or NF for short, is consistent or inconsistent.

Aussonderung: A quite different approach is taken in Zermelo-Fraenkel (ZF) set theory. Here too, a restriction is placed on what sets are supposed to exist. Rather than taking the "top-down" approach of Russell and Frege, who originally believed that for any concept, property or condition, one can suppose there to exist a class of all those things in existence with that property or satisfying that condition, in ZF set theory, one begins from the "bottom up". One begins with individual entities, and the empty set, and puts such entities together to form sets. Thus, unlike the early systems of Russell and Frege, ZF is not committed to a universal set, a set including all entities or even all sets. ZF puts tight restrictions on what sets exist. Only those sets that are explicitly postulated to exist, or which can be put together from such sets by means of iterative processes, etc., can be concluded to exist. Then, rather than having a naive class abstraction principle that states that an entity is in a certain class if and only if it meets its defining condition, ZF has a principle of separation, selection, or as in the original German, "Aussonderung". Rather than supposing there to exist a set of all entities that meet some condition simpliciter, for each set already known to exist, Aussonderung tells us that there is a subset of that set of all those entities in the original set that satisfy the condition. The class abstraction principle then becomes: if set A exists, then for all entities x in A, x is in the subset of A that satisfies condition C if and only if x satisfies condition C. This approach solves Russell's paradox, because we cannot simply assume that there is a set of all sets that are not members of themselves. Given a set of sets, we can separate or divide it into those sets within it that are in themselves and those that are not, but since there is no universal set, we are not committed to the set of all such sets. Without the supposition of Russell's problematic class, the contradiction cannot be proven.

There have been subsequent expansions or modifications made on all these solutions, such as the ramified type-theory of Principia Mathematica, Quine's later expanded system of his Mathematical Logic, and the later developments in set-theory made by Bernays, Gödel and von Neumann. The question of what is the correct solution to Russell's paradox is still a matter of debate.


Russell's Theory of Types

Theory of types, in logic, a theory introduced by the British philosopher Bertrand Russell in his Principia Mathematica to deal with logical paradoxes arising from the unrestricted use of predicate functions as variables.
Arguments of three kinds can be incorporated as variables: (1) In the pure functional calculus of the first order, only individual variables exist. (2) In the second-order calculus, propositional variables are introduced. (3) Higher orders are achieved by allowing predicate functions as variables. The type of a predicate function is determined by the number and type of its arguments. By not allowing predicate functions with arguments of equal or higher type to be used together, contradictions within the system are avoided.

In the system of Russell and Whitehead, the formal objects of the theory are divided into types. The lowest type consists of all individuals, the next type is composed of all predicates, the succeeding type is composed of all predicates of predicates, and so on.

In order to avoid the paradox which he discovered, Russell formulated a Theory of Types. It was wrong to treat classes as randomly classifiable objects. Classes and individuals were of different logical types, and what can be true or false of one cannot be significantly asserted of the other. ‘The class of dogs is a dog’ should be regarded not as false but as meaningless. Similarly, what can meaningfully be said of classes cannot meaningfully be said of classes of classes, and so on through the hierarchy of logical types. If the difference of type between the different levels of the hierarchy is observed, then the paradox will not arise. (Antony Kenny”s A Brief History of Western Philosophy (1998 Blackwell Publishers Ltd), p325.)

Simple Type Theory

The types can be defined as
1.i is the type of individuals
2.( ) is the type of propositions
3.if A1,…,An are types then (A1,…,An) is the type of n-ary relations over objects of respective types A1,…,An

For instance, the type of binary relations over individuals is (i, i), the type of binary connectives is (( ),( )), the type of quantifiers over individuals is ((i)).

For forming propositions we use this type structure: thus R(a1,…,an) is a proposition if R is of type (A1,…,An) and ai is of type Ai for i = 1,…,n. This restriction makes it impossible to form a proposition of the form P(P): the type of P should be of the form (A), and P can only be applied to arguments of type A, and thus cannot be applied to itself since A is not the same as (A).

Russell observes, that it becomes necessary “to distinguish (1) terms, (2) classes, (3) classes of classes, and so on ad infinitum”. Moreover, these collections will be disjoint, and to able to assert xu requires that the collection to which x belongs should be of a degree one lower than that to which u belongs. This expedient leads to a resolution of the paradox, since xx has now been rendered a meaningless proposition. The hierarchy of collections (1), (2), (3), ... is the germ of the doctrine of types.



In the context of language, self-reference is used to denote a statement that refers to itself or its own referent. The most famous example of a self-referential sentence is the liar sentence: “This sentence is not true.” Self-reference is often used in a broader context as well. For instance, a picture could be considered self-referential if it contains a copy of itself (see the animated image); and a piece of literature could be considered self-referential if it includes a reference to the work itself. In philosophy, self-reference is primarily studied in the context of language. Self-reference within language is not only a subject of philosophy, but also a field of individual interest in mathematics and computer science, in particular in relation to the foundations of these sciences.

The philosophical interest in self-reference is to a large extent centered around the paradoxes. A paradox is a seemingly sound piece of reasoning based on apparently true assumptions that leads to a contradiction. The liar sentence considered above leads to a contradiction when we try to determine whether it is true or not. If we assume the sentence to be true, then what it states must be the case, that is, it cannot be true. If, on the other hand, we assume it not to be true, then what it states is actually the case, and thus it must be true. In either case we are led to a contradiction. Since the contradiction was obtained by a seemingly sound piece of reasoning based on apparently true assumptions, it qualifies as a paradox. It is known as the liar paradox.

Most paradoxes of self-reference may be categorised as either semantic, set-theoretic or epistemic. The semantic paradoxes, like the liar paradox, are primarily relevant to theories of truth. The set-theoretic paradoxes are relevant to the foundations of mathematics, and the epistemic paradoxes are relevant to epistemology. Even though these paradoxes are different in the subject matter they relate to, they share the same underlying structure, and may often be tackled using the same mathematical means.

In the present entry, we will first introduce a number of the most well-known paradoxes of self-reference, and discuss their common underlying structure. Subsequently, we will discuss the profound consequences that these paradoxes have on a number of different areas: theories of truth, set theory, epistemology, foundations of mathematics, computability. Finally, we will present the most prominent approaches to solving the paradoxes.

A picture that contains a copy of itself
A picture that contains a copy of itself.
Image source: Stanford Encyclopedia of Philosophy  

•1. Paradoxes of Self-Reference
◦1.1 Semantic Paradoxes
◦1.2 Set-Theoretic Paradoxes
◦1.3 Epistemic paradoxes
◦1.4 Common Structures in the Paradoxes
◦1.5 A Paradoxes without Self-Reference

•2. Why the Paradoxes Matter
◦2.1 Consequences of the Semantic Paradoxes
◦2.2 Consequences of the Set-Theoretic Paradoxes
◦2.3 Consequences of the Epistemic Paradoxes
◦2.4 Consequences Concerning Provability and Computability

•3. Solving the paradoxes
◦3.1 Building Explicit Hierarchies
◦3.2 Building Implicit Hierarchies
  ■3.2.1 Kripke's Theory of Truth
  ■3.2.2 Extensions and Alternatives to Kripke's Theory of Truth
  ■3.2.3 Implicit Hierarchies in Set Theories

◦3.3 General Fixed Point Approaches


Barber’s Paradox

Russell's Paradox is sometimes explained by the easier understood example known as Barber’s Paradox: "If a barber shaves all and only those men who do not shave themselves, does he shave himself?".
If he does, then he mustn't, because he doesn't shave men who shave themselves, and if he doesn't, then he must, because he shaves every man who doesn't shave himself. Both possibilities lead to a contradiction.



Russell’s theory of descriptions

On Denoting

By a “denoting phrase” I mean a phrase such as any one of the following: a man, some man, any man, every man, all men, the present King of England, the present King of France, the centre of mass of the Solar System at the first instant of the twentieth century, the revolution of the earth round the sun, the revolution of the sun round the earth. Thus a phrase is denoting solely in virtue of its form. (Note: Denoting phrases include both definite descriptions and proper names.)
We may distinguish three cases: (1) A phrase may be denoting, and yet not denote anything; e.g., “the present King of France.” (2) A phrase may denote one definite object; e.g., “the present King of England” denotes a certain man. (3) A phrase may denote ambiguously; e.g., “a man” denotes not many men, but an ambiguous man.
The interpretation of such phrases is a matter of considerable difficulty; indeed, it is very hard to frame any theory not susceptible of formal refutation. All the difficulties with which I am acquainted are met, so far as I can discover, by the theory which I am about to explain.

The subject of denoting is of very great importance, not only in logic and mathematics, but also in theory of knowledge. For example, we know that the centre of mass of the Solar System at a definite instant is some definite point, and we can affirm a number of propositions about it; but we have no immediate acquaintance with this point, which is only known to us by description.
The distinction between acquaintance and knowledge about is the distinction between the things we have presentations of, and the things we only reach by means of denoting phrases.
It often happens that we know that a certain phrase denotes unambiguously, although we have no acquaintance with what it denotes; this occurs in the above case of the centre of mass. In perception we have acquaintance with objects of perception, and in thought we have acquaintance with objects of a more abstract logical character; but we do not necessarily have acquaintance with the objects denoted by phrases composed of words with whose meanings we are acquainted. To take a very important instance: There seems no reason to believe that we are ever acquainted with other people’s minds, seeing that these are not directly perceived; hence what we know about them is obtained through denoting. All thinking has to start from acquaintance; but it succeeds in thinking about many things with which we have no acquaintance.

My theory, briefly, is as follows.
I take the notion of the variable as fundamental; I use “C (x)” to mean a proposition in which x is a constituent, where x, the variable, is essentially and wholly undetermined.
Then we can consider the two notions “C (x) is always true” and “C (x) is sometimes true.” Then everything and nothing and something (which are the most primitive of denoting phrases) are to be interpreted as follows:
C (everything) means “C (x) is always true”;
C (nothing) means “ ‘C (x) is false’ is always true”;
C (something) means “It is false that ‘C (x) is false’ is always true.” 

Here the notion “C (x) is always true” is taken as ultimate and indefinable, and the others are defined by means of it. Everything, nothing, and something are not assumed to have any meaning in isolation, but a meaning is assigned to every proposition in which they occur. This is the principle of the theory of denoting I wish to advocate: that denoting phrases never have any meaning in themselves, but that every proposition in whose verbal expression they occur has a meaning.

Consider the proposition “all men are mortal.” This proposition is really hypothetical and states that if anything is a man, it is mortal. That is, it states that if x is a man, x is mortal, whatever x may be. Hence, substituting “x is human” for “x is a man,” we find:
        “All men are mortal” means “ ‘If x is human, x is mortal’ is always true.”

This is what is expressed in symbolic logic by saying that “all men are mortal” means “ ‘x is human’ implies ‘x is mortal’ for all values of x.” More generally, we say:
        “C (all men)” means “ ‘If x is human, then C (x) is true’ is always true.”

“C (no men)” means “ ‘If x is human, then C (x) is false’ is always true.”
“C (some men)” will mean the same as “C (a man),” and
“C (a man)” means “It is false that ‘C (x) and x is human’ is always false.”
“C (every man)” will mean the same as “C (all men).”

It remains to interpret phrases containing “the”. These are by far the most interesting and difficult of denoting phrases.
Take as an instance “the father of Charles II was executed.”
This asserts that there was an x who was the father of Charles II and was executed.
Now “the”, when it is strictly used, involves uniqueness; we do, it is true, speak of “the son of So-and-so” even when So-and-so has several sons, but it would be more correct to say “a son of So-and-so.” Thus for our purposes we take “the” as involving uniqueness.
Thus when we say “x was the father of Charles II” we not only assert that x had a certain relation to Charles II, but also that nothing else had this relation.
The relation in question, without the assumption of uniqueness, and without any denoting phrases, is expressed by “x begat Charles II.”
To get an equivalent of “x was the father of Charles II,” we must add “If y is other than x, y did not beget Charles II,” or, what is equivalent, “If y begat Charles II, y is identical with x.”
Hence “x is the father of Charles II” becomes “x begat Charles II; and ‘if y begat Charles II, y is identical with x’ is always true of y.”

Thus “the father of Charles II was executed” becomes: “It is not always false of x that x begat Charles II and that x was executed and that ‘if y begat Charles II, y is identical with x’ is always true of y.”

This may seem a somewhat incredible interpretation; but I am not at present giving reasons, I am merely stating the theory.

To interpret “C (the father of Charles II),” where C stands for any statement about him, we have only to substitute C (x) for “x was executed” in the above. Observe that, according to the above interpretation, whatever statement C may be, “C (the father of Charles II)” implies:
        “It is not always false of x that ‘if y begat Charles II, y is identical with x’ is always true of y,” which is what is expressed in common language by “Charles II had one father and no more.” Consequently if this condition fails, every proposition of the form “C (the father of Charles II)” is false. Thus e.g. every proposition of the form “C (the present King of France)” is false. This is a great advantage to the present theory. I shall show later that it is not contrary to the law of contradiction, as might be at first supposed.

The above gives a reduction of all propositions in which denoting phrases occur to forms in which no such phrases occur. Why it is imperative to effect such a reduction, the subsequent discussion will endeavor to show.

The evidence for the above theory is derived from the difficulties which seem unavoidable if we regard denoting phrases as standing for genuine constituents of the propositions in whose verbal expressions they occur. Of the possible theories which admit such constituents the simplest is that of Meinong. This theory regards any grammatically correct denoting phrase as standing for an object. Thus “the present King of France,” “the round square,” etc., are supposed to be genuine objects. It is admitted that such objects do not subsist, but nevertheless they are supposed to be objects. This is in itself a difficult view; but the chief objection is that such objects, admittedly, are apt to infringe the law of contradiction. It is contended, for example, that the existent present King of France exists, and also does not exist; that the round square is round, and also not round, etc. But this is intolerable; and if any theory can be found to avoid this result, it is surely to be preferred.

The above breach of the law of contradiction is avoided by Frege’s theory. He distinguishes, in a denoting phrase, two elements, which we may call the meaning and the denotation. Thus “the centre of mass of the Solar System at the beginning of the twentieth century” is highly complex in meaning, but its denotation is a certain point, which is simple. The Solar System, the twentieth century, etc., are constituents of the meaning; but the denotation has no constituents at all. One advantage of this distinction is that it shows why it is often worth while to assert identity. If we say “Scott is the author of Waverley,” we assert an identity of denotation with a difference of meaning.

One of the first difficulties that confront us, when we adopt the view that denoting phrases express a meaning and denote a denotation, concerns the cases in which the denotation appears to be absent.
If we say “the King of England is bald,” that is, it would seem, not a statement about the complex meaning “the King of England,” but about the actual man denoted by the meaning.
But now consider “the King of France is bald.” By parity of form, this also ought to be about the denotation of the phrase “the King of France.” But this phrase, though it has a meaning provided “the King of England” has a meaning, has certainly no denotation, at least in any obvious sense. Hence one would suppose that “the King of France is bald” ought to be nonsense; but it is not nonsense, since it is plainly false.

Thus we must either provide a denotation in cases in which it is at first sight absent, or we must abandon the view that denotation is what is concerned in propositions which contain denoting phrases. The latter is the course that I advocate. The former course may be taken, as by Meinong, by admitting objects which do not subsist, and denying that they obey the law of contradiction; this, however, is to be avoided if possible. Another way of taking the same course (so far as our present alternative is concerned) is adopted by Frege, who provides by definition some purely conventional denotation for the cases in which otherwise there would be none. Thus “the King of France” is to denote the null-class. But this procedure, though it may not lead to actual logical error, is plainly artificial, and does not give an exact analysis of the matter. Thus if we allow that denoting phrases, in general, have the two sides of meaning and denotation, the cases where there seems to be no denotation cause difficulties both on the assumption that there really is a denotation and on the assumption that there really is none.

That the meaning is relevant when a denoting phrase occurs in a proposition is formally proved by the puzzle about the author of Waverley. The proposition “Scott was the author of Waverley” has a property not possessed by “Scott was Scott,” namely the property that George IV wished to know whether it was true. Thus the two are not identical propositions; hence the meaning of “the author of Waverley” must be relevant as well as the denotation, if we adhere to the point of view to which this distinction belongs. Yet, as we have just seen, so long as we adhere to this point of view, we are compelled to hold that only the denotation is relevant.

According to the view which I advocate, a denoting phrase is essentially part of a sentence, and does not, like most single words, have any significance on its own account. If I say “Scott was a man,” that is a statement of the form “x was a man,” and it has “Scott” for its subject. But if I say “the author of Waverley was a man,” that is not a statement of the form “x was a man,” and does not have “the author of Waverley” for its subject. Abbreviating the statement made at the beginning of this article, we may put, in place of “the author of Waverley was a man,” the following: “One and only one entity wrote Waverley, and that one was a man.” (This is not so strictly what is meant as what was said earlier; but it is easier to follow.) And speaking generally, suppose we wish to say that the author of Waverley had the property φ, what we wish to say is equivalent to “One and only one entity wrote Waverley, and that one had the property φ.”

The explanation of denotation is now as follows. Every proposition in which “the author of Waverley” occurs being explained as above, the proposition “Scott was the author of Waverley” (i.e. “Scott was identical with the author of Waverley”) becomes “One and only one entity wrote Waverley, and Scott was identical with that one”; or, reverting to the wholly explicit form: “It is not always false of x that x wrote Waverley, that it is always true of y that if y wrote Waverley y is identical with x, and that Scott is identical with x.” Thus if “C” is a denoting phrase, it may happen that there is one entity x (there cannot be more than one) for which the proposition “x is identical with C” is true, this proposition being interpreted as above. We may then say that the entity x is the denotation of the phrase “C.” Thus Scott is the denotation of “the author of Waverley.” The “C” in inverted commas will be merely the phrase, not anything that can be called the meaning. The phrase per se has no meaning, because in any proposition in which it occurs the proposition, fully expressed, does not contain the phrase, which has been broken up.

The puzzle about George IV’s curiosity is now seen to have a very simple solution. The proposition “Scott was the author of Waverley,” which was written out in its unabbreviated form in the preceding paragraph, does not contain any constituent “the author of Waverley” for which we could substitute “Scott.” This does not interfere with the truth of inferences resulting from making what is verbally the substitution of “Scott” for “the author of Waverley,” so long as “the author of Waverley” has what I call a primary occurrence in the proposition considered. The difference of primary and secondary occurrences of denoting phrases is as follows:

When we say: “George IV wished to know whether so-and-so,” or when we say “So-and-so is surprising” or “So-and-so is true,” etc., the “so-and-so” must be a proposition. Suppose now that “so-and-so” contains a denoting phrase. We may either eliminate this denoting phrase from the subordinate proposition “so-and-so,” or from the whole proposition in which “so-and-so” is a mere constituent. Different propositions result according to which we do. When we say “George IV wished to know whether Scott was the author of Waverley,” we normally mean “George IV wished to know whether one and only one man wrote Waverley and Scott was that man”; but we may also mean: “One and only one man wrote Waverley, and George IV wished to know whether Scott was that man.” In the latter, “the author of Waverley” has a primary occurrence; in the former, a secondary. The latter might be expressed by “George IV wished to know, concerning the man who in fact wrote Waverley, whether he was Scott.” This would be true, for example, if George IV had seen Scott at a distance, and had asked “Is that Scott?” A secondary occurrence of a denoting phrase may be defined as one in which the phrase occurs in a proposition p which is a mere constituent of the proposition we are considering, and the substitution for the denoting phrase is to be effected in p, and not in the whole proposition concerned. The ambiguity as between primary and secondary occurrences is hard to avoid in language; but it does no harm if we are on our guard against it. In symbolic logic it is of course easily avoided.

The distinction of primary and secondary occurrences also enables us to deal with the question whether the present King of France is bald or not bald, and generally with the logical status of denoting phrases that denote nothing. If “C” is a denoting phrase, say “the term having the property F,” then
“C has the property φ” means “one and only one term has the property F, and that one has the property φ.”

If now the property F belongs to no terms, or to several, it follows that “C has the property φ” is false for all values of φ.
Thus “the present King of France is not bald” is false if it means “There is an entity which is now King of France and is not bald,”
but is true if it means “It is false that there is an entity which is now King of France and is bald.”

That is, “the King of France is not bald” is false if the occurrence of “the King of France” is primary, and true if it is secondary. Thus all propositions in which “the King of France” has a primary occurrence are false; the denials of such propositions are true, but in them “the King of France” has a secondary occurrence. Thus we escape the conclusion that the King of France has a wig.

The whole realm of non-entities, such as “the round square,” “the even prime other than 2,” “Apollo,” “Hamlet,” etc., can now be satisfactorily dealt with.
All these are denoting phrases which do not denote anything.
A proposition about Apollo means what we get by substituting what the classical dictionary tells us is meant by Apollo, say “the sun-god.” All propositions in which Apollo occurs are to be interpreted by the above rules for denoting phrases. If “Apollo” has a primary occurrence, the proposition containing the occurrence is false; if the occurrence is secondary, the proposition may be true. So again “the round square is round” means “there is one and only one entity x which is round and square, and that entity is round,” which is a false proposition, not, as Meinong maintains, a true one.

“The most perfect Being has all perfections; existence is a perfection; therefore the most perfect Being exists” becomes:
“There is one and only one entity x which is most perfect; that one has all perfections; existence is a perfection; therefore that one exists.”
As a proof, this fails for want of a proof of the premiss “there is one and only one entity x which is most perfect.”

Mr. MacColl (Mind, N.S., No. 54, and again No. 55, page 401) regards individuals as of two sorts, real and unreal; hence he defines the null-class as the class consisting of all unreal individuals. This assumes that such phrases as “the present King of France,” which do not denote a real individual, do, nevertheless, denote an individual, but an unreal one. This is essentially Meinong’s theory, which we have seen reason to reject because it conflicts with the law of contradiction. With our theory of denoting, we are able to hold that there are no unreal individuals; so that the null-class is the class containing no members, not the class containing as members all unreal individuals.

It is important to observe the effect of our theory on the interpretation of definitions which proceed by means of denoting phrases. Most mathematical definitions are of this sort; for example “m − n means the number which, added to n, gives m.” Thus m − n is defined as meaning the same as a certain denoting phrase; but we agreed that denoting phrases have no meaning in isolation. Thus what the definition really ought to be is: “Any proposition containing m − n is to mean the proposition which results from substituting for ‘m − n’ ‘the number which, added to n, gives m.’ ” The resulting proposition is interpreted according to the rules already given for interpreting propositions whose verbal expression contains a denoting phrase. In the case where m and n are such that there is one and only one number x which, added to n, gives m, there is a number x which can be substituted for m − n in any proposition contain m − n without altering the truth or falsehood of the proposition. But in other cases, all propositions in which “m − n” has a primary occurrence are false.

The usefulness of identity is explained by the above theory. No one outside of a logic-book ever wishes to say “x is x,” and yet assertions of identity are often made in such forms as “Scott was the author of Waverley” or “thou are the man.” The meaning of such propositions cannot be stated without the notion of identity, although they are not simply statements that Scott is identical with another term, the author of Waverley, or that thou are identical with another term, the man. The shortest statement of “Scott is the author of Waverley” seems to be “Scott wrote Waverley; and it is always true of y that if y wrote Waverley, y is identical with Scott.” It is in this way that identity enters into “Scott is the author of Waverley”; and it is owing to such uses that identity is worth affirming.

One interesting result of the above theory of denoting is this: when there is anything with which we do not have immediate acquaintance, but only definition by denoting phrases, then the propositions in which this thing is introduced by means of a denoting phrase do not really contain this thing as a constituent, but contain instead the constituents expressed by the several words of the denoting phrase. Thus in every proposition that we can apprehend (i.e. not only in those whose truth or falsehood we can judge of, but in all that we can think about), all the constituents are really entities with which we have immediate acquaintance. Now such things as matter (in the sense in which matter occurs in physics) and the minds of other people are known to us only by denoting phrases, i.e. we are not acquainted with them, but we know them as what has such and such properties. Hence, although we can form propositional functions C (x) which must hold of such and such a material particle, or of So-and-so’s mind, yet we are not acquainted with the propositions which affirm these things that we know must be true, because we cannot apprehend the actual entities concerned. What we know is “So-and-so has a mind which has such and such properties” but we do not know “A has such and such properties,” where A is the mind in question. In such a case, we know the properties of a thing without having acquaintance with the thing itself, and without, consequently, knowing any single proposition of which the thing itself is a constituent.


Russell’s theory of descriptions

In his article On Denoting of 1905, Russell worked out a general theory of the meaning of definite descriptions, which would take care both of the cases where there was some object answering to the description (as in ‘the man who discovered oxygen’) and of the cases where the description was vacuous (as in ‘the present King of France’).

Russell saw that it was necessary to give an account of the meaning of vacous expressions such as ‘the present King of France’ when it occurred, for instance, in the sentence ‘the present King of France is bald’. Russell called expressions like ‘the man who discovered oxygen’ and ‘the present King of France’ by the name ‘definite description’.

Frege had treated definite descriptions simply as complex names, so that ‘The auther of Hamlet was a genius’ had the same logical structure as ‘Shakespear was a genius’. This meant that he had to provide for arbitrary rules to be laid down in order to ensure that a sentence containing an empty name or vacuous definite description did not lack a truth-value. Russell thought this was unsatisfactory, and proposed to analyse sentences containing definiye descriptions quite differently from those containing names. It is a mistake, he believed to look for the meaning of definite descriptions in themseves; only the propositions in whose verbal expression they occur have a meaning.

For Russell there is a big difference between a sentence such as “James II was deposed’ (containing the name ‘James II’) and a sentence such as ‘The brother of Charles II was deposed’. An expression such as ‘The brother of Charles II’ has no meaning in isolation; but the sentence ‘The brother of Charles II was deposed’ has a meaning none the less. It asserts three things:
a) that some individual was brother to Charles II
b) that only this individual was brother to Charles II
c) that this individual was deposed.

Or, more formally:
For some x, (a) x was brother to Charles II
and (b) for all y, if y was brother to Charles II, y=x
and (c) x was deposed.

The first element of this formulation says that at least one individual was a brother of Charles II, the second that at most one individual was a brother of Charles, so that between them they say that exactly one individual was brother to Charles II. The third element goes on to say that that unique individual was deposed.
In the analyzed sentence nothing appears which looks like a name of James II; instead, we have a combination of predicates and quantifiers.

What is the point of this complicated anslysis? To see this we have to consider a sentence which, unlike ‘The brother of Charles II was deposed’, is not true.

Consider the following two sentences”
(1) The sovereign of the United Kingdom is male.
(2) The sovereign of the United States is male.

Neither of these sentences is true, but the reason differs in the two cases.
Everyone would agree the first sentence is not true, but plain false, because, because the sovereign of the United Kingdom (i.e, Queen Elizabeth II) is female.
The second fails to be true because the Us has no sovereign, and on Russell’s view this second sentence is not just untrue but positively false; and consequently its ‘It is not the case that sovereign of th US is male’ is true.

(Editorial Note: Although both two sentence (1) and sentence (2) are false, sentence (1) obeys and sentence (2) violates the law of excluded middle, which states that for any proposition, either that proposition is true, or its negation is true. Here, the negation of (1) ‘The sovereign of the United Kingdoms is female’ becomes true, thus it obeys the law of excluded middle. On the other hand, the negation of (2) ‘The sovereign of the United States is female’ remains false, thus it violates the law of excluded middle.)

(Bertrand Russell's On Denoting: By the law of the excluded middle, either “A is B” or “A is not B” must be true. Hence either “the present King of France is bald” or “the present King of France is not bald” must be true. Yet if we enumerated the things that are bald, and then the things that are not bald, we should not find the present King of France in either list. Hegelians, who love a synthesis, will probably conclude that he wears a wig.)

According to Russell, sentences containing empty names, i.e. apparent names which name no objects, such as ‘Slawkenburgius was a genius,’ is not really a sentence at all, therefore neither true nor false, since there was never anyone of whom ‘Slawkenburgius’ was the proper name.

Why did ussell want to ensure that sentences containg vacuous definite descriptions should count as false? He was, like Frege, interested in constructing a precise and scientific language for purposes of logic and mathematics. Both Frege and Russell regarded it as essential that such a language should contain only expressions which had a definite sense, by which they meant that all sentences in which the expressions could occur should have a truth-value. For if we allow into our system sentences lacking truth-value, then inference and deduction become impossible. It is easy enough to recognize that ‘the round square’ denotes nothing, because it is obviously self-contradictory. But prior to investigation it may not be clear whether some complicated mathematical formula contains a hidden contradiction. And if it does so, we will not be able to discover this by logical investigation unless sentences containing it are assured of a truth-value.

The Theory of Description

According to the theory of descriptions, we must draw a distinction between proper names and what Russell called “definite descriptions”. A definite description is a phrase containing the word “the” in the singular, and it can be used to mention, refer to, or pick out exactly one person, thing, or place. A proper name seems to have the same function as a definite description; it always picks out or denotes a particular individual, and the individual it picks out is its meaning.
Thus, in the sentence, “Clinton is tall”, the term “Clinton” means the actual person, Clinton. Though definite descriptions and proper names may sometimes pick out the same individual or place, Russell argued that their logical functions are entirely disparate. Thus, a speaker who in 1996 asserted, “The President of the United State is tall”, might be using the definite description “the President of the United State” to refer to Bill Clinton, but that phrase is not Clinton’s name; it could be used on different occasions to refer to different individuals. If Clinton had been replaced as president in 1996 by another tall person, that phrase would refer to someone other than Clinton. Indeed, descriptive phrases can be used meaningfully without picking out anything. “The greatest natural number” does not – indeed cannot – pick out anything, since there is a strict proof that no such number can exist. “The present king of France”, if intended to refer to a twentieth-century monarch, also lacks a referent.

According to Russell, certain apparent names are not real names but abbreviated descriptions. “Hamlet”, “Medusa”, “Santa Claus”, and so on fall into this category; they are not the names of persons but appear in history via stories or literary accounts. In his play, Shakespeare gives us a description of a certain character; thus, in that play, the apparent name “Hamlet” is an abbreviation for a longer phrase such as “the main character in a play called Hamlet by William Shakespeare”. Once the distinction between proper names and descriptions is made, it can be demonstrated that sentences containing proper names and sentences containing descriptions (including apparent names) mean different things. This can be shown by translating the respective sentences into an ideal language, such as that of Principia mathematica, where the difference becomes perspicuous.

In the Principia, the rendering of sentences containing proper names and those containing definite descriptions takes a purely symbolic form. But the difference can also be expressed in English (which again shows how logic can capture the subtleties of ordinary discourse).
Thus, “Bill Clinton is tall”, is of the logical form “Fa”. This is a singular sentence, containing a logical constant “a” that stands for a proper name and a predicate term “F” that stands for a property. When the constant and the predicate are given descriptive meanings, as in the sentence “Clinton is tall”, both sentences ascribe a certain property to a particular individual. Both are thus logically singular sentences.
They can be contrasted with “The present king of France is tall”, which is grammatically a singular sentence but when translated into the notation of Principia is not of the form “Fa”. Rather, in English, it has the same meaning as “At least one person is a male monarch of France, at most one person is male monarch of France, and whoever is male monarch of France is tall”. It is thus not logically a singular sentence but a complex general one. In symbolic notation, it is expressed as a conjunction of three sentences, one of them asserting the existence of a French monarch:

1. (Ex)(MFx) (At least one thing is a male monarch of France).
2. (x)(y) ((MFx.MFy).(x=y)) (At most one thing is a male monarch of France).
3. (x) ((MFx.).Fx) (Whoever is a male monarch of France is tall).

In the English sentence, “The present king of France is tall”, the word “the” expresses singularity, referring to one object as monarch of France. Singularity (the concept of “the”) is captured by sentences (1) and (2). To say that one and only one object is king of France is to say that at least one such object exists and also that not more than one does. If there is such an object, then (1) and (2) are true; if the object has the property ascribed to it, then the whole sentence, “The present king of France is tall”, is true. If there is no such object, then (1) is false, and then “The present king of France is tall” is false. But if either true or false, it is necessarily meaningful. The combination of “at least one” and “not more than one” is equivalent to the notion of exactly one. This shows both how powerful and subtle an ideal logical language can be.

The use of a formal language to distinguish sentences containing names from those containing descriptions has a number of important implications for philosophy.
First, it shows that an ideal language not only can articulate the ordinary sentences of natural languages but also reveal distinctions that such languages conceal.
Second, this fact implies that one must distinguish surface grammar from a deeper logical grammar that expresses the real meaning of such sentences. According to this deeper grammar, definite descriptions are not names, and sentences containing definitive descriptions are not singular but general.
        This finding has direct philosophical import. For example, it clears up the puzzle of how an individual can consistently deny the existence of something. Suppose an atheist says, “God does not exist”. It would seem that the atheist is presupposing in these very words that there exists something, a God, that does not exist; the atheist seems to be contradicting him- or herself. Russell shows that in this sentence, “God” is not a name but an abbreviated description for (on a Judeo-Christian) “the x that is all powerful, all wise, and wholly benevolent”. The atheist’s sentence can now be read as saying: “There is nothing that is all powerful, all wise, and wholly benevolent”. The sentence thus allows a philosophical position to be expressed without falling into inconsistency. This result has similar implications for skepticism, as it allows a radical skeptic to deny that knowledge is attainable without presupposing that there is such a thing as knowledge.

Third, in the preceding analysis of the sentence, “The present king of France is tall”, the phrase “The present king of France” no longer appears as a single unit in any of the three sentences that taken together give its meaning. This means that the phrase “The present king of France” has been eliminated and replaced by a complex of quantifiers, variables, and predicates. If it were a proper name, it would not be eliminable. Because they are eliminable, definite descriptions are called :incomplete symbols” by Russell. His theory of descriptions is thus a theory about the nature and function of incomplete symbols. Finally, each of the analyzing sentence is a general sentence and each is meaningful. This fact is key to understanding how a sentence whose subject term lacks a referent can be meaningful.

In the light of the preceding account, we can summarize Russell’s objection to Meimong’s argument. Meinong essentially confused definite descriptions and names. Once “The present king of France” is seen to be a description, then the phrase need not refer to anything. Therefore, from the fact that a sentence containing the phrase is meaningful, it does not follow that its grammatical subject denotes anything. There is thus no need to posit the existence or subsistence of such entities as the present king of France, Hamlet, Medusa, or Santa Claus.


Russell’s theory of descriptions

In a simple subject-predicate statement such as “Socrates is wise,” Russell observed, there seems to be something referred to (Socrates) and something said about it (that he is wise). If the proper name in such a sentence is replaced by a “definite description”—as in the statement “The president of the United States is wise”—there is apparently still something referred to and something said about it. A problem arises, however, when nothing fits the description, as in the statement “The present king of France is bald.” Although there is apparently nothing for the statement to be about, one nevertheless understands what it says. Prior to Russell’s work on definite descriptions, some philosophers—most notably Alexius Meinong (1853–1920)—felt forced by such examples to conclude that, in addition to things that have real existence, there are things that have some other sort of existence, for such statements could not be understood unless there was something for them to be about.

In Russell’s view, philosophers like Meinong had been misled by the surface grammatical form of sentences containing definite descriptions. Although they treated them as if they were simple subject-predicate statements, in reality they were much more complex. Upon analysis, the statement “The present king of France is bald” is shown to be a complex conjunction of other statements. Rendered in symbolic logic, these statements are:
(i) (∃x)(Fx), or “There is a present king of France”;
(ii) (∀y)(Fyy=x), or “There is at most one present king of France”; and
(iii) (∀x)(Fx → Bx), or “If anyone is a present king of France, he is bald.”
More important, each of the three component statements is general, in the sense that it does not refer to anything or anyone in particular. Thus, there is no phrase in the complete analysis equivalent to “the present king of France,” which shows that the phrase is not an expression, like a proper name, that refers to something as the thing that the whole statement is about. There is no need, therefore, to make Meinong’s distinction between things that have real existence and things that have some other kind of existence.

Because descriptions do not refer directly to things in the world, however, there must be some other way in which such a direct connection between language and the world is made. In search of this connection, Russell turned his attention to proper names. The name Aristotle, for example, does not seem to carry any descriptive content. But Russell argues, on the contrary, that ordinary names are really concealed definite descriptions (Aristotle may simply mean “The student of Plato who taught Alexander, wrote the Metaphysics, etc.”). If a name had no descriptive content, one could not sensibly ask about the existence of its bearer, for one could then not understand what is expressed by a statement involving it. If Russell were a name in this sense (without any descriptive content), then merely to understand the statement “Russell exists” or the statement “Russell does not exist” presupposes that one already knows what Russell refers to. But then there cannot be any genuine question about Russell’s existence, for just to understand the question one must know the thing to which the name refers. Ordinary proper names, however—Russell, Homer, Aristotle, and Santa Claus—as Russell pointed out, are such that it makes sense to question the existence of their bearers. Thus, ordinary names must be concealed descriptions and cannot be the means of directly referring to the particular things in the world.

Russell eventually concluded that things in the world can be talked about only through the medium of a special kind of name—in particular, one about which no question can arise whether it names something or not—and he suggested that in English the only possible candidates are the demonstrative pronouns this and that.

At this point in his thinking, Russell shifted from questions about the nature of language to questions about the nature of the world. He asked what sort of thing it is that can be named in the strict logical sense, that can be known and talked about, and from which one can learn about the world. The important restriction was that no question about whether it exists or not can arise. Ordinary physical objects and other people seemed not to fit this requirement.

In his search for something whose existence cannot be questioned, Russell hit upon present experience and, in particular, upon sense data: one can question whether one is really seeing some physical object—whether, for example, there is a desk before one—but one cannot question that one is having visual impressions or sense data. Thus, what a person can name in the strict logical sense and what things in the world he can refer to directly turn out to be elements of his present experience. Russell therefore made a distinction between what can be known by acquaintance and what can be known only by description—i.e., between things whose existence cannot be doubted and things about whose existence, at least theoretically, doubt can be raised. What is novel about Russell’s conclusion is that it was arrived at from a fairly technical analysis of language. To be directly acquainted with something is to be in a position to give it a name in the strict logical sense, and to know something only by description is to know only that there is something that the description uniquely fits.

Russell was not constant in his view about physical objects. At one point he thought that the observer must infer their existence as the best hypothesis to explain the observer’s experience. Later he held that they were “logical constructions” out of sense data.

Russell’s treatment of definite descriptions

Russell represented sentences of the form ‘The F is G’, e.g. ‘The present King of France is bald’, as conjoining three claims:
(1) There is at least one F (e.g. there is at least one present King of France).
(2) There is at most one F (e.g. there is at most one present King of France).
(3) Whatever is F is G (e.g. what is present King of France is bald).

In modern logical notation, the analysis becomes:
(4) (∃x) [Fx & (∀y) (Fy x = y) & Gy].

By systematising a statement’s inferential (or, more broadly, logical) behaviour, the representation of the statement in a favoured logical system shows the (or, perhaps, a) logical form of the statement. Russell’s treatment provided a model on which a definite description that fails to apply to exactly one individual may be meaningful and so provided a potential solution to old problems about the functioning of talk that purports to make reference to particular non-existents. On Russell’s account, the fact that there is no present King of France makes the sentence ‘The King of France is bald’ false, rather than meaningless, due to its making false the first clause in his analysis. His treatment also made especially evident that the logical form of a statement might not be obvious from its superficial form. (However, the space between logical and superficial form involved in Russell’s treatment of ‘The F is G’ via (4) is an artefact of Russell’s favoured logic. An alternative, though slightly less perspicuous, treatment is given in (5):
(5) [The x: Fx] (Gx).)

Russell’s treatment of definite descriptions showed that philosophical progress could be made by discerning the (or a) logical form of a philosophically problematic range of statements and that some philosophical disputes are usefully viewed as (at least in part) concerning how best to represent the logical forms of statements involved in those disputes. Together with the new treatment of quantification more generally, became a model for a variety of approaches to philosophical problems that involved attention to the forms of language used in the statement of those problems. For it supported the view that philosophical problems can arise due to the misleading superficial forms of the language we use and provided a model for how problems that arise in that way might be solved through uncovering the true logical forms of the statements involved.

Russell’s Theory of Definite Descriptions

Russell’s philosophical method has at its core the making and testing of hypotheses through the weighing of evidence. Hence Russell’s comment that he wished to emphasize the “scientific method” in philosophy. His method also requires the rigorous analysis of problematic propositions using the machinery of first-order logic. It was Russell’s belief that by using the new logic of his day, philosophers would be able to exhibit the underlying “logical form” of natural-language statements. A statement’s logical form, in turn, would help resolve various problems of reference associated with the ambiguity and vagueness of natural language.

Since the introduction of the modern predicate calculus, it has been common to use three separate logical notations (“Px”, “x = y”, and “∃x”) to represent three separate senses of the natural-language word “is”: the is of predication, e.g. “Cicero is wise”; the is of identity, e.g. “Cicero is Tully”; and the is of existence, e.g. “Cicero is”. It was Russell’s suggestion that, just as we use logic to make clear these distinctions, we can also use logic to discover other ontologically significant distinctions, distinctions that should be reflected in the analysis we give of each sentence’s correct logical form.

On Russell’s view, the subject matter of philosophy is then distinguished from that of the sciences only by the generality and a prioricity of philosophical statements, not by the underlying methodology of the discipline. In philosophy, just as in mathematics, Russell believed that it was by applying logical machinery and insights that advances in analysis would be made.

Russell’s most famous example of his new “analytic method” concerns so-called denoting phrases, phrases that include both definite descriptions and proper names. Like Alexius Meinong, Russell had initially adopted the view that every denoting phrase (for example, “Scott,” “the author of Waverley,” “the number two,” “the golden mountain”) denoted, or referred to, an existing entity. On this view, even fictional and imaginary entities had to be real in order to serve as truth-makers for true sentences such as “Unicorns have exactly one horn.” By the time his landmark article, “On Denoting,” appeared in 1905, Russell had modified his extreme realism, substituting in its place the view that denoting phrases need not possess a theoretical unity. As Russell puts it, the assumption that every denoting phrase must refer to an existing entity was the type of assumption that exhibited “a failure of that feeling for reality which ought to be preserved even in the most abstract studies” (Introduction to Mathematical Philosophy, 165).

While logically proper names (words such as “this” or “that” which refer to sensations of which an agent is immediately aware) do have referents associated with them, descriptive phrases (such as “the smallest number less than pi”) should be viewed merely as collections of quantifiers (such as “all” and “some”) and propositional functions (such as “x is a number”). As such, they are not to be viewed as referring terms but, rather, as “incomplete symbols.” In other words, they are to be viewed as symbols that take on meaning within appropriate contexts, but that remain meaningless in isolation.

Put another way, it was Russell’s insight that some phrases may contribute to the meaning (or reference) of a sentence without themselves being meaningful. For example:
The descriptive phrase “The author of Waverley” means nothing.
As he explains:
If “the author of Waverley” meant anything other than “Scott”, “Scott is the author of Waverley” would be false, which it is not.
If “the author of Waverley” meant “Scott”, “Scott is the author of Waverley” would be a tautology, which it is not.
Therefore, “the author of Waverley” means neither “Scott” nor anything else – i.e. “the author of Waverley” means nothing.

If Russell is correct, it follows that
the definite description “The present King of France”, in the sentence (1) The present King of France is bald, plays a role quite different from the role a proper name such as “Scott” plays in the sentence (2) Scott is bald.

Letting K abbreviate the predicate “is a present King of France” and B abbreviate the predicate “is bald,” Russell assigns sentence (1) the logical form
(1′) There is an x such that
i. Kx,
ii.for any y, if Ky then y=x, and
iii. Bx.

Alternatively, in the notation of the predicate calculus, we write
(1″) ∃x [(Kx & ∀y (Kyy=x)) & Bx].

In contrast, by allowing s to abbreviate the name “Scott,” Russell assigns sentence (2) the very different logical form
(2′) Bs.

This distinction between logical forms allows Russell to explain three important puzzles.

The first concerns the operation of the Law of Excluded Middle and how this law relates to denoting terms. According to one reading of the Law of Excluded Middle, it must be the case that either “The present King of France is bald” is true or “The present King of France is not bald” is true. But if so, both sentences appear to entail the existence of a present King of France, clearly an undesirable result, given that France is a republic and so has no king. Russell’s analysis shows how this conclusion can be avoided. By appealing to analysis (1′′), it follows that there is a way to deny (1) without being committed to the existence of a present King of France, namely by changing the scope of the negation operator and thereby accepting that “It is not the case that there exists a present King of France who is bald” is true.

The second puzzle concerns the Law of Identity as it operates in (so-called) opaque contexts. Even though “Scott is the author of Waverley” is true, it does not follow that the two referring terms “Scott” and “the author of Waverley” need be interchangeable in every situation. Thus, although “George IV wanted to know whether Scott was the author of Waverley” is true, “George IV wanted to know whether Scott was Scott” is, presumably, false.

Russell’s distinction between the logical forms associated with the use of proper names and definite descriptions again shows why this is so. To see this, we once again let s abbreviate the name “Scott.” We also let w abbreviate “Waverley” and A abbreviate the two-place predicate “is the author of.” It then follows that the sentence
(3) s=s

is not at all equivalent to the sentence
(4) ∃x [(Axw & ∀y (Ayw y=x)) & x=s].

Sentence (3), for example, is a necessary truth, while sentence (4) is not.

The third puzzle relates to true negative existential claims, such as the claim “The golden mountain does not exist.” Here, once again, by treating definite descriptions as having a logical form distinct from that of proper names, Russell is able to give an account of how a speaker may be committed to the truth of a negative existential without also being committed to the belief that the subject term has reference. That is, the claim that Scott does not exist is false since
(5) ~∃x (x=s)

is self-contradictory. (After all, there must exist at least one thing that is identical to s since it is a logical truth that s is identical to itself!) In contrast, the claim that a golden mountain does not exist may be true since, assuming that G abbreviates the predicate “is golden” and M abbreviates the predicate “is a mountain,” there is nothing contradictory about
(6) ~∃x (Gx & Mx).

Russell’s most important writings relating to his theory of descriptions include not only “On Denoting” (1905), but also The Principles of Mathematics (1903), Principia Mathematica (1910) and Introduction to Mathematical Philosophy (1919). (See too “What is Russell’s Theory of Descriptions?” (Kaplan 1970), “Existence in the Theory of Definite Descriptions” (Kroon 2009), and “The Theory of Descriptions” (Stevens 2011). )



According to Russell, the system of the Principia was an extension of the ordinary language in the sense that it could capture its welter of different types of sentences and expose them to endless set of logical transformation, thus generating new theorems. It also represented a perfection of ordinary language by eliminating ambiguity and vagueness. But above all, because it was an instrument of razor sharpness, it could solve certain enduring philosophical problems.

Through its so-called theory of description, it could explain the invalidity of the ontological argument, which presupposed that existence was a property (or in Russell’s terms, a logical predicate). Thus, in the statements “Tigers growl” and “Tigers exist”, the words “grow;” and “exist” have different logical functions. The first means, “Something is a tiger and growls”, and the second means “Something is a tiger”. “Exists” is thus not a real predicate in the way that “growls” is. As the theory of descriptions demonstrates, however, existence functions as part of the apparatus of quantifications. Thus, the basic move in the ontological argument – that God is not perfect unless he possesses the property of existing – is fallacious because existing is not a property.

The theory of descriptions was able to resolve two other, deeper issues about existence and identity as well.


From the time of the Greeks on, philosophers had puzzled about the nature of nonbeing without coming to any successful resolution of the issue. The problem can be stated thus: We are able to make significant and indeed sometimes even true statements about “entities” such as Santa Claus, Medusa, Hamlet, Atlantis, and so forth. It is surely true to say “Santa Claus does not exist”. Or, again, when we say, “Hamlet murdered Polonius”, that sentence seems to be true. But according to the standard correspondence theory of truth, a sentence p is true if and only if it corresponds to a particular fact in the world. Thus, the world does not contain the fact that Hamlet murdered Polonius, since in reality that putative event never occurred. Moreover, on the most simple and intuitive theory of language, it seems plausible to hold that words obtain their meanings because they correspond to certain sorts of objects. Thus, the word “dog” in the sentence “Some dogs are white” is meaningful because there are objects in the world – namely, dogs – that it picks out or denotes. Yet “Santa Claus”, “Hamlet”, and “Atlantis” all seem to be meaningful, even though they denote no existing things.

In the twentieth century, the problem of nonbeing surfaced in the work of the Austrian logician Alexius Meinong (1853-1920), who advanced the thesis that “there are objects that do not exist”. In 1904, Russell accepted this theory, but by 1907 he had rejected it. Meinong argued that such things as the Fountain of Youth, the present king of France, Santa Claus, and Hamlet – which ordinary people regard as nonexistent – must exist in some sense or another. The special sense he called Bestand (subsistence).
Meinong was led into this position by an argument that can be rephrased as follows: (1) The phrase “the present king of France” is the subject of the sentence, “The present king of France is wise”. (2) Since the sentence “The present king of France is wise” is meaningful, it must about something – namely, the present king of France. (3) But unless the king of France existed, the sentence would not be about anything and hence would not be meaningful at all, since one of its essential constituents, “The present king of France”, would not be meaningful. (4) Sine “The present king of France is wise” is meaningful, it therefore must be about some entity – namely, the present king of France – hence, such entity must exist or subsist.

For Russell, this argument not only was fallacious but it lacked – as he put it – the “robust sense of reality” that one should expect in good philosophy. Santa Claus is not a creature of fresh and blood, and no object is now or was ever king of France in the twentieth century. The fallacy in the argument was exposed via the theory of descriptions.


The new logic was also able to solve long-standing problems about the nature of sameness or identity. This issue is central to a number of major problems, among them the ancient problem of change that puzzled the Greeks and the problem of personhood that bothered seventeenth- and eighteenth-century thinkers. Frege and Russell independently invented different ingenious solutions to the problem. There is a serious debate within the philosophy of language over which solution is preferable and each is widely accepted today. Among the important contemporary writers who have contributed to the debate are Quine, Searle, Ruth Marcus, Keith Donnellan, Saul Kripke, Hillary Putnam, and David Kaplan.

Frege presented his solution in a paper, “über Sinn und Bedeutung” (On Sense And Reference) that was originally published in 1893 and received little recognition in its own time but was rediscovered after World War II and has been influential ever since. Frege begins by stating that the idea of sameness challenges reflection. He formulates the problem thus: Consider two true identity sentences, “Venus = Venus” and “Venus = the morning star”. The first is trivial, a tautology that communicates no new information. The second, however, seems to represent an extension of our knowledge. But if both sentences refer to the same object – namely, a specific planet – how can the second sentence be significant while the first is not? Are we not referring to the same object twice over and thus merely repeating ourselves?

Frege solved this problem by drawing a distinction between two senses of “meaning”. Linguistic expressions, he stated, have meaning in a referential or extensional sense (bedeutung) in which they refer to a particular object, in this case the planet Venus. But they also have a connotative or intensional meaning (sinn), in which they may allude to the object indirectly, via a description of it. With this distinction in hand, the two identity sentences clearly differ in significance. In stating that Venus is the morning star, we add new information; namely that this is the planet that appears in the morning sky. Everyone knows a priori that Venus is Venus, but it was an important astronomical discovery that Venus is the planet that appears in the morning sky. The knowledge that one is referring to the same planet under a special description makes the sentence significant and not trivial. Frege’s solution was that the term “Venus” and “the morning star” are identical in meaning in the extensional but not in the intensional sense, and it is the latter difference that makes the second sentence significant. Frege generalized this brilliant insight into an entire philosophy of language that applied not only to words but to larger units of language as well, such as descriptions and sentences.

Russell, however, denied that genuine proper names such as “Venus” possess intensional meaning. According to him, they mean only the object they denote. His solution to the problem is that because the phrase “the morning star” is a definite description, the sentence “Venus = the morning star” is not an identity sentence at all but a complex general sentence that should be analyzed according to the theory of definite descriptions.

Frege’s account is generally supported on the ground that it captures the grammatical form of the English sentences, allowing both to be identity sentences, but it has the disadvantage – as Quine, Kripke, and Putnam have emphasized – that intensions or senses are not well-defined entities.
Russell’s account treats names as kinds or tags that directly pick out an object without the mediation of a description or intension. This treatment of names has received widespread acceptance, but Russell’s account has the disadvantage that it analyzes what seems prima facie to bean identity sentence into a set of sentences of a completely different logical form.
These differing approaches have generated a vast contemporary literature in which merits and disadvantages of each theory have been extensively probed.



An overview of Bertrand Russell’s technical work in logic and logicism

A sentence is a group of words whose meaning is a complete thought.
A declarative sentence has a meaning that is either true or false.
A proposition is said to be the meaning expressed by a declarative sentence, such as the true proposition “The earth is round” or the false proposition “The earth is flat.” So propositions are either true or false. The declarative sentences that express them are also said to be true or false.

The subject of a proposition is who or what the proposition is about. “The earth is flat” is about the earth. So the earth is the subject of that proposition.
The predicate is what is said about, or attributed to, the subject. Here, the proposition attributes flatness to the earth, so “___ is flat” is the predicate. Logicians write predicates using variables like x, y, or z, instead of blank spaces, to indicate where the subject goes in relation to the predicate. Bertrand Russell called predicates propositional functions. In this book, we use the terms interchangeably.

The predicate “x is flat” is a one-place predicate, because it only has one place where a subject can go – it attributes a property to one thing.
Two-place predicates are relations like that in “Indiana is flatter than Ohio.” Here, the subjects are “Indiana” and “Ohio” and the predicate is “x is flatter than y.” (In grammar, the first is the subject and the second is the object; in logic, they are both subjects.)
Common two-place relations in mathematics are x = y, x > y, and x < y. There are also three-place relations like that in “Ohio is between Indiana and Pennsylvania,” where the predicate is “x is between y and z,” which is often used in geometry. There are also four-place relations, and so on.

Before Russell’s logic of relations, logic consisted principally of the Aristotelian logic of one-place predicates. This simple logic can analyze sentences that use one-place predicates to attribute properties to objects like “Tom is tall” or “The sky is blue.” It can also analyze slightly more complex sentences like “All humans are animals” (if someone is human, that person is an animal) and “Some humans are thoughtful” (at least one person is both human and thoughtful) and from these two sentences infer that “Some animals are thoughtful.” You can’t get too far with such a simple logic and you certainly can’t analyze many mathematical or scientific statements with it.

It was Russell’s first great achievement to develop the more powerful logic of relations to describe concepts expressed by two-place predicates, such as “x is taller than y” used in propositions like “Tom is taller than Bob,” which you can’t say with a one-place predicate like “x is tall.” This allowed Russell to describe propositions containing two-place mathematical relations like x = y or x > y (needed for arithmetic and algebra), three-place relations like “x is between a and b” (needed for geometry), and the like. With it, all of the concepts of pure mathematics can be expressed, which can’t be done with the logic that came before it.

Russell’s logic includes set theory. This is because his logic contains predicates and every predicate defines a set. For example, the predicate “x is human” defines the set of all things that can replace the x to make “x is human” true, i.e., it defines the class of humans. The comprehension axiom is the assumption that every predicate defines a class. It is an assumption of Russell’s logic. So Russell’s logic contains sets and a theory of sets, as well as one-place predicates and two-place relations. Russell refers to sets as “classes” and set theory as “the theory of classes.” We will use both ways of speaking indifferently and without distinction.

After the logic of relations, Russell’s greatest achievement is his theory of logicism – the view that mathematics is just logic, so that all mathematical concepts can be defined in terms of logical concepts and all mathematical truths can be derived from logical truths. Russell’s logic and his logicist philosophy were first fully described in his 1903 Principles of Mathematics. The actual derivation of mathematics from logic, to prove that mathematics is just logic, occurs in the three-volume 1910-13 Principia Mathematica that Russell wrote with Alfred North Whitehead. Russell also presents logicism simply and informally in his 1919 Introduction to Mathematical Philosophy. Finally, there is a 1925-27 revised second edition of Principia Mathematica.

Logicism comes down to this: In the nineteenth century, mathematicians had shown that all of classical mathematics can be defined in terms of, and derived from, arithmetic. Most importantly, Richard Dedekind had shown in 1872 that the real numbers can be defined in terms of rational numbers. Then rational numbers were defined in terms of natural numbers, thus demonstrating that the real numbers can be derived from natural numbers. This is called the arithmetization of mathematics. The next step was taken by Giuseppe Peano, based on work by Dedekind, who showed in 1890 that arithmetic can be reduced to five axioms and three undefined concepts. This is the axiomatization of arithmetic.

After this, all one has to do to reduce mathematics to logic – since mathematics has already been reduced to arithmetic and arithmetic has been reduced to 3 concepts and 5 axioms – is to define Peano’s 3 concepts in terms of logical concepts, thus expressing Peano’s axioms logically, and then derive Peano’s 5 axioms from logical truths, showing that Peano’s axioms, and thus all the mathematics based on them, are logical truths. Peano’s three undefined concepts are: 0, natural number, and successor. Russell starts by defining natural numbers logically as classes of classes. Specifically, a natural number is the class of all classes containing the same number of things, so that the number 1 is the class of all singletons (classes with one member), 2 is the class of all couples, and so on. With this definition, Russell then defines Peano’s other two basic concepts logically and derives Peano’s axioms from logic.

Put this way, demonstrating logicism is a seemingly simple task. But Russell and Whitehead soon ran into difficulties, namely, contradictions Russell found in the new logic and set theory.

........... ...........

Russell’s original form of logicism, in his 1903 Principles of Mathematics, did not attempt to avoid the paradoxes of the new logic, and so did not contain the complexities Russell later added to his logic to avoid them. It is a straightforward theory that contains all the basic elements of logicism without the complexities. We present this basic logicism, which we call naïve logicism, in Chapter 2. The complex version meant to avoid paradoxes, which occurs in the 1910-13 Principia Mathematica, we call restricted logicism. We describe that in Chapter 3.

Russell’s Work in Logic

Russell’s main contributions to logic and the foundations of mathematics include his discovery of Russell’s paradox (also known as the Russell-Zermelo paradox), his development of the theory of types, his championing of logicism (the view that mathematics is, in some significant sense, reducible to formal logic), his impressively general theory of logical relations, his formalization of the mathematics of quantity and of the real numbers, and his refining of the first-order predicate calculus.

Russell discovered the paradox that bears his name in 1901, while working on his Principles of Mathematics (1903). The paradox arises in connection with the set of all sets that are not members of themselves. Such a set, if it exists, will be a member of itself if and only if it is not a member of itself. In his 1901 draft of the Principles of Mathematics, Russell summarizes the problem as follows:
          The axiom that all referents with respect to a given relation form a class seems, however, to require some limitation, and that for the following reason. We saw that some predicates can be predicated of themselves. Consider now those … of which this is not the case. … [T]here is no predicate which attaches to all of them and to no other terms. For this predicate will either be predicable or not predicable of itself. If it is predicable of itself, it is one of those referents by relation to which it was defined, and therefore, in virtue of their definition, it is not predicable of itself. Conversely, if it is not predicable of itself, then again it is one of the said referents, of all of which (by hypothesis) it is predicable, and therefore again it is predicable of itself. This is a contradiction. (CP, Vol. 3, 195)

The paradox is significant since, using classical logic, all sentences are entailed by a contradiction. Russell’s discovery thus prompted a large amount of work in logic, set theory, and the philosophy and foundations of mathematics.

Russell’s response to the paradox came between 1903 and 1908 with the development of his theory of types. It was clear to Russell that some form of restriction needed to be placed on the original comprehension (or abstraction) axiom of naïve set theory, the axiom that formalizes the intuition that any coherent condition (or property) may be used to determine a set. Russell’s basic idea was that reference to sets such as the so-called Russell set (the set of all sets that are not members of themselves) could be avoided by arranging all sentences into a hierarchy, beginning with sentences about individuals at the lowest level, sentences about sets of individuals at the next lowest level, sentences about sets of sets of individuals at the next lowest level, and so on. Using a vicious circle principle similar to that adopted by the mathematician Henri Poincaré, together with his so-called “no class” theory of classes (in which class terms gain meaning only when placed in the appropriate context), Russell was able to explain why the unrestricted comprehension axiom fails: propositional functions, such as the function “x is a set,” may not be applied to themselves since self-application would involve a vicious circle. As a result, all objects for which a given condition (or predicate) holds must be at the same level or of the same “type.” Sentences about these objects will then always be higher in the hierarchy than the objects themselves.

Although first introduced in 1903, the theory of types was further developed by Russell in his 1908 article “Mathematical Logic as Based on the Theory of Types” and in the three-volume work he co-authored with Alfred North Whitehead, Principia Mathematica (1910, 1912, 1913). The theory thus admits of two versions, the “simple theory” of 1903 and the “ramified theory” of 1908. Both versions of the theory came under attack: the simple theory for being too weak, the ramified theory for being too strong. For some, it was important that any proposed solution be comprehensive enough to resolve all known paradoxes at once. For others, it was important that any proposed solution not disallow those parts of classical mathematics that remained consistent, even though they appeared to violate the vicious circle principle. For discussion of related paradoxes, see Chapter 2 of the Introduction to Whitehead and Russell (1910), as well as the entry on paradoxes and contemporary logic in this encyclopedia.

Russell himself had recognized several of these same concerns as early as 1903, noting that it was unlikely that any single solution would resolve all of the known paradoxes. Together with Whitehead, he was also able to introduce a new axiom, the axiom of reducibility, which lessened the vicious circle principle’s scope of application and so resolved many of the most worrisome aspects of type theory. Even so, critics claimed that the axiom was simply too ad hoc to be justified philosophically. For additional discussion see Linsky (1990), Linsky (2002) and Wahl (2011).

Of equal significance during this period was Russell’s defense of logicism, the theory that mathematics is in some important sense reducible to logic. First defended in his 1901 article “Recent Work on the Principles of Mathematics,” and later in greater detail in his Principles of Mathematics and in Principia Mathematica, Russell’s logicism consisted of two main theses. The first was that all mathematical truths can be translated into logical truths or, in other words, that the vocabulary of mathematics constitutes a proper subset of the vocabulary of logic. The second was that all mathematical proofs can be recast as logical proofs or, in other words, that the theorems of mathematics constitute a proper subset of the theorems of logic. As Russell summarizes, “The fact that all Mathematics is Symbolic Logic is one of the greatest discoveries of our age; and when this fact has been established, the remainder of the principles of mathematics consists in the analysis of Symbolic Logic itself” (1903, 5).

Like Gottlob Frege, Russell’s basic idea for defending logicism was that numbers may be identified with classes of classes and that number-theoretic statements may be explained in terms of quantifiers and identity. Thus the number 1 is to be identified with the class of all unit classes, the number 2 with the class of all two-membered classes, and so on. Statements such as “There are at least two books” would be recast as statements such as “There is a book, x, and there is a book, y, and x is not identical to y.” Statements such as “There are exactly two books” would be recast as “There is a book, x, and there is a book, y, and x is not identical to y, and if there is a book, z, then z is identical to either x or y.” It follows that number-theoretic operations may then be explained in terms of set-theoretic operations such as intersection, union, and difference. In Principia Mathematica, Whitehead and Russell were able to provide many detailed derivations of major theorems in set theory, finite and transfinite arithmetic, and elementary measure theory. They were also able to develop a sophisticated theory of logical relations and a unique method of founding the real numbers. Even so, the issue of whether set theory itself can be said to have been successfully reduced to logic remained controversial. A fourth volume on geometry was planned but never completed.

Russell’s most important writings relating to these topics include not only his Principles of Mathematics (1903), “Mathematical Logic as Based on the Theory of Types” (1908), and Principia Mathematica (1910, 1912, 1913), but also his earlier Essay on the Foundations of Geometry (1897) and his Introduction to Mathematical Philosophy (1919a), the last of which was written while Russell was serving time in Brixton Prison as a result of his anti-war activities. Coincidentally, it was at roughly this same time that Ludwig Wittgenstein, Russell’s most famous pupil, was completing his Tractatus Logico-Philosophicus (1921) while being detained as a prisoner of war at Monte Cassino in Italy during World War I.

Anyone needing assistance in deciphering the symbolism found in the more technical of Russell’s writings is encouraged to consult the Notation in Principia Mathematica entry in this encyclopedia.


Russell's Logic and Philosophy of Mathematics

Russell had great influence on modern mathematical logic. The American philosopher and logician Willard Van Orman Quine said Russell's work represented the greatest influence on his own work.

Russell's first mathematical book, An Essay on the Foundations of Geometry, was published in 1897. This work was heavily influenced by Immanuel Kant. Russell soon realized that the conception it laid out would have made Albert Einstein's schema of space-time impossible, which he understood to be superior to his own system. Thenceforth, he rejected the entire Kantian program as it related to mathematics and geometry, and he maintained that his own earliest work on the subject was nearly without value.

Russell was interested in the definition of number and he studied the work of George Boole, Georg Cantor, and Augustus De Morgan, while materials in the Bertrand Russell Archives at McMaster University include notes of his reading in algebraic logic by Charles S. Peirce and Ernst Schröder. He became convinced that the foundations of mathematics were tied to logic, and following Gottlob Frege took an extensionalist approach in which logic was in turn based upon set theory. This led Russell to accept and defend the view known as logicism, the view that mathematics is in some important way reducible to formal logic.
In 1900 he attended the first International Congress of Philosophy in Paris where he became familiar with the work of the Italian mathematician, Giuseppe Peano. He mastered Peano's new symbolism and his set of axioms for arithmetic. Peano was able to define logically all of the terms of these axioms with the exception of 0, number, successor, and the singular term, the. Russell took it upon himself to find logical definitions for each of these. Between 1897 and 1903 he published several articles applying Peano's notation to the classical Boole-Schröder algebra of relations, among them On the Notion of Order, Sur la logique des relations avec les applications à la théorie des séries, and On Cardinal Numbers.

Russell eventually discovered that Frege had independently arrived at equivalent definitions for 0, successor, and number, and the definition of number is now usually referred to as the Frege-Russell definition. It was largely Russell who brought Frege to the attention of the English-speaking world. He did this in 1903, when he published The Principles of Mathematics, in which the concept of class is inextricably tied to the definition of number. The appendix to this work detailed a paradox arising in Frege's set theory, now frequently called or known as naive set theory. That theory had formalized the intuition that any specifiable condition could be used to formulate a set or class. But Russell raised the question of the set of all sets that are not members of themselves—that set is a member of itself if and only if it is not a member of itself. This came to be known as the Russell Paradox.

In writing Principles, Russell came across Cantor's proof that there was no greatest cardinal number, which Russell believed was mistaken. The Cantor Paradox in turn was shown (for example by Crossley) to be a special case of the Russell Paradox. This caused Russell to analyze classes, for it was known that given any number of elements, the number of classes they result in is greater than their number. In turn, this led to the discovery of a very interesting class, namely, the class of all classes, which consists of two kinds of classes: classes that are members of themselves, and classes that are not members of themselves, which led him to find that the so-called principle of extensionality—the notion that any specifiable condition would determine a set or class—taken for granted by logicians of the time, was fatally flawed, and that it resulted in a contradiction, whereby Y is a member of Y, if and only if Y is not a member of Y.

Russell's solution to the Russell Paradox was outlined in an appendix to Principles, which he later developed into a complete theory, the Theory of Types. Aside from exposing a major inconsistency in any Frege-type set theory, usually called naive set theory, Russell's work led directly to the creation of modern set axiomatic set theory. It also crippled Frege's project of reducing arithmetic to logic. The Theory of Types and much of Russell's subsequent work have also found practical applications with computer science and information technology.

Russell continued to defend logicism, and along with his former teacher, Alfred North Whitehead, wrote the monumental Principia Mathematica, an axiomatic system on which they claimed all of mathematics can be built. The first volume of the Principia was published in 1910, and is largely ascribed to Russell. More than any other single work, it established the specialty of mathematical or symbolic logic. Two more volumes were published, but their original plan to incorporate geometry in a fourth volume was never realized, and Russell never felt up to improving the original works, though he referenced new developments and problems in his preface to the second edition. Upon completing the Principia, three volumes of extraordinarily abstract and complex reasoning, Russell was exhausted, and he never felt his intellectual faculties fully recovered from the effort. Although the Principia did not fall prey to the paradoxes in Frege's approach, it was later proven by Kurt Gödel that neither Principia Mathematica, nor any other consistent system of primitive recursive arithmetic, could, within that system, determine that every proposition that could be formulated within that system was decidable, i.e. could decide whether that proposition or its negation was provable within the system (this is known as Gödel's incompleteness theorem).

Russell's last significant work in mathematics and logic, Introduction to Mathematical Philosophy, was written by hand while he was in prison for his anti-war activities during World War I. This was largely an explication of his previous work and its philosophical significance.