Produced by Charles Wells Revised 2016-03-04 Introduction to this website website TOC website index blog Back to Understanding Math head
Definitions in math and in other subjects
Properties of mathematical definitions
Images and metaphors for definitions
"When I use a word, it means just what I choose it to mean--neither more nor less." -- Humpty Dumpty
A mathematical definition is fundamentally different in two ways from other sorts of definitions, a fact that is not widely appreciated by students or even by mathematicians. The differences cause students a lot of trouble.
The definition of a math object is given by accumulation of attributes, that is, by listing properties that the object is required to have.
Dictionary definitions and even definitions in some of the sciences may name some properties of the object they are defining but the properties are not usually definitive, and they also give examples or prototypes of the object.
A definition of (the name of) a math object is imposed on the reader by decree, rather than being determined by studying the way the word is used, as a lexicographer would do.
This chapter starts by summarizing the properties of mathematical definitions; the rest of the chapter goes into more detail. The Handbook of mathematical discourse and the paper by Edwards and Ward go into aspects of mathematical definitions not discussed here.
The four rules below are absolute requirements that all mathematical definitions obey:
Math definitions have other properties as well. The list below describes aspects of definitions that people new to abstract math don't always understand:
A mathematical definition prescribes the meaning of a word or phrase in a very specific way. The word or phrase is defined in terms of a list of required properties (LP), although the list may be disguised by the wording.
In this website, the word or phrase being defined is called the definiendum. The phrase that gives the definition is called the defining phrase. (A special case is the defining formula of a function.)
The definiendum can refer to either of these:
Here is a nonsense example. It uses words that (supposedly) have no meaning, to emphasize that when you see a mathematical definition the form of the definition gives information even though it may use words you don't know.
"A quilgo is a torca that is wabic and frumious".
The definiendum is "quilgo" and the defining phrase is: "is a torca that is wabic and frumious". The word "quilgo" is a noun that names the type of object. The list of required properties of a quilgo are: (1) It must be a torca. (2) It must be wabic. (3) It must be frumious.
We could have decided to use an adjective, say "quilgic", for our definition. Then the definition would be: "A torca is quilgic if it is wabic and frumious."
Mathematical definitions are crisp: |
In a proof, you can use any of the facts in the definition by just saying "by definition".
A definition is a totalitarian dictator. |
For any integer $n$:
We know $-(-3)=3$ and $3>0$, so by definition of "positive", $-(-3)$ is positive. That is an example of proving something by directly using a definition. When you get further along in math, you usually wind up quoting previously proved theorems to construct a proof.
The fact that $-(-3)$ has a minus sign in front of it is irrelevant. This argument depends on the fact that "$3$" and "$-(-3)$" are two different names for the same object.
An integer $n$ is prime if $n>1$ and the only positive divisors of $n$ are $1$ and $n$.
Some who have just learned this may say, "Then $1$ is a prime because then $n=1$ and the only positive divisors of $n$ are $1$ and $n$!" But a definition is a dictator: this definition says $n$ must be greater than $1$, so it follows from EAP that $1$ is not a prime, never mind that it fits the other part of the definition.
The paragraph above is a bit harsh, but it illustrates the point. Still, it is perfectly reasonable to ask, "Why is $1$ excluded from being a prime?" It is because including it would make it more complicated to state the fundamental theorem of arithmetic.
The definition of a math object is defined by mathematicians and they can define it in any way they want, so naturally they make a definition that is useful, and they can change it any time they want. (They did, too; for a while in the nineteenth century $1$ was a prime.)
Mathematicians make definitions to suit their convenience
The symbol $\sqrt{2}$ denotes the unique positive real number whose square is $2$.
Everything that is true about $\sqrt{2}$ follows from this definition (COMP). That includes the fact that the decimal expansion of $\sqrt{2}$ begins $1.414\ldots$ and that may have been what you really wanted to know (NI). (More about this below).
The facts about an object given in the definition |
A domain is a connected open set. (This also has other meanings: See the Glossary entry). The definiendum is "domain". The list of properties: "is a set", "connected" and "open".
You may not be familiar with words such as "connected" and "open", but in this chapter I am writing about the form of the wording of the definition and what that form tells you about the meaning. See the appendix for an intuitive example that may help you think about "connected" and "open".
The definition can be translated as saying that a subset of a topological space is a domain if it is connected and open, whatever "connected" and "open" mean! You will notice that the definition does not say that the set is a subset of a topological space. The point of this is that a definition can depend on context. A person with only a little knowledge of topology knows that if the set is open it must be a subset of a topological space.
A definition must be read in context.
There are many different ways to word a definition, and this long section describes a great many of them. You may think that only a grammarian or a dictionary editor would appreciate such infinite attention to detail, but I recommend that you glance through the possibilities listed. You may discover
It is common to word definitions using "if", in a conditional assertion. The conditional assertion, like any such, may be worded with hypothesis first or with conclusion first. Part of the hypothesis may be stated first in a separate sentence, called the precondition of the definition. (See more about preconditions here.) All this is illustrated in the list of examples following, which you may think is long enough. Even so, the list is not exhaustive.
The definition of "even" can be done in most of these ways as well:
Sometimes a constraint is put on the variable in the definition after the definition is stated, commonly in parentheses. For example: "$n$ is even if it is divisible by $2$ ($n\in \mathbb{Z}$)". This is called a postcondition. See also where.
There is more about "if" below.
A statement in
which one phrase |
Sometimes the author commands you to define something, for example:
This is not in fact telling you to do something, it is just telling you what it means for an integer to be even.
Symbolic expressions may be defined using the same terminology and styles as in definitions of words and phrases.
When defining a word or phrase the scope of the definition is usually the entire document (the definition will stay in effect to the end). Occasionally the author will say something like, "Just for the rest of this proof, say that a number is frumious if…"
However, symbolic expressions are commonly defined for quite narrow scopes, a paragraph or a section. Besides the ways I have already mentioned there are many other ways to say it the case of narrow scope:
The standard definition of even says:
Definition: If an integer is divisible by $2$, then it is even.
You can then use the definition to prove a theorem:
Theorem: If an integer is divisible by $4$, then it is even.
Note that this wording means that the theorem is true for any integer. See indefinite article.
Proof: If $n$ is divisible by $4$. then by the definition of "divides", there is an integer $k$ for which $n=4k$. But $4k=2(2k)$, and $2k$ is an integer, so $n$ is $2$ times an integer. So by definition of "divides", $n$ is even.
Because of the definition, it is correct to say both of these things:
But the theorem only justifies this one statement:
The theorem does not justify saying
The word "if" |
Because of this, some authors have begun using "if and only if" in definitions instead of "if", as in:"Definition: An integer is even if and only if it is divisible by $2$." More about this in the Glossary entry for "if".
The definition
of a math concept |
The special logical status of a definition (everything follows from it) is the reason that rewriting according to the definitions is a reasonable first step in coming up with a proof.
Here are some seemingly contradictory points about the purple prose above:
The proof of any except the most elementary theorem about a concept will use other theorems about the concept. For example, perhaps the definition of the concept implies Theorem A and Theorem A implies Theorem B. Since implication is transitive, this means that the definition implies Theorem B.
Many major theorems about a concept can help in giving you an intuition about it, because the info that is in the definition may not include the most important aspects of the concept. This is about understanding and is separate from the fact that the theorems can be used in the proof, as mentioned in the preceding point.
Metaphors and intuition that you have about the concept are also vital in coming up with a proof, although you cannot use them in the proof. See Images and Metaphors below.
The notation and terminology used may suggest properties the definition does not actually require.
The standard definition of "subset of a set" allows the whole set to be a subset of itself, but the "sub" prefix in ordinary English may make you think a subset has to be a part of the set but not the whole thing. The point is that your feelings about the meaning of "subset" are irrelevant. All that matters is the definition. See semantic contamination.
The definitions may have nothing at all in common with each other, and it may not be easy to prove they give the same concept.
You can define $\sqrt{2}$ as the unique positive real number $r$ for which $r^2=2$, or by saying $\sqrt{2}=\frac{1}{\sin (\pi/4)}$. The second definition is totally lame, but it is in fact a correct definition and could be used to estimate the decimal places of $\sqrt{2}$ by drawing the appropriate right triangle, measuring the sine, and dividing the result into $1$.
A much more important example of two different-looking definitions is given in equivalence relations and partitions in Wikipedia. The usual definition of equivalence relation (a reflexive, symmetric and transitive relation) and a partition (a set of subsets of a set such that every element of a set is contained in exactly one of them) determine exactly the same structure on the set, even though the definitions look utterly different.
The point of this equivalence of definitions is that an equivalence relation determined a unique partition, and a partition determines a unique equivalence relation, and (take a deep breath) a partition determines an equivalence that always determines the partition you started with, and an equivalence relation determines a partition that always determines the equivalence relation you started with. It is worth working out a couple of small finite examples to understand this!
There is no Central Academy |
When we gain a new understanding of a type of math object, we often realize that the names we have chosen don’t work well and need to change them. Because of this common phenomenon, there are authors who deliberately set out to reform the terminology in a subject and redefine many of the terms in the subject or substitute others. (Sometimes they do this for other, mostly bad, reasons). Such attempts rarely work. Bourbaki made the biggest effort of this sort and partly succeeded (but they failed with positive).
It is … quite hard to come up with good technical choices for formal definitions that will be valid in the variety of ways that mathematicians want to use them and that will anticipate future extensions of mathematics. If we were to continue to cooperate, much of our time would be spent with international standards commissions to establish uniform definitions and resolve huge controversies. --William Thurston
Images and metaphors associated with the concept of definition, and the motivation behind the concept, contribute greatly to understanding definitions, but they cannot (directly) be used in proofs.
In order to make it easy to show that some object is an example of the concept, the definition is minimal (or nearly so). It includes just enough information to determine the concept, but not much more.
Definitions are not always absolutely as small as they can be. For example, the usual definition of group given in undergraduate abstract algebra requires more than it needs to. See soojishin's explanation of one minimal definition of group.
The minimality of a mathematical definition hides the richness and complexity of the concept and as such may not be of much use if you want to understand it. The definition can also give you an exaggerated idea of the importance of the items that the definition does include, particularly in the case of the many devious definitions in math discussed below.
For example, a group is defined as a set with a binary operation satisfying certain properties. But groups are important primarily because their elements are symmetries. The definition of group says nothing about the elements being symmetries, although it follows from the definition that the elements of any group are symmetries of in general several different structures.
I need to clarify what "determines everything" means. One definition of "triangle" is that a triangle consists of three points connected by line segments. This definition more precisely determines every statement that is true about every triangle. For example, the angles at the corners of a triangle always add up to $\pi$. It doesn't tell you that every triangle is isosceles.
Note that I am ignoring fine points such as degenerate triangles.
Suppose you want to know the length $d$ of the diagonal of a square whose sides have length 1. You apply the Pythagorean Theorem and conclude that $d=\sqrt{2}$.
Now at this point I will make the (unrealistic) assumption that you know the basics of algebra but nothing at all about square roots and you don’t have a calculator. You look up the definition of the radical sign:
Definition: $\sqrt{r}$ is the unique positive real number s such that ${{s}^{2}}=r.$
So ${{d}^{2}}=2.$ Well big whoop. You want to know how long the diagonal is. That definition says nothing about length. This is an example of the "just enough" nature of definitions. The thing you are most interested in is approximately how long the diagonal is, and the definition of $\sqrt{2}$ says nothing about that.
However, you can get an estimate of how big $\sqrt{2}$ is by using simple algebra facts, including the one that says: for positive $x$ and $y$, if ${{x}^{2}}\lt {{y}^{2}}$ then $x\lt y$. Now you start calculating:
By doing this over and over you can get many decimal places of $\sqrt{2}$. This shows that information about the magnitude of $\sqrt{2}$ is implied by the definition.
Some apparently simple math concepts have really off-the-wall definitions.
This modest deviousness is in both cases a ploy for allowing functions or relations to be completely arbitrary. For example, "$=$" and "$\lt$" and "divides" (as in "$4$ divides $96$") are all relations, but so is the completely arbitrary relation that says "$2$ is related to $3$ and $3$ is related to both $2$ and $5$ and that's all". This is definitely a relation, although useless, and is represented as the set $\{(2,3),(3,2),(3,5)\}$.
In many situations outside math, definitions are fuzzy. For example, "warm weather" is a fuzzy concept. Perhaps everyone will agree that if the temperature is 30 degrees C. then we have warm weather, and if it is 10 degrees C. we do not have warm weather. But 20 degrees is sort of borderline. Some will say it is warm and some will not. (I say it is not. But then, I grew up in Savannah.)
Mathematical concepts are crisp. Either something fits the definition of a mathematical concept or it does not.
There is a sense in which a robin is a typical bird and a penguin is not a typical bird. A mathematical definition is simply a list of properties. If an object has all the properties, it is an example of the definition. If it doesn’t, it is not an example. So in some basic sense no example is any more typical than any other. This is in the sense of rigorous thinking described in the chapter on images and metaphors.
In fact, mathematicians talk about typical examples, trivial examples, monstrous examples and so on all the time. The point is that these attitudes are based on intuition and does not mean that one example fits the definition "better" than some other example.
It may sound really peculiar to a non-mathematician when you say that the monster group is a simple group.
If you want to learn math, listen carefully to
what a mathematician says is rigorously true,
and listen carefully when they talk intuitively
about some mathematical structure.
Both are vitally important to understanding math.
Because the definition of a math concept can be devious, it may be hard to see how you can use it in a proof. A specification of a mathematical concept is a set of statements that are all true of the concept and that suffice for many common uses, but which do not characterize the concept. These are the main points about specifications:
The name "specification" is my own but many texts use what amounts to a specification for certain concepts without using the word "specification". The meaning I use for specification is similar in spirit to the way computing scientists use the word.
I give a specification for sets in the chapter on sets and a specification for functions in the chapter on functions. The list of properties of real numbers given in the chapter on real numbers amounts to a specification.
See literalism.
I am not going to define "connected" and "open" but I will give an example that suggests what they mean.
Let $S$ be the set of points inside (not on the boundary) the red circle and $T$ be the set of points inside (not on either boundary) either the red or the blue circle:
For one of these sets to be connected means that if you have two points in the set you can draw a path connecting them that is entirely inside the set. So $S$ is connected, but $T$ is not.
For one of these set to be open means that for any point in the set, there is a tiny circle that (1) contains the point and (2) is entirely inside the set. So $S$ and $T$ are both open. But if you let $\bar{S}$ be the set in the red circle including the boundary, then $\bar{S}$ is not open because there is no tiny circle containing a point on the boundary that is entirely inside the set.
Note that although what I have said about this example is correct, it is misleading; there are more complicated situations where you cannot depend on how I described "connect" and "open" to give the correct answer.
Thanks to Dr. Hugh Porteous for corrections and suggestions.
This work is licensed under a Creative Commons Attribution-ShareAlike $2$.5 License.