Walter Schwager: Variables and levels of measurement
In doing social research it is current practice to store the data gathered in a computer, so the data can be analyzed using various software programs such as SPSS (Statistical Package for the Social Sciences). To prepare data for computer storage the various values on a variable are usually given numerical codes. Thus we may assign the category “male” the code 1 on the variable sex, and “females” the code 2. Thus, in developing coded information we assign a numerical code to each of the values or categories of our variables. Allocating numerical codes to data allows us to store information in a computer more efficiently. But another advantage of doing so is actually more important: the use of numbers enables us to use the powerful and elegant language of mathematics in dealing with these data. This advantage is associated with a drawback, however. A major peril is that the use of numbers frequently causes a unjustified feeling of exactness and reliability in dealing with the research data. As Moroney put it: "It is an easy and fatal step to think that the accuracy of our arithmetic is equivalent to the accuracy of our knowledge about the problem at hand." So the use of numbers brings major advantages, but also potential dangers. The question we shall address in this section is: what do these numbers mean? What arithmetical or mathematical characteristics are associated with our use of numbers for the values or categories of different variables? What interpretations of these numbers are warranted?
A brief example may help to clarify the issue. You are undoubtedly familiar with the notion of an "average", or more accurately, the arithmetic mean. If our sample consists of 5 individuals who, on the variable age, have the following values (in years): 20, 26, 30, 34 and 40, what is their mean age? We find the mean by adding up the numbers in the series, and dividing their total by the number of elements. In our example the mean age would be:
(20 +26 + 30 + 34 + 40)/5 = (150)/5 = 30
But now take the variable of religion. Let us assume that the five individuals referred to have the following religions, with in brackets the numerical code for that category:
one Protestant (1);
one Roman Catholic (2);
one Jew (3);
one with No Religion (4);
and one with an "Other" religion (5).
What is the "average" religion for this sample of 5 ?
Well, we can proceed with our computations in the same way: add up all the values, and divide the total by the number of cases. In this instance:
(1 + 2 + 3 + 4 + 5)/5 = 15/5 = 3
As we know, 3 refers to the Jewish category. Does this mean that the "average" religion is Jewish? And how would we have interpreted a result that would have given us a "mean" religion of, say, 2.1? What does an "average" religion refer to, anyway?
Let's scrutinize what we just did: we gave numerical codes to religious categories, but those numerical codes were no more than labels. We could with equal justification have assigned entire ly different numbers in an entirely different sequence: Protestant (8), Roman Catholic (1), Jewish (6), No Religion (0), Other (7). The only restriction is that each category should be assigned a unique code number, so we could not confuse it with another category. If we had given these other numbers to the categories, the "average" religion would have been quite different:
(8 + 1 + 6 + 0 + 7)/5 = 22/5 = 4.4
The result that we obtained depended totally on our arbitrary assignment of given codes, and therefore cannot be interpreted in any meaningful fashion.
In the example where we computed the mean for the variable “age”, the mean referred to a value that could be interpreted: a mean of 30 means that the average age is 30 years. But a "mean" religion makes as much sense as an average telephone number for a sample, or a mean car license number. In other words: the notion of a mean age makes mathematical sense, whereas the notion of a mean religion does not. The numbers attached as codes to religious categories are no more than labels: we only know that cases having the same code are the same on religion, and those having different codes have different religions.
You can add and subtract years, and say that someone who is 14 is half as old as someone else who is 28, and twice as old as someone who is 7 years old. But can you say that a Protestant (1) is half as religious as a Roman Catholic (with code 2) and one-fourth as religious as someone with No Religion (with code 4)? That statement would not make sense, as the results are, once again, purely caused by our arbitrary assignment of numerical codes to religious categories. Differences between these numbers (as indicated by subtracting them) do not refer to differences in "religiosity". In other words: you cannot add, subtract, divide or multiply the numerical codes attached to the categories of the variable "religion".
This clarifies the statement in the first paragraph of this section: what arithmetical or mathematical characteristics are associated with our use of numbers for the values or categories of different variables? The various ways in which we can use numbers are called levels of measurement, and each level is called a scale. The four levels of measurement that we shall discuss here are the following:
1. nominal scales;
2. ordinal scales;
3. interval scales;
4. ratio scales.
These four levels form a kind of ladder. The bottom level, nominal scales, is the most rudimentary; each subsequent level becomes more refined, but includes all the characteristics of the preceding one. You may be glad to hear that you already know all there is to know about the most sophisticated level of measurement, that of ratio scales.
The fact that for nominal scales we cannot apply what we generally consider "standard"
mathematical operations points to the following problem. Whether we can apply a certain mathematical operation to some aspect of reality is a question which can only be answered by checking whether the assumptions of that operation fit the characteristics of that situation. Many students have found this hard to grasp. When asked: is 1 plus 2 always 3? nearly everybody answers affirmatively. But what about
one cup of coffee, to which we add two spoonfuls of sugar? This is not a trick question: it demonstrates that the addition of units, according to arithmetical rules, is only possible if the units remain the same, and are not decreased or increased in number due to the physical aspects of the actual addition operation. This requirement is generally satisfied when we deal with cookies or apples, or even dollars or years; it is not when we deal with coffee and sugar, or even a male and a female rabbit, given a couple of months. Therefore a mathematical operation can only be applied if the assumptions of that operation are satisfied by the subject matter that the operation is applied to! This fit between the requirements of a mathematical operation and the characteristics of some subject matter is called isomorphism: similarity of form.
We shall now proceed to a more systematic discussion of these four levels of measurement, and the basic questions we shall be addressing for each one are the following:
a. what are the implications of the way in which numbers are used for each level of measurement?
b. associated with this is the following problem: what mathematical operations are permitted for each level of measurement?
c. this in turn leads to the final problem: what statistical measures are appropriate for each level of measurement? We shall deal with this final question in an introductory manner only.
Before starting on this discussion first a word about terminology. We are, in this topic, always discussing what we can do with the numbers that represent various values on a given variable, as in the examples above, and what these numbers represent or mean. Such a variable is said to be at a certain measurement level, or to be a certain scale. The variable "religion" is at the nominal level of measurement, and can be said to be a nominal scale. A nominal scale may be any variable at that level. A ratio scale is a variable at the ratio level of measurement; as we shall see later, that might be "age", or "years married." Unfortunately the term "scale" also refers to instruments to measure attitudes, so some confusion may arise; so beware.)
Nominal Scales, or: when is 1 plus 2 not 3?
In the example of religion, as we discussed a moment ago, the allocation of numbers was merely a labelling exercise: we assign a (numerical) name to a given category. This is why we call this use of numbers: measurement at the level of categorical or nominal scales (from the Latin "nomen": name).
What characteristics are associated with this way of using numbers? Only those of similarities and differences,- a unit of analysis with a given code is similar (on that variable!) to all other units with the same code, and it differs on that variable from all units of analysis with a different code. (In algebraic terms, a=a, and b=b; and a is not equal to b, and vice versa.) Put technically, the numerical codes identify equivalence classes, as all the elements within a certain category are equivalent: equal in value.
The allocation of numbers is purely arbitrary, however, as we already discussed. As one author put it, "any two numbers may be interchanged without affecting anything but the notation." As long as we keep the numbers distinct for different categories, and we assign the same number to all cases within the same category, we may allocate any numbers we wish.
What mathematical operations are permitted for nominal scales? Apart from operations dealing with similarities and dissimilarities, none. The only operations allowed for equivalence classes are frequency counts: e.g., how many Protestants do we have in our sample? Let's review this systematically.
1. We can count the number of cases with a given code, e.g. the number of Protestants, or the
number of Catholics;
2. Can we compare these numerical codes in terms of more or less? In other words, can we rank them? No, as we have assigned them in an arbitrary fashion. (In our example No religion -code 4- would be "more" on some fictional variable than the preceding three categories: Protestant, Catholic, and Jewish, with codes 1, 2, and 3!)
3. Can we add or subtract these code numbers? No, as we have assigned them totally arbitrarily, and what would additions and subtractions mean here? Although we all tend to believe that 1 + 2 = 3, at the level of nominal scales we cannot add one Protestant to one Roman Catholic to make one member of the Jewish faith; that would be an odd kind of interreligious procreation. Thus at this level of measurement 1 + 2 3!
4. Can we multiply or divide these numbers? Again, the answer is no. our assignment of numbers has been arbitrary, and what would it mean to do the following sum: 2/2= 1? Something like the following: RC/RC = Protestant?
In summary it can be stated that at the level of nominal scales we can only count (heads); we cannot rank, add, subtract, multiply or divide.
The statistical measures that are appropriate at the level of nominal scales are those that are based on head counts only: percentages, proportions, and modes or modal categories, as well as frequencies.
The notion of nominal scales is puzzling at first sight, so you may want to have a look at some other nominal or categorical level variables. Some of the most important categorical variables in sociology are: sex, ethnicity, race, religion, -occupation, party affiliation and marital status.
Nominal level variables are also known as categorical variables, as the values on them are distinct categories. They are also known as qualitative variables, in contrast to the next three types, which are lumped together as quantitative variables. (The Baker text considers ordinal variables as qualitative as well.)
The ranking of numbers: ordinal scales
In many situations we use variables with values, that can be ranked in terms of more or less, or of greater or smaller. The educational achievement of a respondent’s mother or father can be fitted into one of the following categories:
What is the highest level of formal education completed by your parents?
EDUCATION MOTHER FATHER
No schooling..............................................1 1
Some Elementary schooling..................... 2 2
Completed Elementary school...................3 3
Some Secondary school............................4 4
Completed Secondary school....................5 5
Some University or College......................6 6
University degree or degrees....................7 7
Other (write in)
Mother__________________________ 8
Father ___________________________ 8
Don't know..............................................9 9
The first observation we can make is that the numerical codes can be interpreted in terms of similarity and dissimilarity, as in nominal scales. (After we have completed our discussion of levels of measurement you will note that each higher level of measurement has all the characteristics of preceding levels of measurement, plus some new ones.) But the codes also make sense in terms of more and less: "no schooling" (1) is clearly less than, say, "some elementary" (2); (5), "completed secondary school", is clearly more than (4), "some secondary school."
So in what way do ordinal scales differ from nominal scales? The codes of an ordinal scale can be ranked in terms of more or less on a given variable. (With the exception of the 8 -other- and 9 -don't know- categories this applies to the example above, as shown.) This ranking possibility results in a rank order, and therefore the term ordinal scales.
What can we say about the size of the differences between two values on an ordinal scale? In general, little or nothing. How would you compare the difference between 2 and 1, or 7 and 6, in the example just given? Because the differences are unequal or unknown, we cannot compare them in mathematical terms: we cannot add or subtract, therefore (7-6)is not equal to (2-1).
For the same reason we can also not multiply or divide these numbers, as is discussed below.
What statistical measures can we apply to ordinal scales? Basically the same as for nominal scales, plus the ones based on ranking. These include percentiles (and quartiles) and such measures as the median. If you have ranked a class of 15 students with scores on a music test, the score of the middle student -the 8th in this case- is the median value.
Many of the variables in social science research are of an ordinal kind: job prestige, educational level, a country's level of industrialization, and so on. The largest collection probably consists of individual attitudes and aptitudes: the strength of your opinion in favour or against capital punishment, economic nationalism, sexual equality, or your scores on IQ tests, classroom tests, academic subjects and so on. If a student gets a score of 70 on an academic test, we can presumably say that she has a higher score than someone with a score or 35, but can we say that the difference between a score of 70 and 35 is the same as that between 0 and 35? We cannot.
This also implies that we cannot multiply or divide numbers at the level of an ordinal scale: Return to our example for a moment: would you be able to say that 6/3 = 2? In other words, would you be able to say that someone with some university or college education has twice as much schooling as someone who finished elementary school? That would not be a very meaningful statement to make.
To summarize our discussion more systematically, we can state that:
A. the mathematical connotations of numbers used at the
level of ordinal scales are:
1. those of similarity or dissimilarity, as for nominal scales;
2. those of ranking, resulting in rank orders;
B. the mathematical operations permitted for ordinal scales are:
1. those of counting: how many elements are in the 1- category, for example;
2. those of ranking, or comparisons in terms of more or less;
C. The mathematical operations that are not permitted are those of:
1. subtraction and addition;
2. multiplication;
3. division.
D. what statistical measures are appropriate at the level of ordinal scales?
1. those associated with head counts: percentages, proportions, and frequencies; modes and modal categories;
2. those associated with rank orders: the median, and percentiles, to give only two examples.
The mathematical characteristics of an ordinal system include the requirement of transitivity: if a is larger than b, and b is larger that c, than a must be larger than c; i.e., if a is larger than b, and b is larger than c, then c must be larger than a. In reality this transitivity requirement may be violated. The simplest example concerns sports teams: Team A may beat Team B (i.e. be better); Team B may beat Team C; but Team C may beat Team A! This is an example of intransitivity. (In such a situation the criteria for an ordinal scale are not fulfilled. But in all sports competitive rules ensure that such intransitivity does not occur.)
In the natural sciences a few examples of ordinal scales still exist, including the Beaufort scale of wind velocity, (1: leaves move slightly; 10: buildings blow over), the Richter scale for earthquake strength, and the Mohs scale for the hardness of minerals.
In the social sciences many variables are at the level of ordinal scales, but because much more powerful and useful statistics are available for the next level of measurement, most of these ordinal variables are treated as interval level variables. The consequences of this are debated in the profession, but these debates are of no great concern to us for the moment.
Interval Scales, or: When is 10 not twice 5?
The clearest example to illustrate interval scales comes from the measurement of temperature. In comparing Fahrenheit and Celsius scales, for instance, we can state that, roughly speaking,
34 degrees F = 1 degree C; and
68 degrees F = 20 degrees C.
How do these temperatures compare to each other? Well, in terms of Fahrenheit, 68/34=2, so it is tempting to say that one temperature is twice as warm as the other. But how do they compare in terms of the Celsius or centigrade scale? Now, 20/1=20, so here we might want to say that one temperature is twenty times warmer than the other. How come that the two measurement systems give us two different results? After all, our mathematical computations have been correct. Why are our conclusions contradictory?
Could it be that we get these peculiar results because we employ different measurement systems? No, because in measuring lengths in imperial and metric measures, two different systems, we still end up with the same results:
2 yards = 1.82 metres; and
4 yards = 3.64 metres.
How do these two lengths compare? Well, in yards, clearly, 4/2 = 2, and in metres, 3.64/1.82 = 2 as well. So changing measures of length did not influence the results here. It also does not for surface and volume measures. By doing the conversion on a simple example you can check that yourselves, if you want.
The explanation for our conflicting results is that they are contradictory, because the two scales we compared have different zero points: the two scales start counting at different points. These two scales, plus a third one, the Kelvin scale, can be illustrated as in Figure 5.2. The line represents the variable "temperature".
-273 degrees Celsius -18 0 38 100
TEMPERATURE POINTS * * * *
___________________________________________________________________________________
-460 degrees Fahrenheit 0 32 100 212
0 degrees Kelvin 255 273 311 373
A GRAPHIC COMPARISON OF THREE TEMPERATURE SCALES
The Kelvin scale starts at the "natural" zero of absolute zero, but the Celsius and Fahrenheit scales starts at relatively arbitrary points along the temperature line. That is why you cannot divide temperatures on these two scales. (You would also run into problems with negative temperatures on these scales.)
These scales do have the quality that each unit difference on the scale is equivalent to each other unit; in other words, one degree Celsius difference is always the same, no matter where it is located. Thus the difference between 10 F and 5 F is the same as that between 15 degrees and 10 degrees, for example. The intervals between successive numbers are the same, or as it is sometimes put, they are equidistant. That is different from the situation in the preceding level of measurement, ordinal scaling: there a point difference might mean, in one case, the difference between "some elementary schooling" and "no schooling" (codes 2 and 1 on question 40 in the preceding section), or between "some university or college", and "completed secondary school" (codes 6 and 5). So in ordinal scaling the implications or meaning of a difference of a point depend on where on the scale you are. At this new level of measurement, however, each point difference can be seen as an interval of equal size, hence the name "interval scale." (The reasons why these intervals are the same are rather complex, so we'll bypass that discussion.)
Because these intervals are the same, it is meaningful to compare differences by adding and subtracting numbers on the same scale, as we did in stating that the difference between 10 and 5 degrees Celsius was the same as that between, say, 33 and 28. After all, (10-5)=5, and (33-28)=5.
What are the permitted mathematical operations for an interval scale? First of all, we have to mention those that apply also to nominal and ordinal scales, and introduce the new feature specific to interval scales:
1. similarity and dissimilarity;
2. ranking;
3. addition and subtraction.
We cannot, however, multiply or divide, as our comparison of Fahrenheit and Celsius scales illustrated.
The statistical measures applicable to interval measures are first the ones we have encountered already:
1. those based on counts: frequencies, percentages, proportions; modes and modal categories;
2. those based on ranks: percentiles, medians, etc.
But for interval scales there are important new additions:
3. statistical measures based on equal intervals: the
(arithmetic) mean, and its associated measures of dispersion: the variance and the standard deviation.
4. we can now also use standard correlational techniques.
Because the mean and the standard deviation, and correlational analyses are very useful statistical tools, social scientists like to use ordinal scales as if they were interval scales, as was stated in the previous section. Most social scientists now seem to accept this practice. ("Pure" examples of interval scales in the social sciences are actually rather rare.)
Finally, do not confuse "interval scales" with a specific type of attitude scale, the Thurstone "equal appearing interval scales." Apart from the similarities in name these two scales are very different.
Ratio scales, or: back to basics
Ratio scales bring us back to familiar arithmetical examples: cookies, apples, or whatever else was used to teach you basic math.
Ratio scales have the characteristics of interval scales, plus the advantages of a natural zero point. Take income, for example: dollars can be added and subtracted, multiplied, divided, and so on. And you can start from a natural zero!
The mathematical operations applicable to ratio scales are all the ones that you are familiar with:
1. counting;
2. ranking;
3. addition and subtraction;
as well as the most important new operations:
4. multiplication and division.
As divisions establish ratios between numbers, this level of measurement is called a "ratio scale".
The statistical measures appropriate to ratio scales are the same ones as we applied to interval scaling, plus a rather obscure one hardly ever used in social science: the geometric mean, which you can safely ignore.
Examples of ratio scales, or variables at the ratio level of measurement in the social sciences, mainly deal with persons, objects or physical characteristics, such as space and time. They include: number of residents in a community; annual income in dollars; number of children per family; number of years married; number of years education; number of working days; square foot per house; and so on. Sometimes ratio scales deal with the frequency of events, e.g., the frequency of moves over the last ten years.
This tidiness of ratio scales is often lost to some degree when we combine values into groupings, or grouped data. Take this question, for example:
41. To the best of your knowledge, what was your total income in the past year?
Up to $15,000.................................................................................................1
$15,001-$25,000.............................................................................................2
$25,001-$35,000.............................................................................................3
$35,001-$45,000.............................................................................................4
$45,001-$55,000.............................................................................................5
$55,001-$65,000.............................................................................................6
$65,001 and over............................................................................................7
Don't know......................................................................................................9
Given the unequal size of the groupings, especially the ones at both ends, it may be dangerous to treat this variable as a ratio scale. The treatment of grouped data requires some statistical caution, but we cannot go into that now.
A Summary of Levels of Measurement Scales
What are the characteristics and implications of systems of numbers in social measurement? The answer depends on the kind of variable that these numbers are used with: nominal variables, ordinal variables, interval variables, and ratio variables. These variables are at different levels of measurement, and they form different measurement scales.
Nominal level variables are often called qualitative, as we are dealing here with qualitative differences between various categories, which are otherwise incomparable. In ordinary language you might call these "cheese and chalk" or "apples and oranges" variables. The other three levels of measurement are called quantitative variables, as they measure quantities of some characteristic. (As stated earlier, Baker classifies these scales somewhat differently.