What Stands in Its Way?

Social Measurement: What Stands in Its Way?

Martin Bulmer

When you cannot measure * your knowledge is * meager * and * unsatisfactory.

-Lord Kelvin (inscription carved in 1929 below the bay window of the Social Science Research Building of the University of Chicago)

who cares if some one-eyed son of a bitch invents an instrument to measure spring with.

-e. e. cummings

Measurement

MEASUREMENT is any process by which a value is assigned to the level or state of some quality of an object of study. This value is given numerical form, and measurement therefore involves the expression of information in quantifies rather than by verbal statement. It provides a powerful means of reducing qualitative data to a more condensed form for summarization, manipulation, and analysis. Classical measurement theory argues that numbers may perform at least three purposes in representing values: (1) as tags, identification marks, or labels; (2) as signs to indicate the position of a degree of a quality in a series of degrees; and (3) as signs indicating the quantitative relations between qualities. On some occasions, numbers may fulfill all three functions at once (Cohen and Nagel, 1934).

One of the most influential twentieth-century statements of the classical approach was that of psycho-physicist S. S. Stevens, who proposed four scales or degrees of measurement: nominal, ordinal, interval, and ratio measurement (1946, 1975). Nominal and ordinal measurement is nonmetric; interval and ratio measurement is metric. These theoretical standards are translated in measurement standards in the physical world through organizations such as the United States National Bureau of Standards (NBS). The NBS provides state, county, and local officials with technical and operational guides that set out measurement specifications, standard tolerances, and model laws designed to support the physical measurement system (Hunter, 1980: 869). The primary standards are those of the International System of Units (SI units) and are seven: length (meter, m), mass (kilogram, kg); time (second, s); electric current (ampere, A); temperature (kelvin, k); luminous intensity (candela, cd); and amount of substance (mole, mol) (Zebrowski, 1979).

This scientific paradigm of physical measurement provides a model the social sciences, or some social scientists, seek to emulate. The quotation from scientist Lord Kelvin carved on Chicago’s Social Science Research Building reflects that aspiration. The poet e. e. cummings’s skepticism reflects doubts as to whether the aspiration is worthwhile in the first place. The place of measurement in social science research is a contentious issue; this tension runs through social science disciplines such as sociology and political science. It is reflected in the ambivalence with which many social scientists look upon research methods such as social survey research. The aim of this article is to consider some of the hindrances to improved measurement of the social. There has been a notable failure to agree on standards for social measurement (as distinct from psychological or economic measurement), whether in terms of social indicators and conceptual unification, or at the practical level of operationalizing variables.

The Scope of Social Measurement

Social scientists take up differing positions in relation to the value of what is involved in social measurement. Some part of this may be due to resistance to, or ambivalence about, the place of numbers in the realm of knowledge, coupled with inability to appreciate the role that number may play (cf. Paulos, 1988). But the issue cannot be reduced, in T. D. Weldon’s phrase, to “like a taste for ice cream.” The merits of measurement in social science, and the obstacles to measurement, need to be set out and debated. In this way, some of the passions the subject enflames may be restrained and cooled.

Relatively few of those who have approached thoughtfully the issue of social measurement subscribe to classical measurement theory as outlined at the beginning of this article. A great deal of social measurement is nonmetric, and uses the assignment of numbers to qualities of an object of study as a way to label characteristics or make statements of more or less. A common definition of the properties of social measurement is the following:

Whenever we classify a number of units we shall talk of measurement. This

is a rather broad use of the term, but it leads to no difficulty; if we

classify a set of units by a quantitative variate (variable) we have the

special case of conventional measurement (Lazarsfeld, 1970: 66).

In terms of Stevens’s four levels of measurement (1946), much social measurement is of a nominal or ordinal kind, lacking the properties of interval and ratio measurement. But this creates difficulties. How one characterizes the state of the health of a population, or the level of crime in a particular area, is by no means a straightforward matter, given the wide variety of measures of each that are available.

Recognition of the complexity and provisional character of much social measurement comes from a variety of positions that in other respects may not share much in common. For a time, definitional operationism was in vogue, particularly as put forward by the physicist Bridgman (1927). Donald Campbell, however, criticized this as failing to do justice to the complexity of social constructs, and argued instead for a multiple operationism:

One of the great weaknesses in definitional operationism as a description

of best scientific practice was that it allowed no formal way of expressing

the scientist’s preponent awareness of the imperfection of his measuring

instruments and his prototypic activity for improving them …. In the

social sciences, a great many of the laws which impinge upon any given

measurement situation are as yet undiscovered. In addition, we use a given

instrument (such as the door-to-door interview, or peer-ratings, or

newspaper content analysis, or multiple-item attitude tests) to measure a

large number of theoretically independent variables. In this situation,

where two measures are drawn from the same instrument, it is probable that

part of their observed relationship is a function of the shared vehicle, of

the shared irrelevancies. Co-symptoms of interview rapport, acquiescence,

social desirability response sets, halo-effects in ratings, censorship and

attention biases in content analysis, correlates of data quality in

ethnographies, are examples (Campbell, 1988: 33-4).

A. V. Cicourel, in a generally critical analysis of social measurement, drew attention to the problems created by what he termed “measurement by fiat,” which failed to do justice to the complexity and theoretical importance of sociological concepts.

Measurement by fiat is not a substitute for examining and re-examining the

structure of our theories so that our observations, descriptions and

measures of the properties of social objects and events have a literal

correspondence with what we believe to be the structure of social reality

(1964: 33).

Otis Dudley Duncan, in a series of historical and critical notes on social measurement, is quite clear about the limitations of much social measurement:

With the possible and, in any event, limited exception of economics, we

have in social science no system of measurements that can be coherently

described in terms of a small number of dimensions. Like physical

scientists, we have thousands of “instruments,” but these instruments

purport to yield measurements of thousands of variables. That is, we have

no system of units (much less standards for them) that, at least in

principle, relates all of the variables to a common set of logically

primitive qualities. There are no counterparts of mass, length and time in

social science…. To the physical dimensions, economics adds money…. The

fact that social science (beyond economics) does not have such a system of

measurements is, perhaps, another way of saying that theory in our field is

fragmentary and undeveloped, and that our knowledge is largely

correlational rather than theoretical (1984: 162).

One of the endemic difficulties of social measurement is the lack of agreed standards against which to measure social phenomena. Social measurement presents problems that are not encountered in quite the same form in relation to physical, biological, or economic measurement. In the social world, we lack many of the precise measuring instruments of the physical, biological, or economic worlds. Although this difference is perhaps one of degree rather than kind, the absence of formal agreed tools of measurement such as length, weight, distance, or monetary value is a serious problem for many areas of social life.

This is evident in the history of social research (which is reviewed in other articles in this issue). Paul Lazarsfeld traced the origins of the quantification of the social in the tradition of political arithmetic, and through the rise of Quetelet and Le Play, and the French followers of these researchers. “Some time at the end of the nineteenth century,” he observed, “quantification in sociology takes on its modern function: to translate ideas into empirical operations and to look for regular relations between variables so created” (Lazarsfeld, 1961: 202-3). The early practitioners of quantitative social research at the University of Chicago between the wars, to take another example, were drawn from political science and psychology as well as from sociology. Sociologists busied themselves assembling census tract data and producing Local Community Fact Books. Political scientists sought to survey phenomena such as nonvoting by using primitive survey methods. Psychologist L. L. Thurstone devoted several years to attitude measurement, publishing articles with titles such as “Attitudes Can Be Measured” (1928). Jean Converse traced the subsequent early history of attitude measurement and its close association with social survey research and polling (1987: 54-86).

These developments were part of a move in several social science disciplines to make those disciplines more scientific. In economics, political science, and sociology, scholars such as Wesley C. Mitchell, Charles E. Merriam, and William Fielding Oburn sought greater precision through social measurement, and encouraged their students and younger colleagues to undertake more quantitative studies. Dorothy Ross (1991: 390-470) terms this development “scientism,” and argues that it represented in part a turn away from politics toward the understanding and certainty rigorous knowledge provided. Objectivity, too, was more certain if evidence and propositions were more rigorously grounded (cf. Bannister, 1987; Porter, 1995).

Definitions and Conceptual Foundations

A common distinction drawn in social and economic measurement is in terms of the “hardness” or “softness” of data. Economists pride themselves for their access to significant quantities of “hard” data, and have devoted considerable efforts to creating such data to serve their purposes. Demographers limit themselves to a restricted number of variables in the analysis of population, but pride themselves that many of the variables with which they are concerned–such as age and sex–are relatively “hard” and derived from a limited range of sources, particularly the Census of Population and from registration data. Sociologists deal much more frequently with data that is “softer” in character, drawn from a wider range of official sources, and poses more intractable measurement problems, e e cummings hints at this in his reference to measuring the phenomenon of “spring.” Even straightforward variables like marital status can give rise to difficult problems of classifying persons whose union is not legally sanctioned, or where a marriage has been dissolved but not succeeded by a second marriage. More complex constructs, such as the “dark figure” of crime–referring to offenses not recorded in official statistics–require a series of presuppositions: about a definition of crime, of its formal measurement, of the occurrence of events that for one reason or another are not recorded as crime, and of the measurement of these events (cf. Sparks et al., 1977, chap. 1). “Before counting persons with characteristics associated with soft data, one must set certain conventions to define each such attribute, which is thus moved partway from the population to statisticians’ concept of it” (Petersen, 1987: 187-8).

Social measurement differs from economic measurement not only in relative “hardness” or “softness” of data, but in terms of conceptual underpinnings. A sustained effort has been made in the field of economics to tie economic measurement and economic theory together in a much more direct way than is characteristic of other social sciences in relation to social and public policy. This dates back to the creation in the United States after World War I of the National Bureau of Economic Research, and to the influence of Keynes and the formation of the Statistical Section in the British Cabinet Office under Lord Cherwell during World War II, and much elaboration since. In a work such as Robin Marris’s Economic Arithmetic (1958), one can follow how the economist builds up by statistical means what Marris calls “a coherent anatomical description” of the economic system. The aim, moreover, has been to construct a picture of the economic system as a whole:

[W]e may use the same analogy–that of a system of connected pipes and

tanks–as a basis for a general statistical picture of the economy.

Further, the specific pattern of the imaginary system will be basically

similar whether we are thinking exclusively of quantifies or exclusively of

money or of prices. The actual painting of the picture is done by

collecting statistical data which measure the rates of flow past selected

points in the imaginary pipes, the selection of the points to be made in

such a way that the totality of measurements effectively describes all the

important economic characteristics of the system. The measurements will be

of three types–rates of flow of quantities of goods and services, rates of

flow of money payments and average level of prices (Marris, 1958: 4).

Nothing comparable exists or has existed for society as a whole or for social processes in the way that economists have sought systemic data about the working of the economy. Richard Stone’s system of social and demographic accounts aspired to that goal, but it has not been adopted as a model for the compilation of official statistics. Much social data compiled by government is produced according to what might not unfairly be described as pragmatic criteria, definitions, and measures, reflecting a sustained failure to develop the integrating theoretical model underpinning the data collection process characteristic of economics.

The social sciences themselves must bear a good deal of responsibility for the present state of affairs. Although some disciplines such as psychology have devoted considerable attention to questions of systematic definition of concepts, in several disciplines the area has been relatively neglected. Unlike psychology, sociology and political science are not disciplines with large numbers of abstract concepts embedded in formal theories. In survey research, many items have at best an ad hoc rationale, and have no underlying concepts. Moreover, there are few opportunities for rigorous validation, and hence “anything goes.” Most measures can satisfy the rather weak criteria of face and construct validity, which is usually all that is expected (Heath and Martin, 1997: 82).

Concepts

Although concepts are one of the central features that differentiate the social sciences from idiographic intellectual pursuits such as the study of history or literature, the process of concept formation and verification tends to have received relatively little attention in sociology and social research, whether quantitative or qualitative (cf. Bulmer, 1979; Hox, 1997). How is it then, political scientist Giovanni Sartori has asked, that the route of concept analysis has been pursued as lightly as it has? One answer is the dead-end offered by logicians and linguistic analysts, particularly in philosophy, who have offered microanalysis of the meaning of words without dealing with their usefulness in substantive analysis (cf. Gellner, 1959). Sartori himself pursued within the International Political Science Association an ambitious program to achieve conceptual synthesis and agreement across the discipline that seems to have ended in failure. But he recognized that the work of conceptual foundation laying was systematically neglected.

At the other end or extreme, much of what is currently labelled social

science “methodology” actually deals with research techniques and

statistical processing. In moving from the qualitative to the quantitative

science, concepts have been hastily resolved and dissolved into

variables…. [C]oncept formation is one thing and the construction of

variables is another; and the better the concepts, the better the variables

that can be derived from them. Conversely, the more the variable swallows

the concept, the poorer our conceiving (Sartori, 1984: 9-10).

A few leading social science researchers, such as Paul Lazarsfeld, paid systematic attention to these issues of improving social measurement, but in general the dialogue has been somewhat fitful (for three British sociological exceptions, see Stacey, 1969; Gittus, 1972; Burgess, 1986). The development and justification of concepts belongs to the context of discovery rather than the context of justification. Consequently, it tends to have received much less systematic attention from methodologists (Hox, 1997).

There remains the practical problem in the social domain of achieving the degree of theoretical integration between concepts and their measurement in empirical data that has been characteristic for economics. In that discipline,

[w]herever an economic argument is being made, or an inference is derived

from comparing statistics of different economic activities, the matter of

the accuracy of these data arises. This may involve questions of

comparability on grounds of definitions and concepts used. It may concern

numerical operations of various kinds…. A study of the entire complex of

the accuracy of existing statistics or observation is not only helpful but

indispensable in designing programs for the collection of new improved

data…. In many offices in which statistics are being gathered, special

efforts are being made to improve the quality of the data; nevertheless, a

systematic approach is frequently lacking (Morgenstern, 1963: 6).

If it has been frequently lacking in economics, how much more so is this true in other disciplines, such as sociology and social policy. The problems of achieving the degree of integration between conceptualization and measurement characteristic of economics continue to defy solution in the other social sciences. Until these problems are seriously tackled–and there are some signs that they are being tackled piecemeal if not systemically–the prospects for theoretically informed harmonization are slight.

Operationalization

Resolving the definitional and conceptual problems is not enough. The social researcher has to operationalize the concept or classification scheme being used, and embody it in actual research practice. From the point of view of the history of economic and social measurement, it would be instructive to contrast the different courses of economic and social measurement in central government since 1940. Government is important in this context because of the massive resources it commands, and the lead role it takes in basic social statistical data collection through the census and large continuous surveys, such as the Current Population Survey in the United States and the Labour Force Survey in the United Kingdom. One of the arguments of this paper is that the two paths have been divergent, and that it has proved much more difficult to integrate theoretical ideas with practical measurement in the social as compared to the economic spheres.

One explanation of this difference may be sought in the respective roles of expertise in the two areas. The expertise of the economist is both theoretical and empirical, seeking to offer conceptual organization and theoretical propositions, together with the presentation of empirical evidence. An interesting counterpoint from social measurement is provided in recent years in the UK by a systematic effort to harmonize questions in major UK government social surveys in order to foster comparability between different surveys (Government Statistical Service, 1995; Roberts, 1997). The mode of knowledge presented in the harmonization document is different, based as it is on technical expertise in the conduct of social surveys, without any presumption that the variables being characterized are defined in theoretical terms. Indeed, many of the variables are justified purely in pragmatic terms, on the basis of common sense or common identification, and the definitions offered, to the extent that they are, are what Bridgman termed “operational definitions”: the variables are defined in terms of the actual wording employed in their use. This no doubt is how professional survey researchers in government operate, and historically there are distinguished examples of how official categorizations have developed on the basis of pragmatic operational procedures rather than any element of a priori theorizing.

The classic example in the UK is the Registrar General’s Social Class classification, evolved for the analysis of fertility differentials in the 1911 census out of the occupational classification that the Registrar General’s office used for the analysis of mortality differentials in the latter part of the nineteenth century. Though imbued with sociological significance, there are good arguments for the view that the official classification developed by the Registrar General, at least until the Second World War, developed independently from any theoretical input from social scientists and was essentially a pragmatic grouping of occupations intended to reflect the grouping of occupations into social strata as officials of the General Register Office (GRO) perceived them (cf. Leete and Fox, 1977; Szreter, 1984, 1996).

The question then arises of how to validate the classification derived in this way on the pragmatic grounds that it works or appears to work. Usually the route is through cross-checking the results of analyses using this classification with that derived from other sources using alternative classifications. This strategy was employed in the recent review by the Royal Statistical Society of the measurement of the extent and incidence of unemployment (Royal Statistical Society, 1995). The report (by four leading figures in UK statistics) undertook a careful comparison of the results of the “Claimant Count” derived from administrative records (and which had been subject to a large number of changes in the previous 15 years due to adjustments in the rules of inclusion and exclusion), and figures derived from the Labour Force Survey (LFS), which provided estimates of the numbers and proportions in the labor force who are unemployed. The report was generally welcomed as an excellent exemplary study of a difficult issue of social statistical measurement, but what its consequence are in practice remains to be seen. For the time being, the Claimant Count remains the headline figure that is released to the press to reflect changes in unemployment levels from month to month. In this case, the harmonization issue lies in the relationship between data from an administrative source, and data from a survey source that has been developed as an alternative, and one many think provides superior data to the administrative source.

The issue is perhaps different from that faced in much social statistical measurement in that the measurement of unemployment has very high political salience, which has contributed both to political interference in the measurement standards used for the Claimant Count, and public concern about what has been produced as a result, leading in turn to the high-level Royal Statistical Society (RSS) review. Much social measurement has, of course, a small “p” political aspect, but even in contested areas such as the measurement of poverty, it is usually secondary to conceptual, technical, and operational issues about the best way to proceed.

The Challenge of Measurement: Three Areas

If progress is to be made in relation to social measurement, it is likely that this will come by attention to particular topics and variables rather than by means of general programs. The first case discussed here was a general program or movement, which in some respects may be said not to have succeeded. The second and third cases–UK classifications of social class and race and ethnicity–are more specific examples of social measurement.

Social Indicators

An instructive case of the attempt to apply measurement to the assessment of social progress, on a large scale and at a high level, is provided by the social indicators movement, which attracted a great deal of interest in North America and Western Europe in the late 1960s and 1970s. Originally stimulated by what was seen as America’s lagging in the space race with the Soviet Union, the social indicators movement was an ambitious attempt to produce precise, concise, and evaluatively neutral measures of the state of society, and of change in society, using a variety of data, much of it originating with government. Underlying the movement was an appealing idea:

It is important to monitor changes over time in a wide range of quality of

life, both for a population as a whole and for its significant sub-groups,

because such information, when combined with other data, can generate new

knowledge about how to increase the quality of life through more effective

social policies. The idea called for two key changes in earlier practices.

One was an expansion in the range of phenomena monitored beyond the

traditional economic indicators, and an explicit recognition that “life

quality,” however it might be defined, involved more than just economic

considerations. The second change involved an attempt to focus directly on

“output” indicators–i.e., indicators that show how well off people

actually are–in addition to the more traditional “input” indicators that

reflect budget allocations, procedures and processes that are presumed to

enhance well-being (Andrews, 1989: 401).

Thus, the attempt was made to construct standard measures of the state of health, crime, well-being, education, and many other social characteristics of a population. This objective, however, was far from easily realized. Writing in 1989 in a special issue of the Journal of Public Policy, a number of commentators agreed broadly that the social indicators movement had failed in its ambitious aims. Some of the reasons for this were political–skepticism on the part of right-wing governments in Britain and the United States during the 1980s about the value of social indicators programs. But more serious reasons were intellectual, including the problems of developing a system of social indicators, and the absence of a common unit of measurement in relation to social phenomena such as education, housing, health, or crime. In the view of one commentator, a further reason was not only the failure to develop indicator designs themselves, but the design of institutional arrangements for their production and application, and for public scrutiny and assessment of methods (Innes, 1989).

The difficulties in making precise social measurement have been the most important obstacle to indicator construction. A basic condition has been lacking for the creation of a system of social indicators: the existence of a common unit of measurement. Economic indicators, which have been developed with considerable success by both the private market and governments in industrial society, have a common measure of value–money–that provides a unifying thread. No such common unit of measurement can be found in fields like education, health, crime, or housing. In all these areas there are multiple alternative measures, few of which are reducible to any common scale. The only area in which measurement is relatively unproblematic, the study of population, works with a few variables such as birth, migration, marriage, and death, the first and last of which are biological facts that can be relatively easily established. Richard Stone’s demographic indicator model rested on such relatively unproblematic data (cf. Stone, 1973). Even migration and marriage give rise to considerable problems of definition. (What are the boundaries across which migration is deemed to have taken place? What constitutes marriage for couples living together outside of wedlock?) How much greater are the problems in other areas, such as health, where the problems of measuring the extent of illness, discomfort, and pain, and disruption of normal activities are more severe? It is instructive that a promising theoretical argument for the construction of a general health indicator put forward by Culyer, Lavers, and Williams (1972) has not been followed through, despite further work on the subject (Culyer, 1983).

Government is also not the ideal setting in which to implement the application of social science to public affairs. The ambitions of the social indicator movement were altogether more grandiose than those of question harmonization, but they were premised upon the same kind of harmonized system, and that failed to develop, except to a limited extent within national statistical offices. The point can be made in a different way by looking at the experience of the annual UK Central Statistical Office publication, Social Trends (comparable to the United States government publication Social Indicators, which appeared in 1973, 1976, and 1980), whose history Muriel Nissei, a former editor, has reviewed (1995). Her account reveals that in addition to the vicissitudes that the UK government statistical service as a whole experienced, subject to various reviews and cuts in establishment, progress in the harmonization of social variables went into reverse after some progress during the 1970s.

In the late 1960s Sir Claus Moser had set up the Standards and

Classification Unit in the Central Statistical Office to facilitate the

linking together of different sources. Although always more active in the

area of economic classifications rather than social, it included

responsibility for household classifications. In the 1980s, as a result of

the Rayner review, its work on social classifications fell into abeyance

and was taken up by the Office of Population Censuses and Surveys,

alongside their responsibility for occupational classification. There were

limited attempts to unify concepts, such as income, between different

surveys but differences still remained between other major classifications.

Households, heads of household and children are variously defined; thus the

Family Expenditure Survey continues to define children as those under 18

years of age and unmarried and the General Household Survey to define them

as those under 16 years of age, or 16-18 in full-time education (Nissei,

1995: 500).

The progress made in the 1970s was in response to the Joint Approach to Social Policy, instigated by the prime ministerial think tank, the Central Policy Review Staff (CPRS), during the later 1970s, in an attempt to promote more coordinated thinking about government strategy in the social field. This episode has been extensively analyzed, not least by the participants themselves on the CPRS staff, and it reveals some of the obstacles that lie in the way of attempts at harmonization. At the policy level, the CPRS’s Joint Approach to Social Policy (JASP) program encountered many of the problems of central coordination endemic in a decentralized system like British central government in Whitehall. At the statistical level, the initiative on social statistics produced a number of unpublished “Social Briefs” for ministers, and a published report on the relationship between population change and social provision. But after a couple of years the statistical program was cut back, and after the 1979 election and a change of government disappeared completely (Blackstone and Plowden, 1988; Challis 1988).

Another reason for the failure of the social indicator movement, it has been claimed, has been poor cross-national harmonization. There has been a bifurcation between objective measurement of how the population actually lives (the Scandinavian school of indicator construction) and subjective measurement of experiences and evaluations of quality of life, more characteristic elsewhere in Western Europe and in North America (Vogel, 1989). The social indicator movement failed signally in its goal of initiating an internationally harmonized system of social accounts. Comprehensive social reports are published in a number of countries, but they are often primarily descriptive, weak on analysis and paying insufficient attention to trends and establishing time series, reflecting the general failure to achieve measurement standardization. They usually lack an international comparative perspective. There is poor cross-national harmonization.

Social Class

A good example showing that the formidable conceptual and technical problems can be overcome is provided by the use of social classifications in the UK Population Census. As indicated earlier, these were first introduced in 1911, reflecting the salience of social class as a form of social division in British society. The standard procedure for this to be done in operational terms has been in the past through a published Classification of Occupations, whereby occupational descriptions are used to assign individuals and household members first to occupational groups and then from these to the Registrar General’s social classes (RGSC) or socioeconomic groups (SEG). The Office for National Statistics (ONS) instigated in the mid-1990s a thorough review of social classifications under the auspices of the Economic and Social Research Council, carried out by an independent review committee whose convener was Professor David Rose, a sociologist at the University of Essex. Their interim report (Rose, 1995) strongly recommended the continued use of social classifications by ONS, but proposed a program of research to improve the classifications and address some of the problems with the previous classifications, including the lack of a clear conceptual rationale, restricted population coverage, criticism of the categories used, problems of gender and social classification, and individual versus household measurement.

Rose’s report is also a fascinating exploration of the fit or lack of fit between pragmatic measures that have been developed for “off-the-peg” users (for example, health professionals and policymakers who use RGSC in the analysis of mortality differentials) and possible conceptual rationales developed by academic specialists in social stratification, which point up the limitations of the official classification system and have led to the development of alternatives such as the Goldthorpe class scheme and the Cambridge stratification scale devised by Blackburn, Prandy, and Stewart (see Rose and O’Reilly, 1997).

The measurement of social position is a matter of exceptional complexity and the steps toward the revision of the present RGSC and SEG classifications suggested in the report (Rose, 1995: 1017) show that developing a consistent and widely agreed system for the twenty-first century requires more research before an alternative can be promulgated with confidence. The overall aim is to provide the Office for National Statistics with a single, simplified, and improved occupationally based social classification that is clear conceptually, valid and reliable for a range of purposes, easily maintained, and has clear operational and maintenance rules. This exercise, which has resulted in a new classification being introduced for the UK 2001 census, shows that harmonization can be a long-drawn-out and negotiated process, informed by the findings of the needed methodological research required to improve measurement. It also aroused considerable public interest; Professor Rose’s appearance on the BBC Radio Today morning news magazine produced more than 100,000 hits on the Today website by people trying to determine their social class.

Measuring Race and Ethnicity

The different example comes from an innovative question, introduced into the UK 1991 Census of Population for the first time, on the ethnic group of all persons counted in the census. Hitherto, the United Kingdom, unlike the United States, had never had a direct question in the census about ethnicity. Ethnic group might be inferred, on the basis of various assumptions and from census questions on country of birth and parents’ country of birth or nationality, but the inadequacies of information on the latter, and the growth of second-and third-generation black populations no longer identifiable in terms of country of birth pointed to the need for a direct question based on self-identification. After considerable methodological development work that begin in 1975 and lasted 15 years–and a good deal of sporadic public debate about the advantages of directly asking for the information–a question on the ethnic group of each person enumerated was successfully introduced in the 1991 census. Results from this have been published, and in addition to the raw data, ONS has sponsored a series of volumes, with mainly academic contributors, using these data to analyze a variety of demographic, social, geographical, and economic aspects of ethnic diversity in the United Kingdom (see Office for National Statistics, 1996).

The introduction of the ethnic group question constitutes an impressive example of innovation in official statistics, and the possibility of pragmatic question design, including some interventions by members of Parliament and government ministers at various points. The precision of measurement achieved may, however, be questioned, the impressive overall picture provided in the four ONS volumes notwithstanding. As I have argued elsewhere (Bulmer, 1996), there is an “ineluctable fuzziness” at the edges in the results from such ethnic questions due to imprecision of the classification.

For example, two categories in the 1991 census output, “black-other” and “other-other,” have attracted much comment by analysts of these data. This is a necessary part of the fuzziness of categories, a feature evident in the harmonization exercise in the mid-1990s in dealing with the subject of ethnic group. The question recommended in the harmonization booklet closely follows the 1991 census question, except that the “other” category is excluded. This is justified on the grounds that the inclusion of such categories “would not be acceptable to several surveys for which this is not a topic of central interest and/or which lack samples of size which would justify such detail” (Government Statistical Service, 1995: 22).

What does this mean, and is this a satisfactory basis on which to construct a harmonized question? How can it be justified conceptually given the need to identify persons of mixed ethnic origin who do not fit easily into the classification of exclusive ethnic groups? What room has to be left for those who do not fit easily into the main categories used in a question? The ethnic group question has been included again in the 2001 population census, but the classification used is somewhat modified, and for the first time a “mixed” category is included, enabling people to classify themselves as being of mixed ethnic origin. In contrast to social class, measurement of ethnicity relies entirely on the person completing the census form or responding to the survey question to assign themselves (or in the case of the census, all members of their household) to a particular ethnic category, developed by the Office for National Statistics on the basis of previous research. Ethnic group membership is thus self-assigned, within a set of pre-coded categories with a write-in option for those who believe they fall outside the pre-coded alternatives.

Conclusion: Improving Classification and Measurement in Social Research

The foregoing has shown the complexities of conceptualization, measurement, operationalization, and execution in quantitative social research. What is needed is more systematic attention to the processes involved in social classification in order to tackle some of the inconsistencies and inadequacies that result from the plethora of social measures in use, and to reduce the gap between the theoretical and empirical planes in empirical social inquiry. Classification involves (a) the definition of a domain of classification, (b) the grouping of elements into sets within that domain, (c) labeling of the groups within that domain, and (d) where appropriate, the articulation or arrangement of groups in the classificatory order, such as ranking in a hierarchy. Domains vary in their complexity, and hence in the difficulty in carrying through the classification process. They have been most formalized in the area of occupation, economic activity more generally, and the resultant social class classification, which requires detailed information about a large number of (several thousand) occupations, and rules for the allocation of individuals to particular occupational, employment status, and social class groups. At the other extreme, recording a person’s sex is straightforward and poses virtually no problems from the point of view of question harmonization. The variable of ethnicity or ethnic group falls between the two extremes, where the history of attempts to measure the concept are more recent, users’ requirements vary, and the treatment of people of “mixed” racial origin has changed over time.

Practical attempts to improve social measurement like the recent UK harmonization exercise may be criticized for not paying enough attention to the first and fourth of the above processes in classification, and for suggesting that the second and third are simply matters of operational practice of a largely pragmatic kind. There is good reason to think that the process is not as straightforward as the Office for National Statistics claims it is. Moreover, the classifications achieved have an ineluctable fuzziness about them.

[T] here is no ultimate truth about most–perhaps all–classifications. It

would be lovely if there were a truth one might hope to approach

asymptotically and treat deviations from as simple measurement errors.

Alas, no; there is essential ambiguity that needs to be understood as well

as possible if society is sensibly to use statistical results based on

ineluctable fuzziness…. Classification problems arise in all fields of

science and beyond (Kruskal, 1981: 511).

There is a great need to increase the two-way traffic across the divide. What Oscar Morgenstern wrote a generation ago remains true today:

The process of improving data is an unending one. To be successful it will

require a far closer cooperation between those who make and use theories

and those who collect and prepare the data. The urgently needed greater

cooperative interaction cannot be planned and organized. It has to come

about gradually, by itself, from a better understanding of the mutual

interests these two groups have in common. The theorists in particular will

realize more clearly that efforts spent in improving measurements and

designing new measurements where they now seem impossible, will reduce the

difficulties of dealing with the data theoretically. Such closer contact

will also have a great educational value: every theorist ought to be in

intimate touch with the “facts,” “get his hands dirty,” in order to

appreciate the very great difficulties encountered even with routine

measurements (1963: 304).

References

Alonso, William, and Paul Starr, eds. The Politics Of Numbers. New York: Russell Sage Foundation [For the National Committee for Research on the 1980 Census], 1987.

Andrews, Frank M. “The Evolution of a Movement.” Andrews et al. (1989): 401-5.

Andrews, F. M., et al. “Whatever Happened to Social Indicators? A Symposium.” Journal of Public Policy 9:4 (1989).

Bannister, Robert C. Sociology and Scientism: The American Quest for Objectivity, 1880-1940. Chapel Hill: University of North Carolina Press, 1987.

Blackstone, Tessa, and William Plowden. Inside the Think Tank: Advising the Cabinet, 1971-1983. London: William Heinemann, 1988.

Bridgman, P. W. The Logic of Modern Physics. New York: Macmillan, 1927.

Bulmer, Martin. “Concepts in the Analysis of Qualitative Data.” The Sociological Review 27:4 (November 1979): 653-77.

–. The Chicago School of Sociology: Institutionalization, Diversity, and the Rise of Sociological Research. Chicago: University of Chicago Press, 1984.

–. “Problems of Theory and Measurement.” Andrews et al. (1989): 407-12.

–. “The Ethnic Group Question in the 1991 Census of Population.” Ethnicity in the 1991 Census. Eds. David Coleman and John Salt. Vol. 1: Demographic Characteristics of Ethnic Minority Populations. London: HMSO, 1996: 33-62.

Bulmer, Martin and Robert G. Burgess. “Do Concepts, Variables And Indicators Interrelate?” Burgess (1986): 246-65.

Burgess, Robert G., ed. Key Variables in Social Investigation. London: Routledge, 1986.

Campbell, Donald T. “Definitional Versus Multiple Operationism.” Donald T Campbell. Methodology and Epistemology for the Social Sciences: Selected Papers. Ed. E. S. Overman. Chicago: University of Chicago Press, 1988: 31-36.

Carley, Michael. Social Measurement and Social Indicators: Issues of Policy and Theory. London: Allen and Unwin, 1981.

Carlisle, Elaine. “The Conceptual Structure of Social Indicators. Social Indicators and Social Policy. Eds. Andrew Shonfield and Stella Shaw. London: Heinemann Educational Books, 1972: 23-32.

Challis, Linda, et al. Joint Approaches to Social Policy: Rationality and Practice. Cambridge: Cambridge University Press, 1988.

Cicourel, A. V. Method and Measurement in Sociology. New York: The Free Press, 1964.

Cohen, Morris, and Ernest Nagel. An Introduction to Logic and Scientific Method. New York: Harcourt Brace, 1934

Converse, Jean M. Survey Research in the United States: Roots and Emergence, 1890-1960. Berkeley: University of California Press, 1987.

Culyer, A. J., ed. Health Indicators. Oxford: Martin Robertson, 1983.

Culyer, A. J., R. J. Lavers, and A. Williams. “Health Indicators.” Social Indicators and Social Policy. Eds. A. Shonfield and S. Shaw. London: Heinemann Educational, 1972: 94-118.

Duncan, Otis Dudley. Notes On Social Measurement: Historical and Critical. New York: Russell Sage Foundation, 1984.

Gellner, Ernest. Words And Things. London: Penguin, 1959.

Gittus, Elizabeth, ed. Key Variables in Social Research. Vol. 1: Religion, Housing, Locality. London: Heinemann Educational Books, 1972.

Government Statistical Service. Harmonised Questions for Government Social Surveys. London: HMSO, 1995.

Heath, A., and J. Martin. “Why Are There So Few Formal Measuring Instruments in Social and Political Research?” Lyberg et al., eds. (1997): 71-86.

Hox, Joop J. “From Theoretical Concept to Survey Question.” Lyberg et al., eds. (1997): 47-69.

Hunter, J. S. “The National System of Scientific Measurement.” Science 210 (21 November 1980): 869-74.

Innes, J. E. “Disappointments and Legacies of Social Indicators.” Andrews et al. (1989): 429-32.

Kruskal, William. “Statistics in Society: Problems Unsolved and Unformulated.” Journal of the American Statistical Association 76 (1981): 505-15.

Lazarsfeld, Paul F. “Notes on the History of Quantification in Sociology–Trends, Sources, and Problems.” Woolf, ed. (1961): 147-203.

–. “Sociology” Main Trends of Research in the Social and Human Sciences. The Hague: Mouton/UNESCO, 1970: 61-65.

Leete, Richard and John Fox. “Registrar General’s Social Classes: Origins And Uses.” Population Trends 8 (1977): 1-7.

Lieberson, S. Making It Count: The Improvement of Social Research and Theory. Berkeley: University of California Press, 1985.

Lyberg, L., et al., eds. Survey Measurement and Process Quality. New York: Wiley, 1997.

Marris, Robin. Economic Arithmetic. London: Macmillan, 1958.

Moore, Peter G. “Editorial.” Journal Of The Royal Statistical Society. Series A. 158:3 (1995): 359-61.

Morgenstern, Oscar. On the Accuracy of Economic Observations. 2d ed. Princeton: Princeton University Press, 1963.

Nissel, Muriel. “Social Trends and Social Change.” Journal Of The Royal Statistical Society. Series A. 158:3 (1995): 491-504.

Office for National Statistics. Ethnicity in the 1991 Census. Vol. 1: Demographic Characteristics. Eds. D. C. Coleman and J. Salt. Vol. 2: Profiles of the Main Ethnic Groups. Ed. C. Peach. Vol. 3: Social Geography. Ed. P. Ratcliff. Vol. 4: Education, Employment and Housing. Ed. V. Karn. London: HMSO, 1996.

Paulos, John Allen. Innumeracy: Mathematical Illiteracy and Its Consequences. New York: Hill and Wang, 1988.

Petersen, William. “Politics and the Measurement of Ethnicity.” Alonso and Starr (1987): 187-233.

Porter, Theodore M. Trust in Numbers: The Pursuit of Objectivity in Science and Public Life. Princeton: Princeton University Press, 1995.

Ragin, C. C. Constructing Social Research: The Unity and Diversity of Method. Thousand Oaks, Calif.: Pine Forge Press, 1994.

Roberts, Dennis. “Editorial: Harmonization of Statistical Definitions.” Journal of the Royal Statistical Society. Series A. 160:1 (1987): 1-4.

Rose, David. A Report on Phase I of the ESRC Review of OPCS Social Classifications. Swindon: Economic and Social Research Council, 1995. Available at (http://www.iser.essex.ac.uk/staff/phase-1/frame.htm>.

Rose, David, and Karen O’Reilly, eds. Constructing Classes: Towards a New Social Classification for the UK. Swindon: Economic and Social Research Council, 1997. [With the Office for National Statistics.]

Ross, Dorothy. The Origins of American Social Science. Cambridge: Cambridge University Press, 1991.

Royal Statistical Society. “The Measurement of Unemployment in the UK (With Discussion).” (Report of the Working Party on the Measurement of Unemployment in the UK). Journal Of The Royal Statistical Society. Series A. 158:3 (1995): 363-417.

Sartori, Giovanni. “Foreword.” Social Science Concepts: A Systematic Analysis. Ed. G. Sartori. Beverly Hills: Sage, 1984: 9-12.

Sparks, Richard F., Hazel G. Genn, and David J. Dodd. Surveying Victims: A Study of the Measurement of Criminal Victimization. Chichester, Sussex: Wiley, 1977.

Stacey, Margaret, ed. Comparability in Social Research. London: Heinemann Educational Books, 1969.

Starr, Paul. “The Sociology of Official Statistics.” Alonso and Starr (1987).

Stevens, S. S. “On the Theory of Scales of Measurement.” Science 103 (June 1946): 677-80.

–. Psychophysics. New York: Wiley, 1975.

Stone, R. “A System of Social Matrices.” Review of Income and Wealth. Series 19. (1973): 143-66.

Szreter, Simon. “The Genesis of the Registrar General’s Social Classification of Occupations.” British Journal Of Sociology 35 (1984): 522-46.

–. Fertility, Class and Gender in Britain, 1860-1940. Cambridge: Cambridge University Press, 1996.

Thurstone, L. L. “Attitudes Can Be Measured.” American Journal of Sociology 33 (1928): 529-54. Reprinted with seven other papers from 1928-1931 in L. L. Thurstone, The Measurement of Values. Chicago: University of Chicago Press, 1959.

Vogel, J. “Social Indicators: A Swedish Perspective.” Andrews et al. (1989): 439-44.

Woolf, H., ed. Quantification: A History of the Meaning of Measurement in the Natural and Social Sciences. Indianapolis: Bobbs-Merrill, 1961.

Zebrowski, Jr., E. Fundamentals of Physical Measurement. North Scituate, Mass.: Duxbury Press, 1979.

COPYRIGHT 2001 New School for Social Research

COPYRIGHT 2001 Gale Group