Examining the validity structure of qualitative research
R Burke Johnson
Three types of validity in qualitative research are discussed. First, descriptive validity refers to the factual accuracy of the account as reported by the qualitative researcher. Second, interpretive validity is obtained to the degree that the participants’ viewpoints, thoughts, intentions, and experiences are accurately understood and reported by the qualitative researcher. Third, theoretical validity is obtained to the degree that a theory or theoretical explanation developed from a research study fits the data and is, therefore, credible and defensible. The two types of validity that are typical of quantitative research, internal and external validity, are also discussed for qualitative research. Twelve strategies used to promote research validity in qualitative research are discussed.
Discussions of the term “validity” have traditionally been attached to the quantitative research tradition. Not surprisingly, reactions by qualitative researchers have been mixed regarding whether or not this concept should be applied to qualitative research. At the extreme, some qualitative researchers have suggested that the traditional quantitative criteria of reliability and validity are not relevant to qualitative research (e.g., Smith, 1984). Smith contends that the basic epistemological and ontological assumptions of quantitative and qualitative research are incompatible, and, therefore, the concepts of reliability and validity should be abandoned. Most qualitative researchers, however, probably hold a more moderate viewpoint. Most qualitative researchers argue that some qualitative research studies are better than others, and they frequently use the term validity to refer to this difference. When qualitative researchers speak of research validity, they are usually referring to qualitative research that is plausible, credible, trustworthy, and, therefore, defensible. We believe it is important to think about the issue of validity in qualitative research and to examine some strategies that have been developed to maximize validity (Kirk & Miller, 1986; LeCompte & Preissle, 1993; Lincoln & Guba, 1985; Maxwell, 1996). A list of these strategies is provided in Table 1.
One potential threat to validity that researchers must be careful to watch out for is called researcher bias. This problem is summed up in a statement a colleague of mine once made to me. She said “The problem with qualitative research is that the researchers find what they want to find, and then they write up their results.” It is true that the problem of researcher bias is frequently an issue because qualitative research is open ended and less structured than quantitative research. This is because qualitative research tends to be exploratory. (One would be remiss, however, to think that researcher bias is never a problem in quantitative research!) Researcher bias tends to result from selective observation and selective recording of information, and also from allowing one’s personal views and perspectives to affect how data are interpreted and how the research is conducted.
The key strategy used to understand researcher bias is called reflexivity, which means that the researcher actively engages in critical self reflection about his or her potential biases and predispositions (Table 1). Through reflexivity, researchers become more self aware, and they monitor and attempt to control their biases. Many qualitative researchers include a distinct section in their research proposals titled Researcher Bias. In this section, they discuss their personal background, how it may affect their research, and what strategies they will use to address the potential problem. Another strategy that qualitative researchers use to reduce the effect of researcher bias is called negative case sampling (Table 1). This means that they attempt carefully and purposively to search for examples that disconfirm their expectations and explanations about what they are studying. If you use this approach, you will find it more difficult to ignore important information, and you will come up with more credible and defensible results.
We will now examine some types of validity that are important in qualitative research. We will start with three types of validity that are especially relevant to qualitative research (Maxwell, 1991,1996). These types are called descriptive validity, interpretive validity, and theoretical validity. They are important to qualitative research because description of what is observed and interpretation of participants’ thoughts are two primary qualitative research activities. For example, ethnography produces descriptions and accounts of the lives and experiences of groups of people with a focus on cultural characteristics (Fetterman, 1998; LeCompte & Preissle, 1993). Ethnographers also attempt to understand groups of people, from the insider’s perspective (i.e., from the viewpoints of the people in the group; called the emic perspective). Developing a theoretical explanation of the behavior of group members is also of interest to qualitative researchers, especially qualitative researchers using the grounded theory perspective (Glaser & Strauss, 1967; Strauss and Corbin, 1990). After discussing these three forms of validity, the traditional types of validity used in quantitative research, internal and external validity, are discussed. Internal validity is relevant when qualitative researchers explore cause and effect relationships. External validity is relevant when qualitative researchers generalize beyond their research studies.
The first type validity in qualitative research is called descriptive validity. Descriptive validity refers to the factual accuracy of the account as reported by the researchers. The key questions addressed in descriptive validity are: Did what was reported as taking place in the group being studied actually happen? and Did the researchers accurately report what they saw and heard? In other words, descriptive validity refers to accuracy in reporting descriptive information (e.g., description of events, objects, behaviors, people, settings, times, and places). This form of validity is important because description is a major objective in nearly all qualitative research.
One effective strategy used to obtain descriptive validity is called investigator triangulation. In the case of descriptive validity, investigator triangulation involves the use of multiple observers to record and describe the research participants’ behavior and the context in which they were located. The use of multiple observers allows cross-checking of observations to make sure the investigators agree about what took place. When corroboration (i.e., agreement) of observations across multiple investigators is obtained, it is less likely that outside reviewers of the research will question whether something occurred. As a result, the research will be more credible and defensible.
While descriptive validity refers to accuracy in reporting the facts, interpretive validity requires developing a window into the minds of the people being studied. Interpretive validity refers to accurately portraying the meaning attached by participants to what is being studied by the researcher. More specifically, it refers to the degree to which the research participants’ viewpoints, thoughts, feelings, intentions, and experiences are accurately understood by the qualitative researcher and portrayed in the research report. An important part of qualitative research is understanding research participants’ inner worlds (i.e., their phenomenological worlds), and interpretive validity refers to the degree of accuracy in presenting these inner worlds. Accurate interpretive validity requires that the researcher get inside the heads of the participants, look through the participants’ eyes, and see and feel what they see and feel. In this way, the qualitative researcher can understand things from the participants’ perspectives and provide a valid account of these perspectives.
Some strategies for achieving interpretive validity are provided in Table 1. Participant feedback is perhaps the most important strategy (Table 1). This strategy has also been called “member checking” (Lincoln & Guba, 1985). By sharing your interpretations of participants’ viewpoints with the participants and other members of the group, you may clear up areas of miscommunication. Do the people being studied agree with what you have said about them? While this strategy is not perfect, because some participants may attempt to put on a good face, useful information is frequently obtained and inaccuracies are often identified.
When writing the research report, using many low inference descriptors is also helpful so that the reader can experience the participants’ actual language, dialect, and personal meanings (Table 1). A verbatim is the lowest inference descriptor of all because the participants’ exact words are provided in direct quotations. Here is an example of a verbatim from a high school dropout who was part of an ethnographic study of high school dropouts:
I wouldn’t do the work. I didn’t like the teacher and I didn’t like my mom and dad. So, even if I did my work, I wouldn’t turn it in. I completed it. I just didn’t want to turn it in. I was angry with my mom and dad because they were talking about moving out of state at the time (Okey & Cusick, 1995: p. 257). This verbatim provides some description (i.e., what the participant did) but it also provides some information about the participant’s interpretations and personal meanings (which is the topic of interpretive validity). The participant expresses his frustration and anger toward his parents and teacher, and shares with us what homework meant to him at the time and why he acted as he did. By reading verbatims like this one, readers of a report can experience for themselves the participants’ perspectives. Again, getting into the minds of research participants is a common goal in qualitative research, and Maxwell calls our accuracy in portraying this inner content interpretive validity.
The third type of validity in qualitative research is called theoretical validity. You have theoretical validity to the degree that a theoretical explanation developed from a research study fits the data and, therefore, is credible and defensible. Theory usually refers to discussions of how a phenomenon operates and why it operates as it does. Theory is usually more abstract and less concrete than description and interpretation. Theory development moves beyond just the facts and provides an explanation of the phenomenon. In the words of Joseph Maxwell (1991):
…one could label the student’s throwing of the eraser as an act of resistance, and connect this act to the repressive behavior or values of the teacher, the social structure of the school, and class relationships in U.S. society. The identification of the throwing as resistance constitutes the application of a theoretical construct….the connection of this to other aspects of the participants, the school, or the community constitutes the postulation of theoretical relationships among these constructs (p. 291).
In the above example, the theoretical construct called “resistance” is used to explain the student’s behavior. Maxwell points out that the construct of resistance may also be related to other theoretical constructs or variables. In fact, theories are often developed by relating theoretical constructs.
A strategy for promoting theoretical validity is extended fieldwork (Table 1). This means that you should spend a sufficient amount of time studying your research participants and their setting so that you can have confidence that the patterns of relationships you believe are operating are stable and so that you can understand why these relationships occur. As you spend more time in the field collecting data and generating and testing your inductive hypotheses, your theoretical explanation may become more detailed and intricate. You may also decide to use the strategy called theory triangulation (Table 1; Denzin, 1989). This means that you would examine how the phenomenon being studied would be explained by different theories. The various theories might provide you with insights and help you develop a more cogent explanation. In a related way, you might also use investigator triangulation and consider the ideas and explanations generated by additional researchers studying the research participants.
As you develop your theoretical explanation, you should make some predictions based on the theory and test the accuracy of those predictions. When doing this you can use the pattern matching strategy (Table 1). In pattern matching, the strategy is to make several predictions at once; then, if all of the predictions occur as predicted (i.e., if the pattern is found), you have evidence supporting your explanation. As you develop your theoretical explanation you should also use the negative case sampling strategy mentioned earlier (Table 1). That is, you must always search for cases or examples that do not fit your explanation so that you do not simply find the data that support your developing theory. As a general rule, your final explanation should accurately reflect the majority of the people in your research study. Another useful strategy for promoting theoretical validity is called peer review (Table 1). This means that you should try to spend some time discussing your explanation with your colleagues so that they can search for problems with it. Each problem must then be resolved. In some cases you will find that you will need to go back to the field and collect additional data. Finally, when developing a theoretical explanation, you must also think about the issues of internal validity and external validity to which we now turn.
Internal validity is the fourth type of validity in qualitative research of interest to us. Internal validity refers to the degree to which a researcher is justified in concluding that an observed relationship is causal (Cook and Campbell, 1979). Often qualitative researchers are not interested in cause and effect relationships. Sometimes, however, qualitative researchers are interested in identifying potential causes and effects. In fact, qualitative research can be very helpful in describing how phenomena operate (i.e., studying process) and in developing and testing preliminary causal hypotheses and theories (Campbell, 1979; Johnson, 1994; LeCompte & Preissle, 1993; Strauss, 1995; 1994).
When qualitative researchers identify potential cause and effect relationships, they must think about many of the same issues that quantitative researchers must consider. They should also think about the strategies used for obtaining theoretical validity discussed earlier. The qualitative researcher takes on the role of the detective searching for the true cause(s) of a phenomenon, examining each possible clue, and attempting to rule out each rival explanation generated (see researcher as detective in Table 1). When trying to identify a causal relationship, the researcher makes mental comparisons. The comparison might be to a hypothetical control group. Although a control group is rarely used in qualitative research, the researcher can think about what would have happened if the causal factor had not occurred. The researcher can sometimes rely on his or her expert opinion, as well as published research studies when available, in deciding what would have happened. Furthermore, if the event is something that occurs again the researcher can determine if the causal factor precedes the outcome. In other words, when the causal factor occurs again, does the effect follow?
When a researcher believes that an observed relationship is causal, he or she must also attempt to make sure that the observed change in the dependent variable is due to the independent variable and not to something else (e.g., a confounding extraneous variable). The successful researcher will always make a list of rival explanations or rival hypotheses, which are possible or plausible reasons for the relationship other than the originally suspected cause. Be creative and think of as many rival explanations as you can. One way to get started is to be a skeptic and think of reasons why the relationship should not be causal. Each rival explanation must be examined after the list has been developed. Sometimes you will be able to check a rival explanation with the data you have already collected through additional data analysis. At other times you will need to collect additional data. One strategy would be to observe the relationship you believe to be causal under conditions where the confounding variable is not present and compare this outcome with the original outcome. For example, if you concluded that a teacher effectively maintained classroom discipline on a given day but a critic maintained that it was the result of a parent visiting the classroom on that day, then you should try to observe the teacher again when the parent is not present. If the teacher is still successful, you have some evidence that the original finding was not because of the presence of the parent in the classroom.
All of the strategies shown in Table 1 are used to improve the internal validity of qualitative research. Now we will explain the only two strategies not yet discussed (i.e., methods triangulation and data triangulation). When using methods triangulation the researcher uses more than one method of research in a single research study. The word methods should be used broadly here, and it refers to different methods of research (e.g., ethnography, survey, experimental, etc.) as well to different types of data collection procedures (e.g., interviews, questionnaires, and observations). You can intermix any of these (e.g., ethnography and survey research methods, or interviews and observations, or experimental research and interviews). The logic is to combine different methods that have “nonoverlapping weaknesses and strengths” (Brewer & Hunter, 1989). The weaknesses (and strengths) of one method will tend to be different from those of a different method, which means that when you combine two or more methods you will have better evidence! In other words, the “whole” is better than its “parts.” Here is an example of methods triangulation. Perhaps you are interested in why students in an elementary classroom stigmatize a certain student named Brian. A stigmatized student would be an individual that is not well liked, has a lower status, and is seen as different from the normal students. Perhaps Brian has a different hair cut from the other students, is dressed differently, or doesn’t act like the other students. In this case, you might decide to observe how students treat Brian in various situations. In addition to observing the students, you will probably decide to interview Brian and the other students to understand their beliefs and feelings about Brian. A strength of observational data is that you can actually see the students’ behaviors. A weakness of interviews is that what the students say and what they actually do may be different. However, using interviews you can delve into the students’ thinking and reasoning, whereas you cannot do this using observational data. Therefore, the whole will likely be better than the parts.
When using data triangulation the researcher uses multiple data sources in a single research study. “Data sources” does not mean using different methods. Data triangulation refers to the use of multiple data sources using a single method. For example, the use of multiple interviews would provide multiple data sources while using a single method (i.e., the interview method). Likewise, the use of multiple observations would be another example of data triangulation; multiple data sources would be provided while using a single method (i.e., the observational method). Another important part of data triangulation involves collecting data at different times, at different places, and with different people.
Here is an example of data triangulation. Perhaps a researcher is interested in studying why certain students are apathetic. It would make sense to get the perspectives of several different kinds of people. The researcher might interview teachers, interview students identified by the teachers as being apathetic, and interview peers of apathetic students. Then the researcher could check to see if the information obtained from these different data sources was in agreement. Each data source may provide additional reasons as well as a different perspective on the question of student apathy, resulting in a more complete understanding of the phenomenon. The researcher should also interview apathetic students at different class periods during the day and in different types of classes (e.g., math and social studies). Through the rich information gathered (e.g., from different people, at different times, and at different places) the researcher can develop a better understanding of why students are apathetic than if only one data source is used.
External validity is important when you want to generalize from a set of research findings to other people, settings, and times (Cook and Campbell, 1979). Typically, generalizability is not the major purpose of qualitative research. There are at least two reasons for this. First, the people and settings examined in qualitative research are rarely randomly selected, and, as you know, random selection is the best way to generalize from a sample to a population. As a result, qualitative research is virtually always weak in the form of population validity focused on “generalizing to populations” (i.e., generalizing from a sample to a population).
Second, some qualitative researchers are more interested in documenting particularistic findings than universalistic findings. In other words, in certain forms of qualitative research the goal is to show what is unique about a certain group of people, or a certain event, rather than generate findings that are broadly applicable. At a fundamental level, many qualitative researchers do not believe in the presence of general laws or universal laws. General laws are things that apply to many people, and universal laws are things that apply to everyone. As a result, qualitative research is frequently considered weak on the “generalizing across populations” form of population validity (i.e., generalizing to different kinds of people), and on ecological validity (i.e., generalizing across settings) and temporal validity (i.e., generalizing across times).
Other experts argue that rough generalizations can be made from qualitative research. Perhaps the most reasonable stance toward the issue of generalizing is that we can generalize to other people, settings, and times to the degree that they are similar to the people, settings, and times in the original study. Stake (1990) uses the term naturalistic generalization’ to refer to this process of generalizing based on similarity. The bottom line is this: The more similar the people and circumstances in a particular research study are to the ones that you want to generalize to, the more defensible your generalization will be and the more readily you should make such a generalization.
To help readers of a research report know when they can generalize, qualitative researchers should provide the following kinds of information: the number and kinds of people in the study, how they were selected to be in the study, contextual information, the nature of the researcher’s relationship with the participants, information about any informants who provided information, the methods of data collection used, and the data analysis techniques used. This information is usually reported in the Methodology section of the final research report. Using the information included in a well-written methodology section, readers will be able to make informed decisions about to whom the results may be generalized. They will also have the information they will need if they decide to replicate the research study with new participants.
Some experts show another way to generalize from qualitative research (e.g., Yin, 1994). Qualitative researchers can sometimes use replication logic, just like the replication logic that is commonly used by experimental researchers when they generalize beyond the people in their studies, even when they do not have random samples. According replication logic, the more times a research finding is shown to be true with different sets of people, the more confidence we can place in the finding and in the conclusion that the finding generalizes beyond the people in the original research study (Cook and Campbell, 1979). In other words, if the finding is replicated with different kinds of people and in different places, then the evidence may suggest that the finding applies very broadly. Yin’s key point is that there is no reason why replication logic cannot be applied to certain kinds of qualitative research.2
Here is an example. Over the years you may observe a certain pattern of relations between boys and girls in your third grade classroom. Now assume that you decided to conduct a qualitative research study and you find that the pattern of relation occurred in your classroom and in two other third grade classrooms you studied. Because your research is interesting you decide to publish it. Then other researchers replicate your study with other people and they find that the same relationship holds in the third grade classrooms they studied. According to replication logic, the more times a theory or a research finding is replicated with other people, the greater the support for the theory or research finding. Now assume further that other researchers find that the relationship holds in classrooms at several other grade levels (e.g., first grade, second grade, fourth grade, and fifth grade). If this happens the evidence suggests that the finding generalizes to students in other grade levels, adding additional generality to the finding.
We want to make one more comment before concluding. If generalizing through replication and theoretical validity (discussed above) sound similar, that is because they are. Basically, generalizing (i.e., external validity) is frequently part of theoretical validity. In other words, when researchers develop theoretical explanations, they often want to generalize beyond their original research study. Likewise, internal validity is also important for theoretical validity if cause and effect statements are made.
1. Donald Campbell (1986) makes a similar point, and he uses the term proximal similarity to refer to the degree of similarity between the people and circumstances in the original research study and the people and circumstances to which you wish to apply the findings. Using Campbell’s term, your goal is to check for proximal similarity. 2. The late Donald Campbell, perhaps the most important quantitative research methodologist over the past 50 years,
approved of Yin’s (1994) book. See, for example, his introduction to this book.
Brewer, J., & Hunter, A. (1989). Multimethod research: A synthesis of styles. Newbury Park, CA: Sage. Campbell, D.T. (1979). Degrees of freedom and the case study In T. D. Cook & C.S. Reichardt (Eds.) Qualitative and quantitative methods in evaluation research (pp. 49-67). Beverly Hills, CA: Sage Publications.
Campbell, D.T. (1986). Relabeling internal and external validity for applied social scientists. In W. Trochim (Eds.) Advances in quasi-experimental design and analysis: New Directions for Program Evaluation, 31, San Francisco: Jossey-Bass. Cook, T.D., & Campbell, D.T. (1979). Quasiexperimentation: Design and analysis issues for field settings. Chicago: Rand McNally Denzin, N.K. (1989). The research act: theoretical introduction to sociological methods. Englewood cliffs, NJ: Prentice Hall.
Fetterman, D.M. (1998). Ethnography. In Handbook of Applied Social Research Methods by L. Bickman & D.J. Rog (eds.). Thousand Oaks, CA: Sage. Glaser, B.G., & Strauss, A.L. (1967). The discovery of grounded theory: Strategies for qualitative research. New York: Aldine de Gruyter. Kirk, J., & Miller, M.L. (1986). Reliability and validity in qualitative research. Newbury Park, CA: Sage.
Johnson, R.B. (1994). Qualitative research in education. SRATE Journal, 4(1), 3-7. LeCompte, M.D., & Preissle, J. (1993). Ethnography and qualitative design in educational research. San Diego, CA: Academic Press. Lincoln, YS., & Guba, E.G. (1985). Naturalistic inquiry. Beverly Hills, CA: Sage. Maxwell, J. A. (1992). Understanding and validity in qualitative research. Harvard Educational Review, 62(3), 279-299. Maxwell, J. A. (1996). Qualitative research
design. Newbury Park, CA: Sage. Okey, TN., & Cusick, P.A. (1995). Dropping out: Another side of the story. Educational Administration Quarterly, 31(2), 244-267. Smith, J.K. (1984). The problem of criteria for judging interpretive inquiry. Educational Evaluation and Policy Analysis, 6, 379-391. Smith, J.K. (1986). Closing down the conversation: The end of the quantitative-qualitative debate among educational inquirers. Educational Researcher, 15(12-32).
Stake, R.E. (1990). Situational context as influence on evaluation design and use. Studies in Educational Evaluation, 16, 231-246. Strauss, A. (1995). Notes on the nature and development of general theories. Qualitative Inquiry, If l), 7-18. Strauss, A., & Corbin, J. (1990). Basics of qualitative research: Grounded theory procedures and techniques. Newbury Park, CA: Sage. Yin, R.K. (1994). Case study research: Design and methods. Newbury Park: Sage.
R. BURKE JOHNSON, PH.D. Education
University of South Alabama Mobile, Alabama 36688
Copyright Project Innovation Winter 1997
Provided by ProQuest Information and Learning Company. All rights Reserved