A scale in search of a construct: Comments on Gavin and Wamboldt

Schouten, Peter G W

Family science has been characterized by a rapid evolution of theory (Lavee & Dollahite, 1991). Direct tests of the new theoretical formulations are likely to prove worthwhile given adequate measures of key constructs. In addition, sound family-focused assessments may permit more precise definitions of clinical problems and, hence, more effective interventions. Thus, the development of such assessments is ultimately in the interest of the scientific legitimization of family science concepts and family therapy practice.

The Family-of-Origin Scale (FOS; Hovestadt, Anderson, Piercy, Cochran, & Fine, 1985) is an assessment device that has generated considerable interest in relation to the study of family health (Gavin & Wamboldt, 1992; Lee, Gordon, & O’Dell, 1989; Manley, Searight, Skitka, Russo, & Schudy, 1991; Mazer, Mangrum, Hovestadt, & Brashear, 1990). The FOS is a 40-item, standardized self-report measure that was designed to tap two dimensions of the family of origin–autonomy and intimacy. In their recent FOS study, Gavin and Wamboldt (1992) reported factor analytic, correlational, and reliability data for a sample of 63 premarital couples. The authors concluded that the scale is useful as a measure of satisfaction with the family of origin. At first glance, the Gavin and Wamboldt article is encouraging. A closer look reveals a number of problems that deserve further attention.

The focus of a validation effort is not the measurement itself but rather the construct to be inferred on the basis of the measurement. The importance of a construct is determined by its place within a larger network of other mutually compatible constructs. The usefulness of an assessment device, on the other hand, depends on its ability to operationalize a construct of interest (Landy, 1986). The purpose of the present article is to examine these issues in relation to the FOS and the Gavin and Wamboldt study.

As Gavin and Wamboldt (1992) observed, the FOS has a limited research history and ambiguities exist as to what the scale is measuring and how it should be used. Lee et al. (1989) previously suggested that the FOS appears to be “largely concerned with ‘encouragement of communication’ among family members” but were quick to add that “there are other possibilities” (p. 26), prime candidates including response style and response bias. Lee et al. (1989) observed that within-and between-group variability on the FOS is difficult to interpret: “One doesn’t know why expressed perceptions of families differ between groups, or in what ways expressed perceptions of families differ one from another within groups” (p. 26). In short, the FOS has a construct identity problem.

Gavin and Wamboldt (1992) were more positive in their appraisal of the FOS than were Lee et al. (1989) and specifically endorsed the use of the scale as a measure of “the individual’s satisfaction with his/her family of origin” (p. 187). Gavin and Wamboldt did not, however, explain the importance of the hypothesized satisfaction construct. That is, they did not describe how this construct relates to themes that are important in family assessment and family science research. As will be shown in the analysis that follows, based on a consideration of clinical and theoretical importance, a less optimistic conclusion about the FOS would seem justified.

Lee et al. (1989) observed that respondents were unable to complete some FOS items because “there was no such thing as ‘the’ family” (p. 27). In its current form, the FOS describes the family of origin as a whole, without distinguishing among family subsystems. It is impossible to determine who is included in a respondent’s concept of his or her family of origin. Thus, the individual’s FOS ratings could describe, not only his or her own family relationships, but also intrafamilial relationships that do not include the respondent (e. g., sibling-parent or parent-parent dyads). Further, just as there is no such thing as “the family,” there is also no such thing as a single point of view on the family (Richie, 1991; Richie & Fitzpatrick, 1990). In other words, a description of the family based on the FOS would represent a “spurious average phenomenology of the family” (Ransom, Fisher, Philips, Kokes, & Weiss, 1990) that obscures with whom and how family members actually interact.

Assuming for a moment that the FOS is a measure of satisfaction, the individual’s ratings could be influenced by any number of cognitive, emotional, and behavioral factors (Noller & Fitzpatrick, 1990). Responses on a scale such as the FOS may, to some degree, be state dependent or mood congruent (Brewin, Andrews, & Gotlib, 1993). For example, transient affective states that do not directly reflect family-of-origin factors may influence subjects’ responses. On the other hand, subjects’ ratings of the family may reflect relatively stable differences with respect to the ability to cope with family stressors or deprivations and with respect to the internal standards, needs, and expectations by which subjects evaluate their family environments. Individuals may be comparable in self-reported levels of satisfaction, but for very different reasons (Christensen & Shenk, 1991; Moffit, Spence, & Goldney, 1986).

It is unclear whether the hypothesized family-of-origin satisfaction construct can help us understand the nature of family health. The current theoretical emphasis of marital communication research is on the identification of the types of problems and adaptive qualities that account for differential adjustment Joller & Fitzpatrick, 1990). This focus is also relevant to the study of the family of origin and family dynamics. Reflecting an interest in specific mechanisms, recent research has focused on factors such as interpersonal competence (Christensen & Shenk, 1991; Notarius & Vanzetti, 1983), conflict and affect regulation (Lindahl & Markman, 1990), and the acceptance of nonresolution as a form of constructive change (Jacobson, 1991). These processes modulate emotional effects which, in turn, can be expected to predict variations along a satisfaction-dissatisfaction continuum. For example, conflict regulation deficits are likely to be associated with aversive emotions and, hence, lower levels of satisfaction. The emotional effects of an experienced conflict can be understood as epiphenomena. As determinants of satisfaction, the conflict behaviors and coping efforts themselves are of primary theoretical and clinical interest because they have more explanatory power than their concomitant emotional states. These behavioral processes should be assessed directly rather than inferred from their emotional epiphenomena.

The importance of precision with regard to the nature of interpersonal difficulties and the limitations of global satisfaction ratings have been discussed elsewhere (e. g., Weiss, 1990). Suffice it to say that global measures of family functioning, and satisfaction measures in particular, are unlikely to add to clinicians’ and researchers’ armamentaruium because they are too general to permit specific and meaningful inferences.


In the Gavin and Wamboldt (1992) study, the construct identity issues identified in the foregoing discussion are complicated by several methodological problems. In particular, Gavin and Wamboldt used a new instructional set for the FOS. The original instructions ask a subject to describe “the family with which you spent most or all of your childhood years” (Hovestadt et al., 1985, p. 289). Previous studies on the FOS (Lee et al., 1989; Mazer et al., 1990) used the original instructions. In contrast, Gavin and Wamboldt asked subjects “to rate their family as they would have during the last time they were regularly interacting with them” (p. 181). This change in format introduces an element of unknown and uncontrolled method variance, if not an altogether new construct. Surprisingly, Gavin and Wamboldt offered no rationale for their format change. Why would the most recent contact be of interest? Can it be assumed to be representative of an individual’s total family-of-origin experience or characteristic interactions?

Convergent and divergent validity are at the core of the validation process (Campbell & Fiske, 1959). The construct definition adopted as the starting point of a validity study ordinarily guides the choice of validity criteria and should also shape the predictions against which the results are compared. How was satisfaction with the hypothesized family-of-origin construct reflected in Gavin and Wamboldt’s choice of criterion measures? These researchers compared the FOS to eight other family indices, none of which was a recognized index of satisfaction with the family of origin. Gavin and Wamboldt did not say why they selected their set of validity criteria.

One of Gavin and Wamboldt’s (1992) criterion measures was the Family Environment Scale (FES; Moos & Moos, 1981). Recent evidence has cast doubts on this scale’s psychometric adequacy (Loveland-Cherry, Youngblut, & Leidy, 1989; Roosa & Beals, 1990), thus raising questions about its appropriateness as a validity criterion. In addition, in the Gavin and Wamboldt study, the response format of the FES was modified to match the modification of the FOS: research participants were asked to describe the last time they regularly interacted with their parents. As Moos (1990) observed, the response format used with the FES can substantially influence the results. It is unclear why Gavin and Wamboldt used a new response format with the FOS but not with their other validity criteria.

Gavin and Wamboldt (1992) began their empirical study with a factor analysis. Factor analysis can be approached in an exploratory manner or in a confirmatory manner. An exploratory approach can be used to determine communalities among items, that is, to identify groups of items that are more highly correlated with each other than with other items. The observed factors are typically labeled on a post hoc basis (Gorsuch, 1974). A confirmatory approach, on the other hand, can be used to determine the goodness of fit between the data and a scale’s hypothesized dimensions (Long, 1983). The latter approach is concerned with the verification of an a priori idea of factor structure and thus directly addresses the issue of construct identity.

A factor structure validation of the FOS called for a confirmatory factor analysis. Gavin and Wamboldt’s (1992) factor analysis, however, was intended to replicate Lee et al.’s (1989) exploratory procedure. At the same time, Gavin and Wamboldt sought to “approximate more closely the original conceptualization of Hovestadt and associates” (p. 182). These two aims–a replication of the Lee et al. findings and a factor structure validation–were incompatible. A factor structure validation for the FOS would have confirmed 2, not 10, dimensions, that is, autonomy and intimacy (Hovestadt et al., 1985). Thus, Gavin and Wamboldt’s analysis, which was preset to extract 10 factors, did not provide an appropriate factor analytic test of the scale’s construct validity.

Gavin and Wamboldt’s (1992) replication effort is difficult to evaluate in view of the variations in factor solutions found in previous research. Mazer et al. (1990) obtained 7-factor solutions for both the FOS standardization sample and for an undergraduate sample. Lee et al. (1989) obtained a 9-factor solution for their nonpatient sample and a 10-factor solution for their patient sample. It is unclear why Gavin and Wamboldt sought to replicate Lee et al.’s 10-factor solution. Unlike the sample for which Lee et al. found 10 factors, Gavin and Wamboldt’s subjects were not drawn from a client population. The observed cross-study differences in FOS factor solutions are not surprising given the impact of sampling effects on research findings. What is surprising is that Gavin and Wamboldt’s replication claim was not qualified with an explicit recognition of the sample dependence of statistical evidence.

Gavin and Wamboldt (1992) described their factor analytic findings and those reported by Lee et al. (1989) as being “virtually identical,” with “modest differences in content” (p. 183). Closer examination reveals that Gavin and Wamboldt’s first factor had only 2 items in common with Lee et al.’s first factor (items 23 and 27). The remaining 10 items that defined Gavin and Wamboldt’s first factor are scattered throughout the factor loading matrices shown in the Lee et al. study. Further, it is unclear whether a factor replication claim can be supported without assurance of cross-study consistency with respect to cutoff criteria. Lee et al. identified items that had weights of at least .40 and retained factors that met the eigenvalue rule of 1.00 or larger (see also Mazer et al., 1990). Gavin and Wamboldt did not describe their cutoff criteria.

It is tempting to speculate on whether chance fluctuations or true sample-dependent differences were responsible for the observed discrepancies in dimensional structure. However, the Gavin and Wamboldt (1992) study was not comparable to previous studies on the FOS due to the change in response format noted previously. This format change should have been recognized as an uncontrolled method factor that placed serious constraints on the replication effort. Despite the discrepant findings and methodological ambiguities, Gavin and Wamboldt concluded that the results of their factor analysis converged on a global satisfaction construct.

As noted previously, Gavin and Wamboldt did not compare the FOS to a known measure of family-of-origin satisfaction. In the absence of a more appropriate independent external criterion, the authors’ conclusion about what the FOS is measuring appears to rest on the finding of a large first factor. Gavin and Wamboldt compared their results with those reported by Lee et al. (1989), who also “found that their first factors accounted for a large amount of the variance” (Gavin & Wamboldt, 1992, p. 182). In a principal components factor analysis, the first factor will always explain the largest proportion of the accountable variance among variables, and this is true even for random data. Did Gavin and Wamboldt mistake the mechanics of the principal components procedure for evidence of overall factoral complexity?

Although Gavin and Wamboldt (1992) did not report interfactor correlations, the authors did perform the analysis. This omission is problematic for at least two reasons. First, the interfactor correlations could have guided the choice of factor rotation method. If high intercorrelations had been found, this would have indicated the need for an oblique procedure (Warburton, 1963). The orthogonal method of rotation selected by Gavin and Wamboldt (varimax) would then have been inappropriate.

Secondly, the actual values of the interfactor correlations were crucial to Gavin and Wamboldt’s (1992) conclusion that “any different subscales created from the larger measure are basically measuring one primary construct” (p. 183). Instead of reporting these values, the authors cited the associated levels of statistical significance (p

In retrospect, one of the more theoretically interesting findings reported in Gavin and Wamboldt (1992) had to do with the relationship between the modified FOS and the Dyadic Adjustment Scale (DAS). The DAS was designed to measure current interpersonal adjustment. This analysis was concerned with the ability of the family-of-origin factor to predict current adult attachment. The average FOS-DAS correlation was .17. One might reasonably question whether clinicians should be concerned with a family-of-origin variable which, being weakly predictive of current adjustment, apparently has minimal phenomenological impact and etiological significance.

Gavin and Wamboldt (1992) set out to investigate “the overall utility” (p. 179) of the FOS and concluded that the scale is “useful” (p. 187). They did not, however, specify a possible context of use. That is, the authors did not describe the type of predictions or decisions that can be justified on the basis of FOS scores. In fact, the available data neither support the adequacy of the FOS as a measure of family-of-origin satisfaction nor rule out other possible construct interpretations.

In family assessment, pragmatic validity (i. e., the ability to predict clinically relevant criterion behaviors) is sometimes more important than construct validity (Miller & Goddard, 1989). In their endorsement of the FOS, however, Gavin and Wamboldt (1992) also did not describe a possible criterion-related application. In fact, there is no evidence to support any particular use or interpretation of either the original FOS or the new variant examined in the Gavin and Wamboldt study.

Despite the ambiguities that remain with respect to the construct validity and utility of the FOS, Gavin and Wamboldt (1992) concluded that the scale’s prospects are “virtually certain” (p. 186) and went on to make recommendations for future research. Specifically, the authors suggested that additional studies on the FOS and other such measures should focus on “evidence of the degree to which these measures tap family process” (p. 187; italics in original). Equating family process with laboratory observations of transactional behavior, the authors went on to stress the need to compare “observed interactional processes of families and self-report measures of family health” (p. 187). The implicit assumption that family process can be equated with laboratory observations of family interactions is a questionable one. The laboratory is an extrafamilial context that includes strangers (the investigator or research assistants). Further, laboratory observations are influenced by demand characteristics, the ambiguity of the situation, and subjects’ reactions to being observed (Haynes & Horn, 1982; Oliveri & Reiss, 1984; Sigafoos, Reiss, Rich, & Douglas, 1985; Vincent, Friedman, Nugent, & Messerly, 1979). At the very least, these problems raise questions about the value of laboratory observations as validity criteria for a family self-report measure.

Issues of ecological validity aside, how would correlations between the FOS and observational data attest to the scale’s validity or to the theoretical importance of the satisfaction construct? Gavin and Wamboldt (1992) did not say. Nor did the authors specify the criterion behaviors that might permit inferences about the construct underlying the FOS. Without a validity framework (i. e., appropriate criterion measures and testable hypotheses about predictor-criterion relationships), it is impossible to determine whether the findings confirm a referent construct or justify a particular application.

In connection with the idea that self-report measures such as the FOS should be compared to outsider ratings of family interactions, Gavin and Wamboldt (1992) stated: “We do not know the extent to which self-reported differences on such measures actually relate to differences in what families do” (p. 187). There have actually been numerous studies along these lines, including several that have been concerned with the Beavers systems model invoked by Gavin and Wamboldt (see Hampson, Beavers, & Hulgus, 1989, for a review). These studies have focused on the convergence between insider and outsider assessments of interaction patterns, attachment styles, and relationship qualities (i. e., competence, family structure, and range of affect). It is important to note, however, that the family-of-origin satisfaction construct is an individual differences variable (not a relational variable) with no identified place in family systems theory. It is unclear how a variable with no documented value in the assessment of family roles and practices could be validated vis-a-vis an observational coding system. Gavin and Wamboldt’s research agenda presumes insider-outsider correlations where none can be expected. But these concerns are secondary to the more basic problem of theoretical importance. From the standpoint of developing a useful set of descriptions, explanations, and predictions about how families function, there is no good reason to investigate a global satisfaction construct. Any further attempts to validate the FOS would be moot. The scale would have no use even if its adequacy as a measure of family-of-origin satisfaction were demonstrated.


Lee et al. (1989) questioned the construct validity and utility of the FOS. The recent Gavin and Wamboldt (1992) study, which was intended to address Lee et al.’s concerns, contains little in the way of theoretical rationale and evidence that can be accepted as logical and empirical support for the use of the scale. Gavin and Wamboldt’s enthusiasm for the scale is puzzling because their study did not help settle the question at hand: What would be a valid use of the FOS? Given the scale’s lack of specificity with respect to family subsystems and doubtful relevance to research and practice, this question is unlikely to be resolved by any validation effort.

The lack of empirical support for many family science concepts and family therapy practices has been described as a “professional scandal” (Bergin, 1982). This state of affairs is unlikely to be improved by hit-and-run studies in which investigators stretch the inferential limits of their data or pursue a research focus that has no intrinsic relationship to a validity framework that could give their findings meaning or practical value. Indeed, once researchers’ conclusions have become disconnected from theory and evidence, we are left with little more than investigator bias and rhetoric. Costly data collection and analysis activities then appear as mere formalities that reduce the scientific enterprise to posturing and hollow symbolism. Especially misleading are studies that acquire a semblance of scientific legitimacy by a superficial adherence to procedural and statistical conventions, including those that do not address the evidentiary requirements that follow from a given research question. The apparent merits of such studies may derive almost entirely from half-truths and specious arguments that undermine the aims and hopes of science while mimicking a commitment to methodological integrity and rigor.

Unsubstantiated claims may reflect indifference to epistemic criteria or confusion about fundamentally important relationships among design, analysis, and reasoning. These problems must be confronted directly to help ensure theoretically meaningful, empirically defensible, and clinically useful family science research.


The constructive comments of Mark Roosa and three anonymous reviewers on an previous version of this paper are gratefully acknowledged.

