Meta-analytic procedures for estimation of effect sizes in experiments using complex analyses of variance – includes appendix

Hossein Nouri

This paper presents techniques for use In mesa-analytic research to estimate effect sites when studies Involve complex ANOVA, ANOVA with repeated measures, and complex A NOVA with repeated measures. Real examples are also provided to show the application of the techniques.

Since the pioneering work of Glass, McGaw and Smith (1981) and the works of Hunter, Schmidt and Jackson (1982) and Rosenthal (1984), meta-analysis has been widely used by researchers to review the empirical literature. The technique requires that researchers, in dealing with experimental studies, compute the effect size for a particular study in order to cumulate the results across studies. Problems, however, arise when the independent variables used in the study have more than two levels or when the experimental study under review uses analysis of variance (ANOVA) with repeated measures. Currently available effect size formulae are limited to experimental studies in which the independent variables have only two levels. This is acknowledged by Hunter and Schmidt (1990) and Glass et al. (1981), among others. Hunter and Schmidt (1990) suggest that researchers should set up contrasts and test the generalizability of these contrasts when working with designs in which dependent variables contain three or more levels. Rosenthal and Rubin (1986) provide guidelines for combining and comparing research results from studies yielding multiple effect sizes based on multiple dependent variables. The purpose of this article is to provide a general set of mesa-analytic procedures for calculating and comparing research results from (1) studies employing independent variables with more than two levels and/or (2) studies using ANOVA with repeated measures where information regarding contrasts are not either available or are not useful.

Complex ANOVA

A complex ANOVA is an ANOVA where the independent variables in the study have more than two levels. Translation of complex ANOVA results into effect size suitable for standard meta-analysis has not been examined. Standard meta-analysis cumulates effect size, d, across studies; d can be calculated from ANOVA results in two different ways. One method uses the following formula (Wolf, 1986, p. 35):

d=2[(F).sup.1/2]/ [[df(error)].sup.1/2]

Where F is the F statistic when the numerator has only one degree of freedom (i.e., the independent variable has only two levels). Since by definition this is not the case in a complex ANOVA, a different method for calculating the average effect size, d, needs to be derived for complex ANOVA.

The second way of calculating effect size is using the following formula (Hunter & Schmidt, 1990):

(1) [d.sub.EC] = ([[bar]X.sub.E] – [[bar]X.sub.C])/s

where: [[bar]X.sub.E] = Sample mean for the experimental group,

[[bar]X.sub.C] = Sample mean for the control group, and

s = Standard deviation(1) = [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

This equation can be used to compare an experimental group to the control group. The standard deviation in the denominator of Equation 1 is the pooled within-group standard deviation. An alternative is the control group standard deviation as used by Glass et al. (1981). If there is homogeneity of variances, the pooled within group standard deviation is preferable because it has the least sampling error (Hunter & Schmidt, 1990, p. 271). Throughout this paper we assume homogeneity of variances.

For complex ANOVA’s we propose that Equation 1 can be adapted. The adaptation proposed calculates the effect size in the following manner. The mean for each of the experimental and control groups should be calculated as follows:(2)

(2) [[bar]X.sub.i..] = (1/[n.sub.i.])[[Sigma].sub.j=1] [n.sub.ij][[bar]X.sub.ij].

Where: i = 1,2, …, g,

j = 1,2, …, t,

[[bar]X.sub.i..] = Mean of group i,

[[bar]X.sub.ij.] = Mean in the cell associated with the ith group and ith treatment,

[n.sub.ij] = Number of observations in the cell associated with the ith group and ith treatment, and

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The special case when [n.sub.i1] = [n.sub.i2] = … = [n.sub.it] = n, gives

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Where t = Number of treatments in each group.

Next it is necessary to calculate the standard deviation for both the experimental and control group using a formula which basically adds the within ith group variances to the variances of treatment in the ith group (the Appendix has the derivation of this formula):(3)

(3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Where: i = 1,2, …, g,

j = 1,2, …, t,

k = 1,2, …, [n.sub.ij],

[[Sigma].sub.i.] = Standard deviation of group i,

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] = Variance in the cell associated with the ith group and jth treatment, and [[bar]X.sub.i..], [[bar]X.sub.ij.], [n.sub.ij] and [n.sub.i.] as defined for Equation 2.

The special case when [n.sub.i1] = [n.sub.i2] = … = [n.sub.it] = n, gives:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The means and standard deviations calculated in Equations 2 and 3 can now be used in Equation 1 to estimate the effect size. It should be noted that the mean is the same mean that would be obtained using an averaging technique. However, the standard deviation used is different than the one that would be derived from averaging. Therefore the effect size will also differ from the one that would be derived by averaging.

Example

To illustrate how the effect size should be estimated for a study using complex ANOVA, we use the following example. Suppose we are interested in reviewing the studies which have examined the impact of participative goal setting (PGS) versus assigned goals on individual’s job performance. Further, assume that one of the studies under review is an experimental study which examined the impacts of PGS (three levels of goal setting: participation, assigned, and do your best) and information (two levels: high and low information) on the individual performance. Table 1 depicts the situation. Now assume that the study under review provides descriptive statistics as well as ANOVA Tables. These are shown in Tables 2 and 3, respectively.

Table 1. Experimental Results for Hypothetical Example

(Complex ANOVA)

Goal Setting Condition

Information (1) (2)

Condition Participative Assign

[S.sub.111] 8 [S.sub.211] 5

[S.sub.112] 7 [S.sub.212] 6

High [S.sub.113] 9 [S.sub.213] 4

[S.sub.114] 6 [S.sub.214] 7

[[bar]X.sub.11] 7.5 [[bar]X.sub.21] 5.5

[[Sigma].sub.11] 1.29 [[Sigma].sub.21] 1.29

[n.sub.11] 4 [n.sub.21] 4

[S.sub.121] 6 [S.sub.221] 3

[S.sub.122] 5 [S.sub.222] 4

Low [S.sub.123] 7 [S.sub.223] 4

[S.sub.124] 4 [S.sub.224] 5

[[bar]X.sub.12] 5.5 [[bar]X.sub.22] 4.0

[[Sigma].sub.12] 1.29 [[Sigma].sub.22] .82

[n.sub.12] 4 [n.sub.22] 4

Total 52 38

[[bar]X.sub.1..] 6.5 [[bar]X.sub.2..] 4.75

[[Sigma].sub.1.] 1.6 [[Sigma].sub.2.] 1

[n.sub.1.] 8 [n.sub.2.] 8

Information (3)

Condition Do-best

[S.sub.211] 3

[S.sub.312] 2

High [S.sub.313] 3

[S.sub.314] 4

[[bar]X.sub.31] 3.0

[[Sigma].sub.31] .82

[n.sub.31] 4

[S.sub.321] 3

[S.sub.322] 1

Low [S.sub.323] 2

[S.sub.324] 4

[[bar]X.sub.32] 2.5

[[Sigma].sub.32] 1.29

[n.sub.32] 4

Total 22

[[bar]X.sub.3..] 2.75

[[Sigma].sub.3.] 1.04

[n.sub.3.] 8

[d.sub.12] = (6.5 – 4.75)/1.45 = 1.21

Where 1.45=([(8-1)[(1.60).sup.2]+(8-1)[(1.28).sup.2]/

[(8+8-2)).sup.1/2]

Notes:

[bar]X = Means

[Sigma] = Standard deviations

d = Effect size

Table 2. Descriptive Statistics-Means and Standard Deviations

Goal Setting Condition

Participative Assigned Do-best

Means sd Means sd Means sd

High Information 7.5 1.29 5.5 1.29 3 .82

Low Information 5.5 1.29 4.0 .82 2.5 1.29

Table 3. NOVA

Source SS df MS F

A: Goal setting 56.33 2 28.165 21.12

B: Information 10.67 1 10.67 8.00

A X B 2.33 2 1.165 .87

Error 24.00 18 1.33

Total 93.33 23

Estimation of the effect size ([d.sub.12]) between participative and assigned goal setting group is not possible from the ANOVA Table because the F-statistic has two df for the goal setting condition. Therefore, we should use descriptive statistics to compute the effect size ([d.sub.12]). To do this, the following steps should be pursued.

For the data of Table 2, the effect size will be as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Note that if we average the data of Table 2, we get the following estimate of the effect size ([d.sub.12]):

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

As in this example, there could be a significant difference between the actual effect size (i.e., 1.21) and the average effect size (i.e., 1.47). Therefore, in estimating the effect size (d), researchers should avoid to average variances.(4)

ANOVA with Repeated Measures

An ANOVA with repeated measures is performed when the same variable is measured at multiple points in time. A problem could occur if the results of each study only provides data with respect to the means, standard deviations, and intercorrelations for each group in each period. To handle this problem, effect sizes can be estimated in a similar manner as in the Complex ANOVA section.

First we calculate the composite mean score for each of the experimental and control groups using the following formula:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Where: [[bar]X.sub.ij]. = Mean in the cell associated with the ith group and jth period.

Then we estimate the standard deviation for each of the experimental and control groups by combining the cell variance within each period and the between period correlation using the following formula (see Ghiselli, Campbell, and Zedeck, 1981, pp. 157-159 for the derivation):

(5) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Where: [[Sigma].sub.ij] = Variance in the cell associated with the ith group and ith period, and [[Rho].sub.ij,i j+m] = Intercorrelation in group i between j and j+m periods.

In the special case when [[Rho].sub.ij,i j+,] = 1 among all periods, the formula is:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Lastly we estimate effect size by substituting Equations 4 and 5 into Equation 1.(5)

Example

Here we will use the same example as in Table 1, but instead of having two levels of information, we have three periods. The data is presented in Table 4.

Table 4. Experimental Results for Hypothetical Example (ANOVA Repeated Measures)

Periods

Subjects 1 2 3 Sum

Participative

[S.sub.1j1] 6 9 10 25

[S.sub.1j2] 5 6 8 19

[S.sub.1j3] 7 8 9 24

[S.sub.1j4] 4 7 7 18

[[bar]X.sub.1j] 5.5 7.5 8.5 21.5

[[Sigma].sub.1j] 1.29 1.29 1.29 3.51

[[Rho].sub.11,12] = .60

[[Rho].sub.11,13] = .80

[[Rho].sub.12,13] = .80

Assigned

[S.sub.2j1] 3 5 7 15

[S.sub.2j2] 4 6 8 18

[S.sub.2j3] 4 4 6 14

[S.sub.2j4] 5 7 7 19

[[bar]X.sub.2j] 4.0 5.5 7.0 16.5

[[Sigma].sub.2j] .82 1.29 .82 2.38

[[Rho].sub.21,22] = .63

[[Rho].sub.21,23] = .00

[[Rho].sub.22,23] = .63

Do best

[S.sub.3j1] 3 3 4 10

[S.sub.3j2] 1 2 4 7

[S.sub.3j3] 2 3 3 8

[S.sub.3j4] 4 4 5 13

[[bar]X.sub.3j] 2.5 3.0 4.0 9.5

[[Sigma].sub.3j] 1.29 .82 .82 2.65

[[Rho].sub.31,32] = .95

[[Rho].sub.31,33] = .63

[[Rho].sub.32,33] = .50

d = (21.5 – 16.5)/3 = 1.67

Where: 3 = [[([(4-1X3.51).sup.2] + (4-1)[(2.38).sup.2]]/

(4+4-2)).sup.2]

Notes:

[bar]X = Means

[Sigma] = Standard deviations

d = Effect size

[[Rho].sub.1j, j+m] = The intercorrdation in group i between j and j+m periods.

For the data of Table 4, the effect size will be computed as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Note that in the estimation of d, knowledge of the intercorrelations among the periods ([[Rho].sub.ij,i j+m]) is required. If the intercorrelations among the periods for each study is not known, then a reasonable estimate of p based on a knowledge of the relevant literature could be sufficient. For instance, in our example, if the estimates of p are not available and the typical estimate of intercorrelation for participation group is .8 and for assigned group is .6, we can use the estimates in computation of [[Sigma].sub.i]. which will produce an effect size of 1.61 that is close to the actual effect size of 1.67. If intercorrelations among periods were assumed to be unity by the meta-analyst, then we may estimate an effect size which could be very different from the actual effect size. For example, assuming [[Rho].sub.ij,i j+m] = 1 for our example will produce an effect size of 1.48 which is quite different from the actual effect size of 1.67.

If nothing is known about the intercorrelations among the periods, we suggest that the meta-analyst estimates effect size (d) under different intercorrelations, say 1, .7, and .5 for the specific study under review. If the estimations under different intercorrelations are far apart, the meta-analysts should eliminate that study from meta-analyses. However, if estimations are not very different, we suggest the use of [[Rho].sub.ij,i j+m] = 1 since it provides more conservative estimates (see next section for an example).

Complex ANOVA with Repeated Measures

A complex ANOVA with repeated measures is used when an independent measure used in the study has more than two levels and observations for the same variable is measured over time. In this case, the meta-analyst can first use Equations 2 and 3, from the Complex ANOVA section, to find the mean and standard deviation for each of the experimental and control groups in each period and, then, composite these scores across periods through Equations 4 and 5, from the ANOVA with Repeated Measure section.

Example

An example from the study by Erez, Earley and Hulin (1985) will demonstrate how the effect size (d) can be estimated in complex ANOVA with repeated measure. The data in Table 5 is extracted from the first experiment conducted by Erez et al. (1985).

Table 5. Laboratory Experiment-Means and Standard Deviations of Performance

Personal Goal Condition

Phase 1

Goal-setting No-Set Set

Condition Means sd Means sd

Assigned 9.50 2.82 9.79 2.84

Representative 13.90 4.01 12.10 5.93

Participative 15.45 5.90 8.35 3.82

Personal Goal Condition

Phase2

Goal-setting No-Set Set

Condition Means sd Means sd

Assigned 16.79 4.79 16.75 7.44

Representative 18.05 5.11 18.84 7.62

Participative 22.90 6.84 16.70 8.24

Notes:

1. Number of subjects in each goal-setting condition is 40.

2. The data for the baseline reported in the original table is not presented since they are pretest data and are not used for composite scores.

Source: Table 2 of Erez, Earley, and Hulin (1985) study.

In the experiment, Erez et al. (1985) used 120 students to examine the impact of two levels of personal goals and three levels of goal setting condition on job performance (simulated scheduling task). The two conditions of personal goals were “one in which the subject were not asked to set their own personal goals (No-Set), and one in which they were asked to set their personal goals before the goal-setting manipulation (Set)” (p. 52). The three goal-setting conditions were participative, representative, and assigned. In participative condition, subjects set the goals jointly with the experimenter. In representative condition, a person selected by the group negotiated with the experimenter in setting a goal. And in assigned condition, a goal was set by the experimenter. Data was gathered in two phases. In phase 1, goal difficulty was set at 10 schedules and in phase 2, at 25 schedules.

To estimate the effect size between the participative and the assigned group we first use Equations 2 and 3 to estimate the mean and standard deviation of the participative and assigned groups in both phase 1 and phase 2. Since intercorrelations among periods are not reported and no reasonable estimate based on available literature is known, we estimate the composite score when [[Rho].sub.ij,i j+m]=1, .7, and .5 through Equations 4 and 5. The results are:

Phase 1 Phase 2 Composite

Score

Mean sd Mean sd Mean

Assigned 9.6 2.80 16.8 6.18 26.4

Participative 11.9 6.09 19.8 8.11 31.7

Composite Score

sd([Rho] =1] sd([Rho] =.7] sd([Rho] =.5]

Assigned 8.98 8.38 7.96

Participative 14.2 13.12 12.34

Next, we estimate effect size. The estimated effect sizes are .46, .49, and .52 when [Rho]=1, .7, and .5, restively. Since the difference anions estimated effect sizes do not seem to be substantial, we should use the effect size of .46, which is more conservative estimate, in our meta-analysis.

Conclusions

The existing meta-analysis literature gives little guidance on how to handle either Complex ANOVA or ANOVA with repeated measures. Using meta-analysis to analyze the results of n body of literature would be easier if journal editors adopted expanded reporting standards and the authors of primary studies fully reported descriptive statistics for all cells in ANOVA design as well as the correlations in repeated measure designs. When these statistics are not presented it is difficult, if not impossible, to include these studies in the meta-analysis. If descriptive statistics are not available it is inappropriate to simply convert t and F statistics from such designs to effect sizes.

When descriptive statistics are available, If meta-analysts simply average the data in complex ANOVA or simply sum the data in ANOVA with repeated measure, the estimated effect sizes could be highly affected because of biased standard deviations which consequently could affect the results of meta-analysis. It is desirable to estimate the effect sizes in meta-analytic reviews as accurately as possible. This paper develops procedures that can be used by meta-analysts, who confront complex ANOVA, ANOVA with repeated measures, and complex ANOVA with repeated measure, to estimate effect sizes In their review of literature. These procedures will allow meta-analysts to broaden the range of studies that can be included in their analysis as well as to improve their estimates and, therefore, increase the usefulness of their meta-analyses.

Acknowledgment: An earlier Version of this paper was presented at the 1993 Academy of Management Distinctive Poster Session Papers. We are Indebted to three anonymous reviewers for their comments and suggestions on earlier drafts of the paper.

Notes

(1.) [s.sub.g] = standard deviation of the experimental group, [s.sub.C] = standard deviation of the control group, [N.sub.g] = sample size of the experimental group, and [N.sub.C] = sample size of the control group.

(2.) The notations art based on Table A in the Appendix.

(3.) The standard deviation computed here is a pooled standard deviation which is consistent with the approach suggested by Glass, McGaw and Smith (1981). Kirk (1982) is a good source for researchers interested in a more thorough discussion of the advantages and disadvantages of pooling error terms.

(4.) Averaging standard deviations rather than variances would also produce a biased effect size and should be avoided.

(5.) Collapsing errors across all time periods may not always he desirable. The mesa-analyst may prefer to compute two or more d’s, coding these for time since intervention. See Hedges and Olkin (1985) for a discussion of alternate computation of the effect sizes for such lain scores.

References

Erez, M., Earley, P.C. & Hulin, C.L. (1985), The impact of participation on goal acceptance and performance: A two-step model. Academy of Management Journal, 28: 50-66.

Ghiselli, E.E., Campbell, J.P. Zedeck, S. (1981). Measurement theory for the behavioral sciences. San Francisco: Freeman.

Glass, G.V., McGaw, B. & Smith, M.L. (1981). Meta-analysis in social research. Beverly Hills, CA: Sage.

Hedges, L.V. & Olkin, I.O. (1985). Statistical methods for meta-analysis. Orlando, FL: Academic Press.

Hunter, J.E. & Schmidt, F.L. (1990). Methods of meta-analysis. Beverly Hills, CA: Sage.

Hunter, J.E., Schmidt, F.L. & Jackson, G.B. (1982). Meta-analysis: Cumulating research across studies. Beverly Hills, CA: Sage.

Kirk, R.E. (1982). Experimental design: procedures for the behavioral sciences, 2nd ed. Belmont, CA: Brooks/Cole.

Rosenthal, R. (1984). Mete-analytic procedures for social research. Beverly Hills, CA: Sage.

Rosenthal, R. & Rubin, D.B. (1986). Meta-analytic procedures for combining studies with multiple effect sizes. Psychological Bulletin. 99: 400-406.

Wolf, F.M. (1986). Meta-analysis: Quantitative methods for research synthesis. Beverly Hills, CA: Sage.

APPENDIX

An example of the general data configuration for the unequal-cell number case in complex ANOVA is presented in Table A.

[TABULAR DATA A NOT REPRODUCIBLE IN ASCII]

To calculate the standard deviation for each group (i.e., [[Sigma].sub.i]. where i-1,2, … ,g), we can proceed as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Where: i = 1,2, …, g, j=1,2, …, t, k=1,2, …, [n.sub.ij], and [X.sub.ijk] as defined above.

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

or

(1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The standard deviation for each cell can be written as:

(2) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

We have:

(3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

and

(4) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Substituting Equations 3 and 4 into Equation 2, we will have:

(5) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

We further have:

(6) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

and

(7) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Substituting Equations 5, 6, and 7 into Equation 1, we will have:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Therefore, the standard deviation for each group is equal to:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

which can be rewritten as:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

In the special cafe where [n.sub.i1]=[n.sub.i2]= … =[n.sub.it]=n, we have:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

which can be rewritten as:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Direct all correspondence to: Hossein Nouri, Trenton State College, School of Business, Hillwood Lakes CN4700, Trenton, NJ 08650.

COPYRIGHT 1995 JAI Press, Inc.

COPYRIGHT 2004 Gale Group