Sylvain Chartier, Denis Cousineau

## Computing Mixed-Design (Split-Plot) ANOVA NB   CDF   PDF

The mixed, within-between subjects ANOVA (also called a split-plot ANOVA) is a statistical test of means commonly used in the behavioral sciences. One approach to computing this analysis is to use a corrected between-subjects ANOVA. A second approach uses the general linear model by partitioning the sum of squares and cross-product matrices. Both approaches are detailed in this article. Finally, a package called MixedDesignANOVA is introduced that runs mixed-design ANOVAs using the second approach and displays summary statistics as well as a mean plot.

### Introduction

The mixed, within-between subjects design (also called split-plot or randomized blocks factorial) ANOVA is a technique that compares the means obtained by manipulating two factors, one being a repeated-measure factor. Let be the number of independent groups, each representing one level of the between-subjects factor, let be the number of measures corresponding to the within-subjects factor, and let be the number of subjects in the group.

The data is contained in a matrix of the form:

First load the package. It is available from

An example taken from Howell [1] (p. 481) concerns data collected in a study by King [2]. King investigated the effect of midazolam on the motor activity of rats. The rats were measured at six different times () and there were equal groups of individuals, . Hence, the total number of rats measured was . The data is listed in Table 1.

Table 1. Data from Howell (2003) of a 3×6 design (three groups, six repeated measures).

The plot of the means across conditions (Figure 1) shows evidence of a time effect, the results on the second time being generally smaller than those on the first time. In addition, there is an interaction effect caused by the “same” group that does not follow this pattern.

Figure 1. Illustration of the example given in Table 1.

Figure 2 shows variation partitioning for the subjects by condition design. The between-subjects variation is decomposed into two parts: a source of variation due to the group effect (area ) and a source of variation due to the measurement error (area ). Within-subjects variation is decomposed into three areas: a source of variation for the repeated measures effect (area ), a source of variation for the interaction between the repeated measures and the group effect (area ), and a source of variation for error (area ). Consequently, there will be three ratios to compute: the group effect (the ratio , ), the repeated measure effect (the ratio , ) and the interaction effect (the ratio , ).

Figure 2. (top) Venn diagram for the subjects within groups by conditions design. is the group effect, is the repeated-measure effect and is the interaction effect between the two factors. (bottom) Tree diagram showing the partitioning of the different sources of variation.

The total variation is . Each letter corresponds to a proportion of variation (-squared). The between-subjects proportions of variation are:

where represents the variation explained by the between-subjects source. By dividing and by , we get the mean effect, that is, the group effect:

Finally, the within-subjects proportions of variation are:

The various statistics are ratios of the following proportions of variation weighted by their corresponding degrees of freedom:

### Computing a Split-Plot ANOVA from the Computations Obtained by a Between-Subjects ANOVA

#### Test of the Between-Subjects Effect

The group (between-subjects) effect (measured by in Figure 2) can be obtained by averaging the repeated measures (so that information about the repeated measures is discarded) and submitting them to a one-way ANOVA:

Using Mathematica, we need to set a few constants first, and for convenience, the labels for the within- and between-subjects factors. The conditions are taken from the first column of .

This computes the number of groups and repeated measures.

This computes the total number of participants and the number of participants per group.

Finally, we define the labels.

To get the between-subjects effect, we aggregate the data over replicated measures.

Using this, a one-way ANOVA is computed.

In the following, we will need the between-subjects sum of squares, so we extract it from the table. To take into account the repeated measures, the sum of squares must be multiplied by that number of measures.

#### Test of the Within-Subjects Effect

ratios for the repeated-measure effect and the interaction effect are computed by first recoding the data matrix such that the repeated measures look like a second between-subjects factor. Thus, the bulk of the analysis simplifies into a standard factorial ANOVA. The following transforms the data.

Applying a standard between-factors ANOVA, the following summary table is obtained.

First, the results regarding the group effect are discarded since it has been analyzed in the previous section. Next, the results regarding the repeated measure “time” and the interaction (group × time) must be modified to obtain the corrected ratios. Specifically, information regarding the error term is incorrect since it does not take into account the estimation of the between-subjects error that we obtained in the previous subsection. The corrected error sum of squares is given by

The corrected error degrees of freedom () is given by

Using the results of the previous ANOVA, the sum of squares terms can be extracted and the corrected error term computed.

From this, the error mean square (MS) can be computed.

The within-subjects ratios are summarized in the following table.

### Computing a Split-Plot ANOVA Using the General Linear Model Approaches

A different technique for computing a split-plot ANOVA is to use the general linear model approaches [3, 4]. By using the general linear model, all ANOVAs (factorial, repeated measures, etc.) are treated as a special case of regression analyses; the dependent variables remain the same while the predictors are generated using binary codes. Then, the various variability terms (error, within-subjects, between-subjects) can be estimated as ratios of explained to unexplained variation.

#### Test of the Between-Subjects Effect

For this effect, a coding matrix that identifies each subject within each group could be used. However, as pointed out in [3] and detailed in [4], it is simpler to aggregate the repeated measures as in the previous section to consider only the group effect. Hence, the effect coding matrix for the groups () has lines. Because the last group is entirely determined by the other groups, it is not coded (otherwise the resulting matrix would be singular), resulting in columns.

Using the group coding vector for each subject and joining to it the dependent variable, we get a matrix that contains the predictor variables in the first columns and the dependent variable in the last column.

With this matrix , the sum of squares and cross-product () matrix can be computed by

where represents the -dimensional vector composed of 1s. By partitioning the matrix, the coefficient of determination is obtained. The must be partitioned into four submatrices named , , , and (in which the subscript stands for predictors and stands for dependent variable). These matrices represent, respectively, the sum of squares of the predictors alone, the sum of cross-products between the predictors and the dependent variable, the sum of cross-products between the dependent variable and the predictors, and lastly the sum of squares of the dependent variable alone.

where the size of the matrix is and the size of the matrix is . Finally, we verify that .

The coefficient of determination for the between-subjects effect () can be obtained by the matrix multiplication

Finally, the value is the ratio between the explained variation and the unexplained variation, weighted by the degrees of freedom:

All those operations are performed with the following commands. First we have the coding matrix.

The means for each group are then computed.

Then the matrix can be constructed.

From this matrix , the matrix is easily obtained according to equation 1.

This matrix can then be partitioned into the various sum of squares submatrices needed to compute .

Finally, the ratio can be obtained.

And again, the total sum of squares of the between-subjects factor is memorized for later use (area in Figure 2).

#### Test of the Within-Subjects Effects

As in the previous section, the within-subjects effects are computed by dropping the repeated measures. Therefore, the subjects are considered independent and the computation is accomplished like a standard between-subjects ANOVA. However, the error term will differ from a standard ANOVA since the between-subjects variability has been evaluated and thus can be removed from the total error (Figure 2).

The first step is to compute the proportion of variance accounted for by the repeated measure effect . To this end, we must create an effect coding matrix for the repeated measures. This effect coding is created as in the previous section except that it is of size .

The raw data is reorganized so that for each item, the replication number is available, , , , and . From it, the matrix is computed in exactly the same way as before.

Next, the matrix is computed exactly as before.

Partitioning this matrix, the quantity can be obtained exactly as before.

The same steps are repeated one last time for the interaction effect . The effect coding matrix for the interaction is defined for all the combinations of the between-subjects factor and the repeated-measure factor. It is obtained with the outer product of the individual effect coding.

The data matrix is reorganized one last time so that both the group and the number of the repeated measures are available: , , , and . The and matrices are then computed as usual.

Partitioning this last matrix, the quantity can be obtained.

Finally, the error proportion of variance can be estimated as the within-subjects variation left unexplained:

in which is given by

where is the between-factors sum of squares memorized in the previous subsection.

The ratios for the repeated-measure effect and the interaction effect can be computed with the formulas:

which we compute.

All the information required has been gathered; the ANOVA table can be produced just like in the section Computing a Split-Plot ANOVA from the Computations Obtained by a Between-Subjects ANOVA.

### The MixedDesignANOVA Package

The MixedDesignANOVA package performs the different analyses using the procedures outlined in the previous section. It works for equal as well as for unequal numbers of subjects per group. To use it, first load the package (adapt the path if necessary) and load some data. Optionally, you can define labels for the factors and the levels of the factors.

The command MixedDesignANOVA, with a data matrix respecting the input format defined in the first section, displays the ANOVA table only, with default names (B for the between-subjects factor and A for the within-subjects factor).

The command has seven options, as listed below. The option Epsilons is used to compute the Greenhouse-Geisser, the Huynh-Feldt and the lower-bound epsilons [5, 6]. The options MeanTable and MeanPlot show the mean across conditions and measures under the form of a table or visually. The option VarCov returns the variance-covariance matrices for each group as well as the global variance-covariance matrix.

Finally, the option SummaryStatistics can be used to display summary statistics for each cell of the design. Default summary statistics are None; Automatic returns the mean, variance, standard deviation, length (i.e. the number of observations in the cell), unbiased skewness, and unbiased kurtosis.

The next command runs an analysis with all options turned on; the results are displayed one at a time afterward.

The two unbiased functions are available as and following the formulas given in [7].

The package MixedDesignANOVA works in Mathematica 4.0 and higher. It is available with this article or can be found at www.mapageweb.umontreal.ca/cousined/home/Others/MixedDesignAnova/Index.html.

### References

 [1] D. C. Howell, Fundamental Statistics for the Behavioral Sciences, 5th ed., Belmont, CA: Thomson-Brooks/Cole, 2004. [2] D. A. King, “Associative Control of Tolerance to the Sedative Effects of a Short-Acting Benzodiazepine,” unpublished doctoral dissertation, University of Vermont, 1986. [3] J. Cohen and P. Cohen, Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences, 2nd ed., Hillsdale, NJ: Lawrence Erlbaum Associates, 1983. [4] A. L. Edwards, Multiple Regression and the Analysis of Variance and Covariance, 2nd ed., New York: W. H. Freeman, 1985. [5] S. W. Greenhouse and S. Geisser, “On Methods in the Analysis of Profile Data,” Psychometrika, 24(2), 1959 pp. 95-112. www.springerlink.com/content/220512×5l554733q. [6] H. Huynh and L. S. Feldt, “Estimation of the Box Correction for Degrees of Freedom from Sample Data in the Randomized Block and Split Plot Designs,” Journal of Educational Statistics, 1(1), 1976 pp. 69-82. www.jstor.org/pss/1164736. [7] K. Pearson, “Das Fehlergesetz und seine Verallgemeinerungen durch Fechner und Pearson. A Rejoinder,” Biometrika, 4(1/2), 1905 pp. 169-212. www.jstor.org/pss/2331536. S. Chartier and D. Cousineau “Computing Mixed-Design (Split-Plot) ANOVA,” The Mathematica Journal, 2011. dx.doi.org/doi:10.3888/tmj.13-17.

Sylvain Chartier is a professor of cognitive psychology and quantitative methods at the University of Ottawa. His areas of research include artificial neural networks and nonlinear dynamic systems applied to psychology.

Denis Cousineau is a professor of cognitive psychology at the University of Ottawa. He runs research in artificial intelligence as well as on human categorization processes.

Sylvain Chartier
School of Psychology
University of Ottawa
136 Jean Jacques Lussier
Ottawa, Ontario