We make the same assumptions for ANOVA that we did for the 2-sample t-test. That is, we assume that each population is normally distributed, that we have an independent sample from each population, and that the variances of the populations are all the same. As with the t-test, these are assumptions we should attempt to check before proceeding with ANOVA.

Open the heights dataset that was used in a previous assignment. Recall that 1 means female and 2 means male.

First, run a 2-sample t-test (in the Stat...Basic Statistics menu) to compare Males to Females assuming equal variances. Be sure to assume equal variances and a two-sided alternative. Keep the output to refer to later.

Now, run a One-way ANOVA procedure. "One-way" refers to the fact that there is only a single grouping variable, namely sex. Select the Stat...ANOVA...One-way menu and you should obtain the dialog box shown in Figure 4.

Figure 4

In the parlance of ANOVA, a factor is a categorical (grouping) variable, whereas the response is the quantitative measure that forms the basis for our comparisons. In this case, therefore, Sex is the factor and Height is the response. Enter these on the appropriate lines of the dialog box shown in Figure 4 and then click OK.

The output from the ANOVA consists of two parts: A table of summary statistics accompanied by a graph of individual 95% confidence intervals and the so-called ANOVA table. The ANOVA table has 6 columns (source, DF, SS, MS, F, and P) and 3 rows (Sex, Error, Total). Here is what those things stand for:

**Source** means "Source of variation";

**DF** stands for "Degrees of Freedom";

**SS** stands for "Sum of Squares";

**MS** stands for "Mean Square";

**F** is the column for the F statistic(s);

**P** is the column for the p-value(s).

**Sex** is the row containing information for the sex effect.
It is also sometimes referred to as the between-group variation.

**Error** is the row containing information on the random variation
not accounted for by the main (sex) effect. It is also sometimes
referred to as the within-group variation.

**Total** is the sum of the between- and within-group entries
for DF and SS.

Many of the quantities calculated for the t-test are present in the ANOVA table. Notice first of all that the p-value for the t-test is identical to the p-value for the F test in the ANOVA table. Secondly, find the pooled standard deviation and t-statistic from the t-test output try to locate them in the ANOVA table (you'll have to square the t-statistic to find it in the table). The lesson here is that the 2-sample t-test with equal variances is equivalent to the ANOVA F-test applied to the same problem.

dhunter@stat.psu.edu