Learning Outcomes
By the end of this section, students will be able to:
 Explore the data with correlations and scatterplots.
 Use an ANOVA to test for a difference in means across a categorical variable.Â
 Conduct univariable and multivariable linear regression
 Check the regression diagnostics of a linear model.
You can download a copy of the slides here: B1.2 Differences Between Means (ANOVA I)
Video B1.2 – OneWay ANOVA (12 minutes)
B1.2 PRACTICAL: Stata
The ANOVA procedure allows us to establish whether there is evidence that the mean SBP values across the BMI groups are not all equal. You use an ANOVA to test for differences in means across levels of a categorical variable (not a continuous one).
The anova command in Stata is setup as follows:
anova varname [termlist], [options]
where â€˜termlistâ€™ is a factorvariable list (i.e. a categorical variable). We will not be using any of the options here.
Now compare mean SBP in the four groups of BMI using ANOVA, the Stata code you will need is:
Â anova sbp bmi_grp4
Question B1.2a: Is there a significant difference in mean SBP across the BMI groups?
Answer
The result indicates there are differences in mean SBP across the four categories of BMI as the pvalue of 0.0001 is highly significant.
Note that an ANVOA it is a global test and does not tell us which BMI groups are significantly different from each other. You will learn how to produce comparisons of all possible pairings of the BMI groups, as well as many other uses of the ANOVA test in Module B2.
B1.2 PRACTICAL: SPSS
The ANOVA procedure allows us to establish whether there is evidence that the mean SBP values across the BMI groups are not all equal. You use an ANOVA to test for differences in means across levels of a categorical variable (not a continuous one).
We are going to use this to test for a significant difference in mean SBP across the BMI groups.
There are actually multiple ways to get to the same outcome in SPSS. The simplest way, if you know you are not going to need to consider any other variables is as follows.
Select
Analyze >> Compare Means and Proportions >> OneWay ANOVA
The move the continuous variable (in this case SBP) into the Dependant List and the Categorical variable (BMI group) into the Factor box. Then press â€˜OKâ€™ to run the test.
Answer
The result indicates there are differences in mean SBP across the four categories of BMI as the pvalue < 0.001 is highly significant.
Note that an ANVOA it is a global test and does not tell us which BMI groups are significantly different from each other. You will learn how to produce comparisons of all possible pairings of the BMI groups, as well as many other uses of the ANOVA test in Module B2.
B1.2 PRACTICAL: R
ANOVA
We use aov() to perform ANOVA, and we get a summary of the ANOVA table using summary().
aov() can be used in two ways as follows:

 fit3 < aov(y ~ x, data=my.data)
 fit3 < aov(my.data$y ~ my.data$x)
To create a summary of ANOVA we can use summary(fit3).
white.data$bmi_fact<factor(white.data$bmi_grp4)
fit3<aov(sbp~bmi_fact, data=white.data)
summary(fit3)
Question B1.2: Compare SBP in the four groups using ANOVA. Is there a significant relationship between SBP and BMI groups?
Answer
> fit3<aov(sbp~bmi_fact, data=white.data)
> summary(fit3)
Â Â Â Â Â Â Â Df Â Sum Sq Mean Sq F value Â Pr(>F) Â Â
bmi_fact Â Â Â 3 Â Â 6451 Â 2150.4 Â 7.024 0.000104 ***
Residuals Â 4297 1315528 Â 306.2 Â Â Â Â Â Â Â Â Â Â Â
—
Signif. codes: Â 0 â€˜***â€™ 0.001 â€˜**â€™ 0.01 â€˜*â€™ 0.05 â€˜.â€™ 0.1 â€˜ â€™ 1
26 observations deleted due to missingness
The result indicates there are differences in SBP across the four categories of BMI as the pvalue of 0.0001 is highly significant.
However, an ANOVA a global test and does not tell us which BMI groups are significantly different from each other. You will learn how to produce comparisons of all possible pairings of the BMI groups, as well as many other aspects of the ANOVA test in Module B2.