Back to Course

FoSSA: Fundamentals of Statistical Software & Analysis

0% Complete
0/0 Steps
  1. Course Information

    Meet the Teaching Team
  2. Course Dataset 1
  3. Course Dataset 2
  4. MODULE A1: INTRODUCTION TO STATISTICS USING R, STATA, AND SPSS
    A1.1 What is Statistics?
  5. A1.2.1a Introduction to Stata
  6. A1.2.2b: Introduction to R
  7. A1.2.2c: Introduction to SPSS
  8. A1.3: Descriptive Statistics
  9. A1.4: Estimates and Confidence Intervals
  10. A1.5: Hypothesis Testing
  11. A1.6: Transforming Variables
  12. End of Module A1
    1 Quiz
  13. MODULE A2: POWER & SAMPLE SIZE CALCULATIONS
    A2.1 Key Concepts
  14. A2.2 Power calculations for a difference in means
  15. A2.3 Power Calculations for a difference in proportions
  16. A2.4 Sample Size Calculation for RCTs
  17. A2.5 Sample size calculations for cross-sectional studies (or surveys)
  18. A2.6 Sample size calculations for case-control studies
  19. End of Module A2
    1 Quiz
  20. MODULE B1: LINEAR REGRESSION
    B1.1 Correlation and Scatterplots
  21. B1.2 Differences Between Means (ANOVA 1)
  22. B1.3 Univariable Linear Regression
  23. B1.4 Multivariable Linear Regression
  24. B1.5 Model Selection and F-Tests
  25. B1.6 Regression Diagnostics
  26. End of Module B1
    1 Quiz
  27. MODULE B2: MULTIPLE COMPARISONS & REPEATED MEASURES
    B2.1 ANOVA Revisited - Post-Hoc Testing
  28. B2.2 Correcting For Multiple Comparisons
  29. B2.3 Two-way ANOVA
  30. B2.4 Repeated Measures and the Paired T-Test
  31. B2.5 Repeated Measures ANOVA
  32. End of Module B2
    1 Quiz
  33. MODULE B3: NON-PARAMETRIC MEASURES
    B3.1 The Parametric Assumptions
  34. B3.2 Mann-Whitney U Test
  35. B3.3 Kruskal-Wallis Test
  36. B3.4 Wilcoxon Signed Rank Test
  37. B3.5 Friedman Test
  38. B3.6 Spearman's Rank Order Correlation
  39. End of Module B3
    1 Quiz
  40. MODULE C1: BINARY OUTCOME DATA & LOGISTIC REGRESSION
    C1.1 Introduction to Prevalence, Risk, Odds and Rates
  41. C1.2 The Chi-Square Test and the Test For Trend
  42. C1.3 Univariable Logistic Regression
  43. C1.4 Multivariable Logistic Regression
  44. End of Module C1
    1 Quiz
  45. MODULE C2: SURVIVAL DATA
    C2.1 Introduction to Survival Data
  46. C2.2 Kaplan-Meier Survival Function & the Log Rank Test
  47. C2.3 Cox Proportional Hazards Regression
  48. C2.4 Poisson Regression
  49. End of Module C2
    1 Quiz
Topic 14 of 49
In Progress

A2.2 Power calculations for a difference in means

Learning Outcomes

By the end of this section, students will be able to:

  • Explain the key concept of power and what impacts it
  • Estimate the power of a given study
  • Estimate the sample size needed to test hypotheses in different study designs

You can download a copy of the slides here: A2.2 Power calculations for a difference in means

Video A2.2 Power Calculation for a Difference in Means (8 minutes)

A2.2 PRACTICAL: R

Power calculations for two means

Here is an example:

Estimate the sample size needed to compare the mean systolic blood pressure (SBP) in two populations. From a pilot study, you think that the group with lower blood pressure will have a mean SBP of 120 mm Hg, and the standard deviation (SD) of both groups will be 15 mm Hg. You have decided that you are interested in a minimum difference of 5 mm Hg, and you want 90% power, and a 5% significance level.

In R, we need to calculate two statistics to estimate sample size: delta (i.e. the expected difference between groups) and sigma (variance, which in this case is the pooled standard deviation). Once we have delta and sigma, we can calculate the effect size we expect to see, which is Cohen’s d. We can guess delta and sigma from looking at past studies or by running a pilot study. Cohen’s d is estimated by dividing the delta by the sigma.

d <- 5/15
d
[1] 0.3333333

Then we can use the ‘pwr.t.test’ command (from the power package) to assess the sample size needed to detect this effect size.

### d = Cohen’s d
### power = 0.9
### alpha = 0.05

power1<-pwr.t.test(d=d, power=0.9, sig.level =0.05 )
power1

Two-sample t test power calculation

          n = 190.0991
d = 0.3333333
sig.level = 0.05
power = 0.9
alternative = two.sided

NOTE: n is number in *each* group

You need approximately 190 participants in each group, and 380 participants overall.

If we want to estimate the power of a given sample size, we omit the ‘power’ option, and instead use the ‘n=’ option:

>power2<-pwr.t.test(n=190, d=d, sig.level =0.05 )
> power2

Two-sample t test power calculation

          n = 190
d = 0.3333333
sig.level = 0.05
power = 0.8998509
alternative = two.sided

NOTE: n is number in *each* group

We can see here that recruiting 190 participants in each blood pressure group would enable our study to have 90% power.

Question A2_2: Using the same study outlined above, how much power would we have ended up with in our study if we only managed to recruit 150 participants in each group, but the variance of our study sample was smaller than what we anticipated (so SD=12)?

Answer

We first need to recalculate our Cohen’s d (effect size):

d2 <- 5/12
d2

[1] 0.4166667

power3<-pwr.t.test(n=150, d=d2, sig.level =0.05 )
>power3

Two-sample t test power calculation

          n = 150
d = 0.4166667
sig.level = 0.05
power = 0.9491662
alternative = two.sided

NOTE: n is number in *each* group

We recruited fewer participants, which would decrease our power, but since our variance was lower our power actually increased overall to 95%.

A2.2 PRACTICAL: Stata

Power calculations for two means

Here is an example:

Estimate the sample size needed to compare the mean systolic blood pressure (SBP) in two populations. From a pilot study, you think that the group with lower blood pressure will have a mean SBP of 120 mm Hg, and the standard deviation (SD) of both groups will be 15 mm Hg. You have decided that you are interested in a minimum difference of 5 mm Hg, and you want 90% power, and a 5% significance level.

The command and output is as follows:

power twomeans 120, power(0.9) alpha(0.05) diff(5) sd(15)

*– Estimated sample sizes:

            N =       382

  N per group =       191

*– Estimated sample size: 382 (191 per group).

You need approximately 382 participants overall.

If we want to estimate the power of a given sample size, we omit the ‘power’ option, and instead use the ‘n( )’ option:

power twomeans 120, alpha(0.05) diff(5) sd(15) n(382)

A2.2 Figure 1-1.png

Question A2.2: Using the same study outlined above, how much power would we have ended up with in our study if we only managed to recruit 300 participants in total, but the variance of our study sample was smaller than what we anticipated (so SD=12)?

Answer

power twomeans 120, alpha(0.05) diff(5) sd(12) n(300)

A2.2 Figure 2-1.png

We recruited fewer participants, which would decrease our power, but since our variance was lower our power actually increased overall to 95%.

A2.2 PRACTICAL: SPSS

Power calculations for two means

Here we want to estimate the sample size needed to compare the mean systolic blood pressure (SBP) in two populations. From a pilot study, you think that the group with lower blood pressure will have a mean SBP of 120 mm Hg, and the standard deviation (SD) of both groups will be 15 mm Hg. You have decided that you are interested in a minimum difference of 5 mm Hg, and you want 90% power, and a 5% significance level.

Select

Analyze >> Power Analysis >> Means >> Independent Samples T Test

In the Power Analysis window, you need to enter the following:

  • Estimate: Sample size (because this is what we want to calculate)
  • Single power value: 0.9 (we are looking for 90% power, and this is shown as a decimal)
  • Population mean difference: 5 (the difference we are looking for)
  • Population standard deviation: 15 (use the equal for two groups option, as we would not expect the groups to differ from each other.
  • Significance level (α): 0.05

spss1-5.png

If we want to estimate the power of a given sample size, we open the power analysis window in the same way, but select power from the drop-down menu at the top instead of sample size. Input the sample size for each of your two groups at the top of the box, then input all the rest of the values as before. Then press OK to run the test.

Using the same study outlined above, how much power would we have ended up with in our study if we only managed to recruit 300 participants in total, but the variance of our study sample was smaller than what we anticipated (so SD=12)?

Answer

For the first part of the question the output table would look like this.

spss2-6.png

Estimated sample size is 382 (191 per group), so you need approximately 382 participants overall for the study to have your desired power of 90%

For the second part of the question the output table would like like this.

spss3-5.png

We recruited fewer participants, which would decrease our power, but since our variance was lower our power actually increased overall to 95%.

👋 Before you go, please rate your satisfaction with this lesson

Ratings are completely anonymous

Average rating 0 / 5. Vote count: 0

No votes so far! Be the first to rate this post.

Please share any positive or negative feedback you may have.

Feedback is completely anonymous

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments
0
Questions or comments?x