Back to Course

FoSSA: Fundamentals of Statistical Software & Analysis

0% Complete
0/0 Steps
  1. Course Information

    Meet the Teaching Team
  2. Course Dataset 1
  3. Course Dataset 2
  4. MODULE A1: INTRODUCTION TO STATISTICS USING R, STATA, AND SPSS
    A1.1 What is Statistics?
  5. A1.2.1a Introduction to Stata
  6. A1.2.2b: Introduction to R
  7. A1.2.2c: Introduction to SPSS
  8. A1.3: Descriptive Statistics
  9. A1.4: Estimates and Confidence Intervals
  10. A1.5: Hypothesis Testing
  11. A1.6: Transforming Variables
  12. End of Module A1
    1 Quiz
  13. MODULE A2: POWER & SAMPLE SIZE CALCULATIONS
    A2.1 Key Concepts
  14. A2.2 Power calculations for a difference in means
  15. A2.3 Power Calculations for a difference in proportions
  16. A2.4 Sample Size Calculation for RCTs
  17. A2.5 Sample size calculations for cross-sectional studies (or surveys)
  18. A2.6 Sample size calculations for case-control studies
  19. End of Module A2
    1 Quiz
  20. MODULE B1: LINEAR REGRESSION
    B1.1 Correlation and Scatterplots
  21. B1.2 Differences Between Means (ANOVA 1)
  22. B1.3 Univariable Linear Regression
  23. B1.4 Multivariable Linear Regression
  24. B1.5 Model Selection and F-Tests
  25. B1.6 Regression Diagnostics
  26. End of Module B1
    1 Quiz
  27. MODULE B2: MULTIPLE COMPARISONS & REPEATED MEASURES
    B2.1 ANOVA Revisited - Post-Hoc Testing
  28. B2.2 Correcting For Multiple Comparisons
  29. B2.3 Two-way ANOVA
  30. B2.4 Repeated Measures and the Paired T-Test
  31. B2.5 Repeated Measures ANOVA
  32. End of Module B2
    1 Quiz
  33. MODULE B3: NON-PARAMETRIC MEASURES
    B3.1 The Parametric Assumptions
  34. B3.2 Mann-Whitney U Test
  35. B3.3 Kruskal-Wallis Test
  36. B3.4 Wilcoxon Signed Rank Test
  37. B3.5 Friedman Test
  38. B3.6 Spearman's Rank Order Correlation
  39. End of Module B3
    1 Quiz
  40. MODULE C1: BINARY OUTCOME DATA & LOGISTIC REGRESSION
    C1.1 Introduction to Prevalence, Risk, Odds and Rates
  41. C1.2 The Chi-Square Test and the Test For Trend
  42. C1.3 Univariable Logistic Regression
  43. C1.4 Multivariable Logistic Regression
  44. End of Module C1
    1 Quiz
  45. MODULE C2: SURVIVAL DATA
    C2.1 Introduction to Survival Data
  46. C2.2 Kaplan-Meier Survival Function & the Log Rank Test
  47. C2.3 Cox Proportional Hazards Regression
  48. C2.4 Poisson Regression
  49. End of Module C2
    1 Quiz
Topic 18 of 49
In Progress

A2.6 Sample size calculations for case-control studies

Learning Outcomes

By the end of this section, students will be able to:

  • Explain the key concept of power and what impacts it
  • Estimate the power of a given study
  • Estimate the sample size needed to test hypotheses in different study designs

You can download a copy of the slides here: A2.6: Sample size calculations for case-control studies

Video A2.6 Sample Size Calculations for Case-Control Studies (7 minutes)

A2.6 PRACTICAL: R

Estimating sample size for case-control studies

You have now been asked to help another research group with their power calculations. They want to conduct a case-control study of sterilization (tubal ligation) and ovarian cancer. You could use parameters from a recent paper (reference below). Parameters of interest might include the proportion of cases or controls with tubal ligation, and the odds ratio.

Tubal ligation, hysterectomy and epithelial ovarian cancer in the New England Case-Control Study. Rice MS, Murphy MA, Vitonis AF, Cramer DW, Titus LJ, Tworoger SS, Terry KL. Int J Cancer. 2013 Nov 15;133(10):2415-21. doi: 10.1002/ijc.28249. Epub 2013 Jul 9.

Looking at the paper, you could extract  the following relevant information needed to estimate your sample size:

  • Proportion of controls with tubal ligation: 18.5%.
  • Proportion of cases with tubal ligation: 12.8%.
  • Odds ratio for the association between tubal ligation and ovarian cancer: 0.82.
  • This study used 2,265 cases and 2,333 controls.
    Note: you probably won’t need to use all of these parameters.

Below is a plot showing how the sample size would change, depending on the odds ratio (assuming 18% of controls have had tubal ligation, and 90% power, and allowing the odds ratio to vary between 0.6 and 0.9):

Sample size for case-control studies plot

If you want to use a test of proportions to assess how large your sample needs to be for an expected odds ratio of the association between tubal ligation and ovarian cancer, you need to have two proportions for the command. Above we assume that 18% of controls have had tubal ligation, so we then can use this formula to work background from an expected odds ratio to see the other proportion:

p2 = (OR*p1 )/(1 +p1 (OR-1))

  • Question A2.6a: Looking at the plot, how many people do you need to recruit into your study if your odds ratio is less than 0.8 and you want to have power of at least 80%?
  • Question A2.6b: Now use a test of two proportions to assess how large your sample needs to be if the odds ratio of tubal ligation and ovarian cancer is 0.85. You assume 18% of the controls have had tubal ligation and you want 90% power. You need to have two proportions to run the command, so use the formula presented above.
Answer

Answer A2.6a: If the odds ratio in this scenario is ≤ 0.8, (i.e. an effect size of 20% -40%) then you can reach a power ≥ 0.8 with a sample size of around 5000 people. If the true odds ratio is closer to 0.9 (i.e. a 10% effect size), then you would require a much larger sample size.

Answer A2.6b:

We substitute a proportion of 18% and an OR of 0.85 into the formula:

p2 = (OR*p1 )/(1 +p1 (OR-1))

p2= (0.85*0.18)/(1+0.18(0.85-1)

Using R like a calculator, we get:

> (0.85*0.18)/(1+0.18*(0.85-1))

[1] 0.1572456

> power11<-pwr.2p.test(h=ES.h(p1=0.18, p2=0.1572), power=0.9, sig.level=0.05)

> power11

     Difference of proportion power calculation for binomial distribution (arcsine transformation)

              h = 0.06092931

              n = 5660.744

      sig.level = 0.05

          power = 0.9

    alternative = two.sided

NOTE: same sample sizes

You need 5661 cases and 5661 controls, so about 11,322 participants in total, to obtain 90% power if your OR of tubal ligation with ovarian cancer is 0.85.

A2.6 PRACTICAL: Stata

Estimating sample size for case-control studies.

You have now been asked to help another research group with their power calculations. They want to conduct a case-control study of sterilization (tubal ligation) and ovarian cancer. You could use parameters from a recent paper (reference below). Parameters of interest might include the proportion of cases or controls with tubal ligation, and the odds ratio.

Tubal ligation, hysterectomy and epithelial ovarian cancer in the New England Case-Control Study. Rice MS, Murphy MA, Vitonis AF, Cramer DW, Titus LJ, Tworoger SS, Terry KL. Int J Cancer. 2013 Nov 15;133(10):2415-21. doi: 10.1002/ijc.28249. Epub 2013 Jul 9.

Looking at the paper, you could extract  the following relevant information needed to estimate your sample size:

    • Proportion of controls with tubal ligation: 18.5%.
    • Proportion of cases with tubal ligation: 12.8%.
    • Odds ratio for the association between tubal ligation and ovarian cancer: 0.82.
    • This study used 2,265 cases and 2,333 controls.

Note: you probably won’t need to use all of these parameters.

For a table showing how the sample size would change, depending on the odds ratio (assuming 18% of controls have had tubal ligation, and 90% power, and allowing the odds ratio to vary between 0.6 and 0.9):

power twoproportions 0.18, alpha(0.05) effect(oratio) power(0.9)       
  oratio(0.6,0.7,0.8,0.9) table

Estimated sample sizes for a two-sample proportions test
Pearson’s chi-squared test 
Ho: p2 = p1  versus  Ha: p2 != p1

  +————————————————————————-+
  |   alpha   power       N      N1      N2   delta      p1      p2  oratio |
  |————————————————————————-|
  |     .05      .9    1308     654     654      .6     .18   .1164      .6 |
  |     .05      .9    2530    1265    1265      .7     .18   .1332      .7 |
  |     .05      .9    6162    3081    3081      .8     .18   .1494      .8 |
  |     .05      .9   26552   13276   13276      .9     .18    .165      .9 |
+————————————————————————-+

For a graph illustrating how the power and sample size would change with different odds ratios (again assuming 18% of controls have had tubal ligation, and allowing the odds ratio to vary from 0.6 to 0.9, and the sample size to vary from 1000 to 15,000):

*– Basic plot;
  power twoproportions 0.18, alpha(0.05) effect(oratio) ///
  oratio(0.6,0.7,0.8,0.9) n(1000(1000)15000) graph

*– Changing the graph using plot options:
    power twoproportions 0.18, alpha(0.05) effect(oratio) ///   
    oratio(0.6,0.7,0.8,0.9) n(1000(1000)15000) ///
  graph(plot1opts(msymbol(O) lpattern(solid)) ///
  plot2opts(msymbol(D) lpattern(dash)) ///
    plot3opts(msymbol(S) lpattern(dot)) ///
  plot4opts(msymbol(T) lpattern(longdash_dot))  ///   
  graphregion(color(white)) xlabel(, nogrid) ylabel(, nogrid))

sample size for case-control studies plot

Question A2.6: Looking at the plot, how many people do you need to recruit into your study if your odds ratio is less than 0.8 and you want to have power of at least 80%?

Answer

If the odds ratio in this scenario is ≤ 0.8,  then you can reach a power ≥ 0.8 with a sample size of around 5000 people. If the true odds ratio is closer to 0.9, then you would require a much larger sample size.

A2.6 PRACTICAL: SPSS

Estimating sample size for case-control studies

You have now been asked to help another research group with their power calculations. They want to conduct a case-control study of sterilization (tubal ligation) and ovarian cancer. You could use parameters from a recent paper (reference below). Parameters of interest might include the proportion of cases or controls with tubal ligation, and the odds ratio.

Tubal ligation, hysterectomy and epithelial ovarian cancer in the New England Case-Control Study. Rice MS, Murphy MA, Vitonis AF, Cramer DW, Titus LJ, Tworoger SS, Terry KL. Int J Cancer. 2013 Nov 15;133(10):2415-21. doi: 10.1002/ijc.28249. Epub 2013 Jul 9.

Looking at the paper, you could extract  the following relevant information needed to estimate your sample size:

  • Proportion of controls with tubal ligation: 18.5%.
  • Proportion of cases with tubal ligation: 12.8%.
  • Odds ratio for the association between tubal ligation and ovarian cancer: 0.82.
  • This study used 2,265 cases and 2,333 controls.
    Note: you probably won’t need to use all of these parameters.

Below is a plot showing how the sample size would change, depending on the odds ratio (assuming 18% of controls have had tubal ligation, and 90% power, and allowing the odds ratio to vary between 0.6 and 0.9):

If you want to use a test of proportions to assess how large your sample needs to be for an expected odds ratio of the association between tubal ligation and ovarian cancer, you need to have two proportions for the command. Above we assume that 18% of controls have had tubal ligation, so we then can use this formula to work background from an expected odds ratio to see the other proportion:

p2 = (OR*p1 )/(1 +p1 (OR-1))

i, Looking at the plot, how many people do you need to recruit into your study if your odds ratio is less than 0.8 and you want to have power of at least 80%?

ii. Now use the Independent Samples Binomial Test as before to assess how large your sample needs to be if the odds ratio of tubal ligation and ovarian cancer is 0.85. You assume 18% of the controls have had tubal ligation and you want 90% power. You need to have two proportions to run the command, so use the formula presented above.

Answer

i. If the odds ratio in this scenario is ≤ 0.8, (i.e. an effect size of 20% -40%) then you can reach a power ≥ 0.8 with a sample size of around 5000 people. If the true odds ratio is closer to 0.9 (i.e. a 10% effect size), then you would require a much larger sample size.

ii. Firstly we substitute a proportion of 18% and an OR of 0.85 into the formula to work out the second proportion:

p2 = (OR*p1 )/(1 +p1 (OR-1))

= (0.85*0.18)/(1+0.18(0.85-1)

= (0.85*0.18)/(1+0.18*(0.85-1))

= 0.1572456

Then input the proportions into the Independent Samples Binomial Test with a power of 0.9 to calculate the required sample size.

You need 5665 cases and 5665 controls, so about 11,330 participants in total, to obtain 90% power if your OR of tubal ligation with ovarian cancer is 0.85.

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments
0
Questions or comments?x