A2.5 Sample size calculations for cross-sectional studies (or surveys)
Learning Outcomes
By the end of this section, students will be able to:
- Explain the key concept of power and what impacts it
- Estimate the power of a given study
- Estimate the sample size needed to test hypotheses in different study designs
You can download a copy of the slides here: A2.5: Sample size calculations for cross-sectional studies (or surveys)
Video A2.5 Sample Size Calculation for Cross-Sectional Studies (5 minutes)
A2.5 PRACTICAL: R
Example of estimating sample size for a hypothesis in a cross-sectional study
You have been asked to help with a power calculation for a cross-sectional study, to estimate the point prevalence of obesity within a population. A study five years ago in this population found that 30% of people were obese, but the government thinks this has increased by 10% (to a point prevalence of 40%). Estimate the sample size needed for this study, assuming that the previous point prevalence of 30% is your `null hypothesis’. You want 80% power.
You are calculating a sample size for one proportion here.
The command is now ‘pwr.p.test’:
> power8<-pwr.p.test(h=ES.h(p1=0.3, p2=0.4), power=0.8, sig.level=0.05)
> power8
proportion power calculation for binomial distribution (arcsine transformation)
         h = 0.2101589
n = 177.7096
sig.level = 0.05
power = 0.8
alternative = two.sided
You need about 178 participants in your study to estimate this prevalence.
Question A2.5: One researcher has suggested that the proportion of the population who is obese may actually have decreased by 10% in the last five years (i.e. to 20%). How would this change your estimate for the sample size needed?
Answer
You are calculating a sample size for one proportion here.
> power9<-pwr.p.test(h=ES.h(p1=0.3, p2=0.2), power=0.8, sig.level=0.05)
> power9
proportion power calculation for binomial distribution (arcsine transformation)
         h = 0.2319843
n = 145.8443
sig.level = 0.05
power = 0.8
alternative = two.sided
The estimated sample size has now reduced slightly, to 146.
Based the outputs above, we can conclude that more data are needed to detect a change in proportion from 0.3 to 0.4 than from 0.3 to 0.2. For a fixed absolute difference (here the absolute difference in proportions is 0.1), larger sample sizes are needed to obtain a given level of power as the proportions approach 0.5. This relationship is symmetrical around 0.5, as shown below:
> power9<-pwr.p.test(h=ES.h(p1=0.1, p2=0.2), power=0.8, sig.level=0.05)
> power9
proportion power calculation for binomial distribution (arcsine transformation)
         h = 0.2837941
n = 97.45404
sig.level = 0.05
power = 0.8
alternative = two.sided
> power10<-pwr.p.test(h=ES.h(p1=0.9, p2=0.8), power=0.8, sig.level=0.05)
> power10Â Â Â Â Â
proportion power calculation for binomial distribution (arcsine transformation)
         h = 0.2837941
n = 97.45404
sig.level = 0.05
power = 0.8
alternative = two.sided
Recall that the standard error (se) of the sampling distribution of p is .
As p gets closer to 0.5, the amount of variability increases (se is largest when p=0.5) and, therefore, more data are needed to detect a change in proportions of 0.1.
A2.5 PRACTICAL: Stata
Example of estimating sample size for a hypothesis in a cross-sectional study
You have been asked to help with a power calculation for a cross-sectional study, to estimate the point prevalence of obesity within a population. A study five years ago in this population found that 30% of people were obese, but the government thinks this has increased by 10% (to a point prevalence of 40%). Estimate the sample size needed for this study, assuming that the previous point prevalence of 30% is your `null hypothesis’.
You are calculating a sample size for one proportion here.
The command is:
power oneproportion 0.3, diff(0.1)
This could also be calculated using:
power oneproportion 0.3 0.4, power(0.8)
*–Estimated sample size: N = 172
Question A2.5: One researcher has suggested that the proportion of the population who is obese may actually have decreased by 10% in the last five years (i.e. to 20%). How would this change your estimate for the sample size needed?
Answer
You are calculating a sample size for one proportion here.
 power oneproportion 0.3, diff(-0.1)
Or alternatively:
power oneproportion 0.3 0.2
*– Estimated sample size: N = 153
The estimated sample size has now reduced slightly, to 153.
Based the outputs above, we can conclude that more data are needed to detect a change in proportion from 0.3 to 0.4 than from 0.3 to 0.2. For a fixed absolute difference (here the absolute difference in proportions is 0.1), larger sample sizes are needed to obtain a given level of power as the proportions approach 0.5. This relationship is symmetrical around 0.5, as shown below:
power oneproportion 0.1 0.2
*– Estimated sample size: N = 86
power oneproportion 0.9 0.8
*– Estimated sample size: N = 86
Recall that the standard error (se) of the sampling distribution of p is . As p gets closer to 0.5, the amount of variability increases (se is largest when p=0.5) and, therefore, more data are needed to detect a change in proportions of 0.1.
A2.5 PRACTICAL: SPSS
Example of estimating sample size for a hypothesis in a cross-sectional study
You have been asked to help with a power calculation for a cross-sectional study, to estimate the point prevalence of obesity within a population. A study five years ago in this population found that 30% of people were obese, but the government thinks this has increased by 10% (to a point prevalence of 40%). Estimate the sample size needed for this study, assuming that the previous point prevalence of 30% is your `null hypothesis’.
Select
Analyze >> Power Analysis >> Proportions >> One Sample Binomial Test
Input the values form the scenario into the Power Analysis window as before. In this example the Population proportion is the predicted value of 40% (0.4) and the null value is the previous prevalence of 30% (0.3).

One researcher has suggested that the proportion of the population who is obese may actually have decreased by 10% in the last five years (i.e. to 20%). How would this change your estimate for the sample size needed?
Answer
For the first part of the question, when the hypothesis is that we will see a 40% prevalence, the output table will look like this.

The estimated sample size we need to have a power of 80% is 172.
In the second part of the question, when the hypothesis is that we will see a 20% prevalence, the output table will look like this.

The estimated sample size has now reduced slightly, to 153.
Based the outputs above, we can conclude that more data are needed to detect a change in proportion from 0.3 to 0.4 than from 0.3 to 0.2. For a fixed absolute difference (here the absolute difference in proportions is 0.1), larger sample sizes are needed to obtain a given level of power as the proportions approach 0.5. This relationship is symmetrical around 0.5, as shown below:
Recall that the standard error (se) of the sampling distribution of p is . As p gets closer to 0.5, the amount of variability increases (se is largest when p=0.5) and, therefore, more data are needed to detect a change in proportions of 0.1
Hello, Using this synthax power8<-pwr.p.test(h=ES.h(p1=0.3, p2=0.4), power=0.8, sig.level=0.05) to calculate the sample size
What is the considered value of precision(d)?
Thank you