Skip to main content

Sample Size Formula

Sample size calculation is a crucial step in designing an experiment or study, ensuring that the results are statistically significant and generalizable. Below are the main components and formulas used to calculate the sample size for different scenarios, particularly for estimating a population mean or proportion.

1. Estimating a Population Mean

To determine the sample size needed to estimate a population mean with a desired level of confidence and precision, the formula is:

n=(Zα/2×σE)2n = \left(\frac{Z_{\alpha/2} \times \sigma}{E}\right)^2
  • nn: Required sample size
  • Zα/2Z_{\alpha/2}: Z-score corresponding to the desired confidence level (e.g., 1.96 for 95% confidence)
  • σ\sigma: Estimated standard deviation of the population
  • EE: Margin of error (the maximum acceptable difference between the sample mean and the population mean)

Example: Population Mean Calculation

To calculate a sample size for a study with a 95% confidence level, an estimated population standard deviation of 10, and a margin of error of 2:

import scipy.stats as stats

# Parameters
confidence_level = 0.95
sigma = 10 # estimated standard deviation
E = 2 # margin of error

# Calculate Z-score for the given confidence level
Z = stats.norm.ppf((1 + confidence_level) / 2)

# Calculate sample size
n = (Z * sigma / E)**2
n = round(n)
n

2. Estimating a Population Proportion

For calculating the sample size required to estimate a population proportion within a given margin of error, the formula is:

n=Zα/22×p×(1p)E2n = \frac{Z_{\alpha/2}^2 \times p \times (1 - p)}{E^2}
  • pp: Estimated proportion of the attribute present in the population
  • EE: Margin of error

Example: Population Proportion Calculation

To estimate the sample size for a survey where you expect about 50% of the population to respond positively, with a 95% confidence level and a margin of error of 5%:

# Parameters
p = 0.5 # estimated proportion
E = 0.05 # margin of error

# Calculate Z-score for the given confidence level
Z = stats.norm.ppf((1 + confidence_level) / 2)

# Calculate sample size
n = (Z**2 * p * (1 - p)) / E**2
n = round(n)
n

These examples provide a basic guide for sample size calculations in typical scenarios involving means and proportions. Adjustments might be necessary for more complex designs or specific statistical tests.