QCE Mathematical Methods - Unit 4 - Sampling and proportions

Sample Proportions | QCE Mathematical Methods

Learn QCE Mathematical Methods sample proportions, $\hat p$ as a random variable, approximate normality and simulation ideas.

Updated 2026-05-18 - 4 min read

QCAA official coverage - Mathematical Methods 2025 v1.3

Exact syllabus points covered

  1. Understand the concept of the sample proportion $\hat{p}$ as a random variable whose value varies between samples, and the formulas for the mean $p$ and standard deviation $\sqrt{\frac{p(1-p)}{n}}$ of $\hat{p}$, where $n$ is the sample size.
  2. Recognise and use the approximate normality of the distribution of $\hat{p}$ for large samples.
  3. Use repeated random sampling data, for a variety of values of $p$ and a range of sample sizes, to examine the distribution of $\hat{p}$ and the approximate standard normality of $\frac{\hat{p}-p}{\sqrt{\hat{p}(1-\hat{p})/n}}$, where the closeness of the approximation depends on both $n$ and $p$.

A population proportion $p$ is the true proportion of the whole population with a characteristic. A sample proportion $\hat{p}$ is the proportion observed in one sample.

For example, if $37$ out of $100$ sampled students say they use flashcards, then:

$ \hat{p}=\frac{37}{100}=0.37 $

The population proportion $p$ is fixed but usually unknown. The sample proportion $\hat{p}$ changes from sample to sample.

Sample proportion distribution

Original Sylligence diagram for sample proportion distribution.

Sample proportion distribution

$\hat p$ as a random variable

Because different random samples give different sample proportions, $\hat{p}$ is treated as a random variable. Its mean is:

$ E(\hat{p})=p $

Its standard deviation is:

$ \sigma_{\hat{p}}=\sqrt{\frac{p(1-p)}{n}} $

where $n$ is the sample size.

Connection to binomial

If $X$ is the number of successes in $n$ independent trials, then:

$ X\sim B(n,p) $

and:

$ \hat{p}=\frac{X}{n} $

That is why sample proportion formulas look like binomial formulas scaled by $n$.

The mean follows by scaling:

$ E(\hat{p})=E\left(\frac{X}{n}\right)=\frac{1}{n}E(X)=\frac{np}{n}=p $

The variance also scales. Dividing a random variable by $n$ divides the variance by $n^2$:

$ \operatorname{Var}(\hat{p})=\operatorname{Var}\left(\frac{X}{n}\right)=\frac{1}{n^2}\operatorname{Var}(X) $

Since $\operatorname{Var}(X)=np(1-p)$:

$ \operatorname{Var}(\hat{p})=\frac{p(1-p)}{n} $

and therefore:

$ \sigma_{\hat{p}}=\sqrt{\frac{p(1-p)}{n}} $

Approximate normality

For large samples, the distribution of $\hat{p}$ can be approximated by a normal distribution:

$ \hat{p}\approx N\left(p,\frac{p(1-p)}{n}\right) $

The approximation depends on both $n$ and $p$. It generally behaves better when the sample is large and $p$ is not extremely close to $0$ or $1$.

The 2025 syllabus also emphasises the standardised sample proportion:

$ \frac{\hat{p}-p}{\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}} $

For repeated random samples, this quantity becomes more standard-normal-like when $n$ is large enough and the sample proportion is not too close to $0$ or $1$. In teaching notes and textbook derivations you may also see $p$ used in the denominator when the true population proportion is known. In confidence-interval work, $p$ is unknown, so $\hat{p}$ is used as the estimate.

What simulations show

Repeated random sampling helps you see three things:

| Change | Effect on $\hat{p}$ distribution | | --- | --- | | increase $n$ | spread decreases | | keep $p$ the same | centre stays near $p$ | | move $p$ closer to $0$ or $1$ | shape can become less symmetric for small samples |

This is why a sample of $1000$ tends to give a more stable sample proportion than a sample of $50$, but both are still random estimates.

Worked example

Worked probability example

Simulation

Simulation is useful because it makes sampling variation visible. If you repeatedly simulate binomial counts and divide by $n$, the histogram of $\hat{p}$ values should centre near $p$. As $n$ increases, the spread becomes smaller.

When describing a simulated sampling distribution, comment on centre, spread and shape. A strong response sounds like: "The simulated $\hat{p}$ values are centred close to $0.35$, the spread is about $0.07$ either side for most samples, and the distribution is roughly symmetric."

Common mistake

Quick check

Sources