Estimating the mean

Sample mean, t-distribution, CIs

X ~ N(0, 1)

Help

Experiment with drawing samples from a standard normal population. Each sample's mean is recorded in the histogram, building up the sampling distribution of X̄.

Choose the sample size, then generate samples using the buttons. Animate shows one sample appearing dot by dot, with the mean dropping into the histogram at the end.

Toggle the display of the population curve, the sample values, and the histogram using the Population, Sample, and Histogram buttons.

Normal overlays the exact sampling distribution N(0, 1/n) on the histogram.

CI shows a t-based confidence interval for the population mean, constructed from the most recent sample.

Norm win shows the central (1−α) interval of the theoretical sampling distribution — the range where sample means should fall most of the time.

Hist win shows the same interval estimated empirically from the accumulated sample means.

Zoom x rescales the histogram axis to zoom in on the sampling distribution.

Reset the display via the Reset button, or by changing the sample size.

This app is inspired by the outstanding sampling-distribution app developed by David Lane: https://onlinestatbook.com/stat_sim/sampling_dist/

Overview

Suppose we collect a sample X1, …, Xn from a normal or approximately normal population with unknown mean μ. Our best single estimate of μ is the sample mean

X̄ = (1/n) Σ Xi

But how reliable is this estimate? If we repeat the experiment many times, each repetition would give a different X̄. The distribution of these X̄ values is called the sampling distribution of the mean. For a population with variance σ², the sampling distribution has mean μ and variance σ²/n — so larger samples give more precise estimates.

In practice we only have one sample, and we don't know σ². We use the sample standard deviation S to estimate it. The key result is that the standardised statistic

(X̄ − μ) / (S/√n)  ~  t(n−1)

follows a t-distribution with n−1 degrees of freedom. The t-distribution is wider than the normal, reflecting the extra uncertainty from estimating σ². For large n it is nearly identical to the normal.

A two-sided 100(1−α)% confidence interval for μ is

X̄ ± tα/2, n−1 · S/√n

where tα/2, n−1 is the upper α/2 quantile of t(n−1). The CI says: if we repeated the sampling procedure many times, about (1−α)×100% of the resulting intervals would contain the true μ.

The Norm win overlay shows the central (1−α) region of the theoretical sampling distribution — using the known σ² = 1 — while the CI is constructed using only what a real experimenter would have: the sample itself.