Let x be the mean of a random sample of size 50 drawn from a population

\[ \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \] \[ \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \]\[\newcommand{\id}{\mathrm{id}}\] \[ \newcommand{\Span}{\mathrm{span}}\] \[ \newcommand{\kernel}{\mathrm{null}\,}\] \[ \newcommand{\range}{\mathrm{range}\,}\] \[ \newcommand{\RealPart}{\mathrm{Re}}\] \[ \newcommand{\ImaginaryPart}{\mathrm{Im}}\] \[ \newcommand{\Argument}{\mathrm{Arg}}\] \[ \newcommand{\norm}[1]{\| #1 \|}\] \[ \newcommand{\inner}[2]{\langle #1, #2 \rangle}\] \[ \newcommand{\Span}{\mathrm{span}}\] \[\newcommand{\id}{\mathrm{id}}\] \[ \newcommand{\Span}{\mathrm{span}}\] \[ \newcommand{\kernel}{\mathrm{null}\,}\] \[ \newcommand{\range}{\mathrm{range}\,}\] \[ \newcommand{\RealPart}{\mathrm{Re}}\] \[ \newcommand{\ImaginaryPart}{\mathrm{Im}}\] \[ \newcommand{\Argument}{\mathrm{Arg}}\] \[ \newcommand{\norm}[1]{\| #1 \|}\] \[ \newcommand{\inner}[2]{\langle #1, #2 \rangle}\] \[ \newcommand{\Span}{\mathrm{span}}\]\[\newcommand{\AA}{\unicode[.8,0]{x212B}}\]

Learning Objectives

  • To learn what the sampling distribution of \[\overline{X}\] is when the sample size is large.
  • To learn what the sampling distribution of \[\overline{X}\] is when the population is normal.

In Example 6.1.1, we constructed the probability distribution of the sample mean for samples of size two drawn from the population of four rowers. The probability distribution is:

\[\begin{array}{c|c c c c c c c} \bar{x} & 152 & 154 & 156 & 158 & 160 & 162 & 164\\ \hline P[\bar{x}] &\dfrac{1}{16} &\dfrac{2}{16} &\dfrac{3}{16} &\dfrac{4}{16} &\dfrac{3}{16} &\dfrac{2}{16} &\dfrac{1}{16}\\ \end{array}\]

Figure \[\PageIndex{1}\] shows a side-by-side comparison of a histogram for the original population and a histogram for this distribution. Whereas the distribution of the population is uniform, the sampling distribution of the mean has a shape approaching the shape of the familiar bell curve. This phenomenon of the sampling distribution of the mean taking on a bell shape even though the population distribution is not bell-shaped happens in general. Here is a somewhat more realistic example.

Figure \[\PageIndex{1}\]: Distribution of a Population and a Sample Mean

Suppose we take samples of size \[1\], \[5\], \[10\], or \[20\] from a population that consists entirely of the numbers \[0\] and \[1\], half the population \[0\], half \[1\], so that the population mean is \[0.5\]. The sampling distributions are:

\[n = 1\]:

\[\begin{array}{c|c c } \bar{x} & 0 & 1 \\ \hline P[\bar{x}] &0.5 &0.5 \\ \end{array} \nonumber\]

\[n = 5\]:

\[\begin{array}{c|c c c c c c} \bar{x} & 0 & 0.2 & 0.4 & 0.6 & 0.8 & 1 \\ \hline P[\bar{x}] &0.03 &0.16 &0.31 &0.31 &0.16 &0.03 \\ \end{array} \nonumber\]

\[n = 10\]:

\[\begin{array}{c|c c c c c c c c c c c} \bar{x} & 0 & 0.1 & 0.2 & 0.3 & 0.4 & 0.5 & 0.6 & 0.7 & 0.8 & 0.9 & 1 \\ \hline P[\bar{x}] &0.00 &0.01 &0.04 &0.12 &0.21 &0.25 &0.21 &0.12 &0.04 &0.01 &0.00 \\ \end{array} \nonumber\]

\[n = 20\]:

\[\begin{array}{c|c c c c c c c c c c c} \bar{x} & 0 & 0.05 & 0.10 & 0.15 & 0.20 & 0.25 & 0.30 & 0.35 & 0.40 & 0.45 & 0.50 \\ \hline P[\bar{x}] &0.00 &0.00 &0.00 &0.00 &0.00 &0.01 &0.04 &0.07 &0.12 &0.16 &0.18 \\ \end{array} \nonumber\]

and

\[\begin{array}{c|c c c c c c c c c c } \bar{x} & 0.55 & 0.60 & 0.65 & 0.70 & 0.75 & 0.80 & 0.85 & 0.90 & 0.95 & 1 \\ \hline P[\bar{x}] &0.16 &0.12 &0.07 &0.04 &0.01 &0.00 &0.00 &0.00 &0.00 &0.00 \\ \end{array} \nonumber\]

Histograms illustrating these distributions are shown in Figure \[\PageIndex{2}\].

Figure \[\PageIndex{2}\]: Distributions of the Sample Mean

As \[n\] increases the sampling distribution of \[\overline{X}\] evolves in an interesting way: the probabilities on the lower and the upper ends shrink and the probabilities in the middle become larger in relation to them. If we were to continue to increase \[n\] then the shape of the sampling distribution would become smoother and more bell-shaped.

What we are seeing in these examples does not depend on the particular population distributions involved. In general, one may start with any distribution and the sampling distribution of the sample mean will increasingly resemble the bell-shaped normal curve as the sample size increases. This is the content of the Central Limit Theorem.

The Central Limit Theorem

For samples of size \[30\] or more, the sample mean is approximately normally distributed, with mean \[\mu _{\overline{X}}=\mu\] and standard deviation \[\sigma _{\overline{X}}=\dfrac{\sigma }{\sqrt{n}}\], where \[n\] is the sample size. The larger the sample size, the better the approximation. The Central Limit Theorem is illustrated for several common population distributions in Figure \[\PageIndex{3}\].

Figure \[\PageIndex{3}\]: Distribution of Populations and Sample Means

The dashed vertical lines in the figures locate the population mean. Regardless of the distribution of the population, as the sample size is increased the shape of the sampling distribution of the sample mean becomes increasingly bell-shaped, centered on the population mean. Typically by the time the sample size is \[30\] the distribution of the sample mean is practically the same as a normal distribution.

The importance of the Central Limit Theorem is that it allows us to make probability statements about the sample mean, specifically in relation to its value in comparison to the population mean, as we will see in the examples. But to use the result properly we must first realize that there are two separate random variables [and therefore two probability distributions] at play:

  1. \[X\], the measurement of a single element selected at random from the population; the distribution of \[X\] is the distribution of the population, with mean the population mean \[\mu\] and standard deviation the population standard deviation \[\sigma\];
  2. \[\overline{X}\], the mean of the measurements in a sample of size \[n\]; the distribution of \[\overline{X}\] is its sampling distribution, with mean \[\mu _{\overline{X}}=\mu\] and standard deviation \[\sigma _{\overline{X}}=\dfrac{\sigma }{\sqrt{n}}\].

Example \[\PageIndex{1}\]

Let \[\overline{X}\] be the mean of a random sample of size \[50\] drawn from a population with mean \[112\] and standard deviation \[40\].

  1. Find the mean and standard deviation of \[\overline{X}\].
  2. Find the probability that \[\overline{X}\] assumes a value between \[110\] and \[114\].
  3. Find the probability that \[\overline{X}\] assumes a value greater than \[113\].

Solution:

  1. By the formulas in the previous section \[\mu _{\overline{X}}=\mu=112 \nonumber\] and \[ \sigma_{\overline{X}}=\dfrac{\sigma}{\sqrt{n}}=\dfrac{40} {\sqrt{50}}=5.65685 \nonumber\]
  2. Since the sample size is at least \[30\], the Central Limit Theorem applies: \[\overline{X}\] is approximately normally distributed. We compute probabilities using Figure 5.3.1 in the usual way, just being careful to use \[\sigma _{\overline{X}}\] and not \[\sigma\] when we standardize:

\[\begin{align*} P[110

Chủ Đề