
The Student’s t-distribution has more probability in its tails than the standard normal distribution because the spread of the t-distribution is greater than the spread of the standard normal. The mean for the Student’s t-distribution is zero and the distribution is symmetric about zero, again like the standard normal distribution. at column 0.05, 95% level of confidence, we find the t-value of 1.96 at infinite degrees of freedom. You can confirm this by reading the bottom line at infinite degrees of freedom for a familiar level of confidence, e.g. The graph for the Student’s t-distribution is similar to the standard normal curve and at infinite degrees of freedom it is the normal distribution. Properties of the Student’s t-Distribution The effect of losing a degree of freedom is that the t-value increases and the confidence interval increases in width. We call the number n – 1 the degrees of freedom (df) in recognition that one is lost in the calculations. The other n – 1 deviations can change or vary freely. Because the sum of the deviations is zero, we can find the last deviation once we know the other n – 1 deviations. Remember when we first calculated a sample standard deviation we divided the sum of the squared deviations by n − 1, but we used n deviations to calculate s. The degrees of freedom, n – 1, come from the calculation of the sample standard deviation s. For each sample size n, there is a different Student’s t-distribution. It measures how far in standard deviation units is from its mean μ. The t-score has the same interpretation as the z-score. If you draw a simple random sample of size n from a population with mean μ and unknown population standard deviation σ and calculate the t-score t =, then the t-scores follow a Student’s t-distribution with n – 1 degrees of freedom. Up until the mid-1970s, some statisticians used the normal distribution approximation for large sample sizes and used the Student’s t-distribution only for sample sizes of at most 30 observations. The name comes from the fact that Gosset wrote under the pen name “A Student.” This problem led him to “discover” what is called the Student’s t-distribution. He realized that he could not use a normal distribution for the calculation he found that the actual distribution depends on the sample size. Just replacing σ with s did not produce accurate results when he tried to calculate a confidence interval. His experiments with hops and barley produced very few samples. Goset (1876–1937) of the Guinness brewery in Dublin, Ireland ran into this problem. A small sample size caused inaccuracies in the confidence interval. However, statisticians ran into problems when the sample size was small.
In this case there 80 observation well above the suggested 30 observations to eliminate any bias from a small sample. The point estimate for the standard deviation, s, was substituted in the formula for the confidence interval for the population standard deviation.
They used the sample standard deviation s as an estimate for σ and proceeded as before to calculate a confidence interval with close enough results. In the past, when the sample size was large, this did not present a problem to statisticians. In practice, we rarely know the population standard deviation. Confidence Intervals 40 A Confidence Interval for a Population Standard Deviation Unknown, Small Sample Case