Which statistical analysis tests whether there is a mean difference between two groups?
Single Group Tests: The z-test and the t-test
One comment I do want to make is on when to use which test. On page 165 of the Daniel text, there is a complex flow chart that one can follow to make a decision about which to use. Actually, it is much simpler than this. If the population standard deviation is known, use the z-test. If it is not known, use the t-test. Because of the central limit theorem, one doesn't have to worry about the normality of the distribution of the population. In the flow chart, Daniel is pointing out, in part, that one has to be cautious about using the single group t-test when there is a very small sample size and the population is not normally distributed. Most of the time, in real research, we don't know about the shape of the distribution of the population, so people use the t-test regardless of whether the population distribution is normal.
Between Groups t-test
I've already given you the conceptual explanation of the two groups t-test in the previous lectures. To reiterate a bit, it is a matter of checking to see if the difference between two groups is one that is likely to occur by chance or not. If the means of the two groups are large relative to what we would expect to occur from sample to sample, we consider the difference to be significant. If the difference between the group means is small relative to the amount of sampling variability, the difference will not be significant.
We can think of the formula conceptually as a ratio like this:
t = difference between groups
When the value on the top of the equation is large, or the value on the bottom of the equation is small, the overall ratio will be large. The larger the value of t, the farther out on the sampling curve it will be, and, thus, the more likely it will be significant.
The Standard Error of the Estimate
The value on the bottom of the equation is designed to gauge sampling variability. As in the previous lectures, we estimate sampling variability with the standard error. The standard error in this case can be more formally called the "standard error of the estimate," because we are gauging the sampling variability of an estimate of the difference between means. The standard error of the estimate represents the standard deviation of the sampling distribution when the sampling distribution is based on a large number of mean differences. You can imagine a sampling distribution created by taking repeated samples of two groups and computing the difference between the means in each.
Because we are not going to actually construct a sampling distribution, we have to estimate the standard error. To estimate the standard error, we use the standard deviation of the sample we have. The standard deviation represents the variability of scores in the sample. If we know something about how much scores in the sample tend to vary, we can make a guess about how much variation we would expect in the sampling distribution. Imagine drawing scores from a population with very low variability. All the scores would be similar to one another, so the samples drawn from that population would not vary much.
To make our guess at the standard error of mean differences, we need to use the variability of scores within the two groups. Because we have two groups, we need to find their variances and "pool" them together. The variances of the two groups will be symbolized by
n1 and n2 are the sample sizes for each of
the two groups. With the pooled variance, we can then estimate the standard error. The standard error is symbolized by
Now, all we need to do is to use that value in our ratio, and presto, we have a t-test.
The Statistical Hypotheses for the t-test
How to Conduct the Between Groups t-test
As with all tests we conduct, we will compute a value and then compare it to a value in the table in the back of the book. Note that Daniel calls the tabled value the reliability coefficient. (Other books refer to it as the critical value or the tabled value.) When we are making this comparison, we are looking to see how far out on the sampling curve our sample is likely to be. Remember this picture?
If our calculated value exceeds the one in the table (i.e., the reliability coefficient), we have significance. The reliability coefficient is denoted by t.975 when alpha is equal to .05 and we are using a two-tailed test. More generally the reliability coefficient will be referred to as t(1-a/2). We've decided to reject the null hypothesis that the two groups were equal and we accept the alternative hypothesis that states that they are different. This means that we can be 95% sure that the differences between the two groups is not simply due to chance.
The Confidence Interval
To calculate the confidence interval, we find two values: the lower confidence limit (LCL) and the upper confidence limit (UCL). These are obtained from values we already have, the difference between our sample means, the tabled t-value (reliability coefficient), and the standard error.
The next "lecture" is an example with numbers.
What statistical test should I use to compare two groups?
The two most widely used statistical techniques for comparing two groups, where the measurements of the groups are normally distributed, are the Independent Group t-test and the Paired t-test.
Which test is used to test the difference between means of two groups?
One of the most common tests in statistics, the t-test, is used to determine whether the means of two groups are equal to each other.
How do you find the statistical difference between two groups?
Make a data table showing the number of observations for each of two groups, the mean of the results for each group, the standard deviation from each mean and the variance for each mean. Subtract the group two mean from the group one mean. Divide each variance by the number of observations minus 1.