are two averages different?
John Kitchin 1/28/2012
Class A had 30 students who received an average test score of 78, with standard deviation of 10. Class B had 25 students an average test score of 85, with a standard deviation of 15. We want to know if the difference in these averages is statistically relevant. Note that we only have estimates of the true average and standard deviation for each class, and there is uncertainty in those estimates. As a result, we are unsure if the averages are really different. It could have just been luck that a few students in class B did better.
the true averages are the same. We need to perform a two-sample t-test of the hypothesis that (this is often called the null hypothesis). we use a two-tailed test because we do not care if the difference is positive or negative, either way means the averages are not the same.
n1 = 30; % students in class A x1 = 78; % average grade in class A s1 = 10; % std dev of exam grade in class A n2 = 25; % students in class B x2 = 85; % average grade in class B s2 = 15; % std dev of exam grade in class B
this is the standard error of the difference between the two averages. Note the standard error of the difference is less than the standard error of the individual averages.
SE = sqrt(s1^2/n1 + s2^2/n2)
SE = 3.5119
see the discussion at http://stattrek.com/Help/Glossary.aspx?Target=Two-sample%20t-test for a more complex definition of degrees of freedom. Here we simply subtract one from each sample size to account for the estimation of the average of each sample.
DF = (n1 - 1) + (n2 - 1)
DF = 53
The difference between two averages determined from small sample numbers follows the t-distribution. the t-score is the difference between the difference of the means and the hypothesized difference of the means, normalized by the standard error. we compute the absolute value of the t-score to make sure it is positive for convenience later.
tscore = abs(((x1 - x2)-0)/SE)
tscore = 1.9932
A way to approach determinining if the difference is significant or not is to ask, does our computed average fall within a confidence range of the hypothesized value (zero)? If it does, then we can attribute the difference to statistical variations at that confidence level. If it does not, we can say that statistical variations do not account for the difference at that confidence level, and hence the averages must be different.
Lets compute the t-value that corresponds to a 95% confidence level for a mean of zero with the degrees of freedom computed earlier. This means that 95% of the t-scores we expect to get will fall within t95.
ci = 0.95; alpha = 1 - ci; t95 = tinv(1-alpha/2, DF)
t95 = 2.0057
since tscore < t95, we conclude that at the 95% confidence level we cannot say these averages are statistically different because our computed t-score falls in the expected range of deviations. Note that our t-score is very close to the 95% limit. Let's consider a smaller confidence interval.
ci = 0.94; alpha = 1 - ci; t94 = tinv(1-alpha/2, DF)
t94 = 1.9219
at the 94% confidence level, however, tscore > t94, which means we can say with 94% confidence that the two averages are different; class B performed better than class A did. Alternatively, there is only about a 6% chance we are wrong about that statement.
An alternative way to get the confidence that the averages are different is to directly compute it from the cumulative t-distribution function. We compute the difference between all the t-values less than tscore and the t-values less than -tscore, which is the fraction of measurements that are between them. You can see here that we are practically 95% sure that the averages are different.
f = tcdf(tscore,DF)-tcdf(-tscore,DF)
f = 0.9486
'done' % categories: data analysis
ans = done