<< Hide Menu
4 min read•june 18, 2024
Josh Argo
Jed Quiaoit
Josh Argo
Jed Quiaoit
A two-sample t-test is used to determine whether the means of two independent groups are significantly different from each other. It is a parametric test, meaning that it assumes that the data follows a normal distribution and that the variances of the two groups are equal.
The test calculates the difference between the two group means and compares it to the standard error of the difference, and then uses this information to calculate a t-statistic. The t-statistic is then used to determine the p-value, which indicates the probability that the difference between the two group means occurred by chance.
As with any statistical test, the first step necessary to perform the significance test is to write our hypotheses. We always have a null hypothesis, which is the hypothesis that the two populations are not different. Then we have our alternate hypothesis, which states that they are different in some way (either less than, greater than or simply not equal to).
When writing out your hypotheses, you should state them as follows: 📝
Ho: 𝞵1 = 𝞵2
Ha: 𝞵1 ≠ 𝞵2, 𝞵1 < 𝞵2, or 𝞵1 > 𝞵2
Another way of writing them using differences is as follows:
Ho: 𝞵1 - 𝞵2 = 0
Ha: 𝞵1 - 𝞵2 > 0, 𝞵1 - 𝞵2 < 0, or 𝞵1 - 𝞵2 ≠ 0
The first option is more in line with the technology used in actually computing the test statistic and p value that we will cover in Unit 7.9.
Also, as with any significance test, there are conditions for inference that we must check to assure that our test is accurately going to be able to draw conclusions about a said population.
When drawing our two samples to perform our t-test, it is absolutely imperative that our samples are random from the given populations. If they are not random, we cannot generalize our results to the two given populations, which renders our tests useless and there is no way to fix sampling bias with the numbers. If the test involves an experimental study, it is important to note that the treatments were randomly assigned. This allows us to make a causation conclusion. ☑️
Since we are normally sampling without replacement, it is also important to make sure our sample is independently chosen. This can be assumed under the 10% condition which states that as long as our population is 10x our sample, we can assume independence. If dealing with an experimental study, independence is not necessary since we are randomizing treatments. ☑️
When calculating our p-value, we are going to make use of the t curve to see what the probability of obtaining our samples are. In order to ensure that we can use the t curve for our sampling distribution, we must check to be sure that either: ☑️
Mr. Fleck runs a green bean farm. He has two fields that he normally picks from. Every day, he goes out and picks green beans from both fields and has found that the two fields appear to be yielding different amount of crops. In order to test his theory, he randomly selects 120 days to pick from both fields. Field A yields an average of 580 beans with a standard deviation of 25, while Field B yields an average of 550 with a standard deviation of 12. Do the data give convincing evidence that the two fields yield different amount of beans? 🌽
Our null hypothesis is that the two fields yield equal amounts of beans, so our Ho: 𝞵A = 𝞵B, where 𝞵A is the true mean number of beans that comes from field A everyday and 𝞵B is the true mean number of beans that comes from Field B everyday.
Our alternate hypothesis is that these two are different, so our Ha: 𝞵A ≠ 𝞵B. 🫘
© 2024 Fiveable Inc. All rights reserved.