<< Hide Menu

📚

 > 

📊 

 > 

📊

5.5 Sampling Distributions for Sample Proportions

4 min readjune 18, 2024

Josh Argo

Josh Argo

Jed Quiaoit

Jed Quiaoit

B

Brianna Bukowski

Josh Argo

Josh Argo

Jed Quiaoit

Jed Quiaoit

B

Brianna Bukowski

Formulas

You can usually tell if you will solve a problem using sample proportions if the problem gives you a probability or percentage. For a sample proportion with probability p, the mean of our sampling distribution is equal to the probability. All formulas in this section can be found on page 2 of the given formula sheet. 🤓

Source: (NEW) AP Statistics Formula Sheet

Large Counts Condition

Before you can use a sampling distribution for sample proportions to make inferences about a population proportion, you need to check that the sample meets certain conditions. One of these conditions is the large counts condition, which states that the sample size should be large enough for the distribution of the sample proportion to be approximately normal.

The large counts condition can be expressed as np ≥ 10 and n(1-p) ≥ 10, where n is the sample size and p is the sample proportion. This means that both the number of successes (np) and the number of failures (n(1-p)) in the sample should be at least 10. If these conditions are met, then you can assume that the sampling distribution for the sample proportion is approximately normal, and you can use statistical techniques that rely on normality, such as confidence intervals or hypothesis tests (we'll cover this in future sections!). ✔️

For the shape (normal) of distributions of means, you can check the Central Limit Theorem, but for proportions you must always check the Large Counts Condition.

Practice Problem

Suppose that you are conducting a survey to estimate the proportion of people in your town who support a new public transportation system. You decide to use a simple random sample of 1000 people, and you ask them whether or not they support the new system. After collecting the data, you find that 600 people out of the 1000 respondents support the system.

a) Calculate the sample proportion of respondents who support the new system. 🚂

b) Explain what the sampling distribution for the sample proportion represents and why it is useful in this situation.

c) Suppose that the true population proportion of people who support the new system is actually 0.6. Describe the shape, center, and spread of the sampling distribution for the sample proportion in this case.

d) Explain why the Central Limit Theorem applies to the sampling distribution for the sample proportion in this situation.

e) Calculate a 95% confidence interval for the population proportion of people who support the new system based on the sample data. (Optional for now, but feel free to answer if you already checked out the section on confidence intervals!)

f) Discuss one potential source of bias that could affect the results of this study, and explain how it could influence the estimate of the population proportion."

Answers

a) The sample proportion of respondents who support the new system is 600/1000 = 0.6.

b) The sampling distribution for the sample proportion represents the distribution of possible values for the sample proportion if the study were repeated many times. It is useful in this situation because it allows us to make inferences about the population proportion based on the sample data.

c) If the true population proportion of people who support the new system is 0.6, the sampling distribution for the sample proportion would be approximately normal with a center at 0.6 and a spread that depends on the sample size and the variability of the population.

d) The Central Limit Theorem applies to the sampling distribution for the sample proportion in this situation because the sample size (n = 1000) is large enough for the distribution to be approximately normal, even if the population is not normally distributed.

e) A 95% confidence interval for the population proportion of people who support the new system can be calculated as 0.6 +/- (1.96 * sqrt((0.6(1-0.6))/1000)). This gives a confidence interval of (0.570, 0.630).*

f) One potential source of bias in this study could be nonresponse bias, which occurs when certain groups of individuals are more or less likely to respond to the survey. For example, if people who support the new system are more likely to respond to the survey, the sample could be biased toward higher levels of support and produce an overestimate of the population proportion. On the other hand, if people who do not support the new system are more likely to respond, the sample could be biased toward lower levels of support and produce an underestimate of the population proportion.