<< Hide Menu
5 min read•june 18, 2024
Josh Argo
Jed Quiaoit
Josh Argo
Jed Quiaoit
Statistics is all about data. We collect sets of data, analyze our data and ultimately, use our data sets to make inferences about larger sets of individuals in our population.
Have you ever wondered what the average AP score was? Or perhaps the average number of bananas purchased at the grocery store per bunch? Both of these are examples of quantitative data because each individual is assigned a quantity. Whether it is assigning each test taker an AP score, or each banana bunch purchased, each individual being measures is assigned a number. One of the big giveaways for quantitative data is that we can take the mean, or the average, of the data set. In other words, quantitative data is average-able. 📲
EXAMPLE: You have taken 5 exams in your math class and you want to know your average score. The scores on the exams are as follows:
Exam 1: 80
Exam 2: 90
Exam 3: 70
Exam 4: 85
Exam 5: 75
To find the average, you need to add up all of the exam scores and then divide by the total number of exams. In this case, the total score is 80 + 90 + 70 + 85 + 75 = 400, and the total number of exams is 5.
Therefore, the average exam score is 400 / 5 = 80; in this example, your average exam score is 80.
This is a very simple example and in practice, you may encounter more complex problems that involve larger datasets and more variables. However, the basic principle of finding the average by summing the values and dividing by the count remains the same.
💡 Quantitative data uses means, or averages, to make inference!
On the flip side, we have categorical data. Have you ever asked a group of people whether they liked coffee? What about what their favorite vegetable is? How about if they prefer 🍩 or 🍪 for dessert? Each of these types of surveys would be examples of categorical data. The reason why is because each individual chooses a category: do you fall into the 🍩 or 🍪 category? Because of this separation of data, it is impossible to calculate the average dessert preference. After all, it would not make sense to make a statement like "the average dessert preference is a cookie." Instead, we typically measure categorical datasets using measures like proportions. It makes a lot more sense to make a statement like, "the proportion of people who prefer cookies is 0.65."
Here are some examples of statements outlining categorical data using proportions:
💡 Categorical data uses percentages, or proportions, to make inference.
In practice, statistics is used in a wide range of fields, including business, economics, biology, psychology, social sciences, and many others. It is a powerful tool for understanding and interpreting real-world phenomena, and is used to inform decision-making, policy-making, and research in a variety of contexts. 📈
Some common tasks in statistics include:
One of the major things that is going to feel very different for this course as opposed to other mathematics courses you have taken in the past is the way in which you record your answers. In an Algebra or Calculus course, it is sufficient to say "x = 5" when that is your answer. In AP Statistics, it is a good idea to go ahead and get in a habit of tying your answer to whatever the specific context of the problem you are working on. Instead of simply saying, "x = 5" make your answer more specific by saying things like "the average number of bananas per bunch is 5." 🧩
💡 Our goal in statistics is not just to find the correct answer, but to communicate our findings to our audience so that the answer is useful in making further predictions.
Perhaps the biggest concept and skill of this first unit is being able to describe data. In quantitative data, this consists of four main parts: center, outliers, spread, and shape. It is also important to include context in your answer. 💠
For example, if we had a set of data regarding the amount of bananas per bunch purchased, a model response may look like the following: "The mean number of bananas purchased was 5 bananas, There was one outlier when a customer purchased a bunch of 12 bananas. The shape of our data distribution was fairly symmetric. The range of bananas per bunch was 10, with the largest bunch being 12 and the smallest bunch being 2."
In categorical data, this process may look different. It is usually more valuable with context data to discuss which category was most likely to happen and which was least likely to happen. For example, a description could look like this: "Our most likely outcome was people who prefer donuts with a proportion of 0.45 and our least likely outcome was people who prefer cookies with a proportion of 0.15." 👨🍳
Sometimes it is also beneficial with categorical data to discuss raw counts rather than proportions. However, it is more likely that the AP exam will ask you to describe a distribution of a quantitative data set rather than a categorical data set. For more information on content from Unit 1, check the link below! 🏃♂️
🎥 Watch: AP Stats - Unit 1 Streams
© 2024 Fiveable Inc. All rights reserved.