# Descriptive Statistics

Students analyze contextual situations, focusing on single variable data and bivariate data, and are introduced to the concept of using data to make predictions and judgments about a situation.

## Unit Summary

In Unit 2: Statistics, students continue to analyze contextual situations, but in this unit, they focus on single variable data and then bivariate data. This is the first unit where students are introduced to the concept of using data to make predictions and judgments about a situation. Univariate data is described through shape, center, and spread by using mathematical calculations to support reasoning. Students begin to make judgments about whether data is consistent (analysis of spread) and whether mean or median is a better representation of a situation (center). Bivariate data is analyzed for whether the variables are related (correlation) and whether a linear model is the best function to fit a set of data (analysis of residuals), and students develop a linear model that can be used to predict future events. In Unit 2, students are introduced to the modeling cycle and complete a project on univariate data analysis and another on bivariate data analysis.

Unit 2 begins with analyzing and describing univariate data. Students expand on their knowledge of shape, center, and spread from 6th and 7th grade to further interpret and calculate measures of spread—learning about variance and standard deviation. Students capitalize on previous understandings of measures of center and different graphical representations to formalize their knowledge of which measures of center, shape, and spread are used in conjunction with one another, and how these help to inform the “big picture” of the data set they represent. A three-day project culminates study of this topic.

In Unit 2 students dive deeper into bivariate data—identifying categorical and numerical data, and choosing representations that match the data presented. Two-way tables are used to represent categorical data. Students calculate relative and conditional frequencies in two-way tables and expand on their understanding of the tool from 8th grade. Scatterplots are explored heavily in this unit, and students use what they know about association from 8th grade to connect to correlation in Algebra 1. Students base their understanding of regression on their previous learning about line of best fit. Also in this unit, students will learn to assess the validity of the model they have used (be it linear or another function) by using residuals. A three-day project culminates this topic, with a loose framework provided.

As Algebra 1 progresses, students will identify shapes of data sets according to the functions, and they will continue to bring in ideas about how to model data in line with functions. Students will explore S-ID.7 more heavily as they progress through the units of Algebra 1.

## Assessment

This assessment accompanies Unit 2 and should be given on the suggested assessment day or after completing the unit.

## Unit Prep

### Intellectual Prep

?

Internalization of Standards via the Unit Assessment:

• Take unit assessment. Annotate for:
• Standards that each question aligns to
• Purpose of each question: spiral, foundational, mastery, developing
• Strategies and representations used in daily lessons
• Relationship to Essential Understandings of unit
• Lesson(s) that assessment points to

Internalization of Trajectory of Unit:

• Read and annotate “Unit Summary.”
• Notice the progression of concepts through the unit using “Unit at a Glance.”
• Essential understandings
• Connection to assessment questions

Unit-Specific Intellectual Prep:

• As the teacher is teaching the unit, he or she should keep a running “word wall” of statistical questions. This topic is one of the most challenging to teach and will allow students to ask good questions once they get to the first data project in two weeks. Some examples are “Who owns what?” which then can get narrowed to “How do the rental vs. owner-occupied units compare in different neighborhoods?” and can spark the additional question of “What is the cost of rent across the city?” Part of the underlying goal of this unit is for students to start thinking statistically.
• Identify the most challenging parts of statistics for your students, whether it be reasoning, providing evidence, technological capability, or habits of thinking statistically, and weave these in as much as you can throughout the unit.
• If you are new to statistics, the book Naked Statistics, by Robert Whelan, is an excellent introduction to the topic.
• Statistics is best taught with data and representations that are relevant to the students. There are many opportunities to switch out graphs or data sets throughout the unit that increase relevancy. Identify topics and relevant data sets that will increase success of this unit.

### Essential Understandings

?

• Univariate data measures the frequency of a single variable and is described using shape, center, and spread. The shape, center, and spread are calculated using different measurements, depending on the features of the data set.
• Bivariate data compares two variables and when it is numerical, is plotted on a scatterplot and described using shape. The shape can be formalized using a function of best fit, and how well this fit is to a linear model is measured by correlation. The fit of the function to the data (regardless of function) is measured by the residuals.
• Bivariate data that compares categories is usually represented in a two-way table. Two-way tables can be analyzed using relative and conditional frequencies.

### Vocabulary

?

 descriptive statistics measures of center (mean, median) inferential statistics spread (standard deviation, interquartile range) univariate data shape (skew, symmetrical) bivariate data normal distribution numerical data outlier resistant categorical data distribution sample variance population scatterplots statistical question two-way frequency tables frequency graphs (histogram, bar graph, dot plot) relative frequency box plot (box-and-whisker plot) marginal frequency association correlation (correlation coefficient) causation residuals (residual plot)

### Unit Materials, Representations and Tools

?

• Here are three online tools you can use to create graphs.
• This set of applets will be used more in Algebra 2 for simulations, but the data analysis section will be helpful in this unit: Rossman/Chance Applet Collection
• Computers with access to spreadsheets (Excel or Google Sheets) for lessons in which students will be analyzing data. Determine which lessons you would like to use computers for ahead of teaching the unit.

## Common Core Standards

Key: Major Cluster Supporting Cluster Additional Cluster

### Core Standards

?

##### High School — Number and Quantity
• N.Q.A.1 — Use units as a way to understand problems and to guide the solution of multi-step problems; choose and interpret units consistently in formulas; choose and interpret the scale and the origin in graphs and data displays.

##### Interpreting Categorical and Quantitative Data
• HSS-ID.A.1 — Represent data with plots on the real number line (dot plots, histograms, and box plots).

• HSS-ID.A.2 — Use statistics appropriate to the shape of the data distribution to compare center (median, mean) and spread (interquartile range, standard deviation) of two or more different data sets.

• HSS-ID.A.3 — Interpret differences in shape, center, and spread in the context of the data sets, accounting for possible effects of extreme data points (outliers).

• HSS-ID.A.4 — Use the mean and standard deviation of a data set to fit it to a normal distribution and to estimate population percentages. Recognize that there are data sets for which such a procedure is not appropriate. Use calculators, spreadsheets, and tables to estimate areas under the normal curve.

• HSS-ID.B.5 — Summarize categorical data for two categories in two-way frequency tables. Interpret relative frequencies in the context of the data (including joint, marginal, and conditional relative frequencies). Recognize possible associations and trends in the data.

• HSS-ID.B.6 — Represent data on two quantitative variables on a scatter plot, and describe how the variables are related.

• HSS-ID.B.6a — Fit a function to the data; use functions fitted to data to solve problems in the context of the data. Use given functions or choose a function suggested by the context. Emphasize linear, quadratic, and exponential models.

• HSS-ID.B.6b — Informally assess the fit of a function by plotting and analyzing residuals.

• HSS-ID.B.6c — Fit a linear function for a scatter plot that suggests a linear association.

• HSS-ID.C.7 — Interpret the slope (rate of change) and the intercept (constant term) of a linear model in the context of the data.

• HSS-ID.C.8 — Compute (using technology) and interpret the correlation coefficient of a linear fit.

• HSS-ID.C.9 — Distinguish between correlation and causation.

##### Making Inferences and Justifying Conclusions
• HSS-IC.A.1 — Understand statistics as a process for making inferences about population parameters based on a random sample from that population.

?

• 8.F.B.4

• 6.SP.B.4

• 6.SP.B.5

• 7.SP.A.1

• 7.SP.A.2

• 7.SP.B.3

• 7.SP.B.4

• 8.SP.A.1

• 8.SP.A.2

• 8.SP.A.3

• 8.SP.A.4

?

• HSS-CP.A.1

• HSS-CP.A.2

• HSS-CP.A.3

• HSS-CP.A.4

• HSS-CP.A.5

• HSS-CP.B.6

• HSS-CP.B.7

• HSS-CP.B.8

• HSS-CP.B.9

• HSS-IC.A.1

• HSS-IC.A.2

• HSS-IC.B.3

• HSS-IC.B.4

• HSS-IC.B.5

• HSS-IC.B.6

### Standards for Mathematical Practice

• CCSS.MATH.PRACTICE.MP1 — Make sense of problems and persevere in solving them.

• CCSS.MATH.PRACTICE.MP2 — Reason abstractly and quantitatively.

• CCSS.MATH.PRACTICE.MP3 — Construct viable arguments and critique the reasoning of others.

• CCSS.MATH.PRACTICE.MP4 — Model with mathematics.

• CCSS.MATH.PRACTICE.MP5 — Use appropriate tools strategically.

• CCSS.MATH.PRACTICE.MP6 — Attend to precision.

• CCSS.MATH.PRACTICE.MP7 — Look for and make use of structure.

• CCSS.MATH.PRACTICE.MP8 — Look for and express regularity in repeated reasoning.