1) One of the reasons that the Monitoring the Future (MTF) project was started was "to study changes in the beliefs, attitudes, and behavior of young people in the United States." Data are collected from 8 th, 10th, and 12th graders each year. To get a representative nationwide sample, surveys are given to a randomly selected group of students. In Spring 2004, students were asked about alcohol, illegal drug, and cigarette use. Describe the W's, if the information is given. If the information is not given, state that it is not specified. • Who: • What: • When: • Where: • How: • Why:

1)

2) Consider the following part of a data set:

List the variables in the data set. Indicate whether each variable is treated as categorical or quantitative in this data set. If the variable is quantitative, state the units.

1

Has the percentage of young girls drinking milk changed over time? The following table is consistent with the results from "Beverage Choices of Young Females: Changes and Impact on Nutrient Intakes" (Shanthy A. Bowman, Journal of the American Dietetic Association , 102(9), pp. 1234-1239):

3) Find the following: a. What percent of the young girls reported that they drink milk? b. What percent of the young girls were in the 1989-1991 survey? c. What percent of the young girls who reported that they drink milk were in the 1989-1991 survey? d. What percent of the young girls in 1989-1991 reported that they drink milk?

3) a.

b.

c.

d. 4) 5)

4) What is the marginal distribution of milk consumption? 5) Do you think that milk consumption by young girls is independent of the nationwide survey year? Use statistics to justify your reasoning. 6) Consider the following pie charts of a subset of the data above:

6)

Do the pie charts above indicate that milk consumption by young girls is independent of the nationwide survey year? Explain.

2

7) A brake and muffler shop reported the repair bills, in dollars, for their customers yesterday. 88 154 203 56 283 400 118 192 312 381 143 292 290 346 252 213 172 181 227 422

7) a.

b.

c.

a. Sketch a histogram for these data. b. Find the mean and standard deviation of the repair costs. c. Is it appropriate to use the mean and standard deviation to summarize these data? Explain. d. Describe the association of repair costs. 8) On Monday, a class of students took a big test, and the highest score was 92. The next day, a student who had been absent made up the test, scoring 100. Indicate whether adding that student's score to the rest of the data made each of these summary statistics increase, decrease, or stay about the same: a. mean b. median c. range d. IQR e. standard deviation

d.

8) a.

b.

c.

d.

e. 9) The body temperature of students is taken each time a student goes to the nurse's office. The five-number summary for the temperatures (in degrees Fahrenheit) of students on a particular day is: 9) a.

b.

a. Would you expect the mean temperature of all students who visited the nurse's office to be higher or lower than the median? Explain. b. After the data were picked up in the afternoon, three more students visited the nurse's office with temperatures of 96.7°, 98.4°, and 99.2°. Were any of these students outliers? Explain.

3

10) The boxplots show the age of people involved in accidents according to their role in the accident.

10) a.

b.

c.

d.

e.

a. Which role involved the youngest person, and what is the age? b. Which role had the lowest median age, and what is the age? c. Which role had smallest range of ages, and what is it? d. Which role had the largest IQR of ages, and what is it? e. Which role generally involved the oldest people? Explain. 11) One thousand students from a local university were sampled to gather information such as gender, high school GPA, college GPA, and total SAT scores. The results were...