# Statistics: Model Question Paper

By Ejudstuen
Jan 04, 2015
1046 Words

Test 4AAP StatisticsName:

Directions: Do all of your work on these sheets.

Part 1: Multiple Choice. Circle the letter corresponding to the best answer.

1.I measure a response variable Y at each of several times. A scatterplot of log Y versus time of measurement looks approximately like a positively sloping straight line. We may conclude that

(a)the correlation between time of measurement and Y is negative, since logarithms of positive fractions (such as correlations) are negative.

(b)the rate of growth of Y is positive but slowing down over time.

(c)an exponential curve would approximately describe the relationship between Y and time.

(d)a power function would approximately describe the relationship between Y and time.

(e)A mistake has been made. It would have been better to plot log Y versus the logarithm of time.

A survey was designed to study how the operations of a group of businesses vary with their size. Companies were classified as small, medium, and large. Questionnaires were sent to 200 randomly selected businesses of each size, for a total of 600 questionnaires. Since not all questionnaires in a survey of this type are returned, it was decided to examine whether or not the response rate varied with the size of the business. The data are given in the following two-way table:

SizeResponseNo ResponseTotal

Small12575 200

Medium 81 119200

Large 40 160200

2.What percent of all small companies receiving questionnaires responded?

(a)50.8% (b) 20.8% (c) 62.5% (d) 33.3% (e) 12.5%

3.Which of the following conclusions seems to be supported by the data?

(a)There are more small companies than large companies in the survey.

(b)Small companies appear to have higher response rates than medium or big companies.

(c)Exactly the same number of companies responded as didn't respond.

(d)Small companies dislike larger companies.

(e)If we combined the medium and large companies, then their response rate would be equal to that of the small companies.

4.A researcher observes that, on average, the number of divorces in cities with Major League Baseball teams is larger than in cities without Major League Baseball teams. The most plausible explanation for this observed association is that the

(a) presence of a Major League Baseball team causes the number of divorces to rise (perhaps husbands are spending too much time at the ballpark).

(b)high number of divorces is responsible for the presence of Major League Baseball teams (more single men means potentially more fans at the ballpark, making it attractive for an owner to relocate to such cities).

(c)association is due to the presence of a lurking variable (Major League teams tend to be in large cities with more people, hence a greater number of divorces).

(d)association makes no sense, since many married couples go to the ballpark together.

(e)observed association is purely coincidental. It is implausible to believe the observed association could be anything other than accidental.

5.Students in a statistics class drew circles of varying diameters and counted how many Cheerios® could be placed in the circle. The scatterplot shows the results.

The students wanted to determine an appropriate equation for the relationship between diameter and the number of Cheerios®. The students decided to transform the data to make it appear more linear before computing a least-squares regression line. Which of the following transformations would be reasonable for them to try?

I. Take the square root of the number of Cheerios®.

II.Cube the number of Cheerios®.

III.Take the log of the number of Cheerios®.

IV.Take the log of the diameter.

(a)I and II (b) I and III (c) II and III (d) II and IV (e) III and IV

Part 2: Free Response

Answer completely, but be concise. Show your thought process clearly.

6. A study among the Pima Indians of Arizona investigated the relationship between a mother’s diabetic status and the appearance of birth defects in her children. The results appear in the two-way table below.

Diabetic Status

Birth DefectsNondiabeticPrediabeticDiabeticTotal

None754362 38

One or more 31 13 9___________

Total

(a) Fill in the row and column totals in the margins of the table.

(b) Compute (in percents) the conditional distributions of birth defects for each diabetic status.

(c) Use the grid provided to display the conditional distributions in a graph. Don’t forget to label your graph completely.

(d) Comment on any clear associations you see.

7. Here are data for 12 perch caught in a lake in Finland:

WeightLengthWeightLength

(grams)(cm)(grams)(cm)

5.9 8.8 300.028.7

100.019.2 300.030.1

110.022.5 685.039.0

120.023.5 650.041.4

150.024.0 820.042.5

145.025.51000.046.6

(a) Suppose you want to use the length of a perch to predict its weight. Use your calculator to make an appropriate scatterplot. Describe what you see.

(b) How do you expect the weight of animals of the same species to change as their length increases? Make a transformation of weight without using logarithms that should straighten the plot if your expectation is correct. Plot the transformed weights against length. Then find the equation of the least-squares line for the transformed data. Record the equation below. Define any variables you use.

(c) How well does the linear model you calculated in (b) fit the transformed data? Justify your answer with graphical and numerical evidence.

(d) Use your model from (b) to predict the weight of a Finnish perch whose length is 35 cm. Show your method.

8. According to the U.S. census, states with an above-average number of people who fail to complete high school tend to have an above-average number of infant deaths. Is the association between these two variables most likely due to causation, confounding, or common response? Justify your answer.

9. A curious thing happened to two baseball players this year during the first two weeks of the season. Some data related to their hitting success are displayed in the following table. Note that AB = at-bats; H = hits; and BA = batting average, which is defined by BA = H/AB.

Weekly Results

Player 1 Player 2

Week ABHBAABHBA

1 52259

2 205 51

(a) Show that for each week, Player 1 had a higher batting average (BA = hits/at bats) than

Player 2.

(b) Show that at the end of the two weeks, the cumulative results for Player 2 were better than the cumulative results for Player 1.

(c)What is the name for this apparent contradiction?

I pledge that I have neither given nor received aid on this test._____________________________

Directions: Do all of your work on these sheets.

Part 1: Multiple Choice. Circle the letter corresponding to the best answer.

1.I measure a response variable Y at each of several times. A scatterplot of log Y versus time of measurement looks approximately like a positively sloping straight line. We may conclude that

(a)the correlation between time of measurement and Y is negative, since logarithms of positive fractions (such as correlations) are negative.

(b)the rate of growth of Y is positive but slowing down over time.

(c)an exponential curve would approximately describe the relationship between Y and time.

(d)a power function would approximately describe the relationship between Y and time.

(e)A mistake has been made. It would have been better to plot log Y versus the logarithm of time.

A survey was designed to study how the operations of a group of businesses vary with their size. Companies were classified as small, medium, and large. Questionnaires were sent to 200 randomly selected businesses of each size, for a total of 600 questionnaires. Since not all questionnaires in a survey of this type are returned, it was decided to examine whether or not the response rate varied with the size of the business. The data are given in the following two-way table:

SizeResponseNo ResponseTotal

Small12575 200

Medium 81 119200

Large 40 160200

2.What percent of all small companies receiving questionnaires responded?

(a)50.8% (b) 20.8% (c) 62.5% (d) 33.3% (e) 12.5%

3.Which of the following conclusions seems to be supported by the data?

(a)There are more small companies than large companies in the survey.

(b)Small companies appear to have higher response rates than medium or big companies.

(c)Exactly the same number of companies responded as didn't respond.

(d)Small companies dislike larger companies.

(e)If we combined the medium and large companies, then their response rate would be equal to that of the small companies.

4.A researcher observes that, on average, the number of divorces in cities with Major League Baseball teams is larger than in cities without Major League Baseball teams. The most plausible explanation for this observed association is that the

(a) presence of a Major League Baseball team causes the number of divorces to rise (perhaps husbands are spending too much time at the ballpark).

(b)high number of divorces is responsible for the presence of Major League Baseball teams (more single men means potentially more fans at the ballpark, making it attractive for an owner to relocate to such cities).

(c)association is due to the presence of a lurking variable (Major League teams tend to be in large cities with more people, hence a greater number of divorces).

(d)association makes no sense, since many married couples go to the ballpark together.

(e)observed association is purely coincidental. It is implausible to believe the observed association could be anything other than accidental.

5.Students in a statistics class drew circles of varying diameters and counted how many Cheerios® could be placed in the circle. The scatterplot shows the results.

The students wanted to determine an appropriate equation for the relationship between diameter and the number of Cheerios®. The students decided to transform the data to make it appear more linear before computing a least-squares regression line. Which of the following transformations would be reasonable for them to try?

I. Take the square root of the number of Cheerios®.

II.Cube the number of Cheerios®.

III.Take the log of the number of Cheerios®.

IV.Take the log of the diameter.

(a)I and II (b) I and III (c) II and III (d) II and IV (e) III and IV

Part 2: Free Response

Answer completely, but be concise. Show your thought process clearly.

6. A study among the Pima Indians of Arizona investigated the relationship between a mother’s diabetic status and the appearance of birth defects in her children. The results appear in the two-way table below.

Diabetic Status

Birth DefectsNondiabeticPrediabeticDiabeticTotal

None754362 38

One or more 31 13 9___________

Total

(a) Fill in the row and column totals in the margins of the table.

(b) Compute (in percents) the conditional distributions of birth defects for each diabetic status.

(c) Use the grid provided to display the conditional distributions in a graph. Don’t forget to label your graph completely.

(d) Comment on any clear associations you see.

7. Here are data for 12 perch caught in a lake in Finland:

WeightLengthWeightLength

(grams)(cm)(grams)(cm)

5.9 8.8 300.028.7

100.019.2 300.030.1

110.022.5 685.039.0

120.023.5 650.041.4

150.024.0 820.042.5

145.025.51000.046.6

(a) Suppose you want to use the length of a perch to predict its weight. Use your calculator to make an appropriate scatterplot. Describe what you see.

(b) How do you expect the weight of animals of the same species to change as their length increases? Make a transformation of weight without using logarithms that should straighten the plot if your expectation is correct. Plot the transformed weights against length. Then find the equation of the least-squares line for the transformed data. Record the equation below. Define any variables you use.

(c) How well does the linear model you calculated in (b) fit the transformed data? Justify your answer with graphical and numerical evidence.

(d) Use your model from (b) to predict the weight of a Finnish perch whose length is 35 cm. Show your method.

8. According to the U.S. census, states with an above-average number of people who fail to complete high school tend to have an above-average number of infant deaths. Is the association between these two variables most likely due to causation, confounding, or common response? Justify your answer.

9. A curious thing happened to two baseball players this year during the first two weeks of the season. Some data related to their hitting success are displayed in the following table. Note that AB = at-bats; H = hits; and BA = batting average, which is defined by BA = H/AB.

Weekly Results

Player 1 Player 2

Week ABHBAABHBA

1 52259

2 205 51

(a) Show that for each week, Player 1 had a higher batting average (BA = hits/at bats) than

Player 2.

(b) Show that at the end of the two weeks, the cumulative results for Player 2 were better than the cumulative results for Player 1.

(c)What is the name for this apparent contradiction?

I pledge that I have neither given nor received aid on this test._____________________________