Math 221 Week 2 Lab

Submitted by:

Part 1. Random Sampling

|Ordered Sample|

|AA |

|AE |

|AK |

|AN |

|AP |

|AU |

|AX |

|BA |

|BB |

|BG |

|BM |

|BQ |

|BT |

|CB |

|CE |

|CK |

|CQ |

|CT |

|CU |

|CV |

Files should be selected randomly for auditing to avoid biasness. It is not always possible to audit all the files due to time and other constraints. When the population is very large it is always good to select sample data randomly as each element of population will have equal chance of being selected.

Part 2. Cereal and Fiber Type

Fiber Type Bar Chart

[pic]

Fiber Type Pie Chart

[pic]

Part 3. Milk Production

1. Find the sample mean.

2270.54

2. Find the sample standard deviation.

653.1822

3. Make a frequency distribution for the data. Distribution is started at right.

| | | | |Cumulative | | Class Limits |Frequency |Midpoint |Frequency | |1147 |1646 |7 |1396.5 |7 | |1647 |2146 |15 |1896.5 |22 | |2147 |2646 |13 |2396.5 |35 | |2647 |3146 |11 |2896.5 |46 | |3147 | |4 |3396.5 |50 |

4. Create a histogram for the data. Does the data appear bell-shaped?

[pic]

Yes, the data appears to be bell shaped.

5. What true percent of the data lies within one standard deviations of the mean? Within two standard deviations of the mean? How do these results agree with the Empirical Rule?

Data that lies between one standard deviation is (2270.54 - 653.1822, 2270.54 + 653.1822) = (1617.3578, 2,923.7222)

Out of 50, 37 values lie within one standard deviation of the mean that means 74% of the data lie within one standard deviation of the mean.

Data that lies between one standard deviation is (2270.54 – 2*653.1822, 2270.54 + 2*653.1822) = (964.1756, 3,576.9044)

Out of 50, 49 values lie within one standard deviation of the mean that means 98% of the data lie within two standard deviations of the mean.

Empirical Rule states that:

68% of the data lies within one standard deviation of the mean. 95% of the data lie within two standard deviations of the mean.

The results for two standard deviations of the mean are reasonably close to empirical rule but the results for one standard deviation of the mean show some disagreement with 68%.

6. Find the median of the milk production.

Median = 2207

7. Find the maximum and the minimum value.

Minimum value = 1147

Maximum value = 4285

8. What is the range?

Range = 3138

9. What is the first Quartile?

First Quartile = 1798.25

10. What is the second Quartile? (What is this also called?) Second Quartile = 2207

This is also called median.

11. What is the third Quartile?

Third Quartile = 2727.5

12. What is the 80th percentile?

80th percentile = 2832.4

Part 4. Linear Regression

Sugar and Calories Analysis

[pic]

From the best line of fit and correlation coefficient we can note that there is linear relationship between sugar(x) and calories(y) variables. We can also note that with the increase in the grams of sugar, the numbers of calories increases.

Sugar and Cost Analysis

[pic]

From the best line of fit and correlation coefficient we can note that there is linear relationship between sugar(x) and cost(y) variables. We can also note that with the increase in the grams of sugar, the cost increases.

Weight and Cost Analysis

[pic]



