# Cheat Sheet Stats

Pages: 7 (2673 words) Published: August 7, 2012
Chapter 3
Standard units tell you how many standard deviations above or below average a data value is standard units = (actual value – average)/SD
actual value = average + (SD x standard units). Standard units are denoted by Z. Chapter 8
Complement rule: P(A) = 1 – P(A doesn't happen)
Multiplication rule:
P(A and B both happen) = P(A) x P(B given A happened)
Q. 5 random components removed one at a time from box containing 5 defective and twenty working. What is chance of selecting all defective: A. 5/25x4/24x3/23x2/22x1/21. Selecting no defective 1 minus chance of selecting all defective or 20/25 x 19/24 x 18/23 x 17/22 x 16/21 = 2/7 or 29% Q. Important data server breaks down 40% of the time, is operational the other 60%, and servers breakdown independently. How many independent servers should be running so that there is a 99% chance at least one is operational? (40%)^X = 1% (1% = chance none are operational) A. .4^5 = .01024, so to get to .99 uptime, add another server. Q. Consider two bonds with BB- ratings, chance of default 1.5%. What is the probability that both default within a year? A. .015 x .015 Q. What is the probability that neither defaults? A. .985 x .985 Q. What is the probability that exactly one defaults? A. P(exactly one) = 1 – P(neither) – P(both) = 1 - .015 x .015 - .985 x .985. Independent: P(firm B defaults given A defaults) = P(firm B defaults) = .015 Dependant: P(firm A and B default) = P(firm A defaults) x P(firm B defaults given firm A defaults). = number greater than .015 Q. Janice has noticed that on her drive to work there are several things that can slow her down. First, she may hit a red light at a particular large intersection, which happens 30% of the time. If she hits the red light, 40% of the time she will have to stop for the commuter train. If she does not hit the red light, she only has to stop for the commuter train 20% of the time. we know P(red light)=.3, P(train| red light)=.4, P(train| no red light)=.2 a) What percent of the time will Janice have to stop for both the red light and the train? =P(red light)*P(train| red light)=.3*.4=.12 b) What percent of the time will she have to stop at least once (i.e., at either the red light or the train or both)? =1-P(no red light and no train)=1-P(no red light)*P(no train| no red light)=1-(1-.3)*(1-.2) = .44 c) If she makes five trips from home to work, what is the chance she has to stop at least once at the light or the train on at least one of the five trips from home to work? = 1- P(no red light and no train)^5 = 1- ((1-.3)*(1-.2))^5 = .94 Q. In a Youth Survey, 200 randomly selected teenagers from New England states were asked about how well they got along with their parents. 54% said they got along “very well” with their parents. a) Calculate a 95% confidence interval for the percentage of New England teenagers who got along well with their parents, and briefly explain what this interval tells you. SE= square root of(54*46/200) = 3.7, ME = 7. The conference interval is 54+/- 7%. We are 95% confident that the percentage who would say they get along very well with their parents in the population is in that range. b) How much larger a sample would you need if you want the margin of error to be 3%? Currently the margin of error is about 7. To reduce it down to 3, we need to multiply the sample size by (7/3)^2 = 5.4, or in other words a sample of about 1,100 Chapter 9

Law of averages= larger samples become more representative of the population Standard Error: estimates the error we would see in a sample size. If you have no idea about the true percentage, you can plug in 50% to get a conservative estimate of the SE. To divide the margin of error by some number, you must multiply the sample size by that number squared. To halve the margin of error you have to quadruple the sample size. The margin of error only reflects error due to the random sampling, not error due to other sources of bias.

The accuracy of a sample:...