Stat 3022: Midterm Exam 1
March 5 (Tuesday), 2013
• ID number:
• This exam must be your own work entirely. You can not talk to or share information with anybody. You are not allowed to share materials, and calculators. • Cell phone must be turned off.
• You have 50 minutes to complete the exam.
Problem 1 (21 points total, 3 points each)
Choose one of the listed choices for each question (no explanation is needed), put your answers in the table on page 4.
1. Suppose you have a sample x1 , x2 , . . . , xn from a population. Which of the following has different unit as sample mean? (A). population mean
(B). sample variance
(C). standard error
(D). single observation xi for i = 1, 2, . . . , n
2. In paired t-test with sample size n1 = n2 = 20, you have H0 : µ = 0 vs. Ha : µ < 0, the t-statistic is 3.4. what is the p-value?
1 - pt(3.4, 20)
1 - pt(3.4, 19)
3. Suppose y is response, x1 is numerical predictor, and x2 is categorical predictor with 2 levels. Which of the following R code will generate parallel line models? (A).
lm(y ~ 1)
lm(y ~ x1)
lm(y ~ x1 + x2)
lm(y ~ x1 * x2)
4. In paired t-test with H0 : µ = 2, a 95% confidence interval for the mean of difference is (-0.035, 0.057). The corresponding t-test (two-sided Ha ) would: (A). reject H0 at the α=0.05 significance level.
(B). fail to reject H0 at the α=0.05 significance level.
(C). can’t tell without more information.
5. For what type of experiment you can make causal inference, according to chapter 1 in the textbook?
(A). Randomized experiment
(B). Observational experiment
6. In simple linear regression yi = β0 + β1 xi + i , where i = 1, 2, . . . , n. Define residual ei = yi − yˆi , where yˆi are fitted values. Which of the following is correct? (A). E( i ) = 0
(D). (A), (B), (C) are all wrong
7. In the following table, which cell corresponds to the type I Error?
Do not reject H0
H0 is true
Ha is true
Put your answers for multiple choice problems here (use capital letters):
Problem 2 (36 points, 4 points each)
Based on the following output of linear regression, answer questions 1 - 2:
lm(formula = taste ~ lactic + h2s, data = data)
Estimate Std. Error t value Pr(>|t|)
8.982 -3.072 0.00481 **
2.499 0.01885 *
(1) 0.00174 **
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1
Residual standard error: 9.942 on 27 degrees of freedom
Multiple R-squared: 0.6517,Adjusted R-squared: 0.6259
F-statistic: 25.26 on 2 and 27 DF, p-value: 6.551e-07
1. Calculate (1) in the output?
Answer: SE(β2βˆ ) = 1.136
2. What is the proportion of variation in taste that is NOT explained by the variables h2s and lactic?
Answer: 1 − 0.6517 = 0.3483. So 34.83% of variation is explained.
3. We have two random variables x and y, then under what condition we have var(x − y) = var(x) + var(y), where var stands for variance? Answer: x and y are independent.
4. What are the four model assumptions in linear regression? Answer: 1) Linearity, 2) Normality, 3) Constant variance, 4) Independence
5. Under what conditions, you will prefer non-parametric test over standard t-test? Answer: Samples don’t come from a distribution; Data are heavily skewed; Some outliers are found.
6. If x1 , x2 , . . . , xn follow N (µ, sd = σ), then what distribution does SE(¯ follow?
And what distribution does SD(¯x) follow?
Answer: 1) t-distribution with d.f. = n − 1; 2) standard normal distribution.
Please join StudyMode to read the full document