The simple regression model (SRM) is model for association in the population between an explanatory variable X and response Y. The SRM states that these averages align on a line with intercept β0 and slope β1: µy|x = E(Y|X = x) = β0 + β1x Deviation from the Mean

The deviation of observed responses around the conditional means µy|x are called errors (ε). The error’s equation: ε = y - µy|x Errors can be positive or negative, depending on whether data lie above (positive) or below the conditional means (negative).Because the errors are not observed, the SRM makes three assumptions about them: * Independent. The error for one observation is independent of the error for any other observation. * Equal variance. All errors have the same variance, Var(ε) = σε2. * Normal. The errors are normally distributed.

If these assumptions hold, then the collection of all possible errors forms a normal population with mean 0 and variance σε2, abbreviated ε ̴̴ N (0, σε2). Simple Regression Model (SRM) observed values of the response Y are linearly related to values of the explanatory variable X by the equation: y = β0 + β1x + ε, ε ̴̴ N (0, σε2) The observations:

1. are independent of one another,
2. have equal variance σε2 around the regression line, and 3. are normally distributed around the regression line.
21.2 Conditions for the SRM ( Simple Regression Model )
Instead of checking for random residual variation, we have three specific conditions. Checklist for the simple regression model * Is the association between y and x linear?
* Have we ruled out obvious lurking variables?
Errors appears to be a sample from a normal population.|
* Are the errors evidently independent?
* Are the variances of the residuals similar?
* Are the residuals nearly normal?
21.3 INTERFERENCE IN REGRESSION
Confidence intervals and hypothesis tests work as in inferences for the mean of a population: * The 95% confidence intervals for...

...Chapter 4 Simpleregressionmodel Practice problems
Use Chapter 4 Powerpoint question 4.1 to answer the following questions:
1. Report the Eveiw output for regressionmodel .
Please write down your fitted regressionmodel.
2. Are the sign for consistent with your expectation, explain?
3. Hypothesize the sign of the coefficient and test your hypothesis at 5% significance level using t-table.
4. What percentage of variation in 30 year fixed mortgage rate is explained by this model? Why?
Use Chapter 4 Powerpoint question 4.2 to answer the following questions:
5. Report the Eveiw output for regressionmodel
Based on the estimation period of 1986.01 – 1999.07. Please write down your fitted regressionmodel.
6. Is Trend correlated with USPI? Set up the hypothesis testing at 5% significance level.
7. What percentage of variation in USPI is explained by this model? Why?
8. Based on your Eview model, report your forecast of USPI for the period of 1999.08-2000.07. Report RMSE.
Use Chapter 4 Powerpoint question 4.3 to answer the following questions:
9. Report the Eveiw output for regressionmodel USPIt = (USTBR)t + t based on the estimation period of 1986.01 – 1999.07. Please write down your...

...
Simple Linear RegressionModel
1. The following data represent the number of flash drives sold per day at a local computer shop and their prices.
| Price (x) | Units Sold (y) |
| $34 | 3 |
| 36 | 4 |
| 32 | 6 |
| 35 | 5 |
| 30 | 9 |
| 38 | 2 |
| 40 | 1 |
| a. Develop as scatter diagram for these data. b. What does the scatter diagram indicate about the relationship between the two variables? c. Develop the estimated regression equation and explain what the slope of the line indicates. d. Compute the coefficient of determination and comment on the strength of relationship between x and y. e. Compute the sample correlation coefficient between the price and the number of flash drives sold. f. Perform a t test and determine if the price and the number of flash drives sold are related. Let α = 0.01. g. Perform an F test and determine if the price and the number of flash drives sold are related. Let α = 0.01. |
ANS:
b. Negative linear relationship.
c. | = 29.7857 - 0.7286xThe slope indicates that as the price goes up by $1, the number of units sold goes down by 0.7286 units. |
d. | r 2 = .8556; 85.56% of the variability in y is explained by the linear relationship between x and y. |
e. | rxy = -0.92; negative strong relationship. |
f. t = -5.44 < -4.032 (df = 5); reject Ho, and conclude x and y are related.
g. | F = 29.642 > 16.26; reject Ho, x...

...2013)
Introduction to Business Statistics II
2 / 47
Review: Inference for Regression
Example: Real Estate, Tampa Palms, Florida Goal: Predict sale price of residential property based on the appraised value of the property Data: sale price and total appraised value of 92 residential properties in Tampa Palms, Florida
1000 900 Sale Price (in Thousands of Dollars) 800 700 600 500 400 300 200 100 0 0 100 200 300 400 500 600 700 800 900 1000 Appraised Value (in Thousands of Dollars)
Review: Inference for Regression
We can describe the relationship between x and y using a simple linear regressionmodel of the form µy = β 0 + β1 x
1000 900 Sale Price (in Thousands of Dollars) 800 700 600 500 400 300 200 100 0 0 100 200 300 400 500 600 700 800 900 1000 Appraised Value (in Thousands of Dollars)
response variable y : sale price explanatory variable x: appraised value relationship between x and y : linear strong positive
We can estimate the simple linear regressionmodel using Least Squares (LS) yielding the following LS regression line: y = 20.94 + 1.069x
Stat 326 (Spring 2013)
Introduction to Business Statistics II
3 / 47
Stat 326 (Spring 2013)
Introduction to Business Statistics II
4 / 47
Review: Inference for Regression
Interpretation of estimated intercept b0 : corresponds to the predicted value...

...Regression Analysis: A Complete Example
This section works out an example that includes all the topics we have discussed so far in this chapter.
A complete example of regression analysis.
PhotoDisc, Inc./Getty Images
A random sample of eight drivers insured with a company and having similar auto insurance policies was selected. The following table lists their driving experiences (in years) and monthly auto insurance premiums.
Driving Experience (years) Monthly Auto Insurance Premium
5 2 12 9 15 6 25 16
$64 87 50 71 44 56 42 60
a. Does the insurance premium depend on the driving experience or does the driving experience depend on the insurance premium? Do you expect a positive or a negative relationship between these two variables? b. Compute SSxx, SSyy, and SSxy. c. Find the least squares regression line by choosing appropriate dependent and independent variables based on your answer in part a. d. Interpret the meaning of the values of a and b calculated in part c. e. Plot the scatter diagram and the regression line. f. Calculate r and r2 and explain what they mean. g. Predict the monthly auto insurance premium for a driver with 10 years of driving experience. h. Compute the standard deviation of errors. i. Construct a 90% confidence interval for B. j. Test at the 5% significance level whether B is negative. k. Using α = .05, test whether ρ is different from zero.
Solution a. Based on theory and intuition, we...

...CHAPTER 16
SIMPLE LINEAR REGRESSION
AND CORRELATION
SECTIONS 1 - 2
MULTIPLE CHOICE QUESTIONS
In the following multiple-choice questions, please circle the correct answer.
1. The regression line [pic] = 3 + 2x has been fitted to the data points (4, 8), (2, 5), and (1, 2). The sum of the squared residuals will be:
a. 7
b. 15
c. 8
d. 22
ANSWER: d
2. If an estimated regression line has a y-intercept of 10 and a slope of 4, then when x = 2 the actual value of y is:
a. 18
b. 15
c. 14
d. unknown
ANSWER: d
3. Given the least squares regression line [pic]= 5 –2x:
a. the relationship between x and y is positive
b. the relationship between x and y is negative
c. as x increases, so does y
d. as x decreases, so does y
ANSWER: b
4. A regression analysis between weight (y in pounds) and height (x in inches) resulted in the following least squares line: [pic]= 120 + 5x. This implies that if the height is increased by 1 inch, the weight, on average, is expected to:
a. increase by 1 pound
b. decrease by 1 pound
c. increase by 5 pounds
d. increase by 24 pounds
ANSWER: c
5. A regression analysis between sales (in $1000) and advertising (in $100) resulted in the following least squares line: [pic] = 75 +6x. This implies that if advertising is $800, then the predicted...

...-------------------------------------------------
Simpleregression and correlation
Submitted by Sohaib Roomi
Submitted to:Miss Tahreem
Roll No M12BBA014
SimpleRegression
And Correlation
Introduction
The term regression was introduced by the English biometrician, Sir Francis Galton (1822-1911) to describe a phenomenon in which he observed in analyzing the heights of children and their parents. He solved a tendency toward the average height of all men. Today, the word “Regression” is used in quiet different sense. Its investigation depends upon two variables. Dependent and Independent Variable.
Definition
“Regression provides an equation to be used for estimating the average value of the dependent variable from the known values of independent variable.”
Determination and Probabilistic Relation or Model
The relation among variable may or may not be governed by an exact physical law. For convenience, let us consider a set of n pairs of observations (Xi , Yi). If the relation between the variables is exactly linear, then the mathematical equation describing the linear relation is generally written as
Yi = a + bXi
Where a is the value of Y when X equals zero and is called Y-intercept and b indicates the change in Y for a one-unit change in X and is called the slope of the line. Substituting a value for X in the equation, we can completely...

...lineaire regressiemodel wordt er een model gecreëerd. Dit model bevat een onafhankelijke variabele (X) en een afhankelijke variabele (Y), het Monte Carlosimulatiemodel wordt hierop toegepast (Dougherty, 2002, p.72).
Met Monte Carlosimulatie als toepassing wordt als eerste voor het lineaire regressiemodel willekeurig de waarden voor α en β gekozen. Vervolgens wordt met EViews 5.0 voor een vastgesteld aantal waarnemingen, hier uitgaande van 1000 waarnemingen per simulatie (T=1000), waarden getrokken voor de onafhankelijke X –variabele en de storingsterm. Hierna wordt op basis hiervan de waarden voor de afhankelijke variabele Y bepaald. Op de waarnemingen die voortkomen uit deze verschillende modellen, dus de variabelen X en Y, zal regressie uitgevoerd worden. Om te concluderen wat de invloed is, zal er uiteindelijk een Breusch-Godfrey Serial Correlation LM Test gebruikt worden. Dit hele proces wordt een aantal keer herhaald.
2.3 Hypothese
Voor beantwoording van de centrale vraag: in hoeverre heeft eerste orde autocorrelatie invloed op het lineaire regressiemodel, moeten er hypothesen gesteld worden. Het stellen van hypotheses gaat vooraf aan het proces van het ontwikkelen en toepassen van het gecreëerde Monte Carlosimulatiemodel.
Er zijn twee hypotheses: H0 en H1. H0 gaat ervan uit dat er geen autocorrelatie is. H1 gaat van het tegenovergestelde uit. H1 stelt dat er wel autocorrelatie is en dat het wel degelijk invloed heeft op het...

...1.
Qeach brand t=β0+β1*PMinute Maid t+β2*PTropicana t+β3*PPrivate label t+ueach brand t
Q: quantity P: price
By running the above regressionmodel for each brand, we got the following elasticity matrix and the figures for “V” and “C.” Note that we used the average price and quantity for P and Q to calculate each brand’s elasticity.
Price Elasticity | Tropicana | Minute Maid | Private Label |
Tropicana | -3.4620441 | 0.40596537 | 0.392997566 |
Minute Maid | 1.8023329 | -4.26820251 | 0.765331803 |
Private Label | 1.3138871 | 1.41197064 | -4.130754362 |
VTropicana = 0.40596537+0.392997566 = 0.7989629
CTropicana = 1.8023329+ 1.3138871 = 3.11621998
VMinute Maid = 1.8023329+0.765331803 = 2.5676647
CMinute Maid = 0.40596537+1.41197064 = 1.81793601
VPrivate Label = 1.3138871+1.41197064 = 2.7258577
CPrivate Label = 0.392997566+0.765331803 = 1.15832937
“V” suggests the vulnerability of each brand to the price changes of other two brands. On the other hand, “C” suggests the clout of each brand to the other brands. For example, the brand that has the highest vulnerability is private label (2.73), which means private label is most vulnerable to the other two brands’ price changes. If Tropicana and private label each depreciate their price by 1%, then the sales of private label will decrease by 2.73%. In contrast, the brand that has the highest clout is Tropicana (3.12), which means Tropicana is the most influential...