Team D will examine positive relationship of wages with multiple variables. The question is, are wages dependent on the gender, occupation, industry, years of education, race, years of work experience, marital status, and union membership. We will use the technique of linear regression and correlation. Regression analysis in this case should predict the value of the dependent variable (annual wages), using independent variables (gender, occupation, industry, years of education, race, and years of work experience, marital status, and union membership).
Regression Analysis
Based on our initial findings from MegaStat, we built the following model for regression (coefficient factors are rounded to the nearest hundredth):
Global Test:
Ho: All regression coefficients for the variables in the population are zero H1: Not all regression coefficients are zero
Significance level = 0.05
Decision rule: Reject Ho if pvalue < 0.05
The pvalue generated by the regression analysis is nonzero (4.42x107), therefore we reject Ho and conclude that regression is a good fit for this model.
Individual tests:
Ho: Regression coefficient for each variable is zero
H1: Regression coefficient for each variable is not zero
Significance level = 0.05
Decision rule: Reject Ho if pvalue < 0.05
Because these are all ttests, we can read the pvalues of these tests from the Regression output. he variables with pvalues less than 0.05 have significant impact on wages earned, also that variables with pvalues greater than 0.05 do not have significant impact on wages. According to the MegaStat output, the variables that significantly affect wages are education (p = 2.17x106), gender (p = .0001), and experience (p =...
...referring to the recent boom in house prices in many developed countries following a sharp bust in 2008. Researches and policy makers alike have realized that housing has significant influences on the business cycle. This paper tries to figure out the determinants of the selling price of houses in Oregon. The data set used in this paper has been retrieved from the case study titled “Housing Price” (Case #27  Practical Data Analysis: Case Studies in Business Statistics Marlene A. Smith & Peter G. Bryant)
The most important factor in determining the selling prices ofhouses is to know the features that drive the selling prices of the house. People tend to have more interest in houses with multiple bed rooms/bathrooms, fireplace, garage for multiple cars and location while choosing a house. So, a house that meets this requirement tends to be priced more and the house with these features being absent is priced low. According to the survey conducted by Marlene A. Smith & Peter G. Bryant while forming their case study titled “Housing Price” (Case #27  Practical Data Analysis: Case Studies in Business Statistics), 10 variables were selected to find out their impact in determining the housing price. A sample of 108 houses wasselected from East Ville, Oregon along with their characteristics on 10 selected selected variables. The variable set for the study is:
Selling Price of House
Area
No of Bed rooms
No of Bath rooms...
...1. If the correlation coefficient between the variables is 0, it means that the two variables aren’t related. – TRUE
2. In a simple regressionanalysis the error terms are assumed to be independent and normally distributed with zero mean and constant variance. – TRUE
3. The difference between the actual Yvalue and the predicted Yvalue found using a regression equation is called the residual (ε) – TRUE
4. In a multipleregressionanalysis with N observations and k independent variables, the degrees of freedom for the residual error is given by (Nk). – FALSE (correct answer Nk1)
5. From the following scatter plot, we can say that between y & x there is _______. – Negative correlation
6. According to the graph, X & Y have ________. – Virtually no correlation
7. A cost accountant is developing a regression model to predict the total cost of producing a batch of printed circuit boards as a function of batch size (the number of boards produced in one lot or batch.) The explanatory variable is called the _______. – Coefficient of determination
8. In the regression equation, y = 75.65 + 0.50x, the intercept is ______. – 75.65
9. The assumptions underlying simple regressionanalysis include ______. – The error terms are independent
10. The proportion of variability of the dependent...
...MULTIPLEREGRESSION
After completing this chapter, you should be able to:
understand model building using multipleregressionanalysis
apply multipleregressionanalysis to business decisionmaking situations
analyze and interpret the computer output for a multipleregression model
test the significance of the independent variables in a multipleregression model
use variable transformations to model nonlinear relationships
recognize potential problems in multipleregressionanalysis and take the steps to correct the problems.
incorporate qualitative variables into the regression model by using dummy variables.
MultipleRegression Assumptions
The errors are normally distributed
The mean of the errors is zero
Errors have a constant variance
The model errors are independent
Model Specification
Decide what you want to do and select the dependent variable
Determine the potential independent variables for your model
Gather sample data (observations) for all variables
The Correlation Matrix
Correlation between the dependent variable and selected independent variables can be found using Excel:
Tools / Data Analysis… /...
...Chapter3
MultipleRegressionAnalysis: Estimation
Key drawback of SLR: all other factors affecting y are unrelated
with x, as is unrealistic.
Multipleregression allows us to control for many other
factors to explain dependent variable, which is useful both for
testing economic theories and for drawing the ceteris paribus
conclusion.
In addition, MR can incorporate fairly general functional form and
build better models for predicting the regressand.
Econometrics_buaa_Phd, Ma
1
3.1
Motivation for multipleregression
3.2* Mechanics and Interpretation of
MultipleRegression
3.3
The expected value of OLS estimators
3.4** Variance of the OLS Estimators
3.5 The GaussMarkov Theorem
Econometrics_buaa_Phd, Ma
2
3.1 Motivation for multipleregression
1. Taking exper out of u and put it explicitly in the equation:
wage = β0+ β1educ+ β2exper+ u
Now, we can hold exper fixed to evaluate the ceteris paribus effect
of educ on wage, other than assume exper is unrelated with educ.
2. MR is useful for genralizing functional relationship between
variables, for example (a quadratic function):
cons = β0+ β1inc+ β2inc2+ u
Note: we can not hold inc2 fixed while inc changes.
The marginal propensity of income to consumption depends on β1
as well as β2 and the level of income.
3. The definition of the independent variable is crucial...
...MultipleRegressionAnalysis of exchange rate with the determinant factors
RegressionAnalysis: USD versus GDP Growth, FER, FDI Growth, Interest Rate, Money Supply, Terms Of Trade
The regression equation is
USD = 41.5  1.95 GDP Growth + 0.000943 FER  0.139 FDI Growth + 0.048 Differential Interest Rate + 0.000067 Money Supply + 0.166 Terms of Trade  0.000097 External Debt 
Predictor T PConstant 2.32 0.039GDPGrowth 3.43 0.005 FER 1.01 0.332FDIGrowth 1.55 0.146Differential Int Rate 0.11 0.913Money Supply 0.89 0.393Terms of Trade 0.35 0.731External Debt 0.73 0.479 
Where,

T is t stat. Tstat is a measure of the relative strength of prediction (is more reliable than the regression coefficient because it takes into account error). 
The pvalue is a percentage. It tells you how likely it is that the coefficient for that independent variable emerged by chance and does not describe a real relationship.
A pvalue of .05 means that there is a 5% chance that the relationship emerged randomly and a 95% chance that...
...Multipleregression: OLS method
(Mostly from Maddala)
The Ordinary Least Squares method of estimation can easily be extended to models involving two or more explanatory variables, though the algebra becomes progressively more complex. In fact, when dealing with the general regression problem with a large number of variables, we use matrix algebra, but that is beyond the scope of this course.
We illustrate the case of two explanatory variables, X1 and X2, with Y the dependant variable. We therefore have a model
Yi = α + 1X1i + 2X2i + ui
Where ui~N(0,σ2).
We look for estimators so as to minimise the sum of squared errors,
S =
Differentiating, and setting the partial differentials to zero we get
=0 (1)
=0 (2)
=0 (3)
These three equations are called the “normal equations”. They can be simplified as follows: Equation (1) can be written as
or
(4)
Where the bar over Y, X1 and X2 indicates sample mean. Equation (3) can be written as
Substituting in the value of from (4), we get
(5)
A similar equation results from (3) and (4). We can simplify this equation using the following notation. Let us define:
Equation (5) can then be written
S1Y = (6)
Similarly, equation (3) becomes
S2Y = (7)
We can solve these two equations to get:
and
Where =S11S22 – S122. We may therefore obtain from equation (4).
We can...
...RegressionAnalysis (Tom’s Used Mustangs)
Irving Campus
GM 533: Applied Managerial Statistics
04/19/2012
Memo
To:
From:
Date: April 19st, 2012
Re: Statistic Analysis on price settings
Various hypothesis tests were compared as well as several multipleregressions in order to identify the factors that would manipulate the selling price of Ford Mustangs. The data being used contains observations on 35 used Mustangs and 10 different characteristics.
The test hypothesis that price is dependent on whether the car is convertible is superior to the other hypothesis tests conducted. The analysis performed showed that the test hypothesis with the smallest Pvalue was favorable, convertible cars had the smallest Pvalue.
The data that is used in this regressionanalysis to find the proper equation model for the relationship between price, age and mileage is from the Bryant/Smith Case 7 Tom’s Used Mustangs. As described in the case, the used car sales are determined largely by Tom’s gut feeling to determine his asking prices.
The most effective hypothesis test that exhibits a relationship with the mean price is if the car is convertible. The RegressionAnalysis is conducted to see if there is any relationship between the price and mileage, color, owner and age and GT. After running several models with different...
...associated with a β1 change in Y.
(iii) The interpretation of the slope coefficient in the model ln(Yi ) = β0 + β1 ln(Xi ) + ui is as
follows:
(a) a 1% change in X is associated with a β1 % change in Y.
(b) a change in X by one unit is associated with a β1 change in Y.
(c) a change in X by one unit is associated with a 100β1 % change in Y.
(d) a 1% change in X is associated with a change in Y of 0.01β1 .
(iv) To decide whether Yi = β0 + β1 X + ui or ln(Yi ) = β0 + β1 X + ui fits the data better, you
cannot consult the regression R2 because
(a) ln(Y) may be negative for 0 < Y < 1.
(b) the TSS are not measured in the same units between the two models.
(c) the slope no longer indicates the effect of a unit change of X on Y in the loglinear
model.
(d) the regression R2 can be greater than one in the second model.
1
(v) The exponential function
(a) is the inverse of the natural logarithm function.
(b) does not play an important role in modeling nonlinear regression functions in econometrics.
(c) can be written as exp(ex ).
(d) is ex , where e is 3.1415...
(vi) The following are properties of the logarithm function with the exception of
(a) ln(1/x) = −ln(x).
(b) ln(a + x) = ln(a) + ln(x).
(c) ln(ax) = ln(a) + ln(x).
(d) ln(xa) = aln(x).
(vii) In the loglog model, the slope coefficient indicates
(a) the effect that a unit change in X has on Y.
(b) the elasticity of Y with respect to X.
(c) ∆Y/∆X.
(d)
∆Y
∆X
×
Y
X
(viii) In the...