Seth Hill
Professor Gwinn
Econometrics
March 3, 2011
Unemployment Rate and Total New Houses Sold

For decades, owning a home has been touted as the very heart of "the American Dream", but today that dream is out of reach for an increasing number of Americans. Why? It is because there are not nearly enough jobs for everyone. Without a jobs recovery, there simply is not going to be a housing recovery. In this report, I will perform a regression analysis to determine the effect of the Unemployment Rate (UR) on Total New Houses Sold (TNHS). I expect that there will be a negative relationship between the two variables. In other words, as the unemployment rate increases, the total number of new houses sold will decrease.

The simple functional form of the model is TNHS=f(UR), where TNHS (measured in thousands) is the dependent variable and UR (16 years and over) is the explanatory variable. To determine the relationship between the two variables, one must set up the Population Regression Function (PRF). The PRF represents the regression line of the population as a whole. The deterministic PRF for the model is E(TNHSt|UR) = B₁ + B₂URt. B1 and B2 are population parameters. B₁ is the intercept coefficient and represents TNHS when UR is zero. In regression analysis, the population regression function is estimated on the basis of the sample regression function (SRF). That is, the PRF is an estimator of the SRF. The deterministic SRF in this case is TNHS = b1 + b2UR. In this function, b1 and b2 are estimators for B1 and B2 in the PRF. The PRF and SRF functions in their stochastic forms are: PRF:TNHSt = B1 + B2URt + Ut

SRFTNHSt = b1 + b2URt + et
In the PRF, Ut is the population error term. The population error term is a random variable that cannot be explained by the PRF. This term represents the difference between the actual value of TNHS and the value predicted by the regression equation. In other words, the error term accounts for variables that affect TNHS...

...new house or automobile is very much affected by the interest rates changed by banks.
Regressionanalysis is one such causal method. It is not limited to locating the straight line of best fit.
Types:-
1. Simple (or Bivariate) RegressionAnalysis:
Deals with a Single independent variable that determines the value of a dependent variable.
Ft+1 = f (x) t Where Ft+1: the forecast for the next period.
This indicates the future demand is a function of the value of the economic indicator at the
present time.
Demand Function: D=a+bP, where b is negative.
If we assume there is a linear relation between D and P, there may also be some random variation in this relation.
Sum of Squared Errors (SSE): This is a measure of the predictive accuracy. Smaller the value of SSE, the more accurate is there regression equation
EXAMPLE:-
Following data on the demand for sewing machines manufactured by Taylor and Son
Co. have been compiled for the past 10 years.
YEAR | 1971 | 1972 | 1973 | 1974 | 1975 | 1976 | 1977 | 1978 | 1979 | 1980 |
DEMAND (in 1000 Units) | 58 | 65 | 73 | 76 | 78 | 87 | 88 | 93 | 99 | 106 |
1. Single variable linear regression
Year = x where x = 1, 2, 3... 10
Demand = y
D = y + ᵋ Where D is actual demand
ᵋ = D –y
To find out whether this is the line of best fitted or not it is to be made sure that this sum of squares is minimum.
2. Nonlinear...

...Economics 141 (Intro to Econometrics) Professor Yang
Spring 2001
Answers to Midterm Test No. 1
1. Consider a regression model of relating Y (the dependent variable) to X (the independent
variable) Yi = (0 + (1Xi+ (i where (i is the stochastic or error term. Suppose that the
estimated regression equation is stated as Yi = (0 + (1Xi and ei is the residual error term.
A. What is ei and define it precisely. Explain how it is related to (i.
ei is the residual error term in the sample regression function and is defined as eI hat = Y
– Y hat.
ei is the estimated error term of the population function.
B. What is (i and define it precisely. What are the four reasons for the inclusion of this error term in the population regression function (model)?
(i is the stochastic term in the population regression function. The four reasons for its existence are: 1. Omitted variable 2. Measurement error 3. Different functional form
4. to account for purely randomness in the human behavior.
C. Draw a graph where you can clearly show E(Yi(XI) = (( + ((XI and Yi = (0 + (1Xi. Show
also in your graph (( and e6 for the X6. This graph graph will show true and estimated
regression lines together with their respective error terms.
See Figure 2.1 on pages 18 (& 39) of the textbook for the graph.
D....

...associated with a β1 change in Y.
(iii) The interpretation of the slope coefficient in the model ln(Yi ) = β0 + β1 ln(Xi ) + ui is as
follows:
(a) a 1% change in X is associated with a β1 % change in Y.
(b) a change in X by one unit is associated with a β1 change in Y.
(c) a change in X by one unit is associated with a 100β1 % change in Y.
(d) a 1% change in X is associated with a change in Y of 0.01β1 .
(iv) To decide whether Yi = β0 + β1 X + ui or ln(Yi ) = β0 + β1 X + ui fits the data better, you
cannot consult the regression R2 because
(a) ln(Y) may be negative for 0 < Y < 1.
(b) the TSS are not measured in the same units between the two models.
(c) the slope no longer indicates the effect of a unit change of X on Y in the log-linear
model.
(d) the regression R2 can be greater than one in the second model.
1
(v) The exponential function
(a) is the inverse of the natural logarithm function.
(b) does not play an important role in modeling nonlinear regression functions in econometrics.
(c) can be written as exp(ex ).
(d) is ex , where e is 3.1415...
(vi) The following are properties of the logarithm function with the exception of
(a) ln(1/x) = −ln(x).
(b) ln(a + x) = ln(a) + ln(x).
(c) ln(ax) = ln(a) + ln(x).
(d) ln(xa) = aln(x).
(vii) In the log-log model, the slope coefficient indicates
(a) the effect that a unit change in X has on Y.
(b) the elasticity of Y with respect to X.
(c) ∆Y/∆X.
(d)
∆Y
∆X
×
Y
X
(viii) In the...

...RegressionAnalysis Exercises
1- A farmer wanted to find the relationship between the amount of fertilizer used and the yield of corn. He selected seven acres of his land on which he used different amounts of fertilizer to grow corn. The following table gives the amount (in pounds) of fertilizer used and the yield (in bushels) of corn for each of the seven acres.
|Fertilizer Used |Yield of Corn |
|120 |138 |
|80 |112 |
|100 |129 |
|70 |96 |
|88 |119 |
|75 |104 |
|110 |134 |
a. With the amount of fertilizer used as an independent variable and yield of corn as a...

...
Mortality Rates
RegressionAnalysis of Multiple Variables
Neil Bhatt
993569302
Sta 108 P. Burman
11 total pages
The question being posed in this experiment is to understand whether or not pollution has an impact on the mortality rate. Taking data from 60 cities (n=60) where the responsive variable Y = mortality rate per population of 100,000, whose variables include Education, Percent of the population that is nonwhite, percent of population that is deemed poor, the precipitation, the amount sulfur dioxide, and amount of nitrogen dioxide.
Data:
60 Standard Metropolitan Statistical Area (SMSA) in the United States, obtained for the years 1959-1961. [Source: GC McDonald and JS Ayers, “Some applications of the ‘Chernoff Faces’: a technique for graphically representing multivariate data”, in Graphical Representation of Multivariate Data, Academic Press, 1978.
Taking the data, we can construct a matrix plot of the data in order to take a visible look at whether a correlation seems to exist or not prior to calculations.
Data Distribution:
Scatter Plot Matrix
As one can observe there seems to be a cluster of data situated on what appears to be a correlation of relationship between Y=Mortality rate and X= potential variables influencing Y.
From this we construct a correlation matrix in order to see a relationship in matrix form....

...RegressionAnalysis (Tom’s Used Mustangs)
Irving Campus
GM 533: Applied Managerial Statistics
04/19/2012
Memo
To:
From:
Date: April 19st, 2012
Re: Statistic Analysis on price settings
Various hypothesis tests were compared as well as several multiple regressions in order to identify the factors that would manipulate the selling price of Ford Mustangs. The data being used contains observations on 35 used Mustangs and 10 different characteristics.
The test hypothesis that price is dependent on whether the car is convertible is superior to the other hypothesis tests conducted. The analysis performed showed that the test hypothesis with the smallest P-value was favorable, convertible cars had the smallest P-value.
The data that is used in this regressionanalysis to find the proper equation model for the relationship between price, age and mileage is from the Bryant/Smith Case 7 Tom’s Used Mustangs. As described in the case, the used car sales are determined largely by Tom’s gut feeling to determine his asking prices.
The most effective hypothesis test that exhibits a relationship with the mean price is if the car is convertible. The RegressionAnalysis is conducted to see if there is any relationship between the price and mileage, color, owner and age and GT. After running several models with different independent...

...REGRESSIONANALYSIS
Correlation only indicates the degree and direction of relationship between two variables. It does not, necessarily connote a cause-effect relationship. Even when there are grounds to believe the causal relationship exits, correlation does not tell us which variable is the cause and which, the effect. For example, the demand for a commodity and its price will generally be found to be correlated, but the question whether demand depends on price or vice-versa; will not be answered by correlation.
The dictionary meaning of the ‘regression’ is the act of the returning or going back. The term ‘regression’ was first used by Francis Galton in 1877 while studying the relationship between the heights of fathers and sons.
“Regression is the measure of the average relationship between two or more variables in terms of the original units of data.”
The line of regression is the line, which gives the best estimate to the values of one variable for any specific values of other variables.
For two variables on regressionanalysis, there are two regression lines. One line as the regression of x on y and other is for regression of y on x.
These two regression line show the average relationship between the two variables. The regression line of y on x gives the most probable...

...The simpleregression model (SRM) is model for association in the population between an explanatory variable X and response Y. The SRM states that these averages align on a line with intercept β0 and slope β1: µy|x = E(Y|X = x) = β0 + β1x
Deviation from the Mean
The deviation of observed responses around the conditional means µy|x are called errors (ε). The error’s equation: ε = y - µy|x
Errors can be positive or negative, depending on whether data lie above (positive) or below the conditional means (negative). Because the errors are not observed, the SRM makes three assumptions about them:
* Independent. The error for one observation is independent of the error for any other observation.
* Equal variance. All errors have the same variance, Var(ε) = σε2.
* Normal. The errors are normally distributed.
If these assumptions hold, then the collection of all possible errors forms a normal population with mean 0 and variance σε2, abbreviated ε ̴̴ N (0, σε2). SimpleRegression Model (SRM) observed values of the response Y are linearly related to values of the explanatory variable X by the equation: y = β0 + β1x + ε, ε ̴̴ N (0, σε2)
The observations:
1. are independent of one another,
2. have equal variance σε2 around the regression line, and
3. are normally distributed around the regression line.
21.2 Conditions for the SRM ( Simple...