INCOMEPRPBLCK
Mean 47053.78 0.113486
Median 46272.00 0.041444
Maximum 136529.0 0.981658
Minimum 15919.00 0.000000
Std. Dev. 13179.29 0.182416
Skewness 0.962831 2.700012
Kurtosis 7.551386 10.56841
JarqueBera 416.2135 1473.100
Probability 0.000000 0.000000
Sum 19244998 46.41594
Sum Sq. Dev. 7.09E+10 13.57651
Observations 409 409
The average of prpblck is .113 with standard deviation .182; the average of income is 47,053.78 with standard deviation 13,179.29. It is evident that prpblck is a proportion and that income is measured in dollars.
(ii)
Dependent Variable: PSODA
Method: Least Squares
Sample: 1 410
Included observations: 401
Excluded observations: 9
VariableCoefficientStd. ErrortStatisticProb.
PRPBLCK0.1149880.0260014.4225150.0000
INCOME1.60E063.62E074.4301300.0000
C0.9563200.01899250.353790.0000
Rsquared0.064220 Mean dependent var1.044863
Adjusted Rsquared0.059518 S.D. dependent var0.088798
S.E. of regression0.086115 Akaike info criterion2.058820 Sum squared resid2.951465 Schwarz criterion2.028940
Log likelihood415.7934 Fstatistic13.65691
DurbinWatson stat1.696180 Prob(Fstatistic)0.000002
If, say, prpblck increases by .10 (ten percentage points), the price of soda is estimated to increase by .0115 dollars, or about 1.2 cents, holding income constant. While this does not seem large, there are communities with no black population and others that are almost all black, in which case the difference in psoda is estimated to be almost 11.5 cents. I’d still say it’s pretty weak effect, but it’s not totally negligible, given that the price of soda is pretty low.
(iii)
Dependent Variable: PSODA
Method: Least Squares
Sample: 1 410
Included observations: 401
Excluded observations: 9
VariableCoefficientStd. ErrortStatisticProb.
PRPBLCK...
...
ECONOMETRICS



First of all, I would like to apologize for showing the results in Spanish, but I couldn’t find the way to change Gretl’s language. However, all the explanations are in English, so I hope there is no problem to understand the results.
Secondly, I would just inform you that the timeseries data that I have used is “U.S. macro data, 19502000” from Greene Sample folder in Gretl.
Before building the model…
I would try to explain the variable “Real GDP” using the variables “Real consumption expenditures”, “Real PrivateSector Investment”, “Real government expenditure”, “Unemployment rate” and “Inflation rate”. To do so, the first thing we should do is to check if there is correlation risk between the independent variables. We will use the Correlation Matrix to figure out this:
Since the coefficient of correlation between the variables Real Consumption, Real PrivateSector Investment and Real Government expenditure is very close to 1, it means that those variables are providing almost the same information, so I would delete some of them, and check the coefficient of correlation again.
Once Real PrivateSector investment and Real Government Expenditure have been deleted from the correlation matrix, the coefficients of correlations between variables are acceptable now, and we can be sure that every variable gives different information about the model.
However, I could have also used another statistic to check is...
...females. Moreover, if they work in social sciences then the wage would go up by 0.124. SEX and SOSCI are dummy variables.
723 data points lie in this model.
Constant
SEX
SOSCI
TENURE
Tratio
8.803/0.127=69.31
0.077/0.029=2.66
0.124/0.039=3.18
0.006/0.002=3.00
Significance Level
Critical (t*)
Constant
SEX
SOSCI
TENURE
10%
1.64697569
Significant
Significant
Significant
Significant
5%
1.96326884
Significant
Significant
Significant
Significant
1%
2.58268445
Significant
Significant
Significant
Significant
The coefficients are all significant at 1%, 5% and 10% levels.
Autocorrelation could be present. Rsquared could be overestimated at 60%, which is quite high. Standard errors are quite low. Econometric data has many factors so standard errors cannot be that low. The DurbinWatson dtest needs to be carried out to confirm the existence of autocorrelation in this example. GLS or NeweyWest method can be used to correct autocorrelation if need be.
References
Travel in London 3 report, Transport for London
Gregory Clark, "What Were the British Earnings and Prices Then? (New Series)" MeasuringWorth, 2013.
...
...Descriptive Statistics
Mean
Variance
Standard Deviation
Sample Covariance
If it is greater than zero, upward sloping. This is scale dependent.
Sample Correlation
This is scale independent: between 1 and 1, close to 1 is upward, 0 is central, 1 is downward sloping.
Finding the regression
Regression formula with one regressor
Slope
Intercept
Finding R2
TSS=ESS+SSR
The Coefficient of Determination = R2
This gives the total fit of , between 0 (chance) and 1 (perfect prediction)
Standard Errors
Standard Error of the Regression
Standard error of
Hypothesis Testing
1.
2. Define H0
3. Define H1
4. Define Tcrit/Pcrit
a. Note, for Tcrit 2 sided test, half
5. Find Tact/Pact
Tact
,
Pact
For one sided, just
Multiple Regression
Omitted Variable Bias
Ommitted variables may increase the apparent importance of another variable, damaging the ability to prove causality.
Effect of OVB on
1. Find the variable outside of the model
2. Find Corr(ZY)
3. Find Corr (ZX)
4. Multiply the signs
5. If positive, there is an upwards bias ()
Adjusted R2
OLS Wonder Equation
A good model for proving causality has a low , a good model for predicting Y has a low R2
Multiple Variable Tests
Reparametrisation
1.
2. For showing
3. Let
4. Thus,
5.
6. Now, let
7. Thus,
8. Now, run a new regression and do the usual hypothesis...
...Econometrics is the application of mathematics and statistical methods to economic data and described as the branch of economics that aims to give empirical content to economic relations. [1] More precisely, it is "the quantitative analysis of actual economic phenomena based on the concurrent development of theory and observation, related by appropriate methods of inference."[2] An influential introductory economics textbook describes econometrics as allowing economists "to sift through mountains of data to extract simple relationships."[3] The first known use of the term "econometrics" (in cognate form) was by Paweł Ciompa in 1910. Ragnar Frisch is credited with coining the term in the sense that it is used today.[4]
Econometrics is the unification of economics, mathematics, and statistics. This unification produces more than the sum of its parts.[5] Econometrics adds empirical content to economic theory allowing theories to be tested and used for forecasting and policy evaluation
Basic econometric models: linear regression
The basic tool for econometrics is the linear regression model. In modern econometrics, other statistical tools are frequently used, but linear regression is still the most frequently used starting point for an analysis.[7] Estimating a linear regression on two variables can be visualized as fitting a line through data points representing...
...ECON 140
Section 13, November 28, 2013
ECON 140  Section 13
1
The IV Estimator with a Single Regressor and a Single Instrument
1.1
The IV Model and Assumptions
Consider the univariate linear regression framework: Yi = β0 + β1 Xi + ui
Until now, it was assumed that E (ui Xi ) = 0, i.e. conditional mean independence.
Let's relax this assumption and allow the covariance between Xi and ui to be dierent from zero.
Our problem here is that ui is not observed.
Doing OLS yields inconsistent estimates (remember the OVB formula).
In this case we refer to Xi as an endogenous variable.
The way to get consistent estimates is to use an instrument, which is a variable that satises the
following two properties:
1. Relevance: Cov (Zi , Xi ) = 0.
2. Exogeneity: Cov (Zi , ui ) = 0.
In words: since the variation of Xi is contaminated (it is correlated with the variation of ui ), it
follows that we need a variable that allows us to get variation in Xi that is clean, i.e. it holds ui
xed.
1.2
The Two Stage Least Squares Estimator
Since the OLS estimator doesn't yield consistent estimates, we need an estimator that uses the
instrument and yields consistent estimates.
This estimator is called Two Stage Least Squares (TSLS).
This is how it works:
1. In the rst stage, regress Xi on a constant term and Zi : Xi = π0 + π1 Zi + vi .
2. In the second stage, regress Yi on a constant term and the predicted...
...Introduction to Econometrics coursework
For the assignment I will examine whether or not a linear regression model is suitable for estimating the relationship between Human development index (HDI) and its components. Linear Regression is a statistical technique that correlates the change in a variable to other variable/s, the representation of the relationship is called the linear regression model.
Variables are measurements of occurrences of a recurring event taken at regular intervals or measurements of different instances of similar events that can take on different possible values. A dependent variable is a variable whose value depends on the value of other variables in a model. Hence, an independent variable is a variable whose value is not dependent on other variables in a model.
The dependent variable here is HDI and this will be regressed against the independent variables which include Life expectancy at birth, Mean years of schooling, expected years of schooling and Gross National Income per capita Hence we can model this into Yi = b0 + b1 xi + b2 xi + b3 xi + b4 xi + where Y is HDI, β0 is a constant, β1 β2 β3 β4 are the coefficients and denotes for random/error term.
R2 is how much your response variable (y) is explained by your explanatory variable (x). The value of R2 ranges between 0 and 1, and the value will determine how much of the independent variable impacts on the dependent variable. The R2 value will show how reliable the...
...1 also; i.e. E(b1)= 1
Unbiasedness property hinges on the model being correctly specified i.e.
E(xi ui)=0, E(ui)=0
3. Efficiency
An estimator is efficient if:
it is unbiased
no other unbiased estimator has a smaller variance i.e. it has the minimum possible variance (See DG Sect 3.4 & Fig 3.8)
OLS estimators b1, b2 are the Best Linear Unbiased Estimators of 1 2 when the first 5 assumptions of the linear model hold.
b1, b2 : linear
unbiased
efficient, (have smaller variance than any other linear
unbiased estimator)
BLUE
Result is known as Gauss –Markov Theorem
GaussMarkov Theorem:
First 5 assumptions above must hold
OLS estimators are the “best” among all linear and unbiased estimators because
they are efficient: i.e. they have smallest variance among all other linear and unbiased estimators
Normality is unnecessary to assume, GM result does not depend on normality of dependent variable
GM refers to the estimators b1, b2, not to actual values of b1, b2 calculated from a particular sample
GM applies only to linear and unbiased estimators, there are other types of estimators which we can use and these may be better in which case disregard GM
e.g. a biased estimator may be more efficient than an unbiased one which fulfills GM.
3. Consistency
Other properties hold for “small” samples: Consistency is a large sample property i.e. asymptotic property
As n , the sampling distribution of the...
...
SelfLab 2  Use Your Hand
University of Gothenburg
Department of Economics
Applied Econometrics (MSc.), Fall 2013
Alpaslan Akay
University of Gothenburg
This is your second homework. It is a lab that you are going to do it alone again. In the first lab you have learned how to operate Stata and calculate descriptive statistics. You also read a paper with an interesting research question. SelfLab 2 covers some topics of Lecture 2 and 3. In this lab you are going to learn how to calculate OLS estimator with your own hand. Later, you are going to answer some conceptual questions. These concepts are used in the practice very often and you should not make mistake while using them. Understand clearly what they are by reading from internet or any book. Please do not copy and paste something from internet. Submit your answers to my email before 13th of December, 2013 (latest 23:59).
A. It is good to learn how things work…
Imagine that you have the data given below. GPA is grade point average and ACT is the ACT Test score for 8 collage students.
Student
GPA
ACT
1
2.8
21
2
3.4
24
3
3.0
26
4
3.5
27
5
3.6
29
6
3.0
25
7
2.7
25
8
3.7
30
Calculate the followings with your own hand
1) the coefficients of the regression below. You can use any approach (algebraic or matrixes). I would like to see each step clearly described.
2) the error variance...