Residual analysis in SAS (continued)
In the previous two lectures, we learnt how to obtain the residuals of the estimation and to formally test for heteroscedasticity in the model. In this lecture, we will focus on testing the model errors for serial correlation.
Recall that, in order to make valid inferences using the linear regression estimates, we need the errors of the model (i.e. residuals) to satisfy the following key assumption:
• The errors of the model are independent of each other (this is the assumption of non-serial correlation): [pic] for [pic].
How do we test for serial correlation?
We will estimate a simple model with two explanatory variables of the form: [pic]
First, we will obtain the residuals of the model and obtain graphs that can provide informal evidence of serial correlation in the model. A possible graph is one of the residuals plotted against time. Another possibility is a graph of the residuals plotted against the one-period lagged residuals. Any obvious patterns in these graphs could provide informal evidence of serial correlation in the underlying data. We will also perform the Durbin Watson test, the Breusch-Godfrey serial correlation LM test, and some other techniques to look for formal evidence of serial correlation in the model.
Let us start with the following program:
*Run a linear regression model and obtain the residuals. Also get the Durbin Watson test statistic results;
PROC REG DATA = serial;
MODEL y = x2 x3/dw;
OUTPUT OUT = oserial RESIDUAL = rserial;
TITLE ‘Durbin-Watson test for first-order serial correlation’; RUN;
rserialag = LAG1(rserial);
PROC GPLOT DATA = oserial2;
PLOT rserial * year;
TITLE ‘Graphical test 1 of serial correlation’;
PLOT rserial * rserialag;
PLOT rserial * rserialag/href=0 vref=0 (jia heng zong zhou)