STAT2008 Lecture Notes

STAT2008 – REGRESSION MODELLING
LECTURE NOTES - CHAPTER 1: SIMPLE LINEAR REGRESSION
I. Introduction
The basic aims of this chapter are:
• Review of the simple linear regression material covered in Statistical Techniques II;
• An introduction to some new notation, including matrices;
• A more detailed study of the properties of the regression estimates; and,
• An investigation of diagnostic procedures to check the credibility of the underlying assumptions of our regression model.
We will, as much as possible, demonstrate concepts through the use of example data. This will also give us opportunity to see how to use S-Plus to perform our ﬁtting and diagnostic procedures.
When formulating a suitable model for a set of data, we should always take into account:
1. Background scientiﬁc theory which may suggest a speciﬁc structure for our model;
2. Scatterplots of the data; and,
3. Statistical model output and diagnostic procedures.
II. The Model and Assumptions
If our dataset consists of a sample of n pairs (x1 , Y1 ), . . . , (xn , Yn ), where the Yi ’s are considered to be the values of a “response” or “dependent” variable (i.e., the variable whose characteristics we are most interested in examining and explaining) and the xi ’s are the values of a “predictor” or “independent” variable (i.e., a variable whose value may potentially inﬂuence the value of the response or dependent variable), then the simplest possible regression structure has the linear form:
Y = β0 + β1 x + , where is a mean-zero random variable having variance σ 2 . Speciﬁcally, this means that we believe that each data value Yi can be expressed as:
Yi = β0 + β1 xi + i ,

(i = 1, . . . , n)

where the i ’s are the “errors” or “noise” in the model; that is, they are the stochastic or random component of Yi , and they measure the amount by which the observed value diﬀers from what the “deterministic” part of the model would have predicted for the value of Yi , namely E(Yi |xi ) = β0 +β1 xi . We use the

STAT2008 Lecture Notes

You May Also Find These Documents Helpful

The Potential of Tannin Found In Avoccado Seeds

The Potential of Tannin Found In Avoccado Seeds

Strategic Management and Value Chain Analysis

Strategic Management and Value Chain Analysis

extra credit assignment psych 101

extra credit assignment psych 101

Confidence Intervals and Prediction Intervals from Simple Linear Regression

Confidence Intervals and Prediction Intervals from Simple Linear Regression

Does Ethanol Concentration Affect The Activity Of Bovine Catalase

Does Ethanol Concentration Affect The Activity Of Bovine Catalase

Mechanistic Perspective Analysis

Mechanistic Perspective Analysis

Math 540

Math 540

independent and depentent variables of psychology

independent and depentent variables of psychology

General Psychology Worksheet

General Psychology Worksheet

General Psychology Study Guide

General Psychology Study Guide

Decision Making Problem

Decision Making Problem

Scientific Method and Research

Scientific Method and Research

Decision Theory and Probability Distributions

Decision Theory and Probability Distributions

nonlinear regression

nonlinear regression