Introduction: The Nature and Purpose of Econometrics

Continuous and Discrete Data

What is Econometrics? Literal meaning is “measurement in economics”. Definition of financial econometrics: The application of statistical and mathematical techniques to problems in finance and accounting.

Continuous data can take on any value and are not confined to take specific numbers. Discrete data can only take on certain values, which are usually integers

Types of Data

Dummy variable

A dummy variable (also known as indicator variable or just dummy) is one that takes the values 0 or 1 to indicate the absence or presence of some categorical effect that may be expected to shift the outcome. 1 dummy variable for 2 categories 2 dummy variable for 3 categories 3 dummy variable for 4categories ………………….

• There are 3 types of data which econometricians might use for analysis: 1. Time series data 2. Cross-sectional data 3. Panel data, a combination of 1. & 2.

Regression Returns in Financial Modelling

It is preferable not to work directly with asset prices, so we usually convert the raw prices into a series of returns. There are two ways to do this: Simple returns or log returns

Regression is probably the single most important tool What is regression analysis? It is concerned with describing and evaluating the relationship between a given variable (usually called the dependent variable) and one or more other variables (usually known as the independent variable(s)).

pt pt 1 100% pt 1 where, Rt denotes the return at time t pt denotes the asset price at time t ln denotes the natural logarithm Rt

p Rt ln t 100% pt 1

1

2013/3/31

Some Notation

Regression is different from Correlation

Denote the dependent variable by y and the independent variable(s) by x1, x2, ... , xk where there are k independent variables. Some alternative names for the y and x variables: y x dependent variable independent variables regress regressors effect variable causal variables explained variable explanatory var.

If we say y and x are correlated, it means that we are treating y and x in a completely symmetrical way. In regression, we treat the dependent variable (y) and the independent variable(s) (x’s) very differently.

Simple Regression

Simple Regression: An Example

For simplicity, say k=1. This is the situation where y depends on only one x variable.

•

Suppose that we have the following data on the excess returns on a fund manager’s portfolio (“fund XXX”) together with the excess returns on a market index: Excess return = rXXX,t – rft 17.8 39.0 12.8 24.2 17.2 Excess return on market index = rmt - rft 13.7 23.2 6.9 16.8 12.3

Year, t 1 2 3 4 5

Graph (Scatter Diagram)

Finding a Line of Best Fit

45

Excess return on fund XXX

40 35 30 25 20 15 10 5 0 0 5 10 15 20 25 Excess return on market portfolio

We can use the general equation for a straight line, y=a+bx to get the line that best “fits” the data. However, this equation (y=a+bx) is completely deterministic. Is this realistic? No. So what we do is to add a random disturbance term, u into the equation. yt = + xt + ut where t = 1,2,3,4,5

2

2013/3/31

Ordinary Least Squares Determining the Regression Coefficients

So how do we determine what and are? Choose and so that the (vertical) distances from the data points to the fitted lines are minimised (so that the y line fits the data as closely as possible):

•

The most common method used to fit a line to the data is known as OLS (ordinary least squares). What we actually do is take each distance and square it (i.e. take the area of each of the squares in the diagram) and minimise the total sum of the squares (hence least squares). Tightening up the notation, let yt denote the actual data point t denote the fitted value from the regression ˆ yt line

x

Actual and Fitted Value •

$ What do We...