Introduction: The Nature and Purpose of Econometrics
Continuous and Discrete Data
What is Econometrics? Literal meaning is “measurement in economics”. Definition of financial econometrics: The application of statistical and mathematical techniques to problems in finance and accounting.
Continuous data can take on any value and are not confined to take specific numbers. Discrete data can only take on certain values, which are usually integers
Types of Data
A dummy variable (also known as indicator variable or just dummy) is one that takes the values 0 or 1 to indicate the absence or presence of some categorical effect that may be expected to shift the outcome. 1 dummy variable for 2 categories 2 dummy variable for 3 categories 3 dummy variable for 4categories ………………….
• There are 3 types of data which econometricians might use for analysis: 1. Time series data 2. Cross-sectional data 3. Panel data, a combination of 1. & 2.
Regression Returns in Financial Modelling
It is preferable not to work directly with asset prices, so we usually convert the raw prices into a series of returns. There are two ways to do this: Simple returns or log returns
Regression is probably the single most important tool What is regression analysis? It is concerned with describing and evaluating the relationship between a given variable (usually called the dependent variable) and one or more other variables (usually known as the independent variable(s)).
pt pt 1 100% pt 1 where, Rt denotes the return at time t pt denotes the asset price at time t ln denotes the natural logarithm Rt
p Rt ln t 100% pt 1
Regression is different from Correlation
Denote the dependent variable by y and the independent variable(s) by x1, x2, ... , xk where there are k independent variables. Some alternative names for the y and x variables: y x dependent variable independent variables regress regressors effect variable causal variables explained variable explanatory var.
If we say y and x are correlated, it means that we are treating y and x in a completely symmetrical way. In regression, we treat the dependent variable (y) and the independent variable(s) (x’s) very differently.
Simple Regression: An Example
For simplicity, say k=1. This is the situation where y depends on only one x variable.
Suppose that we have the following data on the excess returns on a fund manager’s portfolio (“fund XXX”) together with the excess returns on a market index: Excess return = rXXX,t – rft 17.8 39.0 12.8 24.2 17.2 Excess return on market index = rmt - rft 13.7 23.2 6.9 16.8 12.3
Year, t 1 2 3 4 5
Graph (Scatter Diagram)
Finding a Line of Best Fit
Excess return on fund XXX
40 35 30 25 20 15 10 5 0 0 5 10 15 20 25 Excess return on market portfolio
We can use the general equation for a straight line, y=a+bx to get the line that best “fits” the data. However, this equation (y=a+bx) is completely deterministic. Is this realistic? No. So what we do is to add a random disturbance term, u into the equation. yt = + xt + ut where t = 1,2,3,4,5
Ordinary Least Squares Determining the Regression Coefficients
So how do we determine what and are? Choose and so that the (vertical) distances from the data points to the fitted lines are minimised (so that the y line fits the data as closely as possible):
The most common method used to fit a line to the data is known as OLS (ordinary least squares). What we actually do is take each distance and square it (i.e. take the area of each of the squares in the diagram) and minimise the total sum of the squares (hence least squares). Tightening up the notation, let yt denote the actual data point t denote the fitted value from the regression ˆ yt line
Actual and Fitted Value •
$ What do We...