After completing this chapter, you should be able to:

understand model building using multiple regression analysis

apply multiple regression analysis to business decision-making situations

analyze and interpret the computer output for a multiple regression model

test the significance of the independent variables in a multiple regression model

use variable transformations to model nonlinear relationships

recognize potential problems in multiple regression analysis and take the steps to correct the problems.

incorporate qualitative variables into the regression model by using dummy variables.

Multiple Regression Assumptions

The errors are normally distributed

The mean of the errors is zero

Errors have a constant variance

The model errors are independent

Model Specification

Decide what you want to do and select the dependent variable

Determine the potential independent variables for your model

Gather sample data (observations) for all variables

The Correlation Matrix

Correlation between the dependent variable and selected independent variables can be found using Excel:

Tools / Data Analysis… / Correlation

Can check for statistical significance of correlation with a t test

Example

A distributor of frozen desert pies wants to evaluate factors thought to influence demand

Dependent variable: Pie sales (units per week)

Independent variables: Price (in $)

Advertising ($100’s)

Data is collected for 15 weeks

Pie Sales Model

Sales = b0 + b1 (Price)

+ b2 (Advertising)

Interpretation of Estimated Coefficients

Slope (bi)

Estimates that the average value of y changes by bi units for each 1 unit increase in Xi holding all other variables constant

Example: if b1 = -20, then sales (y) is expected to decrease by an estimated 20 pies per week for each $1 increase in selling price (x1), net of the effects of changes due to advertising (x2)

y-intercept (b0)

The estimated average value of y when all xi = 0 (assuming all xi = 0 is within the range of observed values)

Pie Sales Correlation Matrix

Price vs. Sales : r = -0.44327

There is a negative association between

price and sales

Advertising vs. Sales : r = 0.55632

There is a positive association between

advertising and sales

Scatter Diagrams

Computer software is generally used to generate the coefficients and measures of goodness of fit for multiple regression

Excel:

Tools / Data Analysis... / Regression

Multiple Regression Output

The Multiple Regression Equation

Using The Model to Make Predictions

Input values

Multiple Coefficient of

Determination

Reports the proportion of total variation in y explained by all x variables taken together

Multiple Coefficient of Determination

Adjusted R2

R2 never decreases when a new x variable is added to the model

This can be a disadvantage when comparing models

What is the net effect of adding a new variable?

We lose a degree of freedom when a new x variable is added

Did the new x variable add enough explanatory power to offset the loss of one degree of freedom?

Shows the proportion of variation in y explained by all x variables adjusted for the number of x variables used

(where n = sample size, k = number of independent variables)

Penalize excessive use of unimportant independent variables

Smaller than R2

Useful in comparing among models

Multiple Coefficient of

Determination

Is the Model Significant?

F-Test for Overall Significance of the Model

Shows if there is a linear relationship between all of the x variables considered together and y...