April 11th, 2012
We are given a linear regression that gives us an equation on the relationship of Quantity on Total Cost. As stated in the project, the regression data is very good with a relatively high R2, significant F, and t-values but we can’t use this model to estimate plant size. When we perform a simple eye test on the residual plot for Q a trend seems to form from positive to negative and back to positive. When we also fit a linear trend line to the normal probability plot we can also see a pattern. This may indicate that the errors are not normally distributed; the effect is an increase in the tendency to reject the null hypothesis (type 1 error). But the easiest way to see that the model is not the best for this data is when you assign trend lines to the data.
Linear fit is ok, but you can see that the data is too spread out.
Polynomial to the 3rd order is a better fit, but we can do better.
Polynomial to the 2nd order is the best fit, the data is pretty “snug”.
Attached you will find the regression analysis for all three models with the 2nd order model having the best results with a high R2 and a very significant F with a better looking normal probability plot and Q/Q2 residuals distributed more randomly. The model that is produced with standard errors in parenthesis is: TC =| 347.9484| - 12.4285Q| + 0.356795Q2|
| (77.19129)| (5.50019)| (0.089795)|
| R2 = 0.7970| SEE = 58.9922| F = 52.9924|
| | | |
ATC =| 347.9484/Q| -12.4285| + 0.356795Q|
Both Q and Q2 are significant at an Alpha of .05 with the entire model being highly significant, if you conclude one or more of the coefficients is not equal to zero the probability of being wrong is less than 0.000000000449. This function is statistically better and can actually be used to estimate plant size.