Course Project Part C
The following report displays regression and correlation analysis for AJ Davis Department Stores data on credit balance and size. We will use the data collected from 50 credit customers to complete the following analysis; * Generate a scatterplot for CREDIT BALANCE vs. SIZE, including the graph of the "best fit" line. Interpret. * Determine the equation of the "best fit" line, which describes the relationship between CREDIT BALANCE and SIZE. * Determine the coefficient of correlation. Interpret.
* Determine the coefficient of determination. Interpret.
* Test the utility of this regression model (use a two tail test with α =.05). Interpret your results, including the p-value. * Based on your findings in 1-5, what is your opinion about using SIZE to predict CREDIT BALANCE? Explain. * Compute the 95% confidence interval for. Interpret this interval. * Using an interval, estimate the average credit balance for customers that have household size of 5. Interpret this interval. * Using an interval, predict the credit balance for a customer that has a household size of 5. Interpret this interval. * What can we say about the credit balance for a customer that has a household size of 10? Explain your answer. * Using MINITAB run the multiple regression analysis using the variables INCOME, SIZE and YEARS to predict CREDIT BALANCE. State the equation for this multiple regression model. * Perform the Global Test for Utility (F-Test). Explain your conclusion. * Perform the t-test on each independent variable. Explain your conclusions and clearly state how you should proceed. In particular, which independent variables should we keep and which should be discarded. * Is this multiple regression model better than the linear model that we generated in parts 1-10? Explain. I will generate the data and provide analysis concerning the 50 credit customers.
1. Scatterplot for relationship between credit balance and size.
This scatterplot shows a graphical representation of the relationship between credit balance and size with the best fit line included. It shows that there is a positive relationship between credit balance and size. 2. Determine the equation of the "best fit" line, which describes the relationship between credit balance and size. The equation of “best fit” line which describes the relationship between credit banace and size is Y=B0 +B1X, where y=credit balance and x=size Equation:
Credit Balance ($) = 2591.4 + 403.22 Size
Take from Minitab:
Regression Analysis: Credit Balance($) versus Size
The regression equation is
Credit Balance($) = 2591 + 403 Size
Predictor Coef SE Coef T P
Constant 2591.4 195.1 13.29 0.000
Size 403.22 50.95 7.91 0.000
S = 620.162 R-Sq = 56.6% R-Sq(adj) = 55.7%
Analysis of Variance
Source DF SS MS F P
Regression 1 24092210 24092210 62.64 0.000
Residual Error 48 18460853 384601
Total 49 42553062
Obs Size Balance($) Fit SE Fit Residual St Resid
5 2.00 1864.0 3397.9 113.7 -1533.9 -2.52R
R denotes an observation with a large standardized residual.
3. The main result of a correlation is called the correlation coefficient (or "r"). It ranges from -1.0 to +1.0. The closer r is to +1 or -1, the more closely the two variables are related. For this sample the coefficient of correlation r=.752 Taken from minitab:
Correlations: Credit Balance($), Size
Pearson correlation of Credit Balance($) and Size = 0.752
P-Value = 0.000
4. The coefficient of determination r2=56.6% this is the percent of variance shared by credit balance and size. 5. F-value for this model is 62.64 with p value 0.000. Since, the p value is less than the significance level...