I need help to create a multiple regression anlysis for this problem. Please provide as much explanation as you can. Please see attached files.
My research is based on this topic below. The data is attached in the spreadsheet. This is a multiple regression analysis. I have attached a PDF file that explains the case and the spreadsheet version with all the data recorded from the PDF file. Pleas emae sure you include all the graphs, plots and please use megastat software.
We want to determine the primary factors that affect property crime rates in the United States. The statistical analysis of the data involves multiple-regression analysis.
Questions to answer are:
1. What are the primary determinants of property crimes in the United States? 2. What would you like to know about property crime rates that cannot be answered by this data set? 3. How does population density affect property crime rates? Is this expected?
You will want to prepare a summary of your findings to present to a management team from a national crime department. You will find and explain the regression model using a non technical discussion to explain the important factors affect on the property crime rate.
Multiple regression analysis can be used to model property crime in United States . The regression model suggested is of the form.
Crimes = b0+b1Pincome + b2Dropout +b3Pubaid+b4density+ b5Kids+ b6Prescip+ b7unemploy+ b8 Urban
Here bi (i =0,1,..8) are known as the regression coefficients . They are estimated by the method of least squares. bi measure the impact of unit change in the ith independent variable on crime. The estimated values of regression coefficients are given below
|Variables |coefficients |std. error |t (df=41) |p-value | |Intercept |-642.5030 |1024.03 |-0.627 |.5339 | |PINCOME |-0.0183 |0.0773 |-0.237 |.8136 | |DROPOUT |81.2926 |21.9858 |3.698 |.0006 | |PUBAID |-113.7144 |78.7000 |-1.445 |.1561 | |DENSITY |-1.9841 |0.7299 |-2.718 |.0096 | |KIDS |1.1038 |1.4485 |0.762 |.4504 | |PRECIP |1.5821 |11.1636 |0.142 |.8880 | |UNEMPLOY |-46.3830 |79.6479 |-0.582 |.5635 | |URBAN |64.3915 |10.9303 |5.891 |6.18E-07 |
Thus the model can be written as
Crimes = -642.5030 -0.0183 Pincome + 81.2926 Dropout -113.7144 Pubaid -1.9841 Density +1.1038 Kids+ 1.5821 Prescip - 46.3830 Unemploy+ 64.3915Urban
The adequacy of the model is determined by the R2 value (Coefficient of determination) . R2 measure the percentage of variability in crime that can be explained by the regression model.
|Model Summary |
|R² |0.690 |
|Adjusted R² |0.630 |
|R |0.831 |
|Std. Error |749.394 |
Thus 69.0 % variability in the model can be explained using the regression model.
The significance of regression coefficients are tested using a Student t test. The null hypothesis tested is H0; bi =0 Vs H1: bi≠ 0. The null hypothesis is rejected when the calculated value is greater than the critical value of t with 41 d.f. The critical value for this problem is 2.02. The same conclusion can be taken from the p value also. When p value is less than the significance level, the null hypothesis is rejected. The variables that have significant regression coefficients are high lighted in the table below...
Please join StudyMode to read the full document