Statistical analysis of the relation between Crime Rate, Education and Poverty: USA, 2009
In this research paper, analysis is done to conclude whether the level of education and poverty influence the total crime rate in the United States of America. Using descriptive statistics such a mean, standard deviation, variance, histograms, scatter diagrams and simple linear regression analysis performed upon both independent variables separately, it can be analysed till what extent do these two independent variables, i.e. education and poverty cause fluctuations upon the dependent variable, in what proportion (direct or inverse) and of the two independent variables, which is a better predictor for determining crime rate in USA. Data description
[The states selected for this study are highlighted with yellow in the above map] The Data that is used to define our dependent variable include both, violent crime (murder and non- negligent manslaughter, forcible rape, robbery, and aggravated assault) as well as property crime (burglary, larceny-theft, motor vehicle theft, and arson). Crime statistics used in this study are published by FBI (Federal Bureau of Intelligence) serving as a governmental agency to the United States Department of Justice. The independent variable that comments upon the education levels in the United States of America is carried out by analysing the total number of public high school graduates per state. This data includes students of all the ethnicities for the school year 2008-2009. The education universe in this study is equivalent to the total population of the state. This data has been collected by National Centre for Education Statistics (NCES), which is the primary federal entity that collects education related data in the U.S. and other countries and analyses it. The poverty status for an individual is measured by comparing his/her income to a preset amount of dollars known as the threshold value. The poverty universe excludes children below the age of 15, people living in military barracks, institutional group quarters and college dormitories. This data is collected by the U.S. Census Bureau, serving as the most reliable source about America’s people and economy. All the data collected is cross-sectional, since it was taken during the same time period (year 2009) across different parameters. Also, the scale of measurement for these variables is the ratio scale, since the ratio between two values is meaningful and the observations are comparable to a zero value. Analysis
Mean: It is the representative of a central value for a given data set, i.e. average.
The mean value for crime variable suggests that in the year 2009, the percentage of crimes being reported in any state of USA was 3.26%. The mean value for education variable suggests that the percentage of public high school graduates being reported in any state of USA was 1% for the same time period. Similarly, the mean value for the poverty variable suggests that the percentage of individuals living below the poverty line being reported in any state of USA was 13.54%.
Standard deviation & Variance: The higher the value of the standard deviation, greater is the dispersion of the data set. Out of the three variables, poverty has the highest standard deviation value of 2.98. Therefore, the percentage of individuals below poverty level is more widely dispersed over the states as compared to the other two variables. Variance is the average of the sum of squared deviation scores. It is used to compute the standard variation since it’s a better means for determining the dispersion of data. It is measured as the square of standard deviation for any data set. Skewness: The symmetry of the variable distribution is measured by the help of this statistic. Crime rate has a skewness of 0.083, making it a symmetrical distributed variable...