# Math533 Part a

Pages: 5 (1256 words) Published: June 16, 2013
PART A- Exploratory Data Analysis
Introduction & Overview
AJ Davis is a department store chain, which has many credit customers and wants to find out more information about these customers. The total sample set of 50 credit customers is selected with data collected. The below data was provided in order to perform the analysis. 1. Location:

a. Urban
b. Suburban
c. Rural
2. Income
3. Household Size (number of people living in the household) 4. Years (the number of years that the customer has lived in the current location) 5. Credit Balance (the customer’s current credit card balance on the store's credit card)

Individual Variables
Five individual variables were provided for review: Location, Income, Household Size, Years, and Credit Balance. Below is a statistical analysis summarizing the key points referencing Location, Income and Credit Balance.

Variable: Location
The location of AJ Davis’ customers is distributed between three classes of urban, suburban and rural areas. Of the total number of customer locations in the sample set of 5; 13 are located in rural, 15 in suburban and 22 in urban locations. The pie chart shows that just less than half of all AJ Davis’ customers live in urban (44%) areas, yet customers that live in rural (26%) and suburban (30%) areas are relatively evenly distributed. The rural and suburban areas combine to compromise 56% of AJ Davis’ credit customer base in which they are interested in. This should be useful information to AJ Davis as about half of their credit customers are from urban locations.

Variable: Income
The range of customer’s income is \$49,000, with the highest income at \$74,000 and the lowest at \$25,000. The mean (average) income of a customer is \$46,020. The median income is \$44,500 (half way point between the 50 total incomes, 25 incomes below and 25 incomes above). The value that appears most frequent (mode) is \$54,000 and \$57,000. This information is beneficial to AJ Davis in that it gives them a snapshot of customer income information so that they will be better able to assess their customer’s ability to repay their debt, as well as help them gage if credit limits could possibly be increased.

This boxplot displays this information in a graphical format. The box represents the lower quartile of \$33,000 (25% of incomes are less than this value), the median of \$44,500, and the upper quartile of \$57,250 (25% of incomes are greater than this value).

Variable: Credit Balance

The average (mean) credit balance of AJ Davis’ customer is \$4,153, the largest credit balance is \$5,861 and smallest at \$2,047, resulting in a range of \$3,814. The standard deviation is \$932. This represents that in the aggregated comparison of credit balances of AJ Davis customers (in the sample set) a relatively small variation exists from the mean of \$4,153. The median credit balance is \$4,273; the median often provides a clearer picture of the overall data set examined. Yet in this case being the median and mean resulted in relatively close figures (difference of only \$120) both are representative of an “average” credit balance.

This boxplot displays this information collected in a graphical format. The box represents the lower quartile of \$3292 (25% of balances are less than this value), the median of \$4273 and the upper quartile of \$4931 (25% of balances are greater than this value). As shown a boxplot is a valuable tool in analyzing this data as it gives a good graphical image of the concentration of the data, which is the box. They also show how far from most of the data the extreme values are, which is shown by the top and the bottom of the vertical line.

Relationships
Below are three sets of paired variables from the given data. * Location and Credit Balance
* Credit Balance and Size

* Income and Credit Balance

Location and Credit Balance

A look at the mean credit balances by location shows that the highest...