This topic covers: The concept and measures of central tendency for ungrouped and grouped data. The concept and measures of dispersion for ungrouped and grouped data.
When we look at a distribution of data, we should consider three characteristics: Shape (chapters 2 and 4) Center / Location (central tendency measurement) Spread (dispersion measurement) With these characteristics, we can numerically describe the main features of a data set. And, we may describe about the behaviour of the data in much simpler form.
Central Tendency Measurement
A measure of central tendency gives the center of a histogram or a frequency distribution. To report a typical value that is representative of the data. Three common measures of central tendency: Mean (Arithmetic mean) Median Mode
Other measures of central tendency:
Trimmed mean Harmonic mean Geometric mean
CENTRAL OF TENDENCY
Permissible central of tendency
Mean, Mode*, Median* All statistics are permitted including geometric mean, harmonic mean, trimmed mean, and other robust means.
Central tendency for Ungrouped Data
Mean (Arithmetic mean)
The most frequently used measure of central tendency. The mean of a data set is the sum of the observation divided by the number of observation.
The median is the value of the middle term in a data set that has been ranked in increasing order. Steps: 1) Rank the data in increasing order. 2) Determine the depth (position) of the median.
3) Determine the value of the median.
The mode of the data set is its most frequently occurring values. Not unique. No mode – a data set with each value occurring only once (e.g. 3,4,5,6,1,2,7,8). Unimodal – a data set with only one value occurring with the highest frequency (e.g. 3,4,5,5,1,2,7,8). Bimodal – a data set with two values that occur with same (highest) frequency (e.g. 3,3,5,5,3,2,5,8). Multimodal - more than two values in a data set occur with the same (highest) frequency.
Advantages Unique Consider all data set during the mean calculation Sensitive to outlier
Unique Resistant to outlier
Can be used to calculate qualitative and quantitative data Not unique Some of the data set doesn’t have mode value Most frequent observation
It is difficulty to handle theoretically Divides the bottom 50% of the data from the top 50% When the frequency distribution is skewed left or right
Center of gravity
When to use
When the data are quantitative and the frequency distribution is roughly symmetric
When the most frequent observation is the desired measure of central tendency or the data are qualitative
Class Activity 1
Selecting an appropriate measure of center (mean, median, or mode) for following situation: A student takes four exam in a biology class. His grade are 88, 75, 95, and 100. Mean The National Association of REALTORS publishes data on resale price of U.S. homes. Median The marathon had two categories of official finishers: male and female, of which there were 10894 and 6655, respectively. Mode
The issue of Outliers
• The arithmetic mean is the most preferable measure BUT it is easily deteriorate when there are outliers in the data. OUTLIERS An observation (or a set of observations) that is numerically distant from the rest of the data. • To minimise such deterioration, other statistics that are resistant to errors in the results are needed.
The issue of Outliers
• Example Given are the 10 observations. 30 171 184 201 212 250 265 270 272 289
1. Compute median. 2. Compute mean. 3. Can you spot the differences? Why such results occur?
Sometimes, certain data values have a higher importance or...