Data Types, Data Display and Summary Statistics
Chapter 2
Data Types, Data Display and
Summary Statistics
1
Introduction
• Descriptive Statistics vs. Inferential Statistics
•
Descriptive Statistics - Data summarization
•
Inferential Statistics - Use of sample data to make
inferences about a population
parameter.
•
Population: the collection of objects upon which
measurements could be taken.
•
Sample: a subset of the population.
• Variable is the measurable characteristic of an
entity.
2
Types of Data
• Quantitative or Qualitative?
•
Quantitative: presented as numbers permitting
arithmetic
•
•
•
Interest rate
Temperature
Qualitative (categorical): everything else
•
Country of birth
•
Supplier
3
Types of Data
• Univariate or Multivariate?
•
Univariate: one fact for each object in a dataset (“one
column in a spreadsheet”)
•
Multivariate: two or more facts for each object in a
dataset (“many columns in a spreadsheet”)
4
Types of Data
• Discrete or Continuous?
•
Discrete: counted
•
•
•
Cars sold
Number of children
Continuous: measured (always allow “in-between”
values)
•
•
•
Gallons of oil sold
Temperature
What about age? Money?
5
Types of Data
• Ordinal Data
•
Definition: “Qualitative data that has an ordering”
•
Example – Likert Scale:
disagree strongly disagree neutral agree agree strongly •
Often “measure” with numbers:
1 = disagree strongly
2 = disagree
5 = agree strongly
6
Types of Data
• Time Series or Cross-Sectional?
•
Time series: when time sequencing is important
•
•
•
US historical inflation rates
A baby’s weight
Cross-sectional: data are contemporaneous, all
collected at about the same time
•
2004 inflation rates for several countries
•
Weight at birth
7
The Distribution of Values of a Variable
(Graphical Procedures)
Frequency Distribution
What is a Frequency Distribution?
• A frequency distribution is a list or a table …
• containing the values of a variable (or a set of
ranges within which the data fall) ...
• and the corresponding frequencies with which
each value occurs (or frequencies with which
data fall within each range)
8
Why Use Frequency Distributions?
• A frequency distribution is a way to
summarize data
• The distribution condenses the raw data
into a more useful form...
• and allows for a quick visual interpretation
of the data
9
Frequency Distribution:
Discrete Data
• Discrete data: possible values are countable
• Example:
An advertiser asks 200
customers how many
days per week they
read the daily newspaper
Number of
days read
Frequency
0
44
1
24
2
18
3
16
4
20
5
22
6
26
7
30
Total
200
10
Frequency Distribution
Continuous Data
Example: A manufacturer of insulation randomly
selects 20 winter days and records the daily high
temperature
24, 35, 17, 21, 24, 37, 26, 46, 58, 30,
32, 13, 12, 38, 41, 43, 44, 27, 53, 27
(Temperature is a continuous variable because it
could be measured to any degree of precision desired)
11
Grouping Data by Classes
Sort raw data in ascending order:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
• Find range: 58 - 12 = 46
• Select number of classes: 5 (usually between 5 and 20,
k
we can use 2 n where k is number of classes and n is the number of data values or use k= 1+ 3.3 log (n))
Smallest
• Compute class width: = Largest value –Classes value
Number of
(46/5 then round off to 10)
• Determine class boundaries:10, 20, 30, 40, 50
• Count observations & assign to classes
12
Frequency Distribution Example
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Frequency Distribution
Class
10 but under 20
20 but under 30
30 but under 40
40 but under 50
50 but under 60
Total...
Please join StudyMode to read the full document