Chapter 2

Data Types, Data Display and

Summary Statistics

1

Introduction

• Descriptive Statistics vs. Inferential Statistics

•

Descriptive Statistics - Data summarization

•

Inferential Statistics - Use of sample data to make

inferences about a population

parameter.

•

Population: the collection of objects upon which

measurements could be taken.

•

Sample: a subset of the population.

• Variable is the measurable characteristic of an

entity.

2

Types of Data

• Quantitative or Qualitative?

•

Quantitative: presented as numbers permitting

arithmetic

•

•

•

Interest rate

Temperature

Qualitative (categorical): everything else

•

Country of birth

•

Supplier

3

Types of Data

• Univariate or Multivariate?

•

Univariate: one fact for each object in a dataset (“one

column in a spreadsheet”)

•

Multivariate: two or more facts for each object in a

dataset (“many columns in a spreadsheet”)

4

Types of Data

• Discrete or Continuous?

•

Discrete: counted

•

•

•

Cars sold

Number of children

Continuous: measured (always allow “in-between”

values)

•

•

•

Gallons of oil sold

Temperature

What about age? Money?

5

Types of Data

• Ordinal Data

•

Definition: “Qualitative data that has an ordering”

•

Example – Likert Scale:

disagree strongly disagree neutral agree agree strongly •

Often “measure” with numbers:

1 = disagree strongly

2 = disagree

5 = agree strongly

6

Types of Data

• Time Series or Cross-Sectional?

•

Time series: when time sequencing is important

•

•

•

US historical inflation rates

A baby’s weight

Cross-sectional: data are contemporaneous, all

collected at about the same time

•

2004 inflation rates for several countries

•

Weight at birth

7

The Distribution of Values of a Variable

(Graphical Procedures)

Frequency Distribution

What is a Frequency Distribution?

• A frequency distribution is a list or a table …

• containing the values of a variable (or a set of

ranges within which the data fall) ...

• and the corresponding frequencies with which

each value occurs (or frequencies with which

data fall within each range)

8

Why Use Frequency Distributions?

• A frequency distribution is a way to

summarize data

• The distribution condenses the raw data

into a more useful form...

• and allows for a quick visual interpretation

of the data

9

Frequency Distribution:

Discrete Data

• Discrete data: possible values are countable

• Example:

An advertiser asks 200

customers how many

days per week they

read the daily newspaper

Number of

days read

Frequency

0

44

1

24

2

18

3

16

4

20

5

22

6

26

7

30

Total

200

10

Frequency Distribution

Continuous Data

Example: A manufacturer of insulation randomly

selects 20 winter days and records the daily high

temperature

24, 35, 17, 21, 24, 37, 26, 46, 58, 30,

32, 13, 12, 38, 41, 43, 44, 27, 53, 27

(Temperature is a continuous variable because it

could be measured to any degree of precision desired)

11

Grouping Data by Classes

Sort raw data in ascending order:

12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

• Find range: 58 - 12 = 46

• Select number of classes: 5 (usually between 5 and 20,

k

we can use 2 n where k is number of classes and n is the number of data values or use k= 1+ 3.3 log (n))

Smallest

• Compute class width: = Largest value –Classes value

Number of

(46/5 then round off to 10)

• Determine class boundaries:10, 20, 30, 40, 50

• Count observations & assign to classes

12

Frequency Distribution Example

Data in ordered array:

12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Frequency Distribution

Class

10 but under 20

20 but under 30

30 but under 40

40 but under 50

50 but under 60

Total...