# Big Data

Pages: 16 (4801 words) Published: December 22, 2012
Introduction to Basic Statistics

Pat Hammett, Ph.D.

2005

Instructor Comments: This document contains an overview of basic probability and statistics. It also includes a practice test at the end of the document. Note: answers to the practice test questions are included in an appendix.

1

Pat Hammett

University of Michigan

Table of Contents 1. VARIABLES- QUALITATIVE AND QUANTITATIVE......................3 1.1 Qualitative Data (Categorical Variables or Attributes) ........................... 3 1.2 Quantitative Data............................................................................................... 4 DESCRIPTIVE STATISTICS.................................................6 2.1 Sample Data versus Population Data ................................................................... 6 2.2 Parameters and Statistics..................................................................................... 6 2.3 Location Statistics (measures of central tendency) ...................................... 7 2.4 Dispersion Statistics (measures of variability) ............................................... 8 FREQUENCY DISTRIBUTIONS ........................................... 10 3.1 Frequency Measures.............................................................................................. 10 3.2 Histogram .................................................................................................................11 3.3 Discrete Histogram............................................................................................... 12 3.4 Continuous Data Histogram ................................................................................. 13 NORMAL DISTRIBUTION ................................................. 15 4.1 Properties of the Normal Distribution ............................................................. 15 4.2 Estimating Probabilities Using Normal Distribution ..................................... 16 4.3 Calculating Parts Per Million Defects Given Normal Distribution.............. 17 LINEAR REGRESSION ANALYSIS ........................................ 20 5.1 General Regression equation............................................................................... 20 5.2 Simple linear regression...................................................................................... 20 5.3 Correlation.............................................................................................................. 22 5.4 Using Scatter Plots to Show Linear Relationships ....................................... 23 5.5 Multiple linear regression................................................................................... 24

2.

3.

4.

5.

Appendices: A – Practice Test B – Normal Distribution Tables C – Useful Excel Functions

2

Pat Hammett

University of Michigan

1. VARIABLES- QUALITATIVE AND QUANTITATIVE A variable is any measured characteristic or attribute that differs for different subjects. For example, if the length of 30 desks were measured, then length would be a variable. Key Learning Skills – • Understand the difference between a qualitative (categorical) variable and a quantitative variable. • Understand the types of qualitative (categorical) variables: Nominal, Ordinal, and Binary. • Understand the difference between a discrete and a continuous quantitative variable. Terms and Definitions: 1.1 Qualitative Data (Categorical Variables or Attributes)

Qualitative data involves assigning non-numerical items into groups or categories. Qualitative data also are referred to as categorical data. The qualitative characteristic or classification group of an item is an attribute. Some examples of qualitative data are: • The pizza was delivered on time. • Categorical Variable: Delivery Result • Attribute: On Time, Not On Time • The survey responses include disagree, neutral, or agree. • Categorical Variable: Survey Response • Attribute: Disagree, Neutral, Agree • This car...