– Distribution – describes what values the variable takes and how often
– Pie Charts/Bar Graphs – categorical
– Histograms/Stem plots – quantitative
– Data set has info on number of individuals
– For each individual, data gives values for variables
– When looking at graph… o Center – middle of data o Shape – symmetry or skewed o Spread – range of data
Chapter 5
– Regression used to predict data
– ¬¬¬ , least-squares regression line o passes through (x, y) o b = r (Sy/Sx), a = y – bx
– r2 = fraction of variation in one variable that is explained by least-squares on the other variable
– Extrapolation – values in regression outside of data range
– Correlation ≠ Causation Chapter 2
– Numerical summary provides center & spread
– Mean & median are different (average/midpoint)
– Five-number Summary – min, Q1, M, Q3, max o Outlier = 1.5 x (Q3-Q1) ± Q1/Q3
– Variance (s2) and Standard Deviation (s) measure spread of distribution, gets larger as spread increases
– Median and quartiles are resistant, mean and standard deviation are not
– Mean and standard deviation are for symmetric distributions
– 5-number summary is for skewed distributions
Chapter 3
– Density curve has total area of 1, rel. freq/interval o Idealized description of overall pattern, smoothens it out o Mean of D curve is µ, st. dev is σ o Mean is balance point, range (-∞, ∞) o µ = M for symmetric curves o N(µ, σ) = N(0,1), 68-95-99.7 rule o z = (x – µ)/σ
Chapter 4
– x = explanatory variable
– y = response variable
– In scatter plot, find… o Direction - +/- association o Form – linear, curved, clustered o Strength – how linear points are
– Correlation r measures linear direction & strength o r > 0 positive, r < 0 negative o -1 ≥ r ≥ 1 o r = ±1, perfectly linear o r = 0, no linear relationship o r not affected by change of measurement,