Statistics 1 course notes
Variable types (SS Stevens, 1946) Nominal - assign item to category; are discrete / categorical Ordinal - rank order items; are categorical, but often treated as continuous Interval - rank order items and distance between cases is equal; are continuous Ratio - same as interval, includes a true zero; are continuous
Z score: Z = (X - M) / SD
Mean z-score is always 0. Negative is below average; positive is above.
Mean = average
Median = midpoint score in population (half fall below, half fall above). Middle number in a sorted list. If population is an even number, divide between neighbors on the midpoint.
Mode = most frequent score
Deviation = (X - M)
Sum of Squares (SS) = ∑ (X - M)^2
Variance = ∑ (X - M)^2 / N (mean squares)
Standard Deviation = sqrt of variance
Pearson product-moment correlation coefficient (r): degree to which X and Y vary together, relative to the degree to which they vary independently
Sum of cross products SPxy: Measure of the degree to which X and Y vary together.
SPxy = ∑ [(X - Mx) * (Y - My)] r = SPxy / sqrt (SSx * SSy). SSx, SSy are measures of the degree to which X and Y vary independently.
Z score formula: r = ∑ (Zx * Zy) / N
Covariance = SP / N
Assumptions for r:
1) normal distribution of X and Y - check histograms
2) linear relationship between X and Y - check scatterplots
3) homoscedasticity - vertical distance between scatterplot dots and regression line; indicates level of prediction error (aka “residual”)
Reliability - correlation between X1 and X2 is an estimate of reliability (and is a limit for how X can correlate to anything else)