* Columns correspond to Variables
* Rows correspond to individuals
* Rows are often called observations or cases
* The number of rows is traditionally denoted by n
* In the bar chart the height of each bar is proportional to the count (or percent) in each category * In the pie chart the area of each piece is proportional to the percent of individuals in each category * Pareto chart when the categories are sorted by frequency * Pie charts are less useful than bar charts if we want to compare actual counts (easier to compare bars than angles of wedges) * The area occupied by a part of the graph/chart that displays data should be proportional to the amount of data it represents * Categorical variable: the category with the highest frequency * Numerical variable: location of a major peak of the distribution * Mean = center of mass = the balance point
* Define first quartile to be the median of the observations below the median * Define third quartile to be the median of the observations above the median * The interquartile range IQR is Q3 - Q1
* Coefficient of Variation: defined as the ratio of SD to the mean, has no units, usually is expressed as a percentage, indicates how large SD is relative to the mean * IQR: Robust measure (same as the median), Has the same units as the observations * S: NOT a robust measure (same as the mean), Has the same units as the observations, s=0 if and only if all the observations are equal * The central box spans the quartiles Q1 and Q3.
* The line in the box marks the median M.
* The whiskers extend out to the smallest and largest observations * A density describes the overall pattern of a distribution. The area under the curve and above any range of values is the relative frequency of all observations that fall in that range. * Uniform Density = flat
* A mode of a density curve is a peak point of the curve
* The median of a density curve...
Please join StudyMode to read the full document