Monday 16 th January 2012Mr Aroui
Statistical experiment: A test adopted for collecting data to provide evidence for or against a hypothesis. Event: A sub-set of possible outcomes of an experiment.
Sample space: A list of all possible outcomes of an experiment. Discrete data: Discrete data can only take certain values in any given range. Number of cars in a household is an example of discrete data. The values do not have to be whole numbers (e.g. shoe size is discrete). Continuous Data: Continuous data can take any value in a given range. So a person’s height is continuous since it could be any value within set limits. Histograms: The key feature of a histogram is that the area of each block is proportional to the frequency. In order for the area to be equal (or proportional) to the frequency we plot frequency density on the vertical axis, where . The class width is the width of the interval (i.e. it runs from the lower boundary to the upper boundary). When question says “give a reason to justify the use of a histogram to represent these data” The answer is “Data is continuous”. Box-Whisker Diagrams:
Main features & uses of Box Plots:
Indicates max / median / min / upper quartile/ lower quartile. Indicates outliers.
Indicates range / Interquartile range / spread
Mathematical Modelling in Statistics: A mathematical model is a simplification of a real world situation. it can be used to find solutions to problems without the need to construct a physical model.
Interpolation: Finding the Median, Lower Quartile and Upper Quartile using Interpolation. b is the lower class boundary, f is the sum of all the frequencies below b, is the frequency of the class, w is the class width.
Finding the Mean, Variance and Standard Deviation:
Skewness: The shape(skewness) of a data set can be described using diagrams, measures of location and measures of spread. A distribution can be symmetrical, have positive skew or have negative skew. Using the Quartiles:
Using the Measures of location (Mode, Median & Mean):
Using the coefficient of skewness:
this gives you a value and tells you how skewed the data is The closer to zero the more symmetrical
A negative number means the data has negative skew.
A positive number means the data has a positive skew
a) b) c)
d) e) f)
Calculating , and
A regression line can be used to estimate the value of the dependent variable for any value of the independent variable. Interpolation is when you estimate the value of a dependent variable within the range of the data. Extrapolation is when you estimate a value outside the range of the data. Values estimated by extrapolation can be unreliable. You should not, in general, extrapolate and you must view any extrapolated values with caution. To turn a coded regression line into an actual regression line you substitute the codes into the answer.
Discrete Random Variables:
The Mean (Expected Value) of a discrete random variable:
The Expected Value of :
Example 1: Give two reasons for using mathematical models.
1. Mathematical models are cheaper and easier to use than the real situation. 2. Mathematical models can help improve our understanding of a real world problem.
Example 2: Explain briefly the role of statistical tests in the process of mathematical modelling.
Statistical tests are used to assess how well a mathematical model matches a real world situation.
Example 3: Describe briefly the process of refining a mathematical model. Predictions based on the model are compared with observed data. In the light of this comparison, the model is adjusted. The process is repeated.