# S2 Notes (Edexcel)

Topics: Normal distribution, Probability theory, Cumulative distribution function Pages: 7 (1732 words) Published: May 10, 2013
S2 EDEXCEL REVISION NOTES

Binomial Distribution
Binomial probability distribution is defined as:
* P(X=r) = nCr x pn x (1-p)n-r
* Distribution is written as: X~B(n,p)
Conditions include:
* Fixed number of trials
* All trials are independent of one another
* Probability of success remains constant
* Each trial much have the same two possible outcomes
E(X) = np
Var(X) = npq [where q = 1 - p]
SD(X) = Var (X) = npq
To calculate the probabilities:
* P(X ≤ x) = read off the tables
* P(X ≥ x) = 1- P(X ≤ x-1)
* P(X < x) = P(X ≤ x-1)
* P(X > x) = 1- P(X ≤ x)
* P(x ≤ X ≤ y) = P(X ≤ y) – P(X ≤ x-1)
* P(x < X < y) = P(X ≤ y) – P(X ≤ x)
* P(x ≤ X < y) = P(X ≤ y-1) – P(X ≤ x-1)
* P(x < X ≤ y) = P(X ≤ y) – P(X ≤ x)
Because...
X changes to Y
NEW value for n is 18 – 8 =10
NEW value for p is 1.0 – 0.9 = 0.1
And the sign swaps round
We can now read this off the tables

Note: If p > 0.5, need to make X~B(n,p) convert to Y~B(n,p)
Example:
If X~B(18,0.9), 0.9 is > 0.5 therefore not on the tables
We want to find P(X > 8)
This will become P(Y < 10) AND Y~B(10,0.1)

Poisson Distribution
Binomial probability distribution is defined as:
* P(X=r) = e-λ x λrr!
* Distribution is written as: X~Po(λ)
Conditions include:
* Events occur at random
* All events are independent of one another
* Average rate of occurrence remains constant
* Zero probability of simultaneous occurrences
E(X) = λ
Var(X) = λ
SD(X) = Var (X) = λ
To calculate the probabilities:
* P(X ≤ x) = read off the tables
* P(X ≥ x) = 1- P(X ≤ x-1)
* P(X < x) = P(X ≤ x-1)
* P(X > x) = 1- P(X ≤ x)
* P(x ≤ X ≤ y) = P(X ≤ y) – P(X ≤ x-1)
* P(x < X < y) = P(X ≤ y) – P(X ≤ x)
* P(x ≤ X < y) = P(X ≤ y-1) – P(X ≤ x-1)
* P(x < X ≤ y) = P(X ≤ y) – P(X ≤ x)
To approximate the poisson to the binomial, the following conditions have to apply: * If X~B(n,p) and 1) n is large [n > 50] and
2) p is small [p < 0.1] then, X~Po(np)

Continuous Random Variables
For a probability density function (p.d.f)
* P(a < X < b) = abfx dx
* P(X < k) = P(X ≤ k)
For a cumulative distribution function (c.d.f)
* F(xo) = P(X ≤ xo) using a p.d.f = lower limitupper limitfx dx * Finding probabilities – just sub numbers into the F(x), but check the limits first * Finding the mode – calculate dydx and make dydx=0 and solve * To find the median, make F(x) = 0.5

* To find the lower quartile (Q1), make F(x) = 0.25
* To find the upper quartile (Q2), make F(x) = 0.75
* To find the inter quartile range, calculate Q2 - Q1
In general
* E(X) = lower limitupper limit xfx dx
* Var(X) = lower limitupper limit x2fx dx – [E(X)]2
* E(aX+b) = aE(X) +b
* Var(aX+b) = a2Var(X)
* SD(X) = Var (X)
* P(X=x) is always equal to 0
* To convert c.d.f to p.d.f, you have to differentiate the c.d.f For a 3 part probability density function
* Stage 1) Integrate the first f(x)
2) Put the lower limit into that F(X) just calculated
* To calculate E(X), find for each F(X) and then add them * To calculate Var(X), find for each F(X) and then add them * To calculate SD(X), find for each F(X) and then add them * To calculate the median, use the last F(X) and put that = 0.5

Sampling
Key Definitions
* A statistical model is a statistical process devised to describe or make predictions about the expected behaviour of a real-world problem * A population is a collection/ group/ set of individuals or items * A sample is any subset of a population

* A sampling frame is a complete list or complete identification of the population (e.g. a list, index register, database, map or file) * A sampling unit is an individual member of the population * A...