# S2 Notes (Edexcel)

**Topics:**Normal distribution, Probability theory, Cumulative distribution function

**Pages:**7 (1732 words)

**Published:**May 10, 2013

Binomial Distribution

Binomial probability distribution is defined as:

* P(X=r) = nCr x pn x (1-p)n-r

* Distribution is written as: X~B(n,p)

Conditions include:

* Fixed number of trials

* All trials are independent of one another

* Probability of success remains constant

* Each trial much have the same two possible outcomes

E(X) = np

Var(X) = npq [where q = 1 - p]

SD(X) = Var (X) = npq

To calculate the probabilities:

* P(X ≤ x) = read off the tables

* P(X ≥ x) = 1- P(X ≤ x-1)

* P(X < x) = P(X ≤ x-1)

* P(X > x) = 1- P(X ≤ x)

* P(x ≤ X ≤ y) = P(X ≤ y) – P(X ≤ x-1)

* P(x < X < y) = P(X ≤ y) – P(X ≤ x)

* P(x ≤ X < y) = P(X ≤ y-1) – P(X ≤ x-1)

* P(x < X ≤ y) = P(X ≤ y) – P(X ≤ x)

Because...

X changes to Y

NEW value for n is 18 – 8 =10

NEW value for p is 1.0 – 0.9 = 0.1

And the sign swaps round

We can now read this off the tables

Note: If p > 0.5, need to make X~B(n,p) convert to Y~B(n,p)

Example:

If X~B(18,0.9), 0.9 is > 0.5 therefore not on the tables

We want to find P(X > 8)

This will become P(Y < 10) AND Y~B(10,0.1)

Poisson Distribution

Binomial probability distribution is defined as:

* P(X=r) = e-λ x λrr!

* Distribution is written as: X~Po(λ)

Conditions include:

* Events occur at random

* All events are independent of one another

* Average rate of occurrence remains constant

* Zero probability of simultaneous occurrences

E(X) = λ

Var(X) = λ

SD(X) = Var (X) = λ

To calculate the probabilities:

* P(X ≤ x) = read off the tables

* P(X ≥ x) = 1- P(X ≤ x-1)

* P(X < x) = P(X ≤ x-1)

* P(X > x) = 1- P(X ≤ x)

* P(x ≤ X ≤ y) = P(X ≤ y) – P(X ≤ x-1)

* P(x < X < y) = P(X ≤ y) – P(X ≤ x)

* P(x ≤ X < y) = P(X ≤ y-1) – P(X ≤ x-1)

* P(x < X ≤ y) = P(X ≤ y) – P(X ≤ x)

To approximate the poisson to the binomial, the following conditions have to apply: * If X~B(n,p) and 1) n is large [n > 50] and

2) p is small [p < 0.1] then, X~Po(np)

Continuous Random Variables

For a probability density function (p.d.f)

* P(a < X < b) = abfx dx

* P(X < k) = P(X ≤ k)

For a cumulative distribution function (c.d.f)

* F(xo) = P(X ≤ xo) using a p.d.f = lower limitupper limitfx dx * Finding probabilities – just sub numbers into the F(x), but check the limits first * Finding the mode – calculate dydx and make dydx=0 and solve * To find the median, make F(x) = 0.5

* To find the lower quartile (Q1), make F(x) = 0.25

* To find the upper quartile (Q2), make F(x) = 0.75

* To find the inter quartile range, calculate Q2 - Q1

In general

* E(X) = lower limitupper limit xfx dx

* Var(X) = lower limitupper limit x2fx dx – [E(X)]2

* E(aX+b) = aE(X) +b

* Var(aX+b) = a2Var(X)

* SD(X) = Var (X)

* P(X=x) is always equal to 0

* To convert c.d.f to p.d.f, you have to differentiate the c.d.f For a 3 part probability density function

* Stage 1) Integrate the first f(x)

2) Put the lower limit into that F(X) just calculated

3) Add this answer to the next f(x) integrated

* To calculate E(X), find for each F(X) and then add them * To calculate Var(X), find for each F(X) and then add them * To calculate SD(X), find for each F(X) and then add them * To calculate the median, use the last F(X) and put that = 0.5

Sampling

Key Definitions

* A statistical model is a statistical process devised to describe or make predictions about the expected behaviour of a real-world problem * A population is a collection/ group/ set of individuals or items * A sample is any subset of a population

* A sampling frame is a complete list or complete identification of the population (e.g. a list, index register, database, map or file) * A sampling unit is an individual member of the population * A...

Please join StudyMode to read the full document