Normal Distribution
It is important because of Central Limit Theorem (CTL), the CTL said that Sum up a lot of i.i.d random variables the shape of the distribution will looks like Normal. Normal P.D.F

Now we want to find c
This integral has been proved that it cannot have close form solution. However, someone gives an idea that looks stupid but actually very brilliant by multiply two of them.

reminds the function of circle which we can replace them to polar coordinate
Thus
Mean

By symmetry if g(x) is odd function g-x=-g(x) then -abgxdx=0 Variance

Notation
CDF is standard Normal CDF
by symmetric

,CDF , , All the odd moment of standard normal are zero. However, even moment is not easy to calculate by integral (Symmetry)

Then we say
Most of Statistics books will write the pdf then explain the mean and variance but it is not intuitive.
Standardization

Find PDF of
CDF:
The PDF is derivative of the CDF (using chain rule)
PDF:

Later we’ll show if independent

68-95-99.7% Rule
Because you can’t actually calculate the , somebody create a rule of thumb
The properties of variance

If you shift the variance by c, the mean also shift by c. Thus, the variance doesn’t change.
Remember to square. It is easy to validate if you think c is negative and don’t square it. The corresponding variance becomes negative which contradict the definition of the variance. iff for some

Variance only can be zero if and only if the probability is a constant. in general [It is equal if X,Y are independent]
To get the idea about this we let Y = X which means the two r.v.s are extremely dependent. Then

...require that we know whether we have a sample or a population. 2. The following numbers represent the weights in pounds of six 7year old children in Mrs. Jones' 2nd grade class. {25, 60, 51, 47, 49, 45} Find the mean; median; mode; range; quartiles; variance; standard deviation. Solution: mean = 46.166.... median = 48 mode does not exist range = 35 Q1 = 45 Q2 = median = 48 Q3 = 51 variance = 112.1396 standard deviation =10.59 3. If the variance is 846, what is the standard deviation? Solution: standard deviation = square root of variance = sqrt(846) = 29.086 4. If we have the following data
34, 38, 22, 21, 29, 37, 40, 41, 22, 20, 49, 47, 20, 31, 34, 66 Draw a stem and leaf. Discuss the shape of the distribution. Solution: 2 3 4 5 6 | | | | | 219200 48714 0197 6
This distribution is right skewed (positively skewed) because the “tail” extends to the right. 5. What type of relationship is shown by this scatter plot?
45 40 35 30 25 20 15 10 5 0 0 5 10 15 20
Solution: Weak positive linear correlation 6. What values can r take in linear regression? Select 4 values in this interval and describe how they would be interpreted. Solution: the values are between –1 and +1 inclusive. -1 means strong negative correlation +1 means strong positive correlation 0 means no correlation .5 means moderate positive correlation etc. 7. Does correlation imply causation? Solution: No.
8. What do we call the r value. Solution: The correlation coefficient....

...ORMAL(0, t) distribution.
A Wiener process with initial value W0 = x is gotten by adding x to a standard Wiener
process. As is customary in the land of Markov processes, the initial value x is indicated
(when appropriate) by putting a superscript x on the probability and expectation operators. The term independent increments means that for every choice of nonnegative real
numbers 0 ≤ s1 < t1 ≤ s2 < t2 ≤ · · · ≤ sn < tn < ∞, the increment random variables
Wt1 − Ws1 , Wt2 − Ws2 , . . . , Wtn − Wsn
are jointly independent; the term stationary increments means that for any 0 < s, t < ∞
the distribution of the increment Wt+s − Ws has the same distribution as Wt − W0 = Wt .
In general, a stochastic process with stationary, independent increments is called a L´vy
e
process; more on these later. The Wiener process is the intersection of the class of Gaussian
processes with the L´vy processes.
e
It should not be obvious that properties (1)–(4) in the deﬁnition of a standard Brownian
motion are mutually consistent, so it is not a priori clear that a standard Brownian motion
exists. (The main issue is to show that properties (3)–(4) do not preclude the possibility
of continuous paths.) That it does exist was ﬁrst proved by N. W IENER in about 1920.
´
His proof was simpliﬁed by P. L E VY; we shall outline L´ vy’s construction in section ??
e
below. But notice that properties (3) and (4) are compatible. This follows from...

...•H.P.Gautam
The purpose of this article is not to explain any more the usefulness of normaldistribution in decision-making process no matter whether in social sciences or in natural sciences. Nor is the purpose of making any discussions on the theory of how it can be derived. The only objective of writing this article is to acquaint the enthusiastic readers (specially students) with the simple procedure ( iterative procedure) for finding the numerical value of a normally distributed variable. The procedure is simple in the sense that the students even from non-mathematical background can easily use the technique discussed below to find the value, and it will not an exaggeration to say that he or she after going through the article not only can compute an individual value but also can generate the whole table for such values. At most he require a scientific calculator.
It should be bone in mind that a person using normaldistribution as an analytical tool need not be familiar with the computational aspect of normaldistribution as various computer package and statistical tables are available in the market. However, he must know how the distribution helps him take the right decisions.
Though one need not compute the numerical values himself rather than to how to use and interpret these values, yet he will have a deeper understanding how normal...

...time is known to have a skewed-right distribution with a mean of 10 minutes and a standard deviation of 8 minutes. Suppose 100 flights have been randomly sampled. Describe the sampling distribution of the mean waiting time between when the airplane taxis away from the terminal until the flight takes off for these 100 flights. a) Distribution is skewed-right with mean = 10 minutes and standard error = 0.8 minutes. b) Distribution is skewed-right with mean = 10 minutes and standard error = 8 minutes. c) Distribution is approximately normal with mean = 10 minutes and standard error = 0.8 minutes. d) Distribution is approximately normal with mean = 10 minutes and standard error = 8 minutes. ANSWER: c 2. Suppose the ages of students in Statistics 101 follow a skewed-right distribution with a mean of 23 years and a standard deviation of 3 years. If we randomly sampled 100 students, which of the following statements about the sampling distribution of the sample mean age is incorrect? a) The mean of the sampling distribution is equal to 23 years. b) The standard deviation of the sampling distribution is equal to 3 years. c) The shape of the sampling distribution is approximately normal. d) The standard error of the sampling distribution is equal to 0.3 years. ANSWER: b 3....

...SAMPLING DISTRIBUTIONS
|6.1 POPULATION AND SAMPLING DISTRIBUTION |
|6.1.1 Population Distribution |
Suppose there are only five students in an advanced statistics class and the midterm scores of these five students are:
70 78 80 80 95
Let x denote the score of a student.
• Mean for Population
Based on Example 1, to calculate mean for population:
[pic]
• Standard Deviation for Population
Based on example 1, to calculate standard deviation for population:
[pic]
|6.1.2 Sampling Distribution |
▪ Sample statistic such as median, mode, mean and standard deviation
6.1.2.1 The Sampling Distribution of the Sample Mean
Reconsider the population of midterm scores of five students given in example 1. Let say we draw all possible samples of three numbers each and compute the mean.
Total number of samples = 5C3 =[pic]
Suppose we assign the letters A, B, C, D and E to scores of the five students, so that...

...Business Statistics
Chapter 7
Sampling and Sampling Distributions
6-1
Learning Objectives
In this chapter, you learn:
The concept of the sampling distribution
To compute probabilities related to the sample
mean and the sample proportion
The importance of the Central Limit Theorem
To distinguish between different survey
sampling methods
To evaluate survey worthiness and survey errors
7-2
Reasons for Drawing a Sample
Selecting a sample is less time-consuming than
selecting every item in the population (census).
Selecting a sample is less costly than selecting
every item in the population.
An analysis of a sample is less cumbersome
and more practical than an analysis of the
entire population.
7-3
A Sampling Process Begins With A
Sampling Frame
The sampling frame is a listing of items that
make up the population
Frames are data sources such as population
lists, directories, or maps
Inaccurate or biased results can result if a
frame excludes certain portions of the
population
Using different frames to generate data can
lead to dissimilar conclusions
7-4
Types of Samples Used
Nonprobability Sample
Items included are chosen without regard to
their probability of occurrence
Probability Sample
Items in the sample are chosen on the basis
of known probabilities
7-5
Types of Samples Used
(continued)
Samples
Non-Probability
Samples
Judgement
Quota
Chunk
Convenience
Probability Samples
Simple
Random...

...Probability and statistics - Karol Flisikowski
n
Sampling Distribution of x-bar
How does x-bar behave? To study the behavior,
imagine taking many random samples of size n, and computing an x-bar for each of the samples. Then we plot this set of x-bars with a histogram.
Probability and statistics - Karol Flisikowski
Sampling Distribution of x-bar
Probability and statistics - Karol Flisikowski
Central Limit Theorem
The key to the behavior of x-bar is the central limit
theorem. It says: Suppose the population has mean, m, and standard deviation s. Then, if the sample size, n, is large enough, the distribution of the sample mean, x-bar will have a normal shape, the center will be the mean of the original population, m, and the standard deviation of the x-bars will be s divided by the square root of n.
Probability and statistics - Karol Flisikowski
Central Limit Theorem
If the CLT holds we have,
Normal shape
Center = mu
Spread = sigma/sqroot n.
Probability and statistics - Karol Flisikowski
When Does CLT Hold?
Answer generally depends on the sample size, n,
and the shape of the original distribution. General Rule: the more skewed the population distribution of the data, the larger sample size is needed for the CLT to hold.
Probability and statistics - Karol Flisikowski
CLT
Previous overhead shows...

... standard deviation, variance, standard error of the mean, and confidence intervals. These statistics are used to summarize data and provide information about the sample from which the data were drawn and the accuracy with which the sample represents the population of interest. The mean, median, and mode are measurements of the “central tendency” of the data. The range, standard deviation, variance, standard error of the mean, and confidence intervals provide information about the “dispersion” or variability of the data about the measurements of central tendency.
MEASUREMENTS OF CENTRAL TENDENCY The appropriateness of using the mean, median, or mode in data analysis is dependent upon the nature of the data set and its distribution (normal vs non-normal). The mean (denoted by x) is calculated by dividing the sum of the individual data points (where Σ equals “sum of”) by the number of observations (denoted by n). It is the arithmetic average of the observations and is used to describe the center of a data set.
mean=x= One of the most basic purposes of statistics is simply to enable us to make sense of large numbers. For example, if you want to know how the students in your school are doing in the statewide achievement test, and somebody gives you a list of all 600 of their scores, that’s useless. This everyday problem is even more obvious and staggering when you’re dealing, let’s say, with the population data for the nation....