What are the characteristics of a population for which a mean/median/mode would be appropriate? Inappropriate? The analysis of data begins with descriptive statistics such as the mean, median, mode, range, standard deviation, variance, standard error of the mean, and confidence intervals. These statistics are used to summarize data and provide information about the sample from which the data were drawn and the accuracy with which the sample represents the population of interest. The mean, median, and mode are measurements of the “central tendency” of the data. The range, standard deviation, variance, standard error of the mean, and confidence intervals provide information about the “dispersion” or variability of the data about the measurements of central tendency. MEASUREMENTS OF CENTRAL TENDENCY The appropriateness of using the mean, median, or mode in data analysis is dependent upon the nature of the data set and its distribution (normal vs non-normal). The mean (denoted by x) is calculated by dividing the sum of the individual data points (where Σ equals “sum of”) by the number of observations (denoted by n). It is the arithmetic average of the observations and is used to describe the center of a data set. mean=x= One of the most basic purposes of statistics is simply to enable us to make sense of large numbers. For example, if you want to know how the students in your school are doing in the statewide achievement test, and somebody gives you a list of all 600 of their scores, that’s useless. This everyday problem is even more obvious and staggering when you’re dealing, let’s say, with the population data for the nation. We’ve got to be able to consolidate and synthesize large numbers to reveal their collective characteristics and interrelationships, and transform them from an incomprehensible mass to a set of useful and enlightening indicators. The Mean

One of the most useful and widely used techniques for doing this—one which you already know—is the...

...group of interest. The variance and the standard deviation require that we know whether we have a sample or a population. 2. The following numbers represent the weights in pounds of six 7year old children in Mrs. Jones' 2nd grade class. {25, 60, 51, 47, 49, 45} Find the mean; median; mode; range; quartiles; variance; standard deviation. Solution: mean = 46.166.... median = 48 mode does not exist range = 35 Q1 = 45 Q2 = median = 48 Q3 = 51 variance = 112.1396 standard deviation =10.59 3. If the variance is 846, what is the standard deviation? Solution: standard deviation = square root of variance = sqrt(846) = 29.086 4. If we have the following data
34, 38, 22, 21, 29, 37, 40, 41, 22, 20, 49, 47, 20, 31, 34, 66 Draw a stem and leaf. Discuss the shape of the distribution. Solution: 2 3 4 5 6 | | | | | 219200 48714 0197 6
This distribution is right skewed (positively skewed) because the “tail” extends to the right. 5. What type of relationship is shown by this scatter plot?
45 40 35 30 25 20 15 10 5 0 0 5 10 15 20
Solution: Weak positive linear correlation 6. What values can r take in linear regression? Select 4 values in this interval and describe how they would be interpreted. Solution: the values are between –1 and +1 inclusive. -1 means strong negative correlation +1 means strong positive correlation 0 means no correlation .5 means moderate positive correlation etc. 7. Does correlation imply causation? Solution: No....

...hypothesis testing of two samples that had two means. The reason why the choice of testing of two samples means were because it compared the two sets of data that are directly related to each other. The reason why I believed that rural homes have a lower average of beds due to the fact that rural areas are the countryside rather than the big known towns or towns of the state.
The population that my data set represents was the number of beds that the in-patients had in each of the homes between non-rural home and rural home facilities. The reason why the data was collected was because the Department of Health and Social Services of the State of New Mexico and cover 60 licensed nursing facilities in New Mexico in 1988. The methods that were used to collect the data was by the number of beds that were used in the home, annual medical in patient days (hundreds), annual total patient days (hundreds), annual total patient care revenue ($hundreds), annual nursing salaries ($hundreds), annual facilities expenditures ($hundred), and where the home was located between non-rural and rural areas. The source of the data set of the nursing home information toward New Mexico in 1988 was part of the data analyzed by Howard L. Smith, Niell F. Piland, and Nancy Fisher. This was published in the Journal of Rural Health in winter 1992. This data set can be calculated in four different types of...

...with μ = 110 grams and σ = 25 grams. A sample of 25 vitamins is to be selected. What is the probability that the sample mean will be less than 100 grams?
9)
The amount of pyridoxine (in grams) per multiple vitamin is normally distributed with μ = 110 grams and σ = 25 grams. A sample of 25 vitamins is to be selected. What is the probability that the sample mean will be greater than 100 grams?
10)
The amount of pyridoxine (in grams) per multiple vitamin is normally distributed with μ = 110 grams and σ = 25 grams. A sample of 25 vitamins is to be selected. So, 95% of all sample means will be greater than how many grams?
TABLE 7-1
Times spent studying by students in the week before final exams follow a normaldistribution with standard deviation 8 hours. A random sample of 4 students was taken in order to estimate the mean study time for the population of all students.
11)
Referring to Table 7-1, what is the probability that the sample mean exceeds the population mean by more than 2 hours?
12)
Referring to Table 7-1, what is the probability that the sample mean is more than 3 hours below the population mean?
13)
Referring to Table 7-1, what is the probability that the sample mean differs from the population mean by less than 2 hours?
14)
Referring to Table 7-1, what is the probability that the sample mean differs from the population mean by more than 3 hours?...

...
PGEG371: Data Analysis & Geostatistics
NormalDistributions
Laboratory Exercise # 3
1st and 5th February, 2015
Read through this instruction sheet then answer the ‘pre-Lab’ quiz BEFORE starting the exercises!
1. Aim
The purpose of this laboratory exercise is to use a NormalDistribution to find information about a data population.
On successful completion of this exercise, you should be able to
Describe what a NormalDistribution is;
How the histogram for a whole population looks like;
How to find information about a population using probability distribution function (PDF), and cumulative distribution function (CDF);
How to calculate the best estimate of the mean and standard deviation, and the confidence interval.
2. Introduction
In this section lab, you will study PDF and CDF of a Normaldistribution. PDF is the Probability Distribution Function, and CDF is the Cumulative Distribution Function. A NormalDistribution is the simplest model that exists. Could you speculate why?
Normal (Gaussian) distribution has very important characteristics. Some of them are: (1) it has always the same shape,
(2) the mean values is in the centre of the distribution (any thoughts about the median and mode?),...

...interval scale
c. ratio scale
d. None of these alternatives is correct.
3. Statistical studies in which researchers control variables of interest are
a. experimental studies
b. control observational studies
c. non-experimental studies
d. observational studies
3.
4. A statistics professor asked students in a class their ages. On the basis of this information, the professor states that the average age of all the students in the university is 24 years. This is an example of
a. a census
b. descriptive statistics
c. an experiment
d. statistical inference
5. Qualitative data can be graphically represented by using a(n)
a. histogram
b. frequency polygon
c. ogive
d. bar graph
6. Since the population size is always larger than the sample size, then the sample statistic
a. can never be larger than the population parameter
b. can never be equal to the population parameter
c. can be smaller, larger, or equal to the population parameter
d. can never be smaller than the population parameter
7. The value which has half of the observations above it and half the observations below it is called the
a. range
b. median
c. mean
d. mode
8. When data are positively skewed, the mean will usually be
a. greater than the median
b. smaller than the median
c. equal to the median
d. positive
9. The numerical value of the standard deviation can never...

...others owned 60 or fewer. The remaining student owned 65. The quartiles for the class were 30, 34 and 42 respectively.
Outliers are defined to be any values outside the limits of 1.5(Q3 – Q1) below the lower quartile or above the upper quartile.
On graph paper draw a box plot to represent these data, indicating clearly any outliers. (7) Jan 2001
2) The random variable X is normally distributed with mean 177.0 and standard deviation 6.4.
(a) Find P(166 < X < 185). (4)
It is suggested that X might be a suitable random variable to model the height, in cm, of adult males.
(b) Give two reasons why this is a sensible suggestion. (2)
(c) Explain briefly why mathematical models can help to improve our understanding of real-world problems. (2) Jan 2001
3) A fair six-sided die is rolled. The random variable Y represents the score on the uppermost, face.
(a) Write down the probability function of Y. (b) State the name of the distribution of Y. (2) (1)
Find the value of
(c) E(6Y + 2), (d) Var(4Y – 2)....

...under a Standard Normal curve
a) to the right of z is 0.3632;
b) to the left of z is 0.1131;
c) between 0 and z, with z > 0, is 0.4838;
d) between -z and z, with z > 0, is 0.9500.
Ans : a) z = + 0.35 ( find 0.5- 0.3632 = 0.1368 in the normal table)
b) z = -1.21 ( find 0.5 – 0.1131 = 0.3869 in the normal table)
c ) the area between 0 to z is 0.4838, z = 2.14
d) the area to the right of +z = ( 1-0.95)/2 = 0.025, therefore z = 1.96
3. Given the Normally distributed variable X with mean 18 and standard deviation 2.5, find
a) P(X < 15);
b) the value of k such that P(X < k) = 0.2236;
c) the value of k such that P(X > k) = 0.1814;
d) P( 17 < X < 21).
Ans : X ~ N ( 18, 2.52)
a) P ( X < 15)
P ( Z < (15-18)/2.5) = P ( Z < -1.2) = 0.1151 ( 4 decimal places)
b) P ( X < k) = 0.2236
P ( Z < ( k – 18) / 2.5 ) = 0.2236
From normal table, 0.2236 = -0.76
(k-18)/2.5 = - 0.76, solve k = 16.1
c) P (X > k) = 0.1814
P ( Z > (k-18)/2.5 ) = 0.1814
From normal table, 0.1814 = 0.91
(k-18)/ 2.5 = 0.91, solve k = 20.275
d) P ( 17 < X < 21)
P ( (17 -18)/2.5 < Z < ( 21-18)/2.5)
P ( -0.4 < Z < 1.2) = 0.8849 – 0.3446 = 0.5403 ( 4 decimal places)...

...NORMALDISTRIBUTION
1. Find the
distribution:
a.
b.
c.
d.
e.
f.
following probabilities, the random variable Z has standard normal
P (0< Z < 1.43)
P (0.11 < Z < 1.98)
P (-0.39 < Z < 1.22)
P (Z < 0.92)
P (Z > -1.78)
P (Z < -2.08)
2. Determine the areas under the standard normal curve between –z and +z:
♦ z = 0.5
♦ z = 2.0
Find the two values of z in standard normaldistribution so that:
P(-z < Z < +z) = 0.84
3. At a university, the average height of 500 students of a course is 1.70 m; the standard
deviation is 0.05 m. Find the probability that the height of a randomly selected student is:
1. Below 1.75 m
2. Between 1.68 m and 1.78 m
3. Above 1.60 m
4. Below 1.65m
5. Above 1.8 m
4. Suppose that IQ index follows the normaldistribution with µ = 100 and the standard
deviation σ = 16. Miss. Chi has the IQ index of 120. Find the percentage of people who
have the IQ index below that of Miss. Chi.
5. The length of steel beams made by the Smokers City Steel Company is normally
distributed with µ = 25.1 feet and σ = 0.25 feet.
a. What is the probability that a steel beam will be less than 24.8 feet long?
b. What is the probability that a steel beam will be more than 25.25 feet
long?
c. What is the probability that a steel beam will be between 24.9 and 25.7
feet long?
d. What is the probability that a steel beam will be between 24.6 and 24.9
feet long?
e....

{"hostname":"studymode.com","essaysImgCdnUrl":"\/\/images-study.netdna-ssl.com\/pi\/","useDefaultThumbs":true,"defaultThumbImgs":["\/\/stm-study.netdna-ssl.com\/stm\/images\/placeholders\/default_paper_1.png","\/\/stm-study.netdna-ssl.com\/stm\/images\/placeholders\/default_paper_2.png","\/\/stm-study.netdna-ssl.com\/stm\/images\/placeholders\/default_paper_3.png","\/\/stm-study.netdna-ssl.com\/stm\/images\/placeholders\/default_paper_4.png","\/\/stm-study.netdna-ssl.com\/stm\/images\/placeholders\/default_paper_5.png"],"thumb_default_size":"160x220","thumb_ac_size":"80x110","isPayOrJoin":false,"essayUpload":false,"site_id":1,"autoComplete":false,"isPremiumCountry":false,"userCountryCode":"US","logPixelPath":"\/\/www.smhpix.com\/pixel.gif","tracking_url":"\/\/www.smhpix.com\/pixel.gif","cookies":{"unlimitedBanner":"off"},"essay":{"essayId":37680850,"categoryName":"Periodicals","categoryParentId":"17","currentPage":1,"format":"text","pageMeta":{"text":{"startPage":1,"endPage":3,"pageRange":"1-3","totalPages":3}},"access":"premium","title":"Statistics: Normal Distribution and Data","additionalIds":[5,83,79,7],"additional":["Computer Science","Computer Science\/Domains","Computer Science\/Software Development","Education"],"loadedPages":{"html":[],"text":[1,2,3]}},"user":null,"canonicalUrl":"http:\/\/www.studymode.com\/essays\/Statistics-Normal-Distribution-And-Data-1632551.html","pagesPerLoad":50,"userType":"member_guest","ct":10,"ndocs":"1,500,000","pdocs":"6,000","cc":"10_PERCENT_1MO_AND_6MO","signUpUrl":"https:\/\/www.studymode.com\/signup\/","joinUrl":"https:\/\/www.studymode.com\/join","payPlanUrl":"\/checkout\/pay","upgradeUrl":"\/checkout\/upgrade","freeTrialUrl":"https:\/\/www.studymode.com\/signup\/?redirectUrl=https%3A%2F%2Fwww.studymode.com%2Fcheckout%2Fpay%2Ffree-trial\u0026bypassPaymentPage=1","showModal":"get-access","showModalUrl":"https:\/\/www.studymode.com\/signup\/?redirectUrl=https%3A%2F%2Fwww.studymode.com%2Fjoin","joinFreeUrl":"\/essays\/?newuser=1","siteId":1,"facebook":{"clientId":"306058689489023","version":"v2.9","language":"en_US"}}