STA9708
Regression Analysis: Literacy rates and Poverty rates
As we are aware, poverty rate serve as an indicator for a number of causes in the world. Poverty rates are linked with infant mortality, education, child labor and crime etc. In this project, I will apply the regression analysis learned in the Statistics course to study the relationship between literacy rates and poverty rates among different states in USA. In my study, the poverty rates will be the independent variable (x) and literacy rates will be the dependent variable (y). The purpose of this regression is to determine if there is a correlation between the poverty rates and literacy rates in different states within USA. My null and alternate hypothesis are as follows: Null hypothesis: Ho: β1 = 0 This hypothesis states that there is no correlation between the literacy and poverty rates Alternate hypothesis: Ha: β1≠0 This is the hypothesis we want to prove, there is correlation between the literacy rate and poverty rates The first step I did was to create a scatter plot for the data and the descriptive statistics study. The scatter plot shows a positive correlation between the two variables and the equation of the line is y = 1.0998x + 2.2613 with a R-square value of 0.5305. The scatter plot is shown below: Figure 1: Scatter plot of relationship between poverty and literacy rates

Based on the coefficient of determination of 0.53, we can say that poverty rate is contributing about more than half to the increase in literacy rates in states. The Y-intercept represents the literacy rate without any poverty rate contribution to the states. After the scatter plot, I calculated the descriptive statistics of the dependent variable (y) which is the Literacy rates. The mean literacy rate is 14.76 and the standard error is 1.01. Shown below is the result from the descriptive statistics: Literacy rates - descriptive stats|

| |
Mean| 14.76036172|
Standard Error| 1.002134285|
Median|...

...
Unit 5 – RegressionAnalysis
Mikeja R. Cherry
American InterContinental University
Abstract
In this brief, I will demonstrate selected perceptions of the company Nordstrom, Inc., a retailer that specializes in fashion apparel with over 12 million dollars in sales last year. I will research, review, and analyze perceptions of the company, create graphs to show qualitative and quantitative analysis, and provide a summary of my findings.
Introduction
Nordstrom, Inc. is a retailer that specializes in fashion apparel for men, women and kids that was founded in 1901. The company is headquartered in Seattle, Washington with over 61,000 employees world-wide as of February 2, 2013. (Business Wire, 2014)
Nordstrom, Inc. offers on online store, e-commerce, retail stores, mobile commerce and catalogs to its consumers. It operates 117 full-line stores within the United States and 1 store in Canada, 167 Nordstrom Rack stores, 1 clearance store under the Last Chance Banner, 1 philanthropic treasure & bond store called Trunk Club and 2 Jeffrey boutiques. The option of shopping online is also available at www.nordstrom.com along with an online private sale subsidiary Hautelook. They have warehouses, also called fulfillment centers, which manages majority of their shipping needs that are located in Cedar Rapids, Iowa. (Business Source Premier, 2014)
Nordstrom, Inc. continues to make investments in their e-commerce...

...associated with a β1 change in Y.
(iii) The interpretation of the slope coefficient in the model ln(Yi ) = β0 + β1 ln(Xi ) + ui is as
follows:
(a) a 1% change in X is associated with a β1 % change in Y.
(b) a change in X by one unit is associated with a β1 change in Y.
(c) a change in X by one unit is associated with a 100β1 % change in Y.
(d) a 1% change in X is associated with a change in Y of 0.01β1 .
(iv) To decide whether Yi = β0 + β1 X + ui or ln(Yi ) = β0 + β1 X + ui fits the data better, you
cannot consult the regression R2 because
(a) ln(Y) may be negative for 0 < Y < 1.
(b) the TSS are not measured in the same units between the two models.
(c) the slope no longer indicates the effect of a unit change of X on Y in the log-linear
model.
(d) the regression R2 can be greater than one in the second model.
1
(v) The exponential function
(a) is the inverse of the natural logarithm function.
(b) does not play an important role in modeling nonlinear regression functions in econometrics.
(c) can be written as exp(ex ).
(d) is ex , where e is 3.1415...
(vi) The following are properties of the logarithm function with the exception of
(a) ln(1/x) = −ln(x).
(b) ln(a + x) = ln(a) + ln(x).
(c) ln(ax) = ln(a) + ln(x).
(d) ln(xa) = aln(x).
(vii) In the log-log model, the slope coefficient indicates
(a) the effect that a unit change in X has on Y.
(b) the elasticity of Y with respect to X.
(c) ∆Y/∆X.
(d)
∆Y
∆X
×
Y
X
(viii) In the...

...RegressionAnalysis Exercises
1- A farmer wanted to find the relationship between the amount of fertilizer used and the yield of corn. He selected seven acres of his land on which he used different amounts of fertilizer to grow corn. The following table gives the amount (in pounds) of fertilizer used and the yield (in bushels) of corn for each of the seven acres.
|Fertilizer Used |Yield of Corn |
|120 |138 |
|80 |112 |
|100 |129 |
|70 |96 |
|88 |119 |
|75 |104 |
|110 |134 |
a. With the amount of fertilizer used as an independent variable and yield of corn as a...

...REGRESSIONANALYSIS
Correlation only indicates the degree and direction of relationship between two variables. It does not, necessarily connote a cause-effect relationship. Even when there are grounds to believe the causal relationship exits, correlation does not tell us which variable is the cause and which, the effect. For example, the demand for a commodity and its price will generally be found to be correlated, but the question whether demand depends on price or vice-versa; will not be answered by correlation.
The dictionary meaning of the ‘regression’ is the act of the returning or going back. The term ‘regression’ was first used by Francis Galton in 1877 while studying the relationship between the heights of fathers and sons.
“Regression is the measure of the average relationship between two or more variables in terms of the original units of data.”
The line of regression is the line, which gives the best estimate to the values of one variable for any specific values of other variables.
For two variables on regressionanalysis, there are two regression lines. One line as the regression of x on y and other is for regression of y on x.
These two regression line show the average relationship between the two variables. The regression line of y on x gives the most probable...

...
Memorandum
Subject: Regression to the Mean with Coin Flips
This paper discusses the statistics project, Regression to the Mean with Coin Flips. The paper is divided into four parts, which are summarized below:
Part One: The Questionnaires
This section summarizes the results of questionnaires handed out to a random sample of 110 people. Pie charts are provided, which reflect the responses to each question.
Part Two: 200 Flips
This section discusses the outcome of flipping a normal coin two-hundred times. Recorded observations are explained in detail.
Part Three: 200 Stimulated Flips
For this part, a computer stimulated coin flip in which a coin is flipped two-hundred times is discussed. The results are compared in the context of various statistics principles.
Part Four: Overview
This section compares the results of Parts 2 and 3 to the responses to the questionnaires. The section notes the number of people who answered the first two questions on the questionnaire incorrectly. The section also rationalizes whether there is an association and, if so, why such an association exists.
Part One: The Questionnaires
For this project, a questionnaire was handed out to a random sample of one hundred and ten people. The questionnaire included the following three questions:
46990216535If we flipped a normal coin ten times, and it came up heads seven of those times, would you expect it to come up...

...l
RegressionAnalysis
Basic Concepts & Methodology
1. Introduction
Regressionanalysis is by far the most popular technique in business and economics for
seeking to explain variations in some quantity in terms of variations in other quantities, or to
develop forecasts of the future based on data from the past. For example, suppose we are
interested in the monthly sales of retail outlets across the UK. An initial dataanalysis would
summarise the variability in terms of a mean and standard deviation, but the variation from
outlet to outlet could be very large for a variety of reasons. The size of the local market, the
size of the shop, the level of competition, the level of advertising, etc.. would all influence the
sales volume from outlet to outlet. This is where regressionanalysis can be useful. A
regressionanalysis would seek to model the influence of these factors on the level of sales. In
statistical terms we would be seeking to regress the variation in sales ⎯ the dependent
variable ⎯ upon several explanatory variables such as advertising, size, etc..
From a forecasting point of view we can use regressionanalysis to develop predictions. If we
were asked to make a forecast for the monthly sales of a proposed new outlet in, say, Oxford,
we can simply compute the average outlet sales and put this...

...
Mortality Rates
RegressionAnalysis of Multiple Variables
Neil Bhatt
993569302
Sta 108 P. Burman
11 total pages
The question being posed in this experiment is to understand whether or not pollution has an impact on the mortality rate. Taking data from 60 cities (n=60) where the responsive variable Y = mortality rate per population of 100,000, whose variables include Education, Percent of the population that is nonwhite, percent of population that is deemed poor, the precipitation, the amount sulfur dioxide, and amount of nitrogen dioxide.
Data:
60 Standard Metropolitan Statistical Area (SMSA) in the United States, obtained for the years 1959-1961. [Source: GC McDonald and JS Ayers, “Some applications of the ‘Chernoff Faces’: a technique for graphically representing multivariate data”, in Graphical Representation of Multivariate Data, Academic Press, 1978.
Taking the data, we can construct a matrix plot of the data in order to take a visible look at whether a correlation seems to exist or not prior to calculations.
Data Distribution:
Scatter Plot Matrix
As one can observe there seems to be a cluster of data situated on what appears to be a correlation of relationship between Y=Mortality rate and X= potential variables influencing Y.
From this we construct a correlation matrix in order to see a relationship in matrix form....

...RegressionAnalysis: A Complete Example
This section works out an example that includes all the topics we have discussed so far in this chapter.
A complete example of regressionanalysis.
PhotoDisc, Inc./Getty Images
A random sample of eight drivers insured with a company and having similar auto insurance policies was selected. The following table lists their driving experiences (in years) and monthly auto insurance premiums.
Driving Experience (years) Monthly Auto Insurance Premium
5 2 12 9 15 6 25 16
$64 87 50 71 44 56 42 60
a. Does the insurance premium depend on the driving experience or does the driving experience depend on the insurance premium? Do you expect a positive or a negative relationship between these two variables? b. Compute SSxx, SSyy, and SSxy. c. Find the least squares regression line by choosing appropriate dependent and independent variables based on your answer in part a. d. Interpret the meaning of the values of a and b calculated in part c. e. Plot the scatter diagram and the regression line. f. Calculate r and r2 and explain what they mean. g. Predict the monthly auto insurance premium for a driver with 10 years of driving experience. h. Compute the standard deviation of errors. i. Construct a 90% confidence interval for B. j. Test at the 5% significance level whether B is negative. k. Using α = .05, test whether ρ is different from zero.
Solution a. Based...