Statistics, MAT 201, Module V-CA5
Alfred Basta
December 20, 2009
Statistics
ANOVA & Least Squares
Look at the data below for the income levels and prices paid for cars for ten people: | Annual Income Level |Amount Spent on Car |
|38,000 |12,000 |
|40,000 |16,000 |
|117,000 |41,000 |
|17,000 |3,500 |
|23,000 |6,500 |
|79,000 |21,000 |
|33,000 |5,000 |
|66,000 |8,000 |
|15,000 |1,500 |
|52,000 |6,000 |

Answer the following questions:
A. What kind of correlation do you expect to find between annual income and amount spent on car? Will it be positive or negative? Will it be a strong relationship? Base your answer on your personal guess as well as by looking through the data.

The annual income and amount of money spent on a car correlates that generally the greater the sum of income the larger portion of money spent on a car. The middle/low to middle income in datas spent the most with percentages ranging from the low 21% to 40%. The middle/high income percentages took a much smaller percentage rate at 12% and 35%. While the low income percentages represented only 10% of their incomes spent toward a new car purchase. The trend makes the graph ascend on both sides of the linear regression line. When the incomes of the consumer increase the sales for cars also rises presenting a positive result. Therefore, as long as the incomes continue to grow the relationship to car sales will also trend to the right in an upward, positive motion. B. What is the direction of causality in this relationship - i.e. does having a more expensive car...

...LinearRegression deals with the numerical measures to express the relationship between two variables. Relationships between variables can either be strong or weak or even direct or inverse. A few examples may be the amount McDonald’s spends on advertising per month and the amount of total sales in a month. Additionally the amount of study time one puts toward this statistics in comparison to the grades they receive may be analyzed using theregression method. The formal definition of Regression Analysis is the equation that allows one to estimate the value of one variable based on the value of another.
Key objectives in performing a regression analysis include estimating the dependent variable Y based on a selected value of the independent variable X. To explain, Nike could possibly measurer how much they spend on celebrity endorsements and the affect it has on sales in a month. When measuring, the amount spent celebrity endorsements would be the independent X variable. Without the X variable, Y would be impossible to estimate. The general from of the regression equation is Y hat "=a + bX" where Y hat is the estimated value of the estimated value of the Y variable for a selected X value. a represents the Y-Intercept, therefore, it is the estimated value of Y when X=0. Furthermore, b is the slope of the line or the average change in Y hat for each change of one unit in the...

...of 1000 flights and proportions of three routes in the sample. He divides them into different sub-groups such as satisfaction, refreshments and departure time and then selects proportionally to highlight specific subgroup within the population. The reasons why Mr Kwok used this sampling method are that the cost per observation in the survey may be reduced and it also enables to increase the accuracy at a given cost.
TABLE 1: Data Summaries of Three Routes
Route 1
Route 2
Route 3
Normal(88.532,5.07943)
Normal(97.1033,5.04488)
Normal(107.15,5.15367)
Summary Statistics
Mean
88.532
Std Dev
5.0794269
Std Err Mean
0.2271589
Upper 95% Mean
88.978306
Lower 95% Mean
88.085694
N
500
Sum
44266
Summary Statistics
Mean
97.103333
Std Dev
5.0448811
Std Err Mean
0.2912663
Upper 95% Mean
97.676525
Lower 95% Mean
96.530142
N
300
Sum
29131
Summary Statistics
Mean
107.15
Std Dev
5.1536687
Std Err Mean
0.3644194
Upper 95% Mean
107.86862
Lower 95% Mean
106.43138
N
200
Sum
21430
From the table above, the total number of passengers for route 1 is 44,266, route 2 is 29,131 and route 3 is 21,430 and the total numbers of passengers for 3 routes are 94,827.
Although route 1 has the highest number of passengers and flights but it has the lowest means of passengers among the 3 routes. From...

...Linear -------------------------------------------------
Important
EXERCISE 27 SIMPLE LINEARREGRESSION
STATISTICAL TECHNIQUE IN REVIEW
Linearregression provides a means to estimate or predict the value of a dependent variable based on the value of one or more independent variables. The regression equation is a mathematical expression of a causal proposition emerging from a theoretical framework. The linkage between the theoretical statement and the equation is made prior to data collection and analysis. Linearregression is a statistical method of estimating the expected value of one variable, y, given the value of another variable, x. The term simple linearregression refers to the use of one independent variable, x, to predict one dependent variable, y.
The regression line is usually plotted on a graph, with the horizontal axis representing x (the independent or predictor variable) and the vertical axis representing the y (the dependent or predicted variable) (see Figure 27-1). The value represented by the letter a is referred to as the y intercept or the point where the regression line crosses or intercepts the y-axis. At this point on the regression line, x = 0. The value represented by the letter b is referred to as the slope, or the coefficient of x. The slope determines the...

...
A. DETERMINE IF BLOOD FLOW CAN PREDICT ARTIRIAL OXYGEN.
1. Always start with scatter plot to see if the data is linear (i.e. if the relationship between y and x is linear). Next perform residual analysis and test for violation of assumptions. (Let y = arterial oxygen and x = blood flow).
twoway (scatter y x) (lfit y x)
regress y x
rvpplot x
2. Since regression diagnostics failed, we transform our data.
Ratio transformation was used to generate the dependent variable and reciprocal transformation was used to generate the independent variable.
3. Check if the model is adequate by checking the t-statistic, R2 and F-statistic.
F statistic reveals that the equation used to determine the relationship between the x and y is functional. Using the test statistic for the test of coefficients, it was revealed that the constant value in the equation is not significantly different from 0. Also, it was revealed that the transformed x, significantly explains the dependent variable. Also, it was revealed that the measure of proportion of variability explained by the fitted value is relatively high with 96.23%. This means that transformed data in blood flow explains 96.23% of the variation in the transformed data in arterial oxygen.
4. Check the normality of residuals and equal variances
predict r, resid
kdensity r, normal
pnorm tx
qnorm tx
rvpplot tx...

...Economics 141 (Intro to Econometrics) Professor Yang
Spring 2001
Answers to Midterm Test No. 1
1. Consider a regression model of relating Y (the dependent variable) to X (the independent
variable) Yi = (0 + (1Xi+ (i where (i is the stochastic or error term. Suppose that the
estimated regression equation is stated as Yi = (0 + (1Xi and ei is the residual error term.
A. What is ei and define it precisely. Explain how it is related to (i.
ei is the residual error term in the sample regression function and is defined as eI hat = Y
– Y hat.
ei is the estimated error term of the population function.
B. What is (i and define it precisely. What are the four reasons for the inclusion of this error term in the population regression function (model)?
(i is the stochastic term in the population regression function. The four reasons for its existence are: 1. Omitted variable 2. Measurement error 3. Different functional form
4. to account for purely randomness in the human behavior.
C. Draw a graph where you can clearly show E(Yi(XI) = (( + ((XI and Yi = (0 + (1Xi. Show
also in your graph (( and e6 for the X6. This graph graph will show true and estimated
regression lines together with their respective error terms.
See Figure 2.1 on pages 18 (& 39) of the textbook for the graph.
D....

...Chapter 13
LinearRegression and Correlation
True/False
1. If a scatter diagram shows very little scatter about a straight line drawn through the plots, it indicates a rather weak correlation.
Answer: False Difficulty: Easy Goal: 1
2. A scatter diagram is a chart that portrays the correlation between a dependent variable and an independent variable.
Answer: True Difficulty: Easy Goal: 1 AACSB: AS
3. An economist is interested in predicting the unemployment rate based on gross domestic product. Since the economist is interested in predicting unemployment, the independent variable is gross domestic product.
Answer: True Difficulty: Medium Goal: 1 AACSB: REF
4. There are two variables in correlation analysis referred to as the dependent and determination variables.
Answer: False Difficulty: Easy Goal: 1
5. Correlation analysis is a group of statistical techniques used to measure the strength of the relationship (correlation) between two variables.
Answer: True Difficulty: Easy Goal: 2 AACSB: AS
6. The purpose of correlation analysis is to find how strong the relationship is between two variables.
Answer: True Difficulty: Easy Goal: 2
7. Originated by Karl Pearson about 1900, the coefficient of correlation describes the strength of the relationship between two, interval or...

...Linear-Regression Analysis
Introduction
Whitner Autoplex located in Raytown, Missouri, is one of the AutoUSA dealerships. Whitner Autoplex includes Pontiac, GMC, and Buick franchises as well as a BMW store. Using data found on the AutoUSA website, Team D will use LinearRegression Analysis to determine whether the purchase price of a vehicle purchased from Whitner Autoplex increases as the age of the consumer purchasing the vehicle increases. The data set provided information about the purchasing price of 80 domestic and imported automobiles at Whitner Autoplex as well as the age of the consumers purchasing the vehicles. Team D selected the first 30 of the sampled domestic vehicles to use for this test. The business research question Team D will answer is: Does the purchase price of a consumer increase as the age of the consumer increases? Team D will use a linear-regression analysis to test the age of the consumers and the prices of the vehicles.
Five Step Hypothesis Testing
Team D will conduct the two-sample hypothesis using the following five steps:
1. Formulate the hypothesis
2. State the decision rule
3. Calculate the Test Statistic
4. Make the decision
5. Interpret the results
Step 1- Formulate the Hypothesis
Using the research question: Does the purchase price of an automobile purchased at Whitner Autoplex, increase as...