# Regression Analysis

Only available on StudyMode
• Published : February 10, 2013

Text Preview
Confidence intervals and prediction intervals from simple linear regression

The managers of an outdoor coffee stand in Coast City are examining the relationship between coffee sales and daily temperature. They have bivariate data detailing the stand's coffee sales (denoted by [pic], in dollars) and the maximum temperature (denoted by [pic], in degrees Fahrenheit) for each of [pic] randomly selected days during the past year. The least-squares regression equation computed from their data is [pic].

Tommorrow's forecast high is [pic] degrees Fahrenheit. The managers have used the regression equation to predict the stand's coffee sales for tomorrow. They now are interested in both a prediction interval for tomorrow's coffee sales and a confidence interval for the mean coffee sales on days on which the maximum temperature is [pic]. They have computed the following for their data:

• mean square error (MSE) [pic] [pic];

where [pic] denote the theater revenues in the sample, and [pic] denotes their mean.

The least-squares regression equation can be used to predict the value of one variable (called the dependent variable, often denoted by [pic]) based on a given value of the other variable (called the independent variable, often denoted by [pic]). When we make such a prediction, it is useful to obtain a prediction interval for an individual value of [pic] given a value of [pic]. For example, a [pic] prediction interval for an individual value of [pic], given that [pic], is an interval that is constructed by a method that will capture the actual value of [pic] (when [pic]) about [pic] of the time. In addition, it can be useful to obtain a confidence interval for the mean of the distribution of [pic] given a value of [pic]. For example, a [pic] confidence interval for the mean value of [pic], given that [pic], is an interval that is constructed by a method that will capture the mean of the distribution of [pic] (when [pic]) about [pic] of the time.

Given the least-squares regression equation [pic], the [pic] prediction interval for an individual value of [pic], given that [pic], is

Prediction interval:

The [pic] confidence interval for the mean value of [pic], given that [pic], is

Confidence interval:

In each of these formulas, [pic] denotes the sample size, [pic] is the standard error of the estimate, and [pic] is the value that cuts off an area of [pic] in the right tail of a t distribution with [pic] degrees of freedom. Note that the only difference between the two formulas is that the prediction interval formula has a [pic] in the sum underneath the square root, while the confidence interval formula does not. Note also that the two intervals are centered at the same value, [pic]. This means that [pic] is the best estimate of both an individual value of [pic] when [pic] equals [pic] and the mean value of [pic] when [pic] equals [pic].

We can use these formulas to answer the questions given in the problem.

1. The data for the [pic] days yielded the regression equation [pic], with [pic] denoting maximum temperature (in degrees Fahrenheit) and [pic] denoting estimated coffee sales (in dollars). We're asked to construct a [pic] prediction interval for an individual value for coffee sales when the maximum temperature is [pic] degrees Fahrenheit. The first part of the prediction interval formula, [pic], is just the value of [pic] for the given maximum temperature value, [pic]. With [pic], we have

[pic].

Thus, the prediction interval is centered at [pic], which is the best estimate for an individual value for coffee sales when the maximum temperature is [pic] degrees Fahrenheit.

The next part of the prediction interval formula, [pic], is the value that cuts off an area of [pic] in the right tail of a t distribution with [pic] degrees of freedom. Because we're computing a [pic] prediction interval, we have that [pic]. Also, for the...