3.97.

a.

>poverty=read.csv("C:/Users/Nikita/Downloads/R Excel Data Sets/Excel/poverty.csv") > names(poverty)

[1] "Location" "PovPct" "Brth15to17" "Brth18to19" "ViolCrime" [6] "TeenBrth"

> plot(poverty$PovPct,poverty$Brth15to17)

>abline(lm(poverty$Brth15to17~poverty$PovPct))

> cor(poverty$PovPct,poverty$Brth15to17)

[1] 0.7302931

40

poverty$Brth15to17

30

20

10

5

10

poverty$PovPct

15

20

25

The plot shows that Brth15to17 increases as PovPct increases, this means that they have a positive association. A correlation of 0.73, which is positive confirms the positive association.

We know that a correlation of +1 or -1 means that the relationship between the two variables is extremely strong: straight line. We can see from the graph and the correlation that the relationship between these two variables is quite strong .

It seems that there are no outliers in this data set because the 2 variables increase together

b.

> reg1=lm(poverty$Brth15to17~poverty$PovPct)

> reg1

Call:

lm(formula = poverty$Brth15to17 ~ poverty$PovPct)

Coefficients:

(Intercept) poverty$PovPct

4.267

1.373

Equation of the Regression line is:

y=b0+b1x

Brth15to17=4.267+1.373*PovPct

c.

Slope: 1.373

Brth15to17 changes by 1.373 for every unit increase in PovPct

d.

Using the Regression equation:

PovPct=15%

Brth15to17=4.267+1.373*15

24.862%

3.98

a.

> plot(oldf$Duration,oldf$TimeNext)

> abline(lm(oldf$TimeNext~oldf$Duration))

> cor(oldf$Duration,oldf$TimeNext)

[1] NA

oldf$TimeNext

110

100

90

80

70

60

50

1

2

3

oldf$Duration

4

5

There is a positive association between the two variables as they seem to be increasing together.

The correlation is NA here, thus the strength of the relationship between these two variables cannot be predicted.

Some outliers seem to exist within this dataset as they don’t increase together. One of them is Duration=4 and Time Next=100.

b.

> reg2=lm(oldf$TimeNext~oldf$Duration)

> reg2

Call:

lm(formula = oldf$TimeNext ~ oldf$Duration)

Coefficients:

(Intercept) oldf$Duration

34.98

10.66

Equation of the Regression line is:

y=b0+b1x

TimeNext=4.267+1.373*Duration

c.

Slope: 10.66

TimeNext changes by 10.66 for every unit increase in Duration

d.

Duration=4minutes

Using the regression equation:

TimeNext=34.98+10.66*4

77.62

3.99

a.

> chol plot(chol$TwoDay,chol$FourDay)

> abline(lm(chol$FourDay~chol$TwoDay))

chol$FourDay

350

300

250

200

150

150

200

250

chol$TwoDay

300

350

> cor(chol$FourDay,chol$TwoDay)

[1] NA

The plot shows a positive association between the two variables, however correlation= NA, so we cannot determine the strength of the relationship. It seems that there aren’t any outliers in this dataset, because the variable don’t seem to break the pattern.

b.

> reg3=lm(chol$FourDay~chol$TwoDay)

> reg3

Call:

lm(formula = chol$FourDay ~ chol$TwoDay)

Coefficients:

(Intercept) chol$TwoDay

62.3651

0.6627

Equation for Regression line: y=b0+b1x

FourDay=62.3651+0.6627*TwoDay

c.

Slope=0.6627

FourDay changes by 0.6627 for every unit increase in TwoDay

d.

TwoDay=200

FourDay=62.3651+0.6627*200

194.9051

TwoDay=250

FourDay=62.3651+0.6627*250

228.04

TwoDay=300

FourDay=62.3651+0.6627*300

261.1751

e.

For every 50unit change in TwoDay cholesterol levels , FourDay Cholesterol levels vary by 33.13 units.

228.04 - 194.9051=261.1751 - 228.04=33.13

3.100

a.

> sat plot(sat$PctTook,sat$Verbal)

>abline(lm(sat$Verbal~sat$PctTook))

> cor(sat$Verbal,sat$PctTook)

[1] -0.8949489

sat$Verbal

580

560

540

520

500

480

20

sat$PctTook

40

60

80

There is a negative association between the two variables as shown by the line on the plot and the sign of the correlation. It is a strong relationship as shown by the magnitude of the correlation.

b.

>...