Consider a random variable \(X\) that has a normal distribution with a mean of 5 and a variance of 9. Calculate \(P[X > 4]\).
1 - pnorm(4, mean = 5, sd = 3)
## [1] 0.6305587
pnorm(4, mean = 5, sd = 3, lower.tail = FALSE)
## [1] 0.6305587
# starter
Consider the simple linear regression model
\[ Y = -3 + 2.5x + \epsilon \]
where
\[ \epsilon \sim N(0, \sigma^2 = 4). \]
What is the expected value of \(Y\) given that \(x = 5\)? That is, what is \(\text{E}[Y \mid X = 5]\)?
-3 + 2.5 * 5
## [1] 9.5
Return to the simple linear regression model
\[ Y = -3 + 2.5x + \epsilon \]
where
\[ \epsilon \sim N(0, \sigma^2 = 4). \]
What is the standard deviation of \(Y\) when \(x\) is \(10\). That is, what is \(\text{SD}[Y \mid X = 10]\)?
sqrt(4)
## [1] 2
For this Exercise, use the built-in trees
dataset in R
. Fit a simple linear regression model with Girth
as the response and Height
as the predictor. What is the slope of the fitted regression line?
coef(lm(Girth ~ Height, data = trees))[2]
## Height
## 0.2557471
For this Exercise, use the built-in trees
dataset in R
. Fit a simple linear regression model with Girth
as the response and Height
as the predictor. What is the value of \(R^2\) for this fitted SLR model?
summary(lm(Girth ~ Height, data = trees))$r.squared
## [1] 0.2696518
Consider the simple linear regression model
\[ Y = 10 + 5x + \epsilon \]
where
\[ \epsilon \sim N(0, \sigma^2 = 16). \]
Calculate the probability that \(Y\) is less than 6 given that \(x = 0\).
\[ Y \mid X = 0 \sim N(\mu = 10, \sigma^2 = 16) \]
x = 0
mu = 10 + 5 * x
sigma = 4
pnorm(6, mean = mu, sd = sigma)
## [1] 0.1586553
Consider the simple linear regression model
\[ Y = 6 + 3x + \epsilon \]
where
\[ \epsilon \sim N(0, \sigma^2 = 9). \]
Calculate the probability that \(Y\) is greater than 1.5 given that \(x = -1\).
\[ Y \mid X = -1 \sim N(\mu = 3, \sigma^2 = 9) \]
x = -1
mu = 6 + 3 * x
sigma = 3
pnorm(1.5, mean = mu, sd = sigma, lower.tail = FALSE)
## [1] 0.6914625
Consider the simple linear regression model
\[ Y = 2 + -4x + \epsilon \]
where
\[ \epsilon \sim N(0, \sigma^2 = 25). \]
Calculate the probability that \(Y\) is greater than 1.5 given that \(x = 3\).
\[ Y \mid X = 3 \sim N(\mu = -10, \sigma^2 = 25) \]
x = 3
mu = 2 - 4 * x
sigma = 5
pnorm(1.5, mean = mu, sd = sigma, lower.tail = FALSE)
## [1] 0.01072411
For Exercises 9 - 15, use the faithful
dataset, which is built into R
.
Suppose we would like to predict the duration of an eruption of the Old Faithful geyser in Yellowstone National Park based on the waiting time before an eruption. Fit a simple linear model in R
that accomplishes this task.
What is the estimate of the intercept parameter?
faithful_model = lm(eruptions ~ waiting, data = faithful)
coef(faithful_model)[1]
## (Intercept)
## -1.874016
What is the estimate of the slope parameter?
faithful_model = lm(eruptions ~ waiting, data = faithful)
coef(faithful_model)[2]
## waiting
## 0.07562795
Use the fitted model to estimate the mean duration of eruptions when the waiting time is 78 minutes.
faithful_model = lm(eruptions ~ waiting, data = faithful)
predict(faithful_model, data.frame(waiting = 78))
## 1
## 4.024964
Use the fitted model to estimate the mean duration of eruptions when the waiting time is 122 minutes.
faithful_model = lm(eruptions ~ waiting, data = faithful)
predict(faithful_model, data.frame(waiting = 122))
## 1
## 7.352594
Consider making predictions of eruption duration for waiting times of 80 and 120 minutes, which is more reliable?
range(faithful$waiting)
## [1] 43 96
Calculate the RSS for the fitted model.
faithful_model = lm(eruptions ~ waiting, data = faithful)
sum(resid(faithful_model) ^ 2)
## [1] 66.56178
What proportion of the variation in eruption duration is explained by the linear relationship with waiting time?
faithful_model = lm(eruptions ~ waiting, data = faithful)
summary(faithful_model)$r.squared
## [1] 0.8114608
For this Exercise, use the built-in trees
dataset in R
.
Fit a simple linear regression model with Girth
as the response and Height
as the predictor. Use this fitted model to give an estimate for the mean Girth
of trees that are 81 feet tall.
tree_model = lm(Girth ~ Height, data = trees)
predict(tree_model, data.frame(Height = 81))
## 1
## 14.52712
Suppose both Least Squares and Maximum Likelihood are used to fit a simple linear regression model to the same data. The estimates for the slope and the intercept will be:
Consider the fitted regression model:
\[ \hat{y} = -1.5 + 2.3x \]
Indicate all of the following that must be true:
Indicate all of the following that are true:
Suppose you fit a simple linear regression model and obtain \(\hat{\beta}_1 = 0\). Does this mean that there is no relationship between the response and the predictor?
A simple linear model will only detect a linear relationship between two variables.