---
title: 'STAT 3202: Practice 09'
author: "Autumn 2018, OSU"
date: ''
output:
html_document:
theme: simplex
pdf_document: default
urlcolor: BrickRed
---
***
## Exercise 1
Consider a random variable $X$ that has a $t$ distribution with $7$ degrees of freedom. Calculate $P[X > 1.3]$.
### Solution
```{r}
1 - pt(1.3, df = 7)
pt(1.3, df = 7, lower.tail = FALSE)
```
***
## Exercise 2
Consider a random variable $Y$ that has a $t$ distribution with $9$ degrees of freedom. Find $c$ such that $P[X > c] = 0.025$.
### Solution
```{r}
qt(1 - 0.025, df = 9)
qt(0.025, df = 9, lower.tail = FALSE)
```
***
## Exercise 3
For this Exercise, use the built-in `trees` dataset in `R`. Fit a simple linear regression model with `Girth` as the response and `Height` as the predictor. What is the p-value for testing $H_0: \beta_1 = 0$ vs $H_1: \beta_1 \neq 0$?
### Solution
```{r}
tree_model = lm(Girth ~ Height, data = trees)
summary(tree_model)$coefficients["Height", "Pr(>|t|)"]
```
***
## Exercise 4
Continue using the SLR model you fit in Exercise 3. What is the length of a 90% confidence interval for $\beta_1$?
### Solution
```{r}
tree_model = lm(Girth ~ Height, data = trees)
ci_beta_1 = confint(tree_model, parm = "Height", level = 0.90)
ci_beta_1[2] - ci_beta_1[1]
```
***
## Exercise 5
Continue using the SLR model you fit in Exercise 3. Calculate a 95% confidence interval for the mean tree girth of a tree that is 79 feet tall. Report the upper bound of this interval.
### Solution
```{r}
tree_model = lm(Girth ~ Height, data = trees)
predict(tree_model, newdata = data.frame(Height = 79), interval = "confidence")[, "upr"]
```
***
## Exercise 6
Consider a random variable $X$ that has a $t$ distribution with $5$ degrees of freedom. Calculate $P[|X| > 2.1]$.
### Solution
```{r}
pt(-2.1, df = 5) + pt(2.1, df = 5, lower.tail = FALSE)
2 * pt(2.1, df = 5, lower.tail = FALSE)
```
***
## Exercise 7
Calculate the critical value used for a 90% confidence interval about the slope parameter of a simple linear regression model that is fit to 10 observations. (Your answer should be a positive value.)
### Solution
```{r}
conf_level = 0.90
sig_level = 1 - conf_level
n = 10
abs(qt(sig_level / 2, df = n - 2))
```
***
## Exercise 8
Consider the true simple linear regression model
$$
Y_i = 5 + 4 x_i + \epsilon_i \qquad \epsilon_i \sim N(0, \sigma^2 = 4) \qquad i = 1, 2, \ldots 20
$$
Given $S_{xx} = 1.5$, calculate the probability of observing data according to this model, fitting the SLR model, and obtaining an estimate of the slope parameter greater than 4.2. In other words, calculate
$$
P[\hat{\beta}_1 > 4.2]
$$
### Solution
```{r}
Sxx = 1.5
beta_1 = 4
sigma = 2
e_beta_1_hat = 4
sd_beta_1_hat = sqrt(sigma ^ 2 / Sxx)
pnorm(4.2, mean = e_beta_1_hat, sd = sd_beta_1_hat, lower.tail = FALSE)
```
$$
\hat{\beta}_1 \sim N\left( \beta_1, \frac{\sigma^2}{S_{xx}} \right)
$$
***
## Exercise 9
For Exercises 9 - 13, use the `faithful` dataset, which is built into `R`.
Suppose we would like to predict the duration of an eruption of [the Old Faithful geyser](http://www.yellowstonepark.com/about-old-faithful/) in [Yellowstone National Park](https://en.wikipedia.org/wiki/Yellowstone_National_Park) based on the waiting time before an eruption. Fit a simple linear model in `R` that accomplishes this task.
What is the value of $\text{SE}[\hat{\beta}_1]$?
### Solution
```{r}
faithful_model = lm(eruptions ~ waiting, data = faithful)
summary(faithful_model)$coefficients["waiting", "Std. Error"]
```
***
## Exercise 10
What is the value of the test statistic for testing $H_0: \beta_0 = 0$ vs $H_1: \beta_0 \neq 0$?
### Solution
```{r}
faithful_model = lm(eruptions ~ waiting, data = faithful)
summary(faithful_model)$coefficients["(Intercept)", "t value"]
```
***
## Exercise 11
What is the value of the test statistic for testing $H_0: \beta_1 = 0$ vs $H_1: \beta_1 \neq 0$?
### Solution
```{r}
faithful_model = lm(eruptions ~ waiting, data = faithful)
summary(faithful_model)$coefficients["waiting", "t value"]
```
***
## Exercise 12
Test $H_0: \beta_1 = 0$ vs $H_1: \beta_1 \neq 0$ with $\alpha = 0.01$. What decision do you make?
- Fail to reject $H_0$
- Reject $H_0$
- Reject $H_1$
- Not enough information
### Solution
```{r}
faithful_model = lm(eruptions ~ waiting, data = faithful)
summary(faithful_model)$coefficients["waiting", "Pr(>|t|)"]
```
- Fail to reject $H_0$
- **Reject $H_0$**
- Reject $H_1$
- Not enough information
***
## Exercise 13
Calculate a 90% confidence interval for $\beta_0$. Report the upper bound of this interval.
### Solution
```{r}
faithful_model = lm(eruptions ~ waiting, data = faithful)
confint(faithful_model, parm = "(Intercept)", level = 0.90)[, 2]
```
***
## Exercise 14
For this Exercise, use the `Orange` dataset, which is built into `R`.
Use a simple linear regression model to create a 90% confidence interval for the change in mean circumference of orange trees in millimeters when age is increased by 1 day. Report the lower bound of this interval.
### Solution
```{r}
orange_model = lm(circumference ~ age, data = Orange)
confint(orange_model, parm = "age", level = 0.90)[, 1]
```
***
## Exercise 15
For this Exercise, use the `Orange` dataset, which is built into `R`.
Use a simple linear regression model to create a 90% confidence interval for the mean circumference of orange trees in millimeters when the age is 250 days. Report the lower bound of this interval.
### Solution
```{r}
orange_model = lm(circumference ~ age, data = Orange)
predict(orange_model, interval = "confidence", newdata = data.frame(age = 250), level = 0.90)[, "lwr"]
```
***
## Exercise 16
For this Exercise, use the `cats` dataset from the `MASS` package.
Use a simple linear regression model to create a 99% prediction interval for a cat's heart weight in grams if their body weight is 2.5 kilograms. Report the upper bound of this interval.
### Solution
```{r}
library(MASS)
cat_model = lm(Hwt ~ Bwt, data = cats)
predict(cat_model, interval = "prediction", level = 0.99,
newdata = data.frame(Bwt = 2.5))[, "upr"]
```
***
## Exercise 17
Consider a 90% confidence interval for the mean response and a 90% prediction interval, both at the same $x$ value. Which interval is narrower?
- Confidence interval
- Prediction interval
- No enough information, it depends on the value of $x$
### Solution
- **Confidence interval**
- Prediction interval
- No enough information, it depends on the value of $x$
***
## Exercise 18
Suppose you obtain a 99% confidence interval for $\beta_1$ that is $(-0.4, 5.2)$. Now test $H_0: \beta_1 = 0$ vs $H_1: \beta_1 \neq 0$ with $\alpha = 0.01$. What decision do you make?
- Fail to reject $H_0$
- Reject $H_0$
- Reject $H_1$
- Not enough information
### Solution
- **Fail to reject $H_0$**
- Reject $H_0$
- Reject $H_1$
- Not enough information
***
## Exercise 19
Suppose you test $H_0: \beta_1 = 0$ vs $H_1: \beta_1 \neq 0$ with $\alpha = 0.01$ and fail to reject $H_0$. Indicate all of the following that must always be true:
- There is no relationship between the response and the predictor.
- The probability of observing the estimated value of $\beta_1$ (or something more extreme) is greater than $0.01$ if we assume that $\beta_1 = 0$.
- The value of $\hat{\beta}_1$ is very small. For example, it could not be 1.2.
- The probability that $\beta_1 = 0$ is very high.
- We would also fail to reject at $\alpha = 0.05$.
### Solution
- There is no relationship between the response and the predictor.
- **The probability of observing the estimated value of $\beta_1$ (or something more extreme) is greater than $0.01$ if we assume that $\beta_1 = 0$.**
- The value of $\hat{\beta}_1$ is very small. For example, it could not be 1.2.
- The probability that $\beta_1 = 0$ is very high.
- We would also fail to reject at $\alpha = 0.05$.
***
## Exercise 20
Consider a 95% confidence interval for the mean response calculated at $x = 6$. If instead we calculate the interval at $x = 7$, mark each value that would change:
- Point Estimate
- Critical Value
- Standard Error
### Solution
- **Point Estimate**
- Critical Value
- **Standard Error**
***