---
title: 'STAT 3202: Practice 09'
author: "Autumn 2018, OSU"
date: ''
output:
  html_document:
    theme: simplex
  pdf_document: default
urlcolor: BrickRed
---

***

## Exercise 1

Consider a random variable $X$ that has a $t$ distribution with $7$ degrees of freedom. Calculate $P[X > 1.3]$.

### Solution

```{r}
1 - pt(1.3, df = 7)
pt(1.3, df = 7, lower.tail = FALSE)
```

***

## Exercise 2

Consider a random variable $Y$ that has a $t$ distribution with $9$ degrees of freedom. Find $c$ such that $P[X > c] = 0.025$.

### Solution

```{r}
qt(1 - 0.025, df = 9)
qt(0.025, df = 9, lower.tail = FALSE)
```

***

## Exercise 3

For this Exercise, use the built-in `trees` dataset in `R`. Fit a simple linear regression model with `Girth` as the response and `Height` as the predictor. What is the p-value for testing $H_0: \beta_1 = 0$ vs $H_1: \beta_1 \neq 0$?

### Solution

```{r}
tree_model = lm(Girth ~ Height, data = trees)
summary(tree_model)$coefficients["Height", "Pr(>|t|)"]
```

***

## Exercise 4

Continue using the SLR model you fit in Exercise 3. What is the length of a 90% confidence interval for $\beta_1$?

### Solution

```{r}
tree_model = lm(Girth ~ Height, data = trees)
ci_beta_1 = confint(tree_model, parm = "Height", level = 0.90)
ci_beta_1[2] - ci_beta_1[1]
```

***

## Exercise 5

Continue using the SLR model you fit in Exercise 3. Calculate a 95% confidence interval for the mean tree girth of a tree that is 79 feet tall. Report the upper bound of this interval.

### Solution

```{r}
tree_model = lm(Girth ~ Height, data = trees)
predict(tree_model, newdata = data.frame(Height = 79), interval = "confidence")[, "upr"]
```

***

## Exercise 6

Consider a random variable $X$ that has a $t$ distribution with $5$ degrees of freedom. Calculate $P[|X| > 2.1]$.

### Solution

```{r}
pt(-2.1, df = 5) + pt(2.1, df = 5, lower.tail = FALSE)
2 * pt(2.1, df = 5, lower.tail = FALSE)
```

***

## Exercise 7

Calculate the critical value used for a 90% confidence interval about the slope parameter of a simple linear regression model that is fit to 10 observations. (Your answer should be a positive value.)

### Solution

```{r}
conf_level = 0.90
sig_level = 1 - conf_level
n = 10
abs(qt(sig_level / 2, df = n - 2))
```

***

## Exercise 8

Consider the true simple linear regression model

$$
Y_i = 5 + 4 x_i + \epsilon_i \qquad \epsilon_i \sim N(0, \sigma^2 = 4) \qquad i = 1, 2, \ldots 20
$$

Given $S_{xx} = 1.5$, calculate the probability of observing data according to this model, fitting the SLR model, and obtaining an estimate of the slope parameter greater than 4.2. In other words, calculate

$$
P[\hat{\beta}_1 > 4.2]
$$

### Solution

```{r}
Sxx = 1.5
beta_1 = 4
sigma = 2
e_beta_1_hat = 4
sd_beta_1_hat = sqrt(sigma ^ 2 / Sxx)
pnorm(4.2, mean = e_beta_1_hat, sd = sd_beta_1_hat, lower.tail = FALSE)
```

$$
\hat{\beta}_1 \sim N\left(  \beta_1, \frac{\sigma^2}{S_{xx}} \right)
$$

***

## Exercise 9

For Exercises 9 - 13, use the `faithful` dataset, which is built into `R`. 

Suppose we would like to predict the duration of an eruption of [the Old Faithful geyser](http://www.yellowstonepark.com/about-old-faithful/) in [Yellowstone National Park](https://en.wikipedia.org/wiki/Yellowstone_National_Park) based on the waiting time before an eruption. Fit a simple linear model in `R` that accomplishes this task. 

What is the value of $\text{SE}[\hat{\beta}_1]$?

### Solution

```{r}
faithful_model = lm(eruptions ~ waiting, data = faithful)
summary(faithful_model)$coefficients["waiting", "Std. Error"]
```

***

## Exercise 10

What is the value of the test statistic for testing $H_0: \beta_0 = 0$ vs $H_1: \beta_0 \neq 0$?

### Solution

```{r}
faithful_model = lm(eruptions ~ waiting, data = faithful)
summary(faithful_model)$coefficients["(Intercept)", "t value"]
```

***

## Exercise 11

What is the value of the test statistic for testing $H_0: \beta_1 = 0$ vs $H_1: \beta_1 \neq 0$?

### Solution

```{r}
faithful_model = lm(eruptions ~ waiting, data = faithful)
summary(faithful_model)$coefficients["waiting", "t value"]
```

***

## Exercise 12

Test $H_0: \beta_1 = 0$ vs $H_1: \beta_1 \neq 0$ with $\alpha = 0.01$. What decision do you make?

- Fail to reject $H_0$
- Reject $H_0$
- Reject $H_1$
- Not enough information

### Solution

```{r}
faithful_model = lm(eruptions ~ waiting, data = faithful)
summary(faithful_model)$coefficients["waiting", "Pr(>|t|)"]
```

- Fail to reject $H_0$
- **Reject $H_0$**
- Reject $H_1$
- Not enough information

***

## Exercise 13

Calculate a 90% confidence interval for $\beta_0$. Report the upper bound of this interval.

### Solution

```{r}
faithful_model = lm(eruptions ~ waiting, data = faithful)
confint(faithful_model, parm = "(Intercept)", level = 0.90)[, 2]
```

***

## Exercise 14

For this Exercise, use the `Orange` dataset, which is built into `R`.

Use a simple linear regression model to create a 90% confidence interval for the change in mean circumference of orange trees in millimeters when age is increased by 1 day. Report the lower bound of this interval.

### Solution

```{r}
orange_model = lm(circumference ~ age, data = Orange)
confint(orange_model, parm = "age", level = 0.90)[, 1]
```

***

## Exercise 15

For this Exercise, use the `Orange` dataset, which is built into `R`.

Use a simple linear regression model to create a 90% confidence interval for the mean circumference of orange trees in millimeters when the age is 250 days. Report the lower bound of this interval.

### Solution

```{r}
orange_model = lm(circumference ~ age, data = Orange)
predict(orange_model, interval = "confidence", newdata = data.frame(age = 250), level = 0.90)[, "lwr"]
```

***

## Exercise 16

For this Exercise, use the `cats` dataset from the `MASS` package.

Use a simple linear regression model to create a 99% prediction interval for a cat's heart weight in grams if their body weight is 2.5 kilograms. Report the upper bound of this interval.

### Solution

```{r}
library(MASS)
cat_model = lm(Hwt ~ Bwt, data = cats)
predict(cat_model, interval = "prediction", level = 0.99, 
        newdata = data.frame(Bwt = 2.5))[, "upr"]
```

***

## Exercise 17

Consider a 90% confidence interval for the mean response and a 90% prediction interval, both at the same $x$ value. Which interval is narrower?

- Confidence interval
- Prediction interval
- No enough information, it depends on the value of $x$

### Solution

- **Confidence interval**
- Prediction interval
- No enough information, it depends on the value of $x$

***

## Exercise 18

Suppose you obtain a 99% confidence interval for $\beta_1$ that is $(-0.4, 5.2)$. Now test $H_0: \beta_1 = 0$ vs $H_1: \beta_1 \neq 0$ with $\alpha = 0.01$. What decision do you make?

- Fail to reject $H_0$
- Reject $H_0$
- Reject $H_1$
- Not enough information

### Solution

- **Fail to reject $H_0$**
- Reject $H_0$
- Reject $H_1$
- Not enough information

***

## Exercise 19

Suppose you test $H_0: \beta_1 = 0$ vs $H_1: \beta_1 \neq 0$ with $\alpha = 0.01$ and fail to reject $H_0$. Indicate all of the following that must always be true:

- There is no relationship between the response and the predictor.
- The probability of observing the estimated value of $\beta_1$ (or something more extreme) is greater than $0.01$ if we assume that $\beta_1 = 0$.
- The value of $\hat{\beta}_1$ is very small. For example, it could not be 1.2.
- The probability that $\beta_1 = 0$ is very high.
- We would also fail to reject at $\alpha = 0.05$.

### Solution

- There is no relationship between the response and the predictor.
- **The probability of observing the estimated value of $\beta_1$ (or something more extreme) is greater than $0.01$ if we assume that $\beta_1 = 0$.** 
- The value of $\hat{\beta}_1$ is very small. For example, it could not be 1.2.
- The probability that $\beta_1 = 0$ is very high.
- We would also fail to reject at $\alpha = 0.05$.

***

## Exercise 20

Consider a 95% confidence interval for the mean response calculated at $x = 6$. If instead we calculate the interval at $x = 7$, mark each value that would change:

- Point Estimate
- Critical Value
- Standard Error

### Solution

- **Point Estimate**
- Critical Value
- **Standard Error**

***