- Exercise 0A - \(t\)-Test in
`R`

- Exercise 0B - Sign Test in
`R`

- Exercise 0C - The Cauchy Distribution
- Exercise 0D - Lab Setup
- Exercise 1 - Test Validity, Sampling From Normal
- Exercise 2 - Test Validity, Sampling From Cauchy
- Exercise 3 - Comparing Power, Sampling From Normal
- Exercise 4 - Comparing Power, Sampling From Cauchy

**Goal:** After completing this lab, you should be able to…

*Use*simulation to verify significance levels.*Use*simulation to estimate power.*Understand*the pros and cons of parametric and nonparametric tests.

In this lab we will use, but not focus on…

`R`

Markdown. This document will serve as a template. It is pre-formatted and already contains chunks that you need to complete.

Some additional notes:

- Please see
**Carmen**for information about submission, and grading. - You may use this document as a template. You do not need to remove directions. Chunks that require your input have a comment indicating to do so.

`R`

Recall the Deflategate data from previous homework, in particular the measurements for the Patriots.

`pats = c(11.50, 10.85, 11.15, 10.70, 11.10, 11.60, 11.85, 11.10, 10.95, 10.50, 10.90) `

Suppose that we wanted to test

\[ H_0\colon \mu_P = 12.5 \quad \text{vs} \quad H_A\colon \mu_P < 12.5 \]

To do so, we use the `t.test()`

function in `R`

. We use three of the function’s arguments:

`x`

: the data to be used for the test`mu`

: the hypothesized value of the mean`alternative`

: the alternative hypothesis of the test

`t.test(x = pats, mu = 12.5, alternative = "less")`

```
##
## One Sample t-test
##
## data: pats
## t = -11.465, df = 10, p-value = 2.241e-07
## alternative hypothesis: true mean is less than 12.5
## 95 percent confidence interval:
## -Inf 11.32898
## sample estimates:
## mean of x
## 11.10909
```

The output gives, among other things, the value of the **test statistic** as well as the **p-value** of the test.

The `mu`

and `alternative`

arguments have default values of `0`

and `two.sided`

respectively. So, suppose that we wished to test

\[ H_0\colon \mu = 0 \quad \text{vs} \quad H_A\colon \mu \neq 0 \]

for the following data.

`some_data = c(-0.22, -0.51, 0.12, 0.14, -0.19)`

Then, we would simply use.

`t.test(some_data)`

```
##
## One Sample t-test
##
## data: some_data
## t = -1.0934, df = 4, p-value = 0.3356
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## -0.4671803 0.2031803
## sample estimates:
## mean of x
## -0.132
```

`R`

In class we only used the sign test for paired data, but it can also be used as a test about the median of a population.

Again, recall the Deflategate data from previous homework, this time the measurements for the Colts. (One datapoint was modified for illustration purposes.)

`colts = c(12.70, 12.75, 12.10, 12.55)`

Suppose we wished to test

\[ H_0\colon m_C = 12.5 \quad \text{vs} \quad H_A\colon m_C < 12.5 \]

where \(m_C\) is the median of the PSI of the Colts’ footballs.

Instead of looking at the sign of the difference of paired data, we compare each datapoint to the hypothesized median, in particular noting how many observations are greater than the hypothesized value. (You could alternatively use the number less than, but that would have a reversing affect on the alternative hypothesis.)

`(num_inflated = sum(colts > 12.5))`

`## [1] 3`

To carry out the sign test in `R`

we use the `binom.test()`

function where:

`x`

is the test statistic, the number of observations greater than the hypothesized median- in this case, this is measuring how many balls were properly inflated

`n`

is the number of observations`alternative`

is the alternative hypothesis- the “less” alternative here is equivalent to saying there is less than a 50% chance that a ball is inflated properly
- in other words, “more extreme” in this case is fewer inflated balls

`binom.test(x = num_inflated, n = length(colts), alternative = "less")`

```
##
## Exact binomial test
##
## data: num_inflated and length(colts)
## number of successes = 3, number of trials = 4, p-value = 0.9375
## alternative hypothesis: true probability of success is less than 0.5
## 95 percent confidence interval:
## 0.0000000 0.9872585
## sample estimates:
## probability of success
## 0.75
```

Here the output gives a number of things, including most importantly the **p-value**.

Again, most of the time we would use a two-sided test, which like `t.test()`

is the default for `binom.test()`

. So, suppose that we wished to test

\[ H_0\colon m = 0 \quad \text{vs} \quad H_A\colon m \neq 0 \]

for the following data.

We now longer need to supply an `alternative`

to `binom.test()`

and this time we compare each observation to `0`

. (Also, with a two sided test, how the test statistic relates to the alternative is less important.)

```
some_data = c(-0.22, -0.51, 0.12, -0.14, -0.19)
num_pos = sum(some_data > 0)
binom.test(x = num_pos, n = length(some_data))
```

```
##
## Exact binomial test
##
## data: num_pos and length(some_data)
## number of successes = 1, number of trials = 5, p-value = 0.375
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
## 0.005050763 0.716417936
## sample estimates:
## probability of success
## 0.2
```

Whenever you would like to break a statistical procedure, use the Cauchy distribution! It has a couple interesting properties:

- An
**undefined**mean! - An
**undefined**variance!

```
x_vals = seq(from = -5, to = 5, length = 10000)
plot(x_vals, dnorm(x_vals), type = "l", lwd = 2, lty = 1,
ylim = c(0, 0.45), xlab = "x", ylab = "density", col = "dodgerblue")
lines(x_vals, dcauchy(x_vals), lwd = 2, lty = 3, col = "darkorange")
legend("topleft", legend = c("Normal", "Cauchy"),
col = c("dodgerblue", "darkorange"), lwd = 2, lty = c(1, 3))
grid()
```