---
title: 'STAT 3202: Practice 07'
author: "Autumn 2018, OSU"
date: ''
output:
html_document:
theme: simplex
pdf_document: default
urlcolor: BrickRed
---
***
# Exercise 1
Suppose that a researcher is interested in the effect of caffeine on [typing speed](https://en.wikipedia.org/wiki/Words_per_minute). A group of nine individuals are administered a typing test. The following day, they repeat the typing test, this time after taking 400 mg of caffeine. (Note: This is not recommended.) The data gathered, measured in words per minute, is
```{r}
decaf = c(98, 124, 107, 105, 80, 43, 73, 68, 69)
caff = c(104, 128, 110, 108, 86, 53, 72, 73, 72)
```
```{r, echo = FALSE}
library(kableExtra)
kable(data.frame(decaf, caff)) %>%
kable_styling(bootstrap_options = "striped", full_width = FALSE)
```
Note that these are paired observations.
Use the **sign test** with a significance level of 0.05 to assess whether or not caffeine has an effect on typing speed. That is, test
$$
H_0\colon \ m_D = m_C - m_N = 0 \quad \text{vs} \quad H_A\colon \ m_D = m_C - m_N \neq 0
$$
where
- $m_C$ is the median typing speed in words per minute of individuals using caffeine
- $m_N$ is the median typing speed in words per minute of individuals not using caffeine
Since it is possible that the caffeine makes typing speed worse, use a two-sided test.
You may use the following probabilities calculated in `R`.
```{r}
round(dbinom(x = 0:9, size = 9, prob = 0.5), 3)
```
Report:
- The **p-value** of the test
- A **decision** when $\alpha = 0.05$.
```{r}
# use this chunk to complete any necessary calculations in R
```
- **P-Value:** Your p-value here.
- **Decision:** Your decision here.
***
# Exercise 2
Does meditation have an effect on [blood pressure](https://en.wikipedia.org/wiki/Blood_pressure). A group of six college aged individuals were given a routine physical examination including a measurement of their [systolic](https://en.wikipedia.org/wiki/Systole) blood pressure. (Measured in millimeters of mercury.) A week after their physicals, the same six individuals returned for a guided [meditation session](https://en.wikipedia.org/wiki/Guided_meditation). Immediately afterwords there (systolic) blood pressure was measured. The data gathered is
```{r}
physical = c(125, 108, 185, 135, 112, 133)
meditation = c(120, 114, 160, 131, 124, 125)
```
```{r, echo = FALSE}
library(kableExtra)
kable(data.frame(physical, meditation)) %>%
kable_styling(bootstrap_options = "striped", full_width = FALSE)
```
Note that these are paired observations.
Use the **sign test** with a significance level of 0.10 to assess whether or not meditation has an effect on blood pressure. That is, test
$$
H_0\colon \ m_D = m_M - m_P = 0 \quad \text{vs} \quad H_A\colon \ m_D = m_M - m_P \neq 0
$$
where
- $m_P$ is the median systolic blood pressure in millimeters of mercury measured without meditation
- $m_M$ is the median systolic blood pressure in millimeters of mercury measured with meditation
Since it is possible that the meditation makes blood pressure worse, use a two-sided test.
You may use the following probabilities calculated in `R`.
```{r}
round(dbinom(x = 0:6, size = 6, prob = 0.5), 3)
```
Report:
- The **p-value** of the test
- A **decision** when $\alpha = 0.10$.
```{r}
# use this chunk to complete any necessary calculations in R
```
- **P-Value:** Your p-value here.
- **Decision:** Your decision here.
***
# Exercise 3
Return to the sleep data in Exercise 2. This time test
- $H_0$: The distribution of systolic blood pressure is **the same** with and without meditation
- $H_A$: The distribution of systolic blood pressure is **different** with and without meditation
To do so, use a **permutation test** that permutes the *statistic*
$$
\bar{x}_D
$$
where $\bar{x}_D$ is the sample mean difference. Assume that the distribution of blood pressure with and without meditation has the same shape, but may have different locations. Use at least 10000 permutations.
```{r}
physical = c(125, 108, 185, 135, 112, 133)
meditation = c(120, 114, 160, 131, 124, 125)
```
- Create a histogram that illustrates the distribution of the statistic used.
- Report the p-value of the test.
```{r}
# use this chunk to complete any necessary permutation calculations
# also calculate statistic on observed data
```
```{r}
# use this chunk to create the histogram
```
```{r}
# use this chunk to calculate the p-value of the test
```
***
# Example 4
Which profession pays more? Data Scientist of Actuary? A (far too small) survey of junior (less than three years experience) data scientist and actuaries resulted in the following data:
```{r}
data_sci = c(88000, 121000, 91000, 50000, 78000, 95000)
actuary = c(63000, 75000, 81000, 75000, 85000)
```
Use a **permutation test** that permutes the *statistic*
$$
t = \frac{(\bar{x} - \bar{y}) - 0}{s_p\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}
$$
to test
- $H_0$: The distribution of salaries is **the same** for junior data scientists and actuaries
- $H_A$: The distribution of salaries is **different** for junior data scientists and actuaries
Assume that the distribution of salaries for both has the same shape, but may have different locations. Use at least 10000 permutations.
- Create a histogram that illustrates the distribution of the statistic used.
- Report the p-value of the test.
```{r}
# use this chunk to complete any necessary permutation calculations
# also calculate statistic on observed data
```
```{r}
# use this chunk to create the histogram
```
```{r}
# use this chunk to calculate the p-value of the test
```
***
# Exercise 5
Repeat exercise 3, but use an appropriate test available in the `R` function `wilcox.test()`.
Report:
- The **p-value** of the test
- A **decision** when $\alpha = 0.05$.
```{r}
# use this chunk to complete any necessary calculations in R
```
- **P-Value:** Your p-value here.
- **Decision:** Your decision here.
***