Goal: After completing this lab, you should be able to…

Use simulation to verify significance levels.
Use simulation to estimate power.

In this lab we will use, but not focus on…

R Markdown. This document will serve as a template. It is pre-formatted and already contains chunks that you need to complete.

Some additional notes:

Please see Carmen for information about submission, and grading.
You may use this document as a template. You do not need to remove directions. Chunks that require your input have a comment indicating to do so.

Exercise 0 - Verifying \(\alpha\), Power

The following function will be used throughout this lab:

sim_p_value = function(mu = 0, sigma = 1, n = 25, mu_0 = 0) {

  # sample from population
  sample_data = rnorm(n = n, mean = mu, sd = sigma)

  # calculate statistics
  x_bar = mean(sample_data)
  s = sd(sample_data)
  t = (x_bar - mu_0) / (s / sqrt(n))

  # calculate p-value
  p_value = 2 * pt(-abs(t), df = n - 1)

  # return p-value
  p_value
}

This function simulates p-values for a two-sided, one sample \(t\)-test. That is, we assume that the data is sampled from a normal distribution, \(N(\mu, \sigma^2)\), and we are performing the test:

\[ H_0: \mu = \mu_0 \text{ vs } H_1: \mu \neq \mu_0 \]

The function takes as input four arguments:

mu: The true value of \(\mu\) in the population
sigma: The true value of \(\sigma\) in the population
n: The sample size
mu_0: The hypothesized value of \(\mu\) which we call \(\mu_0\)

Let’s run the function and discuss what is happening.

set.seed(1)
sim_p_value(mu = 10, sigma = 3, n = 25, mu_0 = 13)

## [1] 0.0002035751

First, internally the function samples 25 observations from a normal distribution with a mean of 10 and a standard deviation of 3. Then, the function calculates and returns the p-value for testing

\[ H_0: \mu = 13 \text{ vs } H_1: \mu \neq 13 \]

Note that this is a “small” p-value, so with a significance level of say, \(\alpha = 0.05\) we would reject \(H_0\). (Which makes sense, the true value of \(\mu\) is not 13!)

Recall that \(\alpha\), the significance level is the probably of rejecting the null hypothesis when it is true.

\[ \alpha = P(\text{Reject } H_0 \mid H_0 \text{ True}) \]

Lets verify the significance level of the test

\[ H_0: \mu = 0 \text{ vs } H_1: \mu \neq 0 \]

The true distribution will be normal with a mean of 0 and a standard deviation of 1. We will use a sample size of 25. We will simulate this test 25000 times and use the simulated p-values to estimate the true \(\alpha\) of the test.

set.seed(42)
p_values = replicate(25000, sim_p_value(mu = 0, sigma = 1, n = 25, mu_0 = 0))
head(p_values)

## [1] 0.4798228 0.1843076 0.4695465 0.7457961 0.4990247 0.3441031

Here we use the replicate() function to repeatedly run the sim_p_value() function. This avoids needing to write a for loop.

Notice that these p-values appear roughly uniform.

hist(p_values, probability = TRUE, col = "darkgrey",
     main = "Histogram of P-Values, Null True", xlab = "p-values")
box()
grid()

Suppose we were to use \(\alpha = 0.05\). Well, then we should expect that roughly 5% of these p-values are below 0.05.

mean(p_values < 0.05)

## [1] 0.0498

Pretty close! If we modified the \(\alpha\) value, we would still obtain the right value. (The big take-away here is that p-values are uniformly distributed when the null is true.)

mean(p_values < 0.10)

## [1] 0.1014

mean(p_values < 0.05)

## [1] 0.0498

mean(p_values < 0.01)

## [1] 0.0106

What if the null hypothesis is not true. Instead, some alternative is true. What happens to the distribution of p-values?

Lets calculate the power of the test

\[ H_0: \mu = 0 \text{ vs } H_1: \mu \neq 0 \]

set.seed(42)
p_values_power = replicate(25000, sim_p_value(mu = 0.5, sigma = 1, n = 25, mu_0 = 0))

Let’s keep everything the same as before, except now the null is not true because the true mean is 0.5.

hist(p_values_power, probability = TRUE, col = "darkgrey",
     main = "Histogram of P-Values, Alternative True", xlab = "p-values")
box()
grid()

This is no longer uniform! Now to calculate the power when \(\alpha = 0.05\). Recall that power is given by,

\[ \text{Power} = P(\text{Reject } H_0 \mid H_0 \text{ False}) \]

Then in R we simply note how often we reject.

mean(p_values_power < 0.05)

## [1] 0.66752

This means that even though the null is not true, the mean is 0.5 instead of 0, using a sample of size 25 and \(\alpha = 0.05\), we will only reject the null about 67% of the time. (We’d like this to be 100%, which isn’t possible, so we’d like this to be as high as possible.)

What causes a test to have low or high power? The rest of this lab, you will investigate. Things we will consider:

Sample size
Variance of population
Effect size (How big a difference in mean are we looking for. In other words, which alternative is true. There are a lot of ways for the null hypotheses to be false.)
Significance level

Some things we won’t look at here, but you should think about:

One versus two-sided test. (One sided are more powerful.)
Balance in design. (In a two-sample test, we like the samples from the two populations to be as close to equal as possible)

Exercise 1 - Power, Sample Size

Consider samples of size

\(n = 20\)
\(n = 40\)
\(n = 60\)

from a normal distribution with

\(\mu = 3\)
\(\sigma = 2.5\)

For each \(n\), use 25000 simulations to estimate the power of the test

\[ H_0: \mu = 4 \text{ vs } H_1: \mu \neq 4 \]

when \(\alpha = 0.05\).

set.seed(42)
# perform simulations here
# store results of using replicate() with sim_p_value() three times
# (once for each n)

# calculate power for each n here
# hint: use mean() three times

Based on these results, how does sample size affect power?

YOUR ANSWER HERE.

Exercise 2 - Power, Population Variance

Consider a sample of size

\(n = 25\)

from a normal distribution with mean

\(\mu = 3\)

and standard deviations

\(\sigma = 2\)
\(\sigma = 3\)
\(\sigma = 4\)

For each \(\sigma\), use 25000 simulations to estimate the power of the test

\[ H_0: \mu = 4 \text{ vs } H_1: \mu \neq 4 \]

when \(\alpha = 0.05\).

set.seed(42)
# perform simulations here
# store results of using replicate() with sim_p_value() three times
# (once for each sigma)

# calculate power for each sigma here
# hint: use mean() three times

Based on these results, how does population variance affect power?

YOUR ANSWER HERE.

Exercise 3 - Power, Effect Size

Consider a sample of size

\(n = 20\)

from a normal distributions with means

\(\mu = 5\)
\(\mu = 6\)
\(\mu = 7\)

and standard deviation

\(\sigma = 3.5\)

For each \(\sigma\), use 25000 simulations to estimate the power of the test

\[ H_0: \mu = 4 \text{ vs } H_1: \mu \neq 4 \]

when \(\alpha = 0.05\).

set.seed(42)
# perform simulations here
# store results of using replicate() with sim_p_value() three times
# (once for each mu)

# calculate power for each mu here
# hint: use mean() three times

Based on these results, how does effect size affect power?

YOUR ANSWER HERE.

Exercise 4 - Power, Significance Level

Consider a sample of size

\(n = 15\)

from a normal distributions with mean

\(\mu = 0\)

and standard deviation

\(\sigma = 1\)

For each \(\sigma\), use 25000 simulations to estimate the power of the test

\[ H_0: \mu = 0.5 \text{ vs } H_1: \mu \neq 0.5 \]

when

\(\alpha = 0.01\)
\(\alpha = 0.05\)
\(\alpha = 0.10\)

set.seed(42)
# perform simulations here
# store results of using replicate() with sim_p_value() ONCE
# (use the same simulations for each alpha)

# calculate power for each alpha here
# hint: use mean() three times

Based on these results, how does the significance level affect power?

YOUR ANSWER HERE.

STAT 3202: Lab 05, Power

Spring 2019, OSU

Due: Friday, March 8

Exercise 0 - Verifying \(\alpha\), Power

Exercise 1 - Power, Sample Size

Exercise 2 - Power, Population Variance

Exercise 3 - Power, Effect Size

Exercise 4 - Power, Significance Level