Goal: After completing this lab, you should be able to…

Use permutation tests

In this lab we will use, but not focus on…

R Markdown. This document will serve as a template. It is pre-formatted and already contains chunks that you need to complete.

Some additional notes:

Please see Carmen for information about submission, and grading.
You may use this document as a template. You do not need to remove directions. Chunks that require your input have a comment indicating to do so.
Some code from this set of practice problems may be of use. In particular, the code seen in the solutions.

Exercise 1 - 2019 Ohio State Basketball

library(tidyverse)

For this lab we will use some elements of the tidyverse as a preview for a lab to come which will focus on using the tidyverse. (If you do not have the tidyverse package installed, you will need to do so. Note that the tidyverse package is actually a collection of other packages.)

# load data
osu_bb_2019_games = read_csv("https://daviddalpiaz.github.io/stat3202-sp19/data/osu-bb-2019-games.csv")
osu_bb_2019_games

For this exercise we will use data on the OSU Men’s Basketball games from the 2018 - 2019 season, excluding any games in the soon to be played 2019 NCAA Tournament where OSU is an 11 seed. While an 11 seed isn’t great, have a look at this video by Jon Bois which explains some of the weirdness around certain seeds in the tournament.

In particular we’ll investigate the personal fouls given to OSU compared to their opponents. Specifically we will look at the difference between the number of personal fouls obtained by OSU compared to their opponent in each game. That is, we have “paired” data. (So we will investigate data on the differences.)

# create difference data as a seperate vector
osu_bb_2019_games %>% mutate(pf_diff = PF - OPPPF) %>% 
  select(pf_diff) %>% unlist() %>% unname() -> pf_diff
head(pf_diff)

## [1]   3  -3  -1 -12 -11   5

For example, in the fifth game of the season, OSU had 11 fewer personal fouls than their opponent, Samford.

Suppose we are interested in testing:

\(H_0\): There is no difference between the distribution of fouls obtained by OSU and their opponents.
\(H_A\): OSU is given fewer fouls than their opponents. Specifically, the distribution of fouls for OSU is shifted lower (“to the left”) compared to their opponents, which makes this a one-sided “less-than” alternative. (Which might lead us to believe the referees are favoring OSU. But this analysis is far too simple to draw that conclusion.)

There are a number of ways we could go about testing this. (Although with different or more specific null and alternative hypotheses.)

We could consider a t-test:

t.test(pf_diff, alternative = "less")

## 
##  One Sample t-test
## 
## data:  pf_diff
## t = -0.70983, df = 32, p-value = 0.2415
## alternative hypothesis: true mean is less than 0
## 95 percent confidence interval:
##      -Inf 1.008247
## sample estimates:
##  mean of x 
## -0.7272727

Or, we could consider a Wilcoxon signed rank test:

wilcox.test(pf_diff, alternative = "less")

## 
##  Wilcoxon signed rank test with continuity correction
## 
## data:  pf_diff
## V = 244.5, p-value = 0.3608
## alternative hypothesis: true location is less than 0

We could also consider a sign test:

binom.test(x = sum(pf_diff > 0), n = length(pf_diff), p = 0.5, alternative = "less")

## 
##  Exact binomial test
## 
## data:  sum(pf_diff > 0) and length(pf_diff)
## number of successes = 14, number of trials = 33, p-value = 0.2434
## alternative hypothesis: true probability of success is less than 0.5
## 95 percent confidence interval:
##  0.0000000 0.5814382
## sample estimates:
## probability of success 
##              0.4242424

But maybe none of these seem right to us.

Perhaps we don’t believe the normal assumption required to perform the t-test. (Although, we could probably use a large sample \(z\) procedure here, but again, that’s an assumption we’d have to make.)
Perhaps we don’t understand the sort of weird assumptions of the Wilcoxon test.
Perhaps we understand that the sign test generally has low power.

qplot(pf_diff, binwidth = 3)

So what should we do?

Use a permutation test that permutes the statistic

\[ t = \frac{\bar{x}_D}{s_D / \sqrt{n}} \]

to test the above hypotheses where \(\bar{x}_D\) is the sample mean difference, and \(s_D\) is the standard deviation of the differences. Use 10000 permutations.

Create a histogram that illustrates the distribution of the statistic used.
Report the p-value of the test.

set.seed(42)
# generate t statistics for the personal foul data via permutation here

# calculate t statistic on observed data here

# plot empirical distribution of permutated statistic
# add a vertical line indicating the observed value

# calculate the p-value here
# that is, calculate the proportion of the permutated statistics that are
# less than the observed value

Exercise 2 - 2018 Ohio State Football

Does Ohio State football score more points when playing at home, or on the road?

For this exercise we will use data on the OSU Football games from the 2018, including postseason games.

# load data
osu_fb_2018_games = read_csv("https://daviddalpiaz.github.io/stat3202-sp19/data/osu-fb-2018-games.csv")
osu_fb_2018_games

For the purposes of this highly simplistic analysis, we will consider games played on a neutral field, like the Rose Bowl, an “away” game.

# modify data
osu_fb_2018_games$Home = ifelse(is.na(osu_fb_2018_games$Home), "home", "away")
osu_fb_2018_games

Let’s more specifically test:

\(H_0\): There is no difference between the distribution OSU at “home” or when playing an “away” game.
\(H_A\): There is difference between the distribution OSU at “home” or when playing an “away” game. (In particular a shift up or down.) This is a “two-sided” test.

Here we are assuming that we have two independent samples, one for home and one for away. (We’ll live with this assumption, but we should be highly suspicious of it. There is a ton of dependence in this data. We’re also ignoring opponent strength, and the fact that there were some coaching changes throughout the year….)

There are a number of ways we could go about testing this. (Although with different or more specific null and alternative hypotheses.)

We could consider a t-test that does not assume equal variance in the two groups:

t.test(Pts ~ Home, data = osu_fb_2018_games)

## 
##  Welch Two Sample t-test
## 
## data:  Pts by Home
## t = -2.2599, df = 11.122, p-value = 0.04485
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -32.9709597  -0.4576117
## sample estimates:
## mean in group away mean in group home 
##           34.00000           50.71429

We could consider a t-test that does assume equal variance in the two groups:

t.test(Pts ~ Home, data = osu_fb_2018_games, var.equal = TRUE)

## 
##  Two Sample t-test
## 
## data:  Pts by Home
## t = -2.2599, df = 12, p-value = 0.04322
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -32.828765  -0.599806
## sample estimates:
## mean in group away mean in group home 
##           34.00000           50.71429

We could consider a Wilcoxon rank sum test, better know at OSU as the Mann-Whitney U test:

wilcox.test(Pts ~ Home, data = osu_fb_2018_games)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  Pts by Home
## W = 8.5, p-value = 0.04716
## alternative hypothesis: true location shift is not equal to 0

Again, maybe none of these seem right to us.

Perhaps we don’t believe the normal assumption required to perform either of the t-tests. (And here the sample size probably isn’t large enough to consider a large sample procedure.)
Perhaps we don’t understand the sort of weird assumptions of the Wilcoxon test.

osu_fb_2018_games %>% ggplot(aes(x = Pts)) + geom_histogram(binwidth = 20) + facet_wrap(~Home)

osu_fb_2018_games %>% ggplot(aes(x = Pts, col = Home)) + geom_line(stat = "density")

So what should we do?

Use a permutation test that permutes the statistic

\[ t = \frac{(\bar{x} - \bar{y}) - 0}{s_p\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}} \]

to test the above hypotheses. Use 10000 permutations.

Create a histogram that illustrates the distribution of the statistic used.
Report the p-value of the test.

If you would like to follow the code used in the practice problems, you will need to create subsets of the points variable for the home and away games. If you would like to use that data as-is, consider the following code:

sample(osu_fb_2018_games$Home)

##  [1] "away" "away" "home" "away" "home" "away" "home" "home" "home" "away"
## [11] "home" "away" "home" "away"

set.seed(42)
# generate t statistics for the scoring data via permutation here

# calculate t statistic on observed data here

# plot empirical distribution of permutated statistic
# add a vertical line indicating the observed value (and any value "as extreme")

# calculate the p-value here
# recall that this is a "two-sided" test

STAT 3202: Lab 06, Permutation Testing

Spring 2019, OSU

Due: Friday, March 22

Exercise 1 - 2019 Ohio State Basketball

Exercise 2 - 2018 Ohio State Football