Goal: After completing this lab, you should be able to…
R
) using either maximum likelihood or method of moments estimators.In this lab we will use, but not focus on…
R
Markdown. This document will serve as a template. It is pre-formatted and already contains chunks that you need to complete.Some additional notes:
R
Markdown file contain details on performing the required tasks through a number of examples.01
. It may be extremely useful here.For this exercise, we will return to the sleep data from class. Please refer to the notes for a complete description of this dataset.
library(readr)
fitbit_sleep = read_csv("https://daviddalpiaz.github.io/stat3202-sp19/data/fitbit-sleep.csv")
The above chunk will read the data as a data frame named fitbit_sleep
. (Actually, a tibble
, but that is unimportant at the moment.) In general, you should avoid using an absolute reference to data on your own machine. Instead, you should use a relative reference, and include the data with your R
Markdown document. Here we avoid the issue altogether and simply read the data directly from the web. This way, the above code will work on anyone’s machine.
percent_deep = fitbit_sleep$min_deep / fitbit_sleep$min_sleep
For this exercise, we will investigate a new variable called percent_deep
that is created above. This variable measures the proportion of time asleep spent in the deep sleep stages.
We will consider two probability models for this variable, a beta model and a normal model.
range(percent_deep)
## [1] 0.05882353 0.24561404
Notice that the range of the observed data that is calculated above. Furthermore, since this variable is a proportion, it could take values between 0
and 1
. (Although, it is rather unlikely that you would spend all night in deep sleep.) If we think of the sample space of the continuous distributions that we know, we should remember that the beta distribution takes as input values between 0
and 1
. This makes beta a good candidate.
Recall, that a random variable \(X\) that follows a beta distribution has probability density function
\[ f(x \mid \alpha, \beta) = \left[\frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha)\Gamma(\beta)}\right] x^{\alpha - 1}(1 - y)^{\beta - 1}, 0 \leq x \leq 1, \alpha > 0, \beta > 0 \]
Also, the beta distribution has
\[ \text{E}[X] = \frac{\alpha}{\alpha + \beta} \]
and
\[ \text{Var}[X] = \frac{\alpha\beta}{(\alpha + \beta)^2(\alpha + \beta + 1)} \]
While it is not possible to find an analytic solution for the MLE, given \(X_1, X_2, \ldots X_n\) assumed to be a random sample from a beta distribution, we can use method of moments to obtain estimators:
\[ \tilde{\alpha} = \bar{x}\left(\frac{\bar{x}(1 - \bar{x})}{s^2} - 1\right) \]
\[ \tilde{\beta} = (1 - \bar{x})\left(\frac{\bar{x}(1 - \bar{x})}{s^2} - 1\right) \]
where
\[ \bar{x} = \frac{1}{n}\sum_{i = 1}^{n}x_i \]
\[ s^2 = \frac{1}{n - 1}\sum_{i = 1}^{n}(x_i - \bar{x})^2 \]
Do the following:
# perform estimate calculations here
# remove these comments and insert your code in this chunk
hist(percent_deep, xlim = c(0, 1), ylim = c(0, 12), probability = TRUE,
main = "Histogram of Percent Deep Sleep",
xlab = "Percent Deep Sleep")
box()
grid()