Goal: The goal of this lab is to simply get started using R Markdown. After completing this lab, you will have created your first R Markdown document from scratch.

In later labs, we will use R Markdown, but our focus will be on how R programming to better understand statistical concepts. For those labs, you will be given an R Markdown template document. For this lab, you will work from scratch.

Please see Carmen for information about submission, and grading.


Exercise 0 - Installation

Before we get started, you will need to have the necessary software.

You do not need to finish the lab before the end of class, you have until Friday to submit to Carmen, but if you’ve brought your own machine to class, please make sure you at least have everything installed and in working order before you leave. (Or stop by office hours ASAP to get it sorted.)

We will never use the default R GUI. We will always use RStudio.

After you have R and RStudio installed, the easiest way to install R Markdown is to simply create a new R Markdown document in RStudio. To do so, click File > New File > R Markdown… Assuming you do not have R Markdown installed, you will be greeted with a prompt asking if you would like to install R Markdown. How convenient! Select yes.

At this point you should be see another prompt:

Enter the relevant information (title and author) and select HTML then click OK.

Congratulations, you’ve created your first R Markdown document! Your RStudio should look something like this. We are going to delete most of this text, but you should go ahead and read it now. It serves as a super-mini R Markdown tutorial.

Your next step is to save your document.

Save your document as lab-1-lastname#.Rmd. Do not include the usual “dot” in your lastname.# combination. (You might also consider first creating a folder called lab-01 and saving the file in that folder.)

Our next step is to “knit” this file. That means taking the R Markdown document, which is a combination of R code and Markdown, which is a shorthand for HTML, to create a final, more human-readable file, in this case an HTML file. To do so, click the “Knit” button located in the bar above the code editor.

The result should look something like this:

Before we get started with the rest of the lab, delete everything except for the “header” of the document. Also, replace the Date: field with your lastname.#. (Maybe change the title to be STAT 3202: Lab 01 which is more informative.) Then knit your file again.

This will be our starting place:


Exercise 1 - Markdown

Before we get to the R part of R Markdown, let’s learn some Markdown. Take a look at this R Markdown “Cheatsheet” from RStudio. On the second page there is a section on Pandoc Markdown.

Use this information to:

Your schedule should be formatted as a bullet list with the short coursename bolded, like this:

If you prefer to not share your schedule, write a fake one. (But make it obvious that it is fake…)

Consider “knitting” your document periodically to make sure you’re on the right track.


Exercise 2 - R and Markdown

Add another header to your document titled “Using R.”

In this section, add three R “chunks.” Do the following in those chunks


Exercise 3 - Simulation Study

Add another header to your document titled “Simulation Study.”

In this section, add the following R code in a chunk.

# function that simulates sample means and variance from a poisson distribution
# for various sample sizes and true lambdas
sim_mean_var = function(sample_size = 50, true_lambda = 2) {
  sim_sample = rpois(n = sample_size, lambda = true_lambda)
  c(mean(sim_sample), var(sim_sample))
}

# generate sample means and variances
set.seed(42)
results = replicate(10000, sim_mean_var())
sample_means = results[1, ]
sample_variances = results[2, ]

This is code we saw in class for obtaining the empirical distribution of \(\bar{X}\) and \(S^2\) when attempting to estimating \(\lambda = 2\) using a sample of size 50 from a Poisson distribution.


Exercise 4 - Plotting and Chunk Options

In the “Simulation Study” section, add another chunk with the following code which will produce a plot. When using R Markdown, it is a good idea to use different chunks for plots. Also, one plot per chunk!

Modify this chunk’s Chunk Options to show only the resulting plot in the knitted document and not the code to generate the plot.

# plot results of simulation study
plot(density(sample_means), xlim = range(sample_variances), lwd = 3,
     main =  "Sampling Distributions of Sample Mean and Variance: Poisson ", xlab = " ", col = "dodgerblue")
lines(density(sample_variances), lwd = 3, lty = 2, col = "darkorange")
legend("topright", c("mean", "variance"), lty = c(1, 2), lwd = 3,
       col = c("dodgerblue", "darkorange"))
abline(v = 2, col = "darkgrey", lty = 3)

Exercise 5 - How to Learn

Create another header titled “Objects and Functions.” In that section, recreate the following blockquote:

“To understand computations in R, two slogans are helpful:

  • Everything that exists is an object.
  • Everything that happens is a function call."

— John Chambers

Assume that we hadn’t provide the R Markdown cheatsheet and you needed to figure out how to do this. Click here for a quick demonstration on how to learn anything about R Markdown. (Or really, any topic at all!)

This quotation might not make sense right now, but it is a useful mental model for working with R that we will often return to.


Exercise 6 - Submission

You’re almost done, you just need to knit, zip, and submit.

All submissions must be received by Friday, August 31 at 11:59 PM for credit.


Exercise 7 - Curiosity, Reading, Template, Videos

There is a lot more to learn than what we can discuss and try out in 50 minutes. We have left out a million little details. Even if we had told you every single detail, you wouldn’t remember them. The only real way to learn programming and related activities like R Markdown, is to get your hands dirty and practice. (Which is what we will do in lab!)

Start to get into the mindset of trying things, being wrong, and fixing mistakes. For example, while written in R Markdown, this document looks different than those that your are producing. Why? See if you can figure out how to change theme of your document. Also, see if you can add a table of contents.

For details on R, RStudio, and RMarkdown, consider reading Chapter 1 - 6 of “Applied Statistics with R”. This is a poorly titled, freely available, and open-source book that I (Dave) wrote for a regression course, but it contains a quick-and-dirty introduction to R, RStudio, and R Markdown. Of particular interest should be Chapter 6:

This chapter contains:

Also, we should mention that the author of the R Markdown package wrote an entire book on the subject. This book is well beyond the scope of the course, but a good reference to be aware of.


Exercise 8 - Reading Solutions

After the lab is due, solutions will be released. Even if you get 100% on the lab, you should read the .Rmd solution file to see if there are more efficient ways of doing what you were asked to do. The more exposure you have to good code, the better.