Overview

This data set contains data on the number of runs scored by the home and the away teams for all games played during the 2011 through 2015 seasons.

Details

This data contains the results of every major league baseball game played from 2011 through 2015. The number of runs scored by both the home team and the away team are given for each game. We are interested in modeling the number of runs scored, and determining whether the number of runs scored differs between the home and the away teams.

Data Description

Variable Description
game_date date the games was played, in yyyy-mm-dd format
game_num type of game played: 0 = single game, 1 = first game of doubleheader, 2 = second game of doubleheader
away_team_abbrev code for the name of the away team
away_team_runs number of runs scored by the away team
home_team_abbrev code for the name of the home team
home_team_runs number of runs scored by the home team
mlb_data = read.table("data/mlb_games_2011_2015.csv", header = TRUE, sep = ",")
head(mlb_data)

Data Files

Objectives

We will consider modeling the number of runs scored by the home and away teams in major league baseball teams from 2011 through 2015. The goal is to find distributions that fits these data well, and to estimate the associated parameters.