This data set contains data on the number of runs scored by the home and the away teams for all games played during the 2011 through 2015 seasons.
This data contains the results of every major league baseball game played from 2011 through 2015. The number of runs scored by both the home team and the away team are given for each game. We are interested in modeling the number of runs scored, and determining whether the number of runs scored differs between the home and the away teams.
Variable | Description |
---|---|
game_date |
date the games was played, in yyyy-mm-dd format |
game_num |
type of game played: 0 = single game, 1 = first game of doubleheader, 2 = second game of doubleheader |
away_team_abbrev |
code for the name of the away team |
away_team_runs |
number of runs scored by the away team |
home_team_abbrev |
code for the name of the home team |
home_team_runs |
number of runs scored by the home team |
mlb_data = read.table("data/mlb_games_2011_2015.csv", header = TRUE, sep = ",")
head(mlb_data)
We will consider modeling the number of runs scored by the home and away teams in major league baseball teams from 2011 through 2015. The goal is to find distributions that fits these data well, and to estimate the associated parameters.