Authors: David D. Tung
In this paper, we will investigate the problem of obtaining confidence intervals for a baseball team's Pythagorean expectation, i.e. their expected winning percentage and expected games won. We study this problem from two different perspectives. First, in the framework of regression models, we obtain confidence intervals for prediction, i.e. more formally, prediction intervals for a new observation, on the basis of historical binomial data for Major League Baseball teams from the 1901 through 2009 seasons, and apply this to the 2009 MLB regular season. We also obtain a Scheffé-type simultaneous prediction band and use it to tabulate predicted winning percentages and their prediction intervals, corresponding to a range of values for log(RS=RA). Second, parametric bootstrap simulation is introduced as a data-driven, computer-intensive approach to numerically computing confidence intervals for a team's expected winning percentage. Under the assumption that runs scored per game and runs allowed per game are random variables following independent Weibull distributions, we numerically calculate confidence intervals for the Pythagorean expectation via parametric bootstrap simulation on the basis of each team's runs scored per game and runs allowed per game from the 2009 MLB regular season. The interval estimates, from either framework, allow us to infer with better certainty as to which teams are performing above or below expectations. It is seen that the bootstrap confidence intervals appear to be better at detecting which teams are performing above or below expectations than the prediction intervals obtained in the regression framework.
Comments: 27 Pages.
[v1] 8 May 2010
Unique-IP document downloads: 818 times
Add your own feedback and questions here:
You are equally welcome to be positive or negative about any paper but please be polite. If you are being critical you must mention at least one specific error, otherwise your comment will be deleted as unhelpful.