# Stats Term Paper

Pages: 5 (1419 words) Published: January 8, 2013
The Probability of On-Base Percentage
Affecting Total Team Wins

I. Abstract
In the baseball world, On-Base Percentage is the key element to a team’s advancement to post-season play, because offensive domination is essential to the desired outcome, namely, winning. The following will describe the relationship between On-Base Percentage and the total amount of wins a Major League Baseball team receives using a linear regression model and other descriptive graphs. II. Introduction

Essentially, scoring runs in baseball is the only way to win games. However, if players are not able to get on base then no one can cross home plate, resulting in a loss. So, do the amount of games a baseball team wins correlate to the team’s On-Base Percentage? First, what contributes to the On-Base Percentage of a team: The total number of hits? Walks?... On-Base Percentage is averages using the following five statistics: Walks, Hits, At-Bats, Hit-by Pitches and Sacrifice Flies. (A sacrifice fly is when the batter presumably intends to cause a teammate to score a run, while sacrificing his own ability to do so).

“The Oakland A’s and several other teams have found great success by fielding a competitive team by stressing on-base-percentage…” (Lewis 2003). The following research paper uses statistical analysis and descriptive analysis to show that On-Base Percentage affects the total amount of wins of a team. The data set will consist of the last 30 seasons of the Boston Red Sox, showing their On-Base Percentage and their total amount of wins. III. Descriptive Statistics

The following focuses on the discipline of quantitatively describing the main features of a collection of data. It measures different statistics of the variables such as: the mean, standard deviation, minimum and maximum, and the range. These are just a few statistics that form the basis of most quantitative analysis of data. In Table 1, the descriptive statistics will be shown for both variable X and Y: Table 1

On-Base Percentage (X)| Wins (Y)|
|  |
Mean| 0.346266667| Mean| 85.73333|
Standard Error| 0.00211178| Standard Error| 1.728685|
Median| 0.3485| Median| 86|
Mode| 0.352| Mode| 95|
Standard Deviation| 0.011566698| Standard Deviation| 9.468399| Sample Variance| 0.000133789| Sample Variance| 89.65057| Kurtosis| 0.733099408| Kurtosis| 3.056961|
Skewness| -0.94043351| Skewness| -1.40174|
Range| 0.047| Range| 44|
Minimum| 0.315| Minimum| 54|
Maximum| 0.362| Maximum| 98|
Sum| 10.388| Sum| 2572|
Count| 30| Count| 30|

IV. Model & Data

The simple linear regression model is used “to model the relationship between a scalar dependent variable Y and one or more explanatory variables denoted X” (Levine 411). This paper used this model to form the linear regression equation: Yi = ß0 + ß1 Xi + Ɛi.

Table 2.1 Table 2.2 Table 2.3 Year| OBP| Wins| | Year| OBP| Wins| | Year| OBP| Wins| 1983| .336| 78| | 1993| .330| 80| | 2003| .360| 95| 1984| .341| 86| | 1994| .334| 54| | 2004| .360| 98| 1985| .347| 81| | 1995| .357| 86| | 2005| .357| 95| 1986| .346| 95| | 1996| .359| 85| | 2006| .351| 86| 1987| .352| 78| | 1997| .352| 78| | 2007| .362| 96| 1988| .357| 89| | 1998| .348| 92| | 2008| .358| 95| 1989| .351| 83| | 1999| .350| 94| | 2009| .352| 95| 1990| .344| 88| | 2000| .341| 85| | 2010| .339| 89| 1991| .340| 84| | 2001| .334| 82| | 2011| .349| 90| 1992| .321| 73| | 2002| .345| 93| | 2012| .315| 69|

Data sets are used to measure the variables through imagery and graphs. Shown above, Tables (2.1-2.3) use data taken from the Baseball-Referencr.com>>Red Sox Seasons>>1983-2012>>OBP (website). It utilizes all Red Sox baseball seasons dating from 1983-2012 (30 seasons). The data gives each season’s On-Base...