Regression Analysis for the pricing of players in the

Indian Premier League

Executive Summary

The selling price of players at IPL auction is affected by more than one factor. Most of these factors affect each other and still others impact the selling price only indirectly. The challenge of performing a multiple regression analysis on more than 25 independent variables where a clear relationship cannot be obtained is to form the regression model as carefully as possible. Of the various factors available we have leveraged SPSS software for running our regression analysis. One of the reasons for preferring SPSS over others was the ease with which we can eliminate extraneous independent variables. The two methodologies used for choosing the best model in this project are: * Forward Model Building:

Independent variables in order of their significance are incrementally added to the model till we achieve the optimum model. * Backward Elimination:

The complete set of independent variables is regressed and the least significant predictors are eliminated in order to arrive at the optimum model.

Our analysis has shown that the following variables are the most significant predictors of the selling price: COUNTRY : whether the player is of Indian origin or not AGE_1 : whether the player is below 25 years or not T_RUNS : total number of test runs scored by the player ODI_RUNS : total number of runs scored in ODI matches

ODI_WICKET : total number of wickets taken by the player

RUNS_S : total number of runs scored by the player BASE_PRICE : the base price of the player set in IPL

Using the calculated coefficients the regression model equation can be stated as below: SOLD PRICE = -13366.247 + 219850.349(COUNTRY) + 204492.531(AGE_1) -59.957 (T_RUNS) + 53.878 (ODI_RUNS) + 491.636 (ODI_WICKET) + 194.445(RUNS_S )+ 1.442(BASE_PRICE)

Analysis of Results

* Following is a snapshot of the estimated best regression model ( explained in depth as part of answer to Q no 1)

Model Summary|

Model| R| R Square| Adjusted R Square| Std. Error of the Estimate| 1| .772a| .597| .573| 265690.463|

a. Predictors: (Constant), BASE_PRICE, AGE_1, RUNS_S, ODI_WICKET, COUNTRY, T_RUNS, ODI_RUNS|

* From the regression model we have estimated BASE_PRICE is found to be the highest impact predictor. This implies that more than anything else the benchmark base price of a player is the single strongest determinant of the selling price of the player. * The analysis shows that T_RUNS, i.e. amount of runs scored in test matches negatively impacts the selling price of the player. It is surprising though not unexpected to find that superior performance by a batsman in test matches reduces his worth in IPL auctions. * The positive correlation between AGE_1 and selling price indicates that the younger a player the higher is his expected compensation. * Players from India are expected to command much higher bids than their foreign counterparts, as evidenced by the positive coefficient of COUNTRY. * Another observation is that the total amount of runs scored by a player positively impacts his selling price. * The R Square value of the model comes out to be 0.597 (and the adjusted R Square value is 0.573). This small value of R Square indicates that our regression model has limitations. * The standard error of the estimate is found to be large and equal to 265690.463.

Q3 What is the impact of ability to score “SIXERS” on the player’s price? In order to analyze the impact of the variable “SIXERS”, we add it in the regression model and then we observe that the probability of T statistics for SIXERS is 0.862 and the value of RUNS_S is 0.0504 which makes it in the rejection region. So this means that the impact of this...