FAQ

The Wages of Wins appeared in 2006.  Since that time, a number of questions and comments have appeared again and again in this forum (and in other places).  What follows are the responses to these questions and comments (some of which already appeared in Stumbling on Wins).  One should note… this page is a work in progress.  And when it is completed it will be placed at Stumblingonwins.com.

Why are the player evaluations from Wins Produced so different from what people “know”?  Don’t you ever watch a basketball game?  Are you telling me player X is better than player Y?  Are you serious? 

These questions all express a similar sentiment.  What we see from Wins Produced is often different from what we see from the conventional wisdom.  And this leads a few people to wonder if economists are capable of watching a basketball game. 

This issue was addressed in Stumbling on Wins.  Here is what we say in the Appendix A of the book: 

Chapter Three reports the leaders in Wins Produced from the 2008-09 season.  This list reports a few names – like Chris Paul, LeBron James, Dwyane Wade, and Dwight Howard – that people generally believe are among the best.  But how many NBA fans would rank Troy Murphy, David Lee, and Antonio McDyess among the best players in the game?   

Our response to this complaint is that it’s essentially true. Wins Produced is inconsistent with common perceptions of player performance.  Common perceptions are driven by points scored.  Non-scoring factors tend to be minimized or ignored.  Given this disconnect between how the factors are perceived and the impact these factors have on wins, it’s not surprising that a model that measures wins would give results that differ with how people perceive the game.  Or to put it another way, both PERs and NBA Efficiency are consistent with perceptions of performance; but neither is very consistent with wins. 

Let’s add to this… yes, we have seen a basketball game or two.  And yes, we are serious.

The box score statistics in basketball can’t measure a player’s performance accurately because these do not take into account the impact of teammates. 

Some people believe that the NBA box score doesn’t do a very good job of capturing a player’s contribution to team success.  In Stumbling on Wins, we express disagreement with this sentiment.  Here is what we specifically said in Appendix A: 

Statistics from football tend to be inconsistent. This suggests a player’s numbers are influenced by his teammates.  Although it’s suspected this is true in the NBA, the consistency of performance across time suggests that teammates don’t have much impact on an individual player’s productivity.  Consequently, it seems safe to assume that the statistics tracked for an individual player represent that player’s contribution to team success. 

Let’s add to this… J.C. Bradbury argues there are two issues one needs to consider in evaluating a performance metrics (see Chapter Three of Stumbling on Wins).  First, the measure should explain current outcomes.  In addition, the measure should be consistent across time.  Wins Produced satisfies both criteria.   

So we tend to think – as we note in the book – that Wins Produced is a “reasonable” measure of a player’s contribution.

And as will be noted, measures like plus-minus and adjusted plus-minus – which are not based on the box score – are not very consistent across time.  This suggests these measures do not accurately capture the performance of an individual player.

What about Alternative Box Score Measures[much of this discussion is taken from Stumbling on Wins and was originally noted in Berri and Bradbury (2010)]

Relative to baseball, the statistics tabulated for basketball players has a stronger link to current wins.  Basketball players are also more consistent across time, suggesting that the statistics tracked for basketball players is more often about the player being examined (and not the player’s teammates or luck).

Despite better data, research in basketball has one serious handicap.  A researcher in need of a performance measure for a baseball can turn to an established measure, such as OPS or other more complex statistics. When we first started conducting research in basketball, though, it became clear that existing metrics didn’t capture productivity very accurately.

Consider the two most commonly cited measures, NBA Efficiency and Game Score.  The former is quite similar to Dave Heeran’s TENDEX model, a model — developed by Heeran in 1959 — that is perhaps the oldest summary statistic in basketball.  The latter is the simplified version of John Hollinger’s Player Efficiency Rating (PER).   Although PER makes a number of adjustments beyond the Game Score formulation seen below, the end results are essentially the same.  For the 2008-09 season, PER and Game Score per 48 minutes for the 445 NBA players employed had a 0.99 correlation.

NBA Efficiency = PTS + ORB + DRB + STL + BLK + AST – TO – All Missed Shots

Game Score = PTS + 0.4 * FGM – 0.7 * FGA – 0.4*(FTA – FTM) + 0.7 * ORB + 0.3 * DRB + STL + 0.7 * AST + 0.7 * BLK – 0.4 * PF – TO

A similar story is told about NBA Efficiency and Game Score.  These measures look different, but for 2008-09 season there was a 0.99 correlation between a player’s NBA Efficiency and Game Score.

These measures all align because each tells a similar story about player scoring.  For example, imagine a player who takes twelve shots from two-point range.  If he makes four shots, his NBA Efficiency will rise by eight.  The eight misses, though, will cause his value to decline by eight. So a player breaks-even with respect to NBA Efficiency by converting on 33% of his shots from two-point range.  From three-point range, a player only needs to makes 25% of his shots to break-even.

Most NBA players can exceed these thresholds.  Therefore, the more shots most NBA players take the higher will be his NBA Efficiency total.  As a consequence, players who take a large number of shots tend to dominate the player rankings produced by this measure.

For Game Score the same problem exists, only the problem is a bit worse.  As detailed at stumblingonwins.com, the break-even point on two-point shots for Game Score is 29.2%.  From three-point range a player breaks-even if he hits on 20.6% of his shots.  If a player surpasses these break-even points – and again, most players can do this – then the more shots he takes the higher will be his value.

Because these measures reward a player for just taking shots, they don’t tend to explain wins very well.  As detailed at stumblingonwins.com, a team’s NBA Efficiency only explains 32% of the variation in team wins.  A team’s Game Score and PER explains 31% and 33% of the variation in win respectively.  One might note, though, that these measures don’t include the team defensive adjustment employed in the calculation of Wins Produced.  Unfortunately, if you add the team defensive adjustment to NBA Efficiency, Game Score, and PERs, explanatory power only rises to 58%, 60%, and 56% respectively.

What about Plus-Minus and Adjusted Plus-Minus? [much of this discussion is taken from Stumbling on Wins and was originally noted in Berri and Bradbury (2010)]

The problems with NBA Efficiency and the Player Efficiency Rating were also noted by Wayne Winston (2009).  Winston sought to solve these problems by developing adjusted plus-minus.  Here are some thoughts on this approach.

The plus-minus approach involves looking at how many points a team scores and allows when a player is on and off the court.  Because a player’s plus-minus, though, depends on the quality of his teammates, people have turned to a measure called adjusted plus–minus. This approach involves employing a regression that is designed to control for the impact of a player’s teammates.

Although an attempt is made to control for player interactions, as Berri and Bradbury (2010) noted, an examination of 239 players revealed that only 7% of the variation in a player’s adjusted plus–minus value in 2008-09 was explained by what he did in 2007-08.

And if we turn to a sample of 87 players who switched teams in these years, only 1% of the variation in adjusted plus–minus in 2008-09 was explained by the player’s adjusted plus–minus in 2007–08. Furthermore, the relationship between performances in each of these seasons—for the players who switched teams—was statistically insignificant. So if we change all of a player’s teammates, his adjusted plus–minus appears to change as well.

There is another issue with adjusted plus–minus noted by Berri and Bradbury(2010).  For each player, a coefficient is estimated that represents a player’s value, theoretically holding all else constant. Each coefficient comes with a standard error, and the size of these errors suggests that for the vast majority of players, one cannot differentiate his adjusted plus–minus coefficient from zero. In general, if a coefficient is twice the size of the standard error, then one is 95% confident that the coefficient is actually different from zero (i.e. there is only a 5% chance that the coefficient is zero). Of the 666 player observations from the 2007-08 and 2008-09 season, only 10% had a coefficient that was twice the value of the standard error. Only 20% of coefficients were at least 1.5 times the value of the standard errors. In sum, for most players it appears the results are not statistically significant and therefore one cannot say if most players—according to adjusted plus–minus—have any impact on team outcomes at all.

Proponents of adjusted plus–minus have argued that increasing the amount of data results in smaller standard errors. This is true. BasketballValue.com reports coefficients for 292 players who played in both 2007–08 and 2008-09. For this data set, 15% of players had a coefficient that was twice the value of a standard error.  Looking at the 1.5 threshold, 26.0% of coefficients surpass this mark. An even greater gain is seen if five years of player data is examined. Examining the results for 373 players who played for five seasons, one sees that 39% of coefficients are at least twice the value of the standard error. And 50% surpass the 1.5 threshold.

Although more data does increase the level of statistical significance, it’s still the case that most players—even when five years of data is employed—are not found by this method to have a statistically significant impact on outcomes.

Adjusted plus-minus is designed to account for everything a player does on the court, including on-the-ball defense. The box score data—as proponents of adjusted plus-minus note—does not fully measure a player’s contribution to defense. Consequently, when a disparity between a box score measure and adjusted plus-minus is uncovered, one might conclude that the disparity reflects the inability of the box score data to capture on-the-ball defense. Unfortunately, such differences might also reflect the substantial noise in adjusted plus-minus. And it is simply not clear how one could tell the difference between the ability to capture defense and the noise in the adjusted plus-minus system.

One should note…the data on adjusted plus–minus comes from BasketballValue.com. This data was compiled by Aaron Barzilai. According to BasketballValue.com, the calculations were done in the spirit of the work of Dan Rosenbaum. Rosenbaum’s work, in turn, is based on the work of Wayne Winston and Jeff Sagarin. We do not have access to the original work of Winston-Sagarin so we cannot say the issues raised apply to the work of Winston-Sagarin. Winston (2009)—in discussing his work—does note that there is “… a lot of noise in the system. It takes many minutes to get an accurate player rating” (p. 215).

Wins Produced Overvalues Rebounds

UPDATE: In the summer of 2011, an adjustment was made to Wins Produced to take into account the diminishing returns we see with respect to defensive rebounds.  As noted below, this is a real effect.  Previously, though, it was argued (as you can see below) that including this effect didn’t change much.  And that is true.  Nevertheless, people often asked to see what the Wins Produced numbers would look like if we took this effect into account.  So rather than have two sets of WP numbers, the model was just updated to take this into account.

What follows are the many arguments offered for why rebounds were valued “correctly” originally.  If you do not like these arguments… well, then you should be happy with the new version of Wins Produced.

Response #1 – The Consistency of Rebounds

As noted in Stumbling on Wins, per-minute rebounding is very consistent across time.  The correlation coefficient for rebounds per-minute – comparing this season to last season in the NBA — is over 0.9.  When you adjust for position played, the coefficient is still 0.83.   To put these numbers in perspective, here are some correlation coefficients of other statistics tracked in team sports: 

  • Baseball — OPS (On-base percentage +slugging percentage): 0.65
  • Baseball — Walks per nine innings: 0.64
  • Baseball — Batting average: 0.47
  • Baseball — Earned run average: 0.37
  • Football — Rushing yards per attempt (for running backs): 0.36
  • Football – QB Rating: 0.35
  • Hockey – Save percentage (for goalies): 0.24
  • Football – Interceptions per attempt: 0.07

Once again, JC Bradbury (who provided the correlations for the baseball statistics) argues that when a statistic is consistent across time we can generally believe that the statistic reflects the skill of the player; and not the team around the player.  And we tend to think that statistics like OPS are about the hitter in baseball (even if this correlation is lower than what we see for rebounds).  In contrast, interceptions in football and save percentage in hockey don’t seem to reflect the skills of quarterbacks and goalies.

When we look at rebounds, we see a higher correlation than all of these statistics.  And this high correlation still exists when we consider position played.  This leads one to conclude that rebounding is a skill that is primarily about the player credited with the rebound (an observation made about Kevin Love in Sports Illustrated in the December 20, 2010 issue).  In other words, a player’s rebound totals are not really about his teammates’ ability to play defense or his teammates’ willingness to let the player in question “steal” all their rebounds.

Response #2 — Rebounds Are Not the Same for All Teams

If a player’s rebounds are all “stolen” from his teammates, then teams would have to be getting the same number of rebounds.  So do all teams end up with the same number of rebounds? UPDATE: Don’t have a link to this, but believe it or not, someone actually made the rather silly argument that all teams get the same number of rebounds.  What follows is the response to this silly argument.

This is actually fairly easy to check.  If we want to compare the level of variation in two series of numbers we turn to the coefficient of variation (standard deviation divided by the mean).

And here is the coefficient of variation (for every NBA team from 1990-91 to 2009-10) for various statistics is as follows:

  • defensive efficiency: 0.035
  • offensive efficiency: 0.037
  • defensive rebound percentage: 0.039
  • opponent’s adjusted field goal percentage: 0.040
  • adjusted field goal percentage: 0.043
  • offensive rebounds percentage: 0.106

There is more variation in offensive rebound percentage than there is with respect to defensive rebound percentage.  But it is still the case that defensive rebounds vary about as much as offensive efficiency, defensive efficiency, opponent’s adjusted field goal percentage (which has only a -0.20 correlation with defensive rebound percentage), and adjusted field goal percentage.  And since we understand that all teams are not the same with respect to the efficiency measures (at least, I hope that is understood), we should also see that all teams are not the same with respect to defensive rebounds.

Response #3 — Do We Overvalue Rebounds (Responding Again and Again)?

Okay, rebounds are consistent and rebounds vary across teams.  Yet, you may still think rebounds are overvalued. So let’s address that question (again).

Back in November of 2006 – or more than four years ago – this question was addressed: Do We Overvalue Rebounds?  The answer to this question involved adjusting the value of a rebound.  At the time, values of 0.7 were arbitrarily chosen.  The results… well, nothing much really changed.

In writing Stumbling on Wins we re-visited this issue.  Here is what we say in our latest book (this is from Appendix A):

The existence of diminishing returns leads some to suspect that the impact of productive players is inflated by Wins Produced.  In Chapter Eight this concern was addressed when it was noted that although diminishing returns do exist, the effect is quite small. 

When one looks at specific statistics, one does see large effects with respect to points scored and field goal attempts.  One also sees an effect with respect to defensive rebounds (although it’s only about half of what we see with respect to scoring).  People tend not to be troubled by the possibility the value of scorers is over-stated.  When people see a player like Ben Wallace (a player known for rebounding) lead the league in Wins Produced in 2001-02, then questions are raised.

To address these concerns, two versions of Position Adjusted Win Score (PAWS) were constructed.  The first only counted half of a player’s rebounds. Re-ranking the players with this adjusted version of PAWS revealed that Ben Wallace was still the top ranked player in the game in 2001-02.  This is because the revised version of PAWS per-minute and WP48 have a 0.95 correlation.  One can also construct PAWS by giving offensive rebounds a weight of 0.7 and defensive rebounds a weight of 0.3 (following Hollinger’s lead).  With these values Ben Wallace was still the top ranked player in 2001-02.   This is also not surprising since this version of PAWS per minute and WP48 still has a 0.95 correlation.

The values chosen for rebounds in the book – like the exercise presented four years ago – were somewhat arbitrary.  Recently WP48 was re-estimated, but this time a defensive rebound would only be worth half as much as points, field goal attempts, offensive rebounds (just to note… we did not find diminishing returns for offensive rebounds), turnovers, and steals.   This adjustment follows the diminishing returns results reported in Stumbling on Wins for defensive rebounds (although we are completely ignoring diminishing returns for all other factors).

The re-estimation of WP48 specifically involved subtracting from each player’s production half the value of their DRB numbers.  The next step involved adding back in half the value of the team’s DRB numbers, according to the minutes each player played (so half a DRB is credited to a player, half is credited to the team).  WP48 was then re-estimated

Here is what you see for last year’s 441 players with the two different approaches to measuring WP48.

  • The correlation between WP48 by each method is 0.98
  • The correlation between Wins Produced by each method is 0.99
  • The correlation between ADJ P48 by each method is 0.97

And here are the top 20 players in Wins Produced in 2009-10.  Two rankings are offered.  The first is how Wins Produced was reported at the end of the 2009-10 season.  The second is with DRB valued at 50%.  As one can see, the player rankings are not very different.

Specifically, one can see that…

  • Marcus Camby is ranked 6th by the previous method. With DRB re-allocated he was ranked 9th.
  • David Lee was ranked 11th before, and with DRB re-allocated he would be ranked 15th.
  • Of the player ranked in the top 20 with the traditional approach, 19 are in the top 20 with DRB valued at 50%.  And the one player who drops out (Troy Murphy) only drops from 16 to 21.

In other words, even if half of Murphy’s defensive rebounds came from his teammates, he is still an amazing player.

And this appears to indicate – once again — that diminishing returns doesn’t make any real difference (as was said in Stumbling on Wins, and as was said four years ago, and for more on this issue, one should read this post from the Sport Skeptic).

Response #4 – WP48 isn’t just about Rebounds

What is the role rebounds play in a player’s WP48?

To answer this question we need data.  Well, how about more than 8,700 player observations from 1977-78 to 2007-08.  Across this data WP48 was regressed upon the following statistics: Points per field goal attempt, free throw percentage, rebounds, turnovers, steals, assists, blocked shots, and personal fouls.  These statistics were adjusted for position played and measured on a per-minute basis.

Estimating the regression doesn’t actually answer the question (at least, it doesn’t if you estimate a linear model).  What we want to know is the relative importance of each statistic, or responsive WP48 has to changes in each statistic.  And another name for “responsiveness” is “elasticity” (a concept you may remember if you ever took Microeconomics).  More specifically, we need to look at how a 1% change (or a 10% change or whatever percentage change you wish) in each factor impacts WP48.

The elasticity results – derived from the aforementioned regression and reported below – might prove surprising to some:

  • Points per field goal attempt: 5.2%
  • Rebounds: 3.2%
  • Free throw percentage: 1.2%
  • Personal fouls: -1.1%
  • Assists: 1%
  • Turnovers: -0.9%
  • Steals: 0.7%
  • Blocked shots: 0.2%

Rebounding certainly matters.  After all, getting and keeping possession of the ball is important; and rebounds are the primary way a team gains possession (without letting the other team score).  But WP48 is more “responsive” to shooting efficiency from the field.  A 1% change in points per field goal attempt (or adjusted field goal percentage times two) leads to a 5.2% change in WP48.  UPDATE: Obviously, rebounds are less important to the new version of WP.

And that result re-enforces a story that has been told again and again.  Scoring totals – by themselves – are not what matters in the NBA.  What matters is the ability to put the ball in the hoop.  In sum, shooting efficiency is important and players who score inefficiently are not really helping.   Furthermore, metrics like Player Efficiency Rating and NBA Efficiency – which do not properly capture the importance of shooting efficiency, do not properly capture a player’s impact on wins.

Of course, one should add that this isn’t just about performance metrics.  The importance of shooting efficiency also reminds us that the emphasis placed on scoring totals in the NBA – which we can see in the study of free agents, the voting for post season awards, the allocation of minutes, and the NBA draft – is misguided.  Inefficient scorers may be rewarded by decision-makers.  But these players do not contribute much to wins.

Summarizing All That Was Said About Wins Produced and Rebounds

  • Rebounds are one of the most consistent statistics in team sports.  Players who rebounded well in the past tend to do in this in the future.  Players who are not good at rebounding in the past tend to be poor at this aspect of the game in the future.
  • Rebounds also vary across teams.  And teams that rebound well tend to employ better rebounders (or at least, avoid employing really bad rebounders).
  • Diminishing returns does exist.  This is seen with respect to Wins Produced in general and also with respect to defensive rebounds (but not, apparently, with respect to offensive rebounds).  The effect, though, is small. This is seen when we estimate the size of the diminishing returns effect.  It is also seen when we re-estimate player productivity with different value for defensive rebounds.
  • Although rebounds do have a substantial impact on wins, it is shooting efficiency – not rebounding – that is the most important determinant of a player’s Wins Produced.

Wins Produced only considers a player’s offense. Defense is not considered.

When people look at the box score statistics they tend to focus on scoring.  So it is probably not surprising that some people think that the box score is only about offense.

The box score, though, includes statistics that reflect activity at both end of the court.  In fact, the box score statistics do a wonderful job of explaining wins in the NBA.

There is something missing, though, from the box score.  Most of the data tracked is linked to an individual player.  Data on opponent’s points scored, opponent’s made field goals, opponent’s turnovers that are not steals, team turnovers, and team rebounds, though, are not linked back to individuals.  To incorporate these factors in the calculation of Wins Produced, a team defensive adjustment [labeled DEFTM48 or TMDEF48] is calculated (as detailed at the Calculating Wins Produced page and in Stumbling on Wins).

As noted at the Calculating Wins Produced page, this adjustment is quite small and does not substantially alter our per-minute evaluation of players: “The average value, in absolute terms, of DEFTM48 is 0.011, so again this is a very small adjustment. And as we saw with MATE48, DEFTM48 has very little impact on our assessment of individual players. The correlation coefficient between P48 and Adj. P48 in 1977-78 was 0.9977.”

In Stumbling on Wins we also added the following:

“…it should be noted that TMDEF48 incorporates the five factors reported in Table A.2 that are tracked for the team, but not tracked for individual players. These include opponent’s points scored, opponent’s made field goals, opponent’s turnovers that are not steals, team turnovers, and team rebounds. These statistics are allocated across players according to the minutes each player plays. This approach essentially follows from Scott, Long, and Somppi (1985); Berri (1999); and Oliver (2004).

Such an approach assumes that defense is essentially a team activity. The validity of this assumption is bolstered by the fact that teams typically play defense together. This is especially true in the NBA today, where zone defenses are legal. This approach allows one to differentiate players who play on good and bad defensive teams. However, it fails to differentiate between players who are relatively better or worse on an individual team.

An alternative approach was suggested by TyWillihnganz of The Courtside Analyst. Willihnganz has augmented Win Score (the simplified version of Wins Produced) by incorporating defensive data from 82games.com. This is a Web site—primarily known for plus-minus data—that reports how well a player’s supposed defensive assignment performs.

As noted in Chapter 3, “The Search for Useful Stats”, we question the ability of a plus-minus measure (and adjusted plus-minus) to completely and accurately capture a player’s value. It does seem possible, though, that such data could better capture a player’s defensive ability. In other words, perhaps plus-minus data—as Willihnganz attempts— could supplement what we learn from the standard box score data.

It appears that the player evaluations offered by Willihnganz are quite consistent with what we see from our calculations (which are based solely on box score data). In considering the few differences that exist, though, it’s important to remember the primary problem with plus-minus data. A player’s plus-minus value, as we observed in

Chapter 3, appears to depend on his teammates. This feature is illustrated by the inconsistency of these measures. Willihnganz has noted that such inconsistency also plagues the data he employed on the performance of each player’s opponent. Such inconsistency suggests that the data he utilizes is not fully capturing a player’s defensive ability. Consequently, although we find Willihnganz’s approach interesting, we are not convinced it’s necessarily an improvement over our approach to capturing defense.”

To summarize… defense is part of the Wins Produced calculation.  The specific approach taken has a history in the academic literature and is essentially employed by Dean Oliver in the calculation of his basketball performance measures, which are employed in his measure of individual won-loss records (and also by Justin Kubatko in his calculation of Win Shares).

Despite the history behind this approach, the team defensive adjustment has led to some confusion.  Specifically, the following statement has appeared in the past:

Any rating system that utilizes a “team adjustment” to account for the aspects of defense that are not captured by individual statistics will equate to team wins

This statement – recently offered by Kevin Pelton at Basketball Prospectus – is not supported by the empirical evidence.  Before we get to the evidence, though, let’s talk about what appears to be the origin of this sentiment.

The Wins Produced story – as noted in November of 2010 – notes that Dean Oliver (author of Basketball on Paper) played a role in the creation of the model.  Prior to The Wages of Wins appearing, Dean was essentially the only on-line basketball stats person who had any significant interaction with the authors of The Wages of Wins.

After the book appeared in (and actually, soon after the book was just announced), we learned of the existence of a small on-line community who call themselves APBRmetricians.  And many members of this group were quite unhappy with Wins Produced and The Wages of Wins.

Leading this group was Dan Rosenbaum and David Lewin.  At the time, the latter was just an undergraduate at Macalester College.  Rosenbaum, though, was an assistant professor of economics at UNC-Greensboro.  Before The Wages of Wins was even published (or at least, before Rosenbaum ever had a chance to read the book), he was voicing his displeasure with our work.

Our initial advice was “read the book”.  After this advice was supposedly followed, though, the criticism continued.  And that led to our second piece of advice: “Publish your critique”.  This advice follows from the incentives facing academics.  Professors are not just asked to teach college courses.  We are also expected to do research, and this research is supposed to be reviewed by our peers.  Research that passes peer-review is then published, and it is the quantity and quality of our publications that are primarily used to evaluate our performance (just for the record – as of January, 2011 – Martin Schmidt has 35 publications in academic journals and academic collections while David Berri has 43 publications in journals and collections).

Given the nature of academic – and since Rosenbaum held an academic position at the time – we expected that if he had a substantive critique that this could be published. And in turn, we would be able to add to our publication totals by publishing a response.

But although the criticisms began in early 2006, it was not until the end of 2007 that anything resembling an academic article appeared.  The specific working paper that did finally appear –“The Pot Calling the Kettle Black: Are NBA Statistical Models More Irrational Than“Irrational”Decision-Makers?”  — appeared to have more than a few problems.  Still, we were willing to wait for it to be published someplace.

So we waited, and waited, and waited…

Well, as of 2009, the paper still hadn’t appeared.  Although it was not published, a response was offered as part of a paper published by David Berri and J.C. Bradbury [“Working in the Land of the Metricians”, published in the Journal of Sports Economics in 2010].  And these critiques were also presented in the end notes of Stumbling on Wins.

Here is much of what was said:

1. Lewin and Rosenbaum offer a model of player salaries that fails to note a substantial literature, fails to account for many factors previously found to explain player salaries, and produces a result that contradicts the central thesis of the Lewin and Rosenbaum paper.

The salary model estimated for The Wages of Wins, in Berri, Schmidt, and Berri (2007), and in Stumbling on Wins followed from the model presented in Jenkins (1996).  From Stumbling on Wins… specifically, only players who recently signed a contract were examined (in the model of player salaries). Lewin and Rosenbaum (2007) recently illustrated why the Jenkins approach is necessary. These authors examined a data set that included all NBA players. The results reported by these authors indicated that scoring totals were the primary determinants of player salary. The results also indicated, though, that shooting efficiency and steals had a negative—and statistically significant—impact on player salaries. Such a result suggests that players who miss more shots get paid more.

Before anyone believes such analysis, though, it’s important to note that the data set included players who signed contracts years before the performance data was generated. Furthermore, it appears players were evaluated who were still playing under their rookie contract. The failure to restrict the salary data might explain such odd findings.

A bit more on this model… Lewin and Rosenbaum argue that decision-makers are not shown to be “irrational” by previously published work.  Their own model, though, tells us that players can get paid more by missing shots.  Again, this is not a result we uncover in our study of NBA salaries.  But this result does contradict the central argument made in Lewin and Rosenbaum.

One should also add that there is a substantial literature examining the determinants of player salaries [detailed in Berri (2006)].  More than a dozen papers have been published on this topic since the 1980s.  None of these papers are cited by Lewin and Rosenbaum.  And had these authors bothered to look at this literature they would have seen that player salaries in the NBA are not strictly a function of player performance on the court.  Other factors people have considered include player experience, injury status, market size of signing teams, success of team employing player in the past, and position played.  In sum, the salary model – because the authors failed to review the relevant literature – is quite poor (and again, contradicts the argument the authors were trying to make).

2. Lewin and Rosenbaum’s approach to evaluating models is – to say the least – somewhat odd.

In Stumbling on Wins we discuss what appears to be the most obvious approach to evaluate two competing player performance measures:

The approach taken to differentiate batting average and OPS looks at how each factor explains current outcomes. The measure that explains current outcomes better is considered the superior measure. This is the same approach noted in Appendix A to differentiate Wins Produced, NBA Efficiency, Player Efficiency Rating (PER), and Game Score.

Berri and Bradbury (2010) critiqued an alternative approach advocated by Lewin and Rosenbaum (2007). These authors were examining a variety of measures used to evaluate NBA players (Wins Produced, PER, etc.). They begin by regressing a team’s efficiency differential (points scored per possession minus points surrendered per possession) on a team’s PER (or whatever metric was being examined). The result of this regression, plus the regression’s residual (or error term), was then used to evaluate players. This evaluation was then used to predict a team’s efficiency differential for the next season. The results indicated that the models could explain between 75% and 77% of future wins, suggesting that all models were the same. Of course, as any student of econometrics would know, any model plus the error term (as Lewin and Rosenbaum actually noted) would explain 100% of current wins. Appendix A notes that when one does not include the error term in the evaluation of a model, it’s clear Wins Produced does a better job of explaining wins than PERs or NBA Efficiency.

Again, as noted above, PERs and NBA Efficiency – with the team defensive adjustment employed in the calculation of Wins Produced – explains 60% or less of the variation in current wins.

Let’s add to this observation with something noted in a forthcoming contribution to The Handbook of Sports Economics [eds. Stephen Shmanske and Leo Kahane; Oxford University Press].

One can go one step further and allow the individual components of the team defensive adjustment (detailed in Berri (2008) and employed in the calculation of Wins Produced) to vary. Such a step does raise the explanatory power of PERs to 82%. Wins Produced, though, explains 95% of wins, so even with the team defensive adjustments components added, the more popular measures come up short.

One should note that PERs –by itself – 0nly explains about 33% of team wins.  If you add in all the defensive variables – and you let the coefficients take on any value – you can raise the explanatory power to 82%.  But then, it is the team defensive factors that are offering the bulk of your explanatory power.   So what you learn about individual players from PERs is still not helping much.  Finally – as noted – even if you let the team defensive variables take on any value, you still can’t match the explanatory power of Wins Produced.  And that means – the argument offered by Kevin Pelton – appears to be incorrect.  One cannot add the team defensive variables to any model and explain win as well as Wins Produced.

3. Lewin and Rosenbaum offer another “odd” test.

Lewin and Rosenbaum also seek to “test” models by regressing the adjusted plus-minus coefficients (estimated for each player) on the player evaluation offered by various box score metrics.  As noted above, the adjusted plus-minus coefficients are frequently statistically insignificant.  Given the insignificance of these coefficients, it is not clear what the regression the authors are estimating is actually telling us.  Furthermore, it is also not clear which model is being tested. Specifically, if one regresses adjusted plus-minus on Wins Produced, are you testing Wins Produced?  Or are you testing the validity of adjusted plus-minus?

Again, a better approach to testing models is to look at how well the model explains current wins and the consistency of the model across time.  From these two perspectives, Wins Produced out performs NBA Efficiency (doesn’t explain current wins as well), Player Efficiency Rating (doesn’t explain current wins as well), Win Shares (as will be noted, is not as consistent over time), and Adjusted Plus-Minus (very inconsistent over time).

Let’s close this discussion by noting that in the conclusion to Lewin and Rosenbaum the authors argue their evidence is “overwhelming”.  Unfortunately, as we go through each section of the paper we see there is very little the authors could do to transform their “overwhelming” evidence into a paper that should be published.  Each section of the paper appears to have significant errors that undermine the argument the authors were trying to make.  Again, this paper – for what appears to be obvious reasons – was never published in a peer reviewed academic journal.  It was “reviewed” by the on-line APBRmetrics community, but the problems noted above did not appear to be noted by any members of this group.  That suggests that submitting academic work to this group is not equivalent to submitting work to an academic journal.


Leave a reply