Last summer Kevin Durant made his professional debut in the NBA summer leagues. His performance inspired the following two posts:
These two columns made two observations:
a. Kevin Durant played very badly last summer.
b. The media covering summer league basketball didn’t seem to notice that Kevin Durant played very badly last summer.
Durant followed this summer league performance by basically playing poorly his rookie season (and again the media didn’t seem to notice).
The Durant story might lead us to believe that something can be learned from summer league basketball (other than the media’s inability to look past scoring totals). But is that really possible? Does a player’s performance in summer league really tell us about a player’s future performance in real NBA games?
The 2008 Las Vegas Summer League Numbers
Before I answer this question I thought I would examine the performances in the 2008 Las Vegas summer league. A few days ago Erich Doerr graciously sent me the data from Vegas. And with data in hand, we can now offer the first evaluation (if we ignore the Orlando summer league) of the 2008 NBA draft class.
Table One presents an evaluation of each player who was
a. chosen in the 2008 draft, and
b. played at least 50 minutes in the 2008 Las Vegas Summer League
Each player is evaluated in terms of PAWS48, or Position Adjusted Win Score per 48 minutes. In essence, each player’s Vegas numbers were evaluated in terms of Win Score, and this number was compared to what an average NBA player – at the player’s position – would have done in 48 minutes. Positive numbers indicate the player was above average (and negative numbers mean… well, I think you can figure that out).
Of the 32 players who satisfied each of the above conditions, only eight (or 25%) was above average. In other words, most rookies played badly – by NBA standards – in Vegas.
Of the eight “good” players, only three – Kevin Love, Jerryd Bayless, and D.J. Augustin – were taken in the first round. The remaining five above average performers were second round choices, with Maarty Leunen and James Gist taken in the last few picks of the draft.
When we look at the “bad”, or below average players, we find a few more lottery picks. O.J. Mayo, Joe Alexander, and Anthony Randolph were each below average performers.
Explaining the Numbers
So what does this mean? Do these numbers indicate that Maarty Leunen is going to be a better pro than Joe Alexander? Will James Gist offer more than O.J. Mayo?
To answer these questions, I thought I would look at the relationship between what the 2007 draft class did in Las Vegas and in their 2007-08 rookie season. Specifically, I collected data on 24 players who were
a. chosen in the 2007 draft
b. played at least 50 minutes in the 2007 Las Vegas summer league
c. and played at least 100 minutes in the 2007-08 regular season.
I then regressed a player’s PAWS48 from his 2007-08 rookie campaign upon his PAWS48 in the 2007 Las Vegas summer league.
The regression resulted in the following equation:
PAWS48 in the NBA = -0.903 + 0.139*PAWS48 in Las Vegas
O.J. Mayo posted a -6.2 PAWS48 in Vegas. Given this model, Mayo’s expected PAWS48 in the NBA is -0.903 + 0.139*(-6.2) = -1.8. With an expected mark below zero, we now have “conclusive proof” that Mayo will not be a “good” NBA player.
Really Explaining the Numbers
Now some people might stop reading after the last sentence and thus believe that I have “proven” Mayo is not going to be a good NBA player. The reality is quite different (there is reason conclusive proof is in quotations).
To understand what this model is actually saying we need to spend a bit more time thinking about the estimated relationship between Vegas PAWS48 and NBA PAWS48. The equation indicates that each one unit increase in Vegas PAWS48 increases the NBA PAWS48 by 0.139. This, though, is just an estimate. And with this estimate comes a standard error.
Before I reveal this standard error, let me review what Ian Ayres calls the “Two Standard Deviation” rule. In Ian Ayres book “Super Crunchers” (a book I really enjoyed and need to talk about more), Ayres states the following:
“There is a 95% chance that a normally distributed variable will fall within two standard deviations (plus or minus) of its mean.”
What does this mean for our regression? The value 0.139 is just an estimate. There is a 95% chance that the “true” value of this coefficient will fall within two standard errors (again, plus or minus) of this value. The standard error for this coefficient is 0.133, which means there is a 95% chance that the “truth” lies between -0.136 and 0.415 (again, you go two standard errors in each direction to get the 95% confidence interval).
Given our confidence interval it could be that
a. the better a player does in Vegas, the better he does in the NBA (or the relationship is positive), or
b. it could be that the coefficient is negative, so the better a player does in Vegas, the worse he does in the NBA, or
c. it could be that the “true” value is zero. This means there is no relationship between Vegas performance and NBA productivity.
In sum, we now know that the relationship between the Vegas numbers and the NBA numbers are positive, or negative, or non-existence. Or in other words, we haven’t learned much of anything.
Actually, let me amend that statement. We did learn something. When we get such results we conclude that the estimated relationship is “not statistically significant.” And once we see this, our discussion of the estimated relationship stops. We would not proceed to forecast Mayo’s NBA performance (as I did earlier). We were not able to find a relationship, and hence our “Mayo story” has to end.
Really, Really, Explaining the Numbers
Although we cannot use this model to evaluate NBA players, there is more to the story. For example, did we actually “prove” that what happens in Vegas stays in Vegas? No, it turns out our model doesn’t even let us reach this conclusion. The simple model has some issues.
First of all, our sample was quite small. We only had 24 observations. Perhaps if we had data from more seasons we could find a relationship.
In addition, and this is perhaps more important, our model may not have been specified correctly. Our model of NBA PAWS48 only considers one explanatory variable [Vegas PAWS48]. Perhaps if we considered other explanatory variables the estimated relationship between Vegas PAW48 and NBA PAWS48 would be different. Specification of a model is extremely important, and if your model is mis-specified your estimated coefficients – and the statistical significance of these coefficients – can be impacted. And this will make the interpretation of your results difficult.
Okay, obviously when you run a regression – and interpret your findings – there are quite a few issues to consider. In fact, there is quite a bit to learn before you even start playing around with this stuff (JC Bradbury wrote a brief post a couple of years ago at Sabernomics detailing what you might want to start reading).
Did We Learn Anything?
Alright, let’s review what we learned from the analysis of the 2008 Las Vegas summer league. I think we learned
a. O.J. Mayo did play poorly in Vegas.
b. other players – who are not as well known – played better.
c. these results, though, do not indicate that O.J. Mayo will have problems in the NBA.
Let me close by noting that one can offer better analysis of the relationship between college and professional performance. And unlike what we see with our Vegas analysis, there is a statistically significant relationship between what we see in college and what we will see in the NBA.
When we look at O.J. Mayo’s college numbers we do see evidence that he might struggle as a pro. Of course, the key word is “might.” Regression analysis is not a crystal ball. It simply reveals tendencies that decision-makers should consider. So despite what Mayo did at USC, it’s possible he will be a productive NBA player. It’s just somewhat more likely that he won’t.
The WoW Journal Comments Policy
Our research on the NBA was summarized HERE.
Wins Produced, Win Score, and PAWSmin are also discussed in the following posts:
Finally, A Guide to Evaluating Models contains useful hints on how to interpret and evaluate statistical models.