Explaining Wins: I Knew It!

End of an Era

Yesterday I reviewed some excellent work by Patrick in regards to the “low usage, high efficiency” player myth. The basic claim is two-fold. Many claim that Wins Produced overvalues these players. And with that claim comes the implication that being this type of player is easy. Patrick pointed out very simply that these type of players are rare and I showed that they were also not the players leading the way for top Wins Produced players.

Hoopdon, who comments frequently, wrote a follow up post. His goal was to explain how low usage, high efficiency players benefit by playing by high usage players or good systems. I am going to do a critique of his post for this post (meta, I know!) and hopefully explain some of the problems with trying to explain things the way Hoopdon did.

Sample Size

I’ll be honest, I was upset when I started reading Hoopdon’s post

Many claim efficient low-usage players can be vastly overrated by WP, while high-usage less efficient players can be underrated.  I don’t generally agree with the premise, however there are certain cases where it can be true.  This isn’t so much a function of WP being crap, but like in any holistic statistical metric, certain data points will fall outside the norm for whatever reason.

Now, this isn’t strictly false, but I was irritated by how this was presented. Our point yesterday was to very clearly say – these players are rare and a majority of low usage players aren’t high efficiency, and even the low usage / high efficiency players aren’t highly ranked! This start, completely ignores that point. The “in certain cases” language is completely off to me. The truth is that there are 8 players that fit this mold and only 3 of them are in the top 20. Starting with this premise ignores the data. There have been 308 players to play 400+ minutes this season. 8 is not a large group by any means.

And there’s something key behind this. With only 8 data points (and Hoopdon only reviews 5) it is hard to draw a huge conclusion from this group. Moving on.

What are you controlling for?

In examining Hoopdon’s analysis, his explanations went like this:

And as Dave mentions in our comments, Hoopdon does this by providing large snapshots in player careers. Now, our issue is pretty clear. Hoopdon is attempting to explain player performance for a very limited group of players to very specific things. Specific coaches, teams and players are singled out. In examining the five players, here are just a few things I noticed

  • The players play different positions
  • The players ages (Sefolosha enters his prime right as he improves)
  • The teams they played for (Kinda mentioned with Spurs, but others?)
  • Other players (Durant and Dirk are singled out, what about the other players?)
  • The player them self (Some players change)

I like looking into issues like these. As I’ve explained with my recent fascination with Andre Drummond, outliers are fun. The thing is, to make broad sweeping explanation requires controlling for all the things that might impact the outcome. A recent example is the Celtics win streak (snapped by the Bobcats, hah!) Here is some logic you could employ:

  • In games where Rondo played, the Celtics had a losing record.
  • In games where Rondo did not play the Celtics had a winning record
  • In games where the Celtics play without Rondo and against Byron Mullens, they lose.

From this data can we make the claim, the Celtics should play without Rondo and never want to play against Byron Mullens? No! Not unless we control for everything else. And as we pointed out as the start of the Celtics win streak, there were other things at work. And this is a reason trying to explain small sample sizes can be rough. So many things change that trying to pin down the explanation on one thing is difficult.

Now, year to year, we have an out. We do notice players tend to be consistent. But that doesn’t mean they don’t change. And yes, further analysis can reveal specific things that impact that change, some of which are reviewed in the great book Stumbling on Wins (Age is huge, and yes we do notice Greg Popovich and San Antonio do improve players) The key is, when we do this analysis, we examine the whole population, specify the variables we want to control for and see if the results exist and if they are meaningful.

And that’s really our problem with analysis like above. To try and explain low usage / high efficiency players by cherry picking their situations and pointing to things you think matter doesn’t work. Yes, it can be the start of the analysis but it is not conclusive. And notice how Hoopdon wraps up his point

So what does this all mean?  Well, it means that low-usage players can be risky gambles.  Put them in the correct situation, and the production can be staggering.  Put them in the wrong situation, and it could get ugly.

Wait, what? We’ve examined 5 players, with many variables to control for. As far as I can tell, this hasn’t been done. Can we really make a judgement on all low-usage players based on cherry picking the data for five players? No! Even the individual analysis of each player is by no means conclusive. For example, Jason Kidd’s efficiency was between 50.0% and 57% with Dallas, that’s a pretty big range. Can we really hand wave and say Dirk and Carlisle made him better?

Now, this is coming down a bit hard on Hoopdon. Let me be clear. I am very happy with his analysis. I like where it is going. The mistake is by stopping far short of sufficient to make a conclusion and not just making a conclusion, but making a huge conclusion about both a class of player and a metric. And I will also say that hopefully Hoopdon wants this type of feedback. If not, well I hope you enjoyed the post anyway.


Comments are closed.