While listening to a recent Hoopspeak, I heard Kevin Pelton reference +/- several times. Here is the simple truth about +/-: it’s a bad stat (Editor’s note: Clarification. It is a bad stat for analyzing which players are responsible for winning games). Take assists (and we’ll be talking about them more in a bit). How well a player did on assisting the prior year explains 87% of the variation in assists the next year! How about +/-? It’s down at 23%! And they key is that we can explain 95% of wins with the stats found in your boxscore. The least consistent of these stats is field goal percentage at 47%, which is still over twice as consistent as +/-. In short, all of the stats found in the classic box score can explain wins very well, most of them are attributed to individual players, most are very consistent year to year, and all of them are more consistent than +/-.
And yet, the love of +/- as a stat won’t die. Kevin Pelton is well regarded in the stats community and he sees it as the stat to use in several situations. And the thing is this thought is not uncommon. In fact, I got a front row seat to a discussion on stats and plus-minus at last year’s Sloan between Dean Oliver and John Dewan, let’s review!
Full talk here, with it specifically at time this subject comes up. Dean talks first.
Assists are pretty dirty. I remember something that Bill [James] wrote a long time ago about statistics, evaluating them on how clear they are, how reproducible to some degree, and how well the represent what’s going on. And I think an assist is an important thing to capture. You want to get some idea for how people helped set up a basket. But they’re so subjective. I know the NBA is very liberal in handing them out and other leagues are not.
Dean then points out the rules for an assist (“a pass directly leading to a basket”) is very ambiguous.
You want to use the data because it gets at something important but it’s so subjective that it’s difficult.
On the other end we have this +/- statistic that shows up in the box score right now. It is very clearly defined, it’s unambiguous, you know what it is — it’s how well the team does while you’re on the court. But how well it reflects on the individual’s contribution is very vague.
Now, here’s the thinking Dewan jumps in with.
Every statistic is misleading. Every statistic taken in different context. Every statistic has a flaw. What we’re trying to do is assemble that have some meaning and have the best meaning. If someone scores 28 points in a losing effort and he does that consistently in a losing effort, is that more meaningful than the fact that he had his plus-minuses: -6 you know, then it’s -8, then it’s -4. He’s scoring points but he consistently has a negative. There are outside influences on plus minus. Yeah, if you’re always playing with your starting unit your plus minus generally should look better than your backup unit unless your backup unit is better than the other other backup unit. Then you could have really good plus minuses cause you’re with….it’s all dependent upon the unit. In the end I think it is more meaningful than all the traditional basketball stats even though it has flaws.
Dewan stumbles down the rabbit hole that I’ve seen many analysts on the web also go down. He purports that stats at face value aren’t enough, and that a consistent +/- will be more meaningful. Even as he fumbles around to explain why, he caps it with his proclamation that — flaws and all — it is better than ALL other traditional stats.
Thankfully Dean does end the discussion with:
I know coaches like to use it because it is unambiguous. ‘Yeah, we played really well with him out there’ . But it doesn’t reflect easily on an individual because there are so many other factors out there.
Have I mentioned how awesome Dean is? The major problem with Dewan’s assertion? His claims are testable. He LIKES +/- more than other statistics, but that doesn’t mean reality will match his preference.
Dean helps explain the issue better. Plus-minus is a very clear stat. It doesn’t rely on scorekeepers that may prefer hometown heroes to away players. We don’t have to bicker over if the rebound belongs to Dwight or to Kobe. The thing is, that despite being very clear cut, the stat itself is not useful. Over time it is not consistent. It is noisy and subject to many factors. And as I’ve pointed out with lineup data, it’s often not possible to get sample sizes that are even close to meaningful.
Compare this to assists, which Dean also brings up. We find these matter a great deal in winning (side note: in the original Wins Produced formulas, Dave and crew left assists out because they didn’t think they mattered. But like good scientists they examined their impact and found out that they matter a great deal). And what’s more, these are fairly stable year to year. Yes, these are more subjective. But the key is that this stat is useful!
The Hammer and the nail
A famous quote is “When you have a hammer, everything looks like a nail.” But I’d like to actually extrapolate that further. You see Dewan is a baseball guy. And while the amazing breakthroughs of baseball almost perfectly translate to basketball, I don’t see many baseball ‘experts’ make that connection. The reason is because, like many, they start with the assumption:
Basketball is complicated. There are lots of players and interaction effects. Unlike baseball, which is mostly one on one, basketball is difficult!
And, when you go in with that mindset, you will want a metric that alleges to capture that complexity. And as soon as you start with a metric like +/- to explain basketball, then the problem of basketball stays in the complicated and intricate realm. Indeed, we’ve seen iteration after iteration of “Adjusted Plus Minus” to improve upon the fact that all prior versions are bad at explaining wins. It’s a self-fulfilling prophecy. If you bring complicated metrics to try and solve a problem, you will remain convinced the problem is complicated and thus be convinced the only thing to solve the problem with is a complicated metric.
The problem you’re trying to solve
One other reason I see for the use of stats like +/- is to solve things like defense, which can be difficult to attribute to the individual. I did like Bill James’ point on stats. Is the stat clear? Is it reproducible? Is it useful? The last one is an issue I see a lot in sports analysis. We are often convinced that because we’ve identified a complicated problem that it’s important. A player being “unclutch” or unable to play in close games because of poor free throw shooting is often seen as huge. Yet, when we break it down, ‘crunch time’ is a very minimal part of a player’s game. Defense is often paraded out as a major problem. And yet, it may be largely a team activity and the parts we can’t track to players via individual stats is usually not that big. In short, are noisy stats like +/- really that useful when they’re applied to issues that aren’t that major? As a comparison. Are you more worried about the stars your NBA team is targeting with their extra $20 million, or about who they are going to pick up with their mid-level exception?
The Real Problem
Alright, +/- is bad. Some of you will say we’ve said this before. My two outs? One, our site isn’t that large and we get lots of new readers regularly. This may be your first time reading about this. Second, as Kevin Pelton shows, +/- is still used a lot! And that’s my final gripe.
In sports, we have the ability to test our theories. And while I have seen many fall prey to misusing statistics, it is far more common that I see the wrong tests being applied. The common test is to see if a stat verifies your own beliefs. For example, this terrible post on why PER is better than Wins Produced because they like how Harden looks with PER. Joe Sill’s original paper on RAPM defends it by explaining “See! The players you think are good are good!” And as I’ve noted in the last few days, even NBA coaches fall victim to this. In sports and stats, you need to test your theories and test them right. The love of +/- is unforgivable because it has been shown to be a bad stat. I repeatedly see people say “I know +/- is noisy but…” and that’s the issue. There is no follow up there. If a metric is bad, we move on. Our goal should be advancing statistics. And to do this requires actually testing our ideas and leaving behind the bad ones. Adding up the good and subtracting out the bad? It’s a crazy notion I know but one I think can work.