Obama stopped by the Google campus during his 2008 campaign. As a joke he was asked a “Google Interview Question” about sorting one million 32-bit integers. His response?
Well, I think Bubble Sort would be the wrong way to go.
I’ve probably lost most of you that come here for basketball stats or PER bashing. But stay with me. Computer Science can be a complicated field. At least, I like to think so, given that it’s my field. Sometimes proving your proficiency can take a lot of work. At the same time, proving your lack of proficiency can be relatively easy. Sorting is a subject that’s particular important in Computer Science. This is especially true to a company like Google, who likes to keep lots of data and have it sorted. So I’m going to explain to those non-CS people why Obama’s answer is funny to a room fool of computer folk.
Bubble Sort is a terrible sort. Yes, it works at sorting a group of items, but it takes a long time. Pretend I give you a deck of cards out of order, and ask you to put it in order. Bubble Sort would take 13 times as long as the “standard” sorts and 52 times as long as the quickest sorts.* Clearly, Bubble Sort is a bad solution. And if anyone suggested it, we’d be aware they weren’t a good person to ask or hire in regards to sorting.
I’m going to get a little mean so forgive me. Player Efficiency Rating (PER) is like Bubble Sort. In this new age of being able to easily make a blog and call yourself a stat expert, it can be hard to know who to listen to. Sometimes seeing if someone really understands a metric takes a lot of work (For instance, it’s fair to say many people cannot construct Win Shares or PER or fully understand how either works) That said, when people use PER, sadly I realize they are not an expert on advanced stats. In regards to explaining player performance (as measured in wins) it does a pretty poor job (great paper from Bradbury and Berri here). And this is not anything new or complicated. Wayne Winston and Dave Berri have both gone through the work of showing via PER’s own math that it rewards poor shooting (here and also here) And as both have mentioned, the math behind why PER is bad is not that complicated (unlike PER itself)
Even in regards to explaining popular opinion, which many claim it does, it is far more complicated than NBA Efficiency (EFF) or Points Per Game (PPG), both of which are highly correlated with PER.
There is really no reason to use PER.** At least, not if the advancement of useful stats in basketball is your goal. The truth is that PER is still heavily used. ESPN has a vested interest in using it. Many analysts on the web use it, because candidly, it’s hard to get stats. It’s a lot of work to break apart metrics and understand how they work (something we’re trying to do a better job of with Wins Produced). The simple fact is when people bring up PER or use sentences like “I know +/- is noisy, but…” well all I can say is that it’s a wrong way to go.
* For C.S. folks out there, I am only comparing the difference in O(n) notations. I did not actually calculate out the difference in worst case scenarios for each sort.
** I will note that while Dave has studied how PPG and EFF explain pay, (paper here) he does not have the same study for PER yet. His recollection is that game score does roughly as well as EFF. Until I have the numbers though, I will let PER fans hold out hope that it may have one decent use — explaining bad paying decisions in the NBA. And even if that is the case, to still use PER requires using it for this purpose, which many do not.