“And by that destiny to perform an act Whereof what’s past is prologue, what to come In yours and my discharge.”
― William Shakespeare
It’s a tough job being an oracle.
As you know I’ve been making picks for the Truehoop Stat Geek Smackdown. It’s been a blast but some bad luck has knocked me out of contention to win.
Everyone makes picks in different ways. Some use their guts. Some consult the stars and prophets. We, as men of science, build models.
Any model we build is going to contain variation.
The interesting part is that the result is not that surprising. So much so that we never let it get us down, rather we look for patterns and lessons to build on as we continue to make our predictions.
Consider this then your peek behind the curtain. This is how the sausage gets made.
The first and most important lesson is that predictions can come down to plain old dumb luck. This is gospel for any game of chance. We build models and look at data. We take all the available information from all the available sources and we use that to project the most likely scenario based on a set of given variables.
In short, randomness happens. The trick however is to differentiate between what is variation inherent and predicted by the model and variation not predicted by the model. This is the difference between common cause and special cause variation.
Common-cause variation is that variation that is expected within the bounds of the system. You roll a pair of die and you expect to get snake eyes at a certain frequency. Bad teams (even historically bad ones) will still manage to win some games. We can predict and account for this variability.
Special-cause variation means that something new and unexpected has occurred. It is by definition something that was not predicted by the model. It is change or an unforeseen variable changing the process to render it unpredictable.
The Conference Finals were a perfect contrast in those two types of variation. I may have written at length about it too. Let’s review the result and see what it says about the model before we get to the pick for the finals.
After the jump of course.
Round 1 the model went 6-2 with 3 dead on picks. The two misses were due to injury (Chicago) and coach irrationality (Denver). The Denver pick in particular was still very specifically common cause variation (i.e. predictable) as the actual result (Lakers in 7) was the second most likely result at 17.2%.
Round 2 was more of the same. 4-0 on picks with 3 exact picks. The one result that was somewhat unlikely was Heat in 6. Two factors at play here where that it really wasn’t that unlikely (18.5%) and the Bosh/Wade injury daily double for the Heat.
Let’s talk about the conference Finals then. Keep in mind that I had to pick an upset to win the smackdown. Based on the data at hand, the logical choice was the Celtics.
For the conference Finals, we have a perfect study in variation. I went 0-2 for picks. The model really went 1-1. Had I used the full season numbers as opposed to the second half numbers the Miami series the model would have flipped option 1 and 2 and favored Miami in seven then Boston in 6. As it was, the actual result was well within the error of the process. This is common cause variation.
OKC over San Antonio caught everyone by surprise. Here something clearly changed in the process of the series. I can Illustrate.
While the young guns for OKC all jumped a level, the San Antonio stars ran into a wall. Generally players are who they are. In the playoffs in particular we can count on young players for the most part to fall short. As they gain experience they start improving. Having a young team like OKC all hitting their stride at the same time is not a common occurrence. San Antonio also showed their age with their young star (Leonard) being good but inconsistent and their older vets, particularly Duncan, showing their age at exactly the wrong time. Coach Brooks increasing Ibaka’s minutes was also key. Throw in a massive performance or two in each of the last four games (Thabo game 3, KD & Ibaka in game 4, Harden in game 5 and KD and Thabo again to close it out). This is the definition of special cause variation.
OKC changed on San Antonio in the process of the series.
But is it enough for the Finals?
Let’s get to the picks.
The method really has three parts.
- Setting the player Value
- Projecting Minute Allocation
- Running the Playoff Model
One important note is that experience in the playoffs matter. Some of my competitors have pointed this out this year. I have been pointing this out for a while. Basically, playoff vets get favorable calls. To account for that, I’ve given those teams that feature playoff vets that have produced at a high level in the playoffs a 4% boost (or about half of the basic homecourt) to account for this favorable bias. Three series feature this: Thunder-Nuggets, Heat-Pacers and Celtics-Sixers and of the three, two have this be the deciding factor.
For setting the player value, I ended up calculating the ADJP48 (Raw unadjusted Wins Produced, go here for more detail) for the season for every player and adjusting it to take out the effects of homefield advantage. I won’t go into full detail (not just yet anyway) of that here but you can see part of that work here.
The next bit is the tricky part. You have to guess at what the playoff minute allocation will be for each team. The key idea here is the half baked notion.The Half baked notion is this: what wins in the regular season is not necessarily what gets you the trophy. What’s the difference? Minute allocation & how wins produced are affected by that allocation. We continuously hear terms like playoff rotation & playoff minutes thrown around come playoff time. When we take a look at the data we’ll see that the pundits may just be right (hell has officially frozen over).
The half baked notion tells us that a good deep team filled with average and above average players will get you in the playoffs but to get far in the playoffs you need your wins to be concentrated in your Top 6.
To illustrate, let’s look at the regular season data. I’m using all the data from every season since the merger. I will be ranking the players on each roster by minutes played and then allocation wins accordingly. The data looks like this :
A few interesting points from this table:
- Your starting five account for 82% percent of your wins in the regular season.
- Your second unit is important over the course of an 82 game regular season accounting for 18% of your wins
- After that everybody else is statistically meaningless.
Now let’s look at the playoff data. Again, I’m using all the data from every season since the merger. I will be ranking the players on each roster by minutes played and then allocation wins accordingly. The data looks like this :
You can clearly see the obvious differences:
- Your starting five account for 94% percent of your wins in the playoffs.
- Only the first guy of your bench matters accounting for 5% of your wins
- After that everybody else is statistically meaningless.
If I apply all these concepts and shake vigorously, the projected Lineups for the Conference Finals look like so:
The last part is to fire up the math, calculate win probabilities and feed it to my model. I am not posting the whole thing here but I will give it to you in picture form.
2012 NBA Finals Pick:
We are breaking out all the stops today. Let’s start with the oracles. I asked the I-Ching :
Who will win the NBA Finals? The answer was “Heaven reflects the Flame of clarity:
The Superior Person analyzes the various levels and working parts of the social structure, and uses them to advantage.”
I’m thinking the Flame of clarity took her talents to South Beach. One Vote for the Heat then.
Let’s take a look at what everyone else seems to be looking at, schedule and opponent adjusted Point Margin.
For this one I take every game and calcualte point margin after taking into account opponent, schedule and altitude. OKC had the best point margin after all adjustments came in in the regular season. They were about a half point better than Miami. Couple of important points on this:
- Miami took the last 22 games off.
- Miami in the playoffs is slightly better than OKC
- These numbers do not account for the fact that minutes allocation in the playoffs is way different in the playoffs. The smart coach plays his his best players more. This is a big plus for Miami (see Fisher,Derek)
So point margin on the surface is a slight win for OKC. Throw in Home court and it’s an easy call for OKC. That’s a vote for OKC.
We know better though. As explained, We have to account for actually is going to play. When we do that we get the following:
For the playoffs, I try to model things in multiple ways. Typically, I look at the full season, after the trade deadline and for the playoffs. If there is a major injury during the season I make the attempt to segregate the effect out (for example Dirk last year). For cases of major roster upheaval (as with the Celtics this year) I will tend to go with the roster that will show up to the series as much as possible.
For the Finals, all three models show Miami as the better team once we take out the trash on the end of their bench. The closest is the after the deadline where with Homecourt OKC has a slight edge. You do remember me mentioning Miami taking the last 22 games off right? So big edge here to Miami.
Let’s talk meta model. When I first look at all the playoff teams I like to look at a series of factors that I’ve found that all NBA champions (at least since the merger) have in common to identify the those teams that are truly in it and those that are fatally flawed. Both these teams, weirdly enough, came up flawed when compared against past champions. OKC is too young. They lack the veteran star which every Champion has. They are exactly a year short. OKC in 2013 will have those vets in Durant and Harden. Miami failed the big man test. They lacked that dominant big that controls the paint that is a hallmark of every Champion team.
So a choice between Miami and OKC is a choice between too young and too small right? Not so fast. By necessity, Miami discovered that the dominant big man they needed was already on the roster. LeBron James has stepped up in dramatic fashion to fulfill that role for Miami. Lebron was an unstoppable force of nature in the the last two series. Combine that with the fact that Miami shows up as a better team on a neutral floor no matter how I slice it and I have to think that history will reward the vets and disappoint the first timers once again. Given the 2-3-2 format the pick is then Road Team in 5.
Once all the precints are in? We get Miami in 5.
That said? Given the format and the teams, there is a 60% chance the series goes longer. I’m kinda hoping that’s the case.
Enjoy the Finals.