Jeremy Lin and the Ghost of NBA Draft’s Past

“Science is the belief in the ignorance of experts.”
― Richard P. Feynman

Yes, before you ask, as is contractually required by any and all bloggers I will be talking about the unlikely Jeremy Lin. Now, I know we touched on this yesterday but our goal today is different. My take will be different. You see rather than waxing poetic about the unbelievable and unpredictable nature of basketball or focusing on how no one could have seen this coming, I’m going to focus on how we kind of did.

Because when faced with a supposedly unsolvable problem, we brought the science and science once again beat the experts.

The problem I’m alluding to is evaluating talent in the NBA draft. Anyone who knows me knows I love to write about the draft. For those who don’t, hello you must be new here. Just in case, let me illustrate that by throwing some links up for your viewing pleasure.

This lead to a lengthy draft strategy segment in my guide to running an NBA franchise (Build me a winner rev.2).

The key takeaway was that talent was that I needed to build an effective draft model to predict player performance based on publicly available data. I built two (go here for the model build parts 1 & part 2 ). In very general terms the models use the available data to predict future performance for each player coming into the draft from college. Based on that prediction a ranking is done and a draft recommendation is generated.

Now this model is a work in process, I build it then publish it then go back at some future point to review to see if it worked. I will make corrections as needed over time.

One of the key ideas is having a public build to allow for peer review and answer the skeptics.

For the purposes of this discussion for example I will focus on the last 2010 build (see here) because at the request of some of our loyal readers I had included the best undrafted rookies. Can you guess who was number one?

Do you want to answer for the class?

Mr. Lin actually was the number tenth overall ranked prospect on our draft board and easily the best undrafted. The model had him slightly below the draft treshold. Given this and a few other similar data points, I moved the treshold slightly down  to .090 WP48 for Model #1 and .060 WP48 for model #2. You will see the results of that in the numbers that follow.

Why should you care exactly?

It’ll make more sense if I just give you the full story:

That’s every drafted player coming from the NCAA’s from 1995 thru 2010 who’s played at least 400 minutes in the NBA (2010 shows additional players who haven’t played those minutes yet). It shows the player’s draft year, where he was picked, the model predictions and the player actuals for his first 4 years and his career. For 2010 for example, we can see both the Knicks starting guards in the top 10 but this could simply be coincidence. Did the models actually do anything?

A simple test is to look at correlation between the place the player was picked, where the models suggested picking him and actual rank by draft in terms of production. Draft order vs production shows minimal correlation with an R-square of about 5%. It jumps to 25% for the predicted production rank.

A more complex and interesting test is to look at:

  • The probability of landing a better than average player (>.090 WP48)
  • The probability of landing a good player (>.150 WP48)

If I do this for all picks by the Models as well as  all draft picks and Model picks taken after the top 5  picks I get:

The models perform as well or better than the majority of lottery picks.  The only real difference is superstar talent at the number one pick (which isn’t really an every year affair).

So to review, using publicly available data we built a model that picks draft winners at a 75% rate which is better in general than having the #1 pick in the draft and big winners at a 40% rate which is better than everything but the #1 pick.

Science!

-Arturo

P.S. How about one more bonus table?

 

 

The biggest losers in the NBA

Losing is a disease as contagious as bubonic plague attacking one but infecting all.
-The Natural (1984)

.0986 Wins Produced per 48 minutes or thereabouts.

What’s that you ask? The average production of player in the NBA for the 2011-12 season as of all games played by February 9th. Why should I care? Because that’s what your team and your players need to beat to be a winning team.

That is the top 6 in terms of minutes for every team in the league as of the morning of February 10th of 2012. Please note the column in the middle called point margin per game. This column has a very simple explanation. Take for example Mr. Nick Young of the Washington Wizards. His point margin per game is -3 points. What does that mean exactly? It means that if Mr. Young plays a game for your team at his average of 30.7 minutes per games at his average level of performance (-.054 wins per 48 minutes) vs average opposition (.0986 wins per 48 minutes) his team, the aforementioned Wizards will be three points in the hole at the end of that game solely because of that player’s efforts.

Simply put playing this particular player is a losing proposition. The interesting bit is that he is not alone. Let’s talk losers.

Here come the losers (Image by Jack "The King" Kirby)

Let’s look at the 20 worse offenders in the league at this point by game and for the season.
How exactly does this kind of thing happen? Of the 30 teams in the league only 16 show up on this list of ignomy. Let’s focus on them.

8 teams show up with one player (Lakers, Raptors, Thunder, Wolves, Mavericks, Clippers, Bucks and Heat).  A few of these are understandable.

  • World Peace is generally undervalued by the model as his contributions are hard to measure. The Lakers get a bye for keeping the Peace.
  • Perk is in a similar category for the Thunder.
  • Wesley Johnson is young and still has room to grow for the Wolves. Norris Cole gets the same pass for Los Heat.
  • Odom is missing the beach apparently and playing way below standard for the Mavs. I don’t expect this to continue indefinitely.
  • Brian Cook is tenth on the rotation for the Clippers and can be considered a warm body at this point.
  • The last two in this group (DeRozan for the Raptors and Captain Jack for the Bucks) are indefensible . Captain Jack should not be #2 on the depth chart for a team with playoff aspirations at this point in his career. As for DeRozan? If he is your number 1 option in the rotation, you have real problems. Then again, sadly this is an old story for the Raptors.

There are 6 teams with 2 players in the Losers’ bracket:

  • Cleveland: It could be argued that they need to have Jamison out there to sell. Mychel Thompson falls under the warm body category to me.
  • Detroit: A rookie (Knight) and yet another warm body (Daye).
  • Houston: A rookie deep in the rotation (Morris) and a player in Scola who seems to be in a precipitous decline.
  • New Jersey: Two bench players on a bad team in Shawnee Williams and Okur who like Scola may be in rapid decline.
  • New York: Amare,who I will continue to argue, is miscast totally in the Melo iso show and could be decent with a better PG, say a Harvard grad and Toney Douglas who is probably out of a job.
  • Orlando: On behalf of all Celtic fans, I’d like to thank the Magic front office for taking Baby off our hands. God bless you and protect you. You’ll need it. Larry Hughes is there collecting a paycheck.

The last two teams on this list are just terrible with three and four players on this list:

  • My old friends the Wizards , have two players in Nick Young and Jordan Crawford who at this point, based on the data, will not ever reach the level expected from a bench player on a good NBA team. Andray Blatche might reach that level but he should not be drawing more than the NBA minimum.
  • The Bobcats, my lord, the Bobcats are a mess. Corey Higgins is a deep in the rotation rookie and given the talent drought on this team they should really be playing him more to see if they have anything. Byron Mullens is another unknown quantity so they might as well play him. Corey Maggette may be ready for the glue factory at this point after a respectable twelve year career which is long. Tyrus Thomas probably needs out of the Queen City.

At the end of the exercise, what can we conclude? There are a few key reasons why these lovable losers get playing time on these teams. Most are understandable (or at least come with a good story): they’re defensive contributors on good teams, they’re having an off year, they’re losing the battle with father time, they’re dealing with injuries, they’re inexperienced rookies or unknown quantities getting their shot at the big time. There’s only a few cases of truly bad management in the bunch.  To me, the Raptors, Magic and the Wizards are all guilty of throwing good money after bad and doing it consistently year in and year out. They pay,play and trade for bad players. Where I a fan of these teams I would be very pessimistic about the near future prospects of my team.

-Arturo

 

The MVP Race so far: The King and I

Grady Fuson: Artie, who do you like?
Scout Artie: I like Perez. He’s got a classy swing, it’s a real clean stroke.
Scout Barry: He can’t hit the curve ball.
Scout Artie: Yeah, there’s some work to be done, I’ll admit that.
Scout Barry: Yeah, there is.
Scout Artie: But he’s noticeable.
Matt Keough: And an ugly girlfriend.
Scout Barry: What does that mean?
Matt Keough: Ugly girl friend means no confidence.
Scout Barry: Okay.
John Poloni: Oh, now, you guys are full of it, Artie’s right. This guy’s got an attitude and an attitude is good. I mean it’s the kind of guy who walks into a room his dick has already been there for two minutes.
Scout Pote: He passes the eye candy test. He’s got the looks, he’s great at playing the part. He just needs to get some playing time.
Matt Keough: I’m just saying his girlfriend is a six at best.

– Moneyball
Perhaps , my favorite scene in the movie Moneyball illustrates beautifully the ignorance of the select group of people (scouts) who are supposed to be in the know. This is a recurring theme in sports and in life: years of experience and know how boil down to how we focus on our feelings and our gut. We miss the truth that is right in front of us because we rely on fallible perceptions.

This leads to the kind of inefficient marketplaces for talent that Billy Beane and the Oakland A’s managed to exploit, at least for a while, thru the judicious use of math and economics.

Perhaps my favorite case in point is the NBA MVP race (which long time readers may note is a particular obsession of mine). I’m impatient as ever to get started so let’s make it short and sweet.
A picture is worth a thousands words no? That is a Word cloud illustrating the top 6 players in terms of minutes played for each team. Each player is colored using his teams colors and each player is sized based on the Wins Produced (all numbers from NBA Geek)  for his team per game played. So my MVP evaluation takes into account production and playing time to ascertain who is truly the Most Valuable.

A Familiar Face at the top

Unsurprisingly, Lebron sits on the throne. In fact the top 5 for this year so far (Lebron,Chandler, Paul,Love and Howard) features only one new face in Mr. Chandler. Howard has dropped from his lofty perch at the end of last season but there may be extenuating circumstances.

Ships passing in the night?

Still here? I gave you a neat infograph, some nice pictures and a straight forward conclusion. Seems like more than enough to me. Do you expect me to give you a table illustrating the top 6 for each team?

Oh, all right.

-Arturo

The Variable season: NBA Rankings for 2012 thru 2/2/12

“Those who are victorious plan effectively and change decisively. They are like a great river that maintains its course but adjusts its flow.”
— Sun Tzu

A third of the way into this wild ride of a season, let’s take stock.

I am certain that the one constant for this NBA season will be change. The schedule, the teams, the timing, the rotations all have been deeply impacted by the cram that is the lockout shortened 2011-2012 season. Strange doings will abound. Up will become down, down will become up, Bargniani will play well and so will the Clippers. Portland will unfathomably remain injury free almost to the all star break.  Granted some universal truths will hold out, the Knicks will still overpay scorers, the Nuggets and Jazz will kill people on back to backs and Michael will remain on the golf course.

Stay on the course Mike, you can't really make any difference at the office.

The trick then lies in identifying that which is true and that which was true and using that to reach some conclusions.

The most basic truth in basketball remains: the winner is the team that outscores the opponent. Point margin is the truest currency of team value that there is. In fact, it’s not only clear it’s simple math ( full detail is Here)

Games won correlates to average point margin per game correlates to win produced or:

Team Win % = Team WP48= (Avg Point Margin for Team (season))/31 +.500

Granted as with any expression of value there are some significant external factors to consider that can obscure the truth. Not all ten point wins are created equal and neither are all teams that outscore their opponents. Let’s review them shall we?

Image illustrating the reaching of unsustainable conclusions courtesy of xkcd.com

Factor #1: The law of large numbers (LLN) describes the result of performing the same experiment a large number of times. It’s a simple enough theorem, the average of results obtained from a large sample (or number of trials) will get closer and closer to the real value of something the larger the sample. Conversely, the error (or more accurately the possibility of it) gets larger and larger the smaller the sample . What does this mean for us exactly?

The higher the number of games a particular team plays in its current configuration the more likely we are to know how good they are. A good game is a data point. A series of good games is a possible trend. A good season? A fact.

Factor #2: Homecourt Advantage (originally seen Here))

Some teams do better than others:

Altitude and rest days affect the Homecourt advantage (HCA) and they interact with one another. Average HCA is at 59.9%. Altitude is directly proportional to HCA. Rest days are a little stranger. Altitude directly interacts with rest. Denver and Utah kill teams at home if they have a rest edge but they get killed themselves if the other team is coming in with at least a two day rest edge.

It adds up and must be accounted for.

Factor #3: Strength of Schedule

Washington Wizards are 4-19. Beating them is not the same as beating the  Miami Heat, or even the Chicago Bulls, or the Sixers for that matter. All wins are not created equal. Opponents matter. We will account for that.

The Rankings as of 2/02/2012

Take Point Margin, check, Homecourt check, Strength of schedule check. Are we missing anything? Of course, the game data. God Bless Basketball Reference.

Now let’s put this all together and make a ranking. As before, I will work out the following numbers:

  • Point Margin per Game: Pts scored by team -Pts scored by opponent divided by games played
  • Home court Point Margin per Game: Point Margin per game due to the schedule and homecourt advantage.
  • Adjusted Point Margin per Game: Point Margin per Game – Home court Point Margin per Game. Schedule independent point margin (neutral site at sea level)
  • Adjusted Opponent Point Margin: The average Point Margin per Game of a teams opponents.
  • Real Point Margin (RPM): Point Margin per Game -Home court Point Margin per Game +Adjusted Opponent Point Margin. Expected Point Margin at a neutral site against perfectly average opposition. This is the Number I use to rank.

A key difference is that I will now be looking for trends. Ideally, I’d like to see the average real point margin for the season as well as rolling 10 game samples for the season to date. That might look something like this:

(Editor’s Note: I guess I am really out of practice. Screwed up the following two tables initially fixed now. Really need to stop hanging out with Dirk in the offseason :-) )

 

Fixed now !


Lot’s of cool data there but what can we conclude from it? The next step is to take the expected Real Point Margin for the season and average it with that for the same for last ten games. Rank everyone accordingly. Look for and identify trends. Make a nice table:


Given the history of the NBA, there are currently ten teams in contention. How do I figure?  Every team that’s won the title since 1980 has won 52 games or 63% of their games but one (the lone exception being Houston in 1995 and they swung a late season critical deal to add Clyde Drexler).

Let’s review those results starting with that top ten.

  • Philly, OKC and Miami as your top three should be unsurprising with OKC improving and the Sixers and Heat cooling down should be unsurprising.
  • Four thru eight provide surprises with the Blazers, Grizzlies and Nuggets and the Zombie Celtics suddenly charging to top of the list.
  • Chicago and Dallas are both slowing down at 7 and 9 and the Hawks remain about the same to round out the top 10.
  • Outside the top 10, Milwaukee and Indiana are improved to the point of contending for a playoff spot. The Warriors are improved to worse chance in the lottery. The Magic are basically a condemned building at this point. The Bobcats are looking forward to a tough relegation battle with Toronto, the Kings, Wizards and Pistons.

So a third of the way thru, only a third of the teams are currently in the hunt. Health (San Antonio), age (Lakers), contracts (Orlando), and just time playing together (Clippers) have played a role in knocking some teams off.

But this could all change very,very quickly.

We’ll be sure to keep you posted.

-Arturo

Arturos’ Awesome Primer: Everything you need to know about the 2011-2012 NBA Season

“If you thought that science was certain – well, that is just an error on your part.”
— Richard P. Feynman

Well, it’s been an interesting season so far. Teams bubbling up. Teams crashing down. As always, it’s human nature to rush off and make dramatic pronouncements particularly when you want to tell a good story.

Reality is a good deal more complicated than that.

The  law of large numbers (LLN) describes the result of performing the same experiment a large number of times. It’s a simple enough theorem, the average of results obtained from a large sample (or number of trials) will get closer and closer to the real value of something the larger the sample. Conversely, the error (or more accurately  the possibility of it) gets larger and larger the smaller the sample . What does this mean?

Let’s not get ahead of ourselves. Rushing to judgement based on a small sample is premature. A larger sample size is called for before we can make any definitive conclusions.

We can however use what we know to maximize what we can actually learn from the current sample. And that is precisely what I’ve been doing with my time.

We are going to have some fun today.

Because, today is the day when we I put it together all that I’ve learned about point margins, the homecourt advantage, strength of schedule and give you Team Rankings.

Let’s start with recapping what we know.

Warning: Science ahead!

Point Margin (The Win Cheat Sheet and Point Margin Produced Rev 1.1 (originally seen Here))

I’ve previously shown that on a game to game basis Wins produced correlates at a 99.8% with point margin (Point Margin for a game = 0.0377 + 15.5 Wins Produced for that game) and for the season a 95% correlation has been shown repeatedly (the difference is down to blowouts).The gist of it is that Wins Produced for a team correlates to a teams average point margin which correlates strongly with games won.

Do some additional maths and you can come up with some nice and nifty equations:

Expected Avg Point Margin for Team (season) = 31*(Wins Produced (team for the season) -41 )/82

Wins Produced (team for the season) = (Expected Avg Point Margin for Team (season)*82)/31 +41

Team Win % = Team WP48= (Avg Point Margin for Team (season))/31 +.500

Wins Produced (team for the season) = (Expected Avg Point Margin for Team (season)*82)/31 +41

And

+1 Points per game = 2.645 wins over .500 (43.645 wins)

+3.1 Points per game = 10% increase in Winning %

+10 Points per game= 26.45 wins over .500 (67.45 wins)

+1 WP = +.378 Points per game

+10 WP = +3.78 Points per game

And for Players:

Point Margin Produced per 48= (WP48-.099)* 31.1

Homecourt Advantage (The Unfair Advantage (originally seen Here))

The basic equation goes something like this:

Probability of Home team winning a game (Win %)

= (Projected Wins Home Team-Projected Wins Road Team)/82 +.606

=Win %: (Proj. Home Team Win% – Proj. Road Team Win%) +Homecourt Advantage(.606)

This is the simple equation I came up with for the home team winning a single game. The base assumption being that based on the data set (all regular season games from 1999 thru 2008 ) the home team wins 60.6% of time) and this was good and worked fairly well. As I got older and wiser (or at least more creaky), I then decided to add some more factors in:

  • Add in the effect of rest days and back to backs.
  • Add in the effect of altitude

I did some maths and figured the homecourt advantage in each scenario over playing at a neutral site. For this post I went even further and figured out the value of that advantage in points (using the handy-dandy equations in the previous section):

In summary, both, altitude and rest days affect the Homecourt advantage (HCA) and they interact with one another. Average HCA is at 59.9%. Altitude is directly proportional to HCA. Rest days are a little stranger. Altitude directly interacts with rest. Denver and Utah kill teams at home if they have a rest edge but they get killed themselves if the other team is coming in with at least a two day rest edge.

It just so happens that this kinda adds up. Apply that to a regular season played by identical clones and you get:

So if I assume all teams are equal, Utah and Denver both get a 10% boost in winning percentage when they play at home. This is good for four extra wins a season versus the average.

Strength of Schedule

Simple logic here. The Washington Wizards (hi Ted!!) are not the Miami Heat, or even the Chicago Bulls, or the Sixers for that matter. Wait, I’m getting a little ahead of my self.

All wins are not created equal. Opponents matter. We will account for that. A typical NBA schedule (I used 2010 here) confers Home court as follows over the course of a whole season:

So Utah and Denver get a four point edge over the Lakers,Clippers,Mavs, Rockets,Wizards, Warriors and Celts. This advantage goes away for the most part in the playoffs.

Some of those playoff losses make a wee bit more sense

The Rankings as of 1/11/2012

So, Point Margin, check, Homecourt check, Strength of schedule check. Are we missing anything?

Of course, the game data. God Bless Basketball Reference.

Now let’s put this all together an make a ranking. I will wok out the following numbers:

  • Point Margin per Game: Pts scored by team -Pts scored by opponent divided by games played
  • Home court Point Margin per Game: Point Margin per game due to the schedule and homecourt advantage.
  • Adjusted Point Margin per Game: Point Margin per Game -Home court Point Margin per Game. Schedule independent point margin (neutral site at sea level)
  • Adjusted Opponent Point Margin: The average Point Margin per Game of a teams opponents.
  • Real Point Margin (RPM): Point Margin per Game -Home court Point Margin per Game +Adjusted Opponent Point Margin. Expected Point Margin at a neutral site against perfectly average opposition. This is the Number I use to rank.
  • Neutral Site Win % : RPM/31 + .500

This is meant as a measure of just how strong each team projects based on the data of the season to date . We still need to account for injuries and incorporate what we know of player historical performance. We will address this in a, say it with me, future post.

A few notes:

  • The Sixers look totally legit by any definition. They in fact form a tight group of three (Philly, Chicago and Miami) at the top that must be considered the favorites for the title at this point.
  • Atlanta is the only eastern team in the 2nd tier with Portland, the Lakers, Clippers, Nuggets and Thunder in a logjam out west. Denver, with their unfair advantage, has a stellar shot at the #1 seed out west (barring acts of god or George Karl).
  • The bottom of the East is putrid. Memphis, the second worse team in the West would be the 7th in the East.
  • Some playoff teams from 2011 that look cooked: Hornets, Grizzlies and my beloved Celts.

And before we go, let’s attempt to add the effect of schedule back in to the equation:

As always, the schedule breaks greatly in favor of the Nuggets who look good for the #1 seed in the West. Keep in mind I’m using older schedules for this. Special accounting will have to be done for this season.

In the future, of course. The very uncertain future.

A future where this just might be a critical playoff matchup.

-Arturo