The Inconsistent Quarterback Story Told Again in Less than 3,000 Words

The Wages of Wins discusses how performance can be measured in both the NBA and NFL.  The Wages of Wins Journal, though, almost exclusively focuses on the NBA.   Why isn’t performance in the NFL discussed more frequently?  The answer to this question can be illustrated by comparing the play of Jay Cutler and Kyle Orton. 

Cutler and Orton Defy the Pundits

The Chicago Bears finished the 2008 season with a 9-7 record, a mark that fell just short of qualifying for the playoffs.  In discussing Chicago’s problems, people tended to focus on the team’s quarterback.  As Table One reports, Kyle Orton – the Bears starting quarterback in 2008 — was ranked 25th (out of 32) quarterbacks in both the NFL’s QB Rating system and the Wages of Wins metrics (i.e. QB Score, Net Points, Wins Produced).

Table One: Final Quarterback Rankings for 2008

In the offseason it became clear that Jay Cutler – a player who ranked 7th in Net Points per Play (and Wins Produced per 100 plays or WP100) – was available.  So the Bears sent Kyle Orton – plus two first round draft picks and a third round pick – to the Broncos for Cutler.  Fans of the Bears rejoiced at this move.  And fans of the Denver Broncos became very, very angry.  In the pre-season the views of both groups of fans were confirmed.  The Bears finished the exhibition season with a 3-1 mark, while the Broncos – led by a less than impressive Orton – finished 1-3.  Many NFL pundits were heard expressing the conventional wisdom:  You simply don’t trade away a “franchise” quarterback. 

And then the real games were played.  As December begins, the Broncos are 7-4 while the Bears are 4-7.  When we look at each quarterback’s stats – reported in Table Two – we see that the 2008 result has been essentially reversed.  Orton now ranks 9th in the NFL in Wins Produced per 100 plays (Wins100) while Cutler is ranked 25th. 

Table Two: Week Twelve Quarterback Rankings in 2009

The reversal in the ranking of these two quarterbacks is hardly unique.  Nine of the quarterbacks ranked in the top 10 this year qualified for the rankings last year.  Of these nine, only four – Drew Brees, Peyton Manning, Philip Rivers, and Matt Schaub – were ranked in the top ten at the end of last year.  And we see the same story at the bottom of the rankings.  Seven of the players ranked in the bottom ten qualified for the rankings last year.  Of these, only two – JaMarcus Russell, and Derek Anderson – ranked in the bottom ten in 2008.

Despite such inconsistency, fans of the NFL – and apparently at least some decision-makers – can be impressed by a quarterback’s past numbers.  Consequently, the Bears can be tempted to give up three draft picks and a starting quarterback for an apparent “franchise” signal caller.  And the Chiefs can give up a second round pick and significant dollars for Matt Cassel (currently ranked 26th). 

The problem facing decision-makers in the NFL is the numbers – which are often cited – don’t tell us very much about the future performance of a quarterback.  A quarterback’s statistics depend on his teammates and the quality of his coaching.  Change the teammates and coaches and you often see the numbers change as well.  Unlike basketball – where player statistics are remarkably consistent from season to season – football numbers suffer from very significant interaction effects.  This means those numbers – which told us that Cassel and Cutler are “great” quarterbacks – may not tell us much about what these quarterbacks will do when these players change teams.

And it’s important to note that this isn’t just some numbers or some quarterbacks.  Less than 25% of a quarterback’s completion percentage and passing yards per attempt are explained by what the quarterback did with respect to these statistics last season.  Less than 10% of touchdowns per pass attempt this season are explained by last year; and when we turn to interceptions per attempt, explanatory power falls to less than 2% (these results come from an examination of 399 quarterbacks who played consecutive seasons from 1994 to 2007).  When we turn to measures such as QB Score, the NFL’s quarterback rating, or the numbers at FootballOutsiders.com, again we see inconsistency (explanatory power is less than 20%). 

Such results tell us that what we see from Cutler and Orton in 2008 and 2009 should not be surprising.  Predicting performance of quarterbacks in the NFL is simply very difficult (and this is not just the story I tell, but also the story told by Brian Burke at Advanced NFL Stats).

This is really a fascinating story.  But the story was essentially told in The Wages of Wins.  And I told it again during the 2006, 2007, and 2008 NFL season.   Consequently, this is what I said towards the end of my discussion of the final quarterback rankings in 2008: “…the measurement of performance in football really only tells one story.  The interaction effects in football cause the performance statistics to be inconsistent.  So the players we see perform well today are not necessarily going to perform well tomorrow.  Although I like telling that story, it’s really about all I ever say about the NFL. Consequently, this very long post … might be my last post on football.”

Looking at the NFL Draft Again

But now another aspect of this story has sparked some interest.  Rob Simmons and I recently wrote an academic article examining the relationship between where a quarterback is selected in the draft and how he performs in the NFL.  For many the results were surprising.  As Rob and I report, where a quarterback is taken in the draft is not related to how that quarterback performs in the NFL. 

Once again… it’s difficult to predict the future performance of NFL quarterbacks.  On draft day NFL decision-makers have an even more difficult challenge.  People in the NFL must project how well a quarterback will play in the NFL before he ever plays with — and against — NFL talent.  Now if predicting performance of actual NFL quarterbacks is hard, what should one expect to see when it comes to projecting performance of quarterbacks that are not in the NFL?

Well, here is what Rob and I found.

1.  We did find several factors that predict where a quarterback will get drafted.  Specifically, we find that taller, faster, and smarter (i.e. better Wonderlic scores) quarterbacks get drafted first. 

2. The factors that predict draft performance, though, don’t predict NFL performance. 

3. Given this result, we shouldn’t be surprised that where a quarterback is drafted doesn’t predict how well a quarterback will perform in the NFL.

This is how point #3 was described a few days ago:

… here is a sample of what we found.  After a quarterback has played five seasons in the NFL (minimum 500 career plays), here are the correlation coefficients between draft position and various career statistics:

Completion Percentage: -0.01

Passing Yards per Pass Attempt: -0.02

Touchdowns per Pass Attempt: -0.12

Interceptions per Pass Attempt: 0.00

QB Score per Play: -0.01

Net Points per Play: -0.02

Wins per Play: -0.02

QB Rating: -0.06

Directly below this data — and I mean, directly below this data – I wrote the following sentences:

Our data set runs from 1970 to 2007 (adjustments were made for how performance changed over time). We also looked at career performance after 2, 3, 4, 6, 7, and 8 years.  In addition, we also looked at what a player did in each year from 1 to 10.  And with each data set our story looks essentially the same.  The above stats are not really correlated with draft position.

We should note that although draft position and performance are not related – and our story is the same regardless of when we look at the relationship — draft position and salary are clearly correlated.  To illustrate, JaMarcus Russell has collected millions of dollars to play quarterback in the NFL.  But he clearly has not performed at a level consistent with all those dollars.  And a similar story can be told about David Carr, Ryan Leaf, Tim Couch, Joey Harrington, etc…  Quarterbacks who are drafted early clearly get paid more. They just don’t seem to perform any better.

Reacting to Some Reactions

There have been a few reactions to this result that I would like to address.  Here is a sample of what I have seen.

1. A problem with reading comprehension

Let me start with a response that suggests people don’t always read what’s being said. Despite the sentences I highlighted above, I have read statements like the following (this is comment #10 on Jason Lisk’s post at Pro-Football Reference.com from one of the bloggers that Steven Pinker cited):

The Berri choice to exclude QBs who didn’t play five years in the league is a pretty fundamental error to make.

Hmmm… pretty fundamental error?  Perhaps a more fundamental error is not reading a single paragraph that, once again, appeared directly beneath the results I posted. 

2.  Per-play vs. Aggregate Measures, Part One

Beyond the issue of reading comprehension skills is the objection some people have voiced to how we examined the correlation between draft position and NFL performance.  Rob and I focused on per play measures — such as completion percentage, yards per pass attempt, interceptions per pass attempt, touchdowns per pass attempt, NFL’s quarterback rating, QB Score per play, Wins Produced per play, and Net Points per play – in examining the link between draft position and NFL performance (again, at a host of different points in a quarterback’s career). 

People have argued, though, that it’s better to look at aggregate measures such as total touchdown passes or total yards.  Such examinations show a stronger correlation between draft position and performance (although not that strong).  And these examinations show that “better” quarterbacks – where “better” is defined in terms of total touchdowns or total yards – tend to be picked first (again, this is not a strong tendency).  Of course, one could define quarterbacks in terms of total interceptions thrown and show the opposite.  Quarterbacks chosen first in the draft throw more interceptions, and since interceptions are not good, this means quarterbacks taken first tend to be “worse”.

The results with respect to interceptions — and passing yards and touchdowns — are driven by the fact quarterbacks taken first tend to play more.  So by focusing on the aggregate measures one is really looking at the link between one decision (a team liked the quarterback on draft day) and another (the team decided it will play the quarterback it liked on draft day). 

The persistence of draft day evaluations in the NFL is reminiscent of a study by Colin Camerer and Roberto Weber offered in a 1999 article looking at the NBA draft.  The Camerer-Weber article looked at the factors that predicted minutes per game in the NBA.  What they found was that draft position could still predict playing time – even after performance was controlled for – years into a player’s career.  It wasn’t that performance didn’t predict playing time.  No, the important finding was that draft position – independent of NBA performance – predicted playing time.  Such results suggest that NBA teams had trouble ignoring sunk costs in making decisions.    

This is essentially what Jason Lisk reported (in a less sophisticated study) with respect to quarterbacks and the NFL draft.  Even after controlling for performance, Lisk reported that draft position predicted a quarterback’s playing time.

Such a story confirms the approach Rob and I took in our examination of quarterbacks and the NFL draft.  Aggregate numbers are biased because draft position is an independent predictor of playing time.  Therefore, one should focus on per-play metrics.

3. Per Play vs. Aggregate Measures, Part Two

One doesn’t need to consider the bias in playing time, though, to defend the choice of per play measures.   In evaluating players in sports we tend to focus on measures that consider how many opportunities given the player.  For example, in baseball we tend to look at batting average, on-base percentage, slugging percentage, OPS, ERA, etc…  In basketball we tend to focus on per-minute measures.  And in football, the basic quarterback rating measure is entirely defined in terms of performance per pass attempt.   

We tend to think quarterbacks are “better” when they have a higher completion percentage and throw fewer interceptions per pass attempt.  Draft position, though, doesn’t predict these measures (or any of the per play measures reported above).  But if teams were getting it “right” on draft day, shouldn’t the quarterbacks taken first have a higher completion percentage, or get more yards per pass attempt, or throw fewer interceptions per pass attempt, or produce more wins per play, etc…?

4. Draft Position and Never Playing

Steven Pinker had one more reaction to the construction of our study.  Pinker – in the New York Times – noted that lower drafted quarterbacks don’t “merit many plays”.  And this somehow establishes that teams are drafting correctly.  Again, though, this is using one evaluation to justify another.  We expect that NFL teams are going to discount players who were already discounted. 

For us to study the link between draft position and performance, we can only consider players who actually performed.  It’s possible that those quarterbacks who never performed were really bad quarterbacks.  But since they never played, we don’t know that (and Pinker also doesn’t know this).  What we do know is that for those quarterbacks who did play, draft position and performance aren’t related.   

Another way to think about this is to consider the careers of Kurt Warner and Tom Brady.  The numbers tell us that Warner and Brady are among the best quarterbacks of the past decade.  Yet both quarterbacks were passed over by teams on draft day (Warner was never selected and Brady was a 6th round draft choice). Are we to believe that Warner and Brady were the only quarterbacks passed over who could really play?  It seems likely that at least some of the quarterbacks who never played really could have contributed to an NFL team.  But once again, we will never know, since these quarterbacks never played.

And one should add once again… draft position and salary are clearly related.  Teams pay much more for a quarterback taken with one of the first ten slots in the draft.  But the evidence doesn’t indicate that these quarterbacks perform better than those taken later in the first round, second round, third round, etc…. 

5. Reacting to an Odd Interpretation of Our Results

All that being said, let me say what we are not saying.  Jason Lisk – in the blog post linked to above — notes that past NFL performance predicts future playing time.  Such a result is not surprising.  Past performance predicts future salaries in the NFL (hence Cassell gets a big payday after last season in the NFL).  How Lisk interpreted these results, though, was somewhat odd.  Here is what Lisk said towards the end of his post:

If you believe that the only reason Carson Palmer has played a lot more than Gibran Hamdan is because Palmer was drafted alot higher, then you can accept Gladwell’s position.

I certainly don’t recall Malcolm Gladwell saying that draft position was the “only” (this is Lisk’s word) predictor of future playing time.  What Gladwell argued – and what we argued – is that draft position couldn’t predict future performance.   At no point have I ever argued that NFL decision-makers don’t consider past performance in determining playing time or salaries.  In fact – as noted above – we have argued that NFL teams do consider past performance.  Unfortunately, past performance is a poor predictor of the future.  Hence, it’s not clear that the acquisitions of Cutler or Cassell will ever generate the returns envisioned when those players were acquired.

So we agree with Lisk when he argues that past performance predicts future performance.  Where we don’t agree is with the assertion that at some point we argued something else.

Another Study Confirming Our Story

Let me close with a comment left by fellow economist Kevin Quinn at Malcolm Gladwell’s blog (you have to go through a large number of comments to get to Quinn’s thoughts):

I am a sports economist and have investigated the predictability of eventual NFL performance by QBs based on the information available just before the draft. While my approach and methods differed somewhat from those employed by… Dave Berri, my results essentially confirm his findings.

Kevin co-authored a working paper that examined the NFL draft and came to – as Kevin notes – a very similar conclusion (across a smaller sample then Rob and I considered).  Again, this result –given what we see when we look at the consistency of performance in the NFL – is not surprising. 

And hopefully this extremely lengthy post answers all the reactions to the study Rob Simmons and I published (and yes, this post is less than 3,000 words – although not very far below this mark).

- DJ

The WoW Journal Comments Policy

For more on the Wages of Wins football metrics see

The New QB Score

Consistent Inconsistency in Football

Football Outsiders and QB Score

The Value of Player Statistics in the NFL

Maybe It Is Time to Stop Blaming the Coach in Toronto

The Toronto Raptors lost to the Atlanta Hawks by 31 points on Wednesday night.

The loss gave the Raptors 13 defeats in their first 20 games.  The team’s efficiency differential – offensive efficiency (points scored per possession) minus defensive efficiency (points surrendered per possession) – of -5.9 suggests the Raptors are only going to win about 26 games in 2009-10.  In sum, the only NBA team in Canada isn’t very good.

After the Atlanta game, there were grumbles out of Toronto that the players are blaming the coach – Jay Triano — for their troubles.  (HT to TrueHoop).  Such grumbles are reminiscent of the story told a year ago in Toronto.  Last December, the Raptors decided to fire head coach Sam Mitchell (and replace him with Triano). The thinking at the time was that an 8-9 record was simply unacceptable.  And somehow if the players had a different coach, life in Toronto would be different.

At the time I expressed a great deal of skepticism regarding this perspective.  How players perform on the court determines the outcome of each game.   And coaches don’t appear to have much impact on the player’s performances.  Essentially, if you give a coach productive players a team will tend to win; and if a coach doesn’t have many productive players he gets to lead a loser.  The Raptors of last year didn’t have many productive players.  Consequently, we shouldn’t have been surprised that the Raptors only won 25 more games after Mitchell left the scene.

This past summer the Raptors seemed to get this message.  Toronto decided to keep Triano as head coach, while a number of new players were added.  Unfortunately – as the early results indicate — most of the new players haven’t helped.  Again, a differential of -5.9 suggests this team is going to struggle to reach 30 wins this year.

If we move from efficiency differential to Wins Produced, we can see where Toronto’s team makeover went wrong. 

Let’s start with the good news.  Chris Bosh – the team’s star – is on pace to produce 13.3 wins this season.  Jose Calderon – who led the Raptors in Wins Produced in 2008-09 – has struggled.  But Calderon is still on pace to produce 5.9 wins in 2009-10.

So Bosh and Calderon are on pace to produce 19.2 wins.  And the team is on pace to win about 26 games.  A bit of simple math reveals that everyone not named Bosh and Calderon must be on pace to produce only about seven wins.

The perception in Toronto is that everyone else is led by Hedo Turkoglu.  Toronto gave Turkoglu a $53 million contract this past summer.  Although Turkoglu was a sought after free agent last summer, he really has only been an average player across his career.  And he’s now 30 years of age (young for an economist, old for a basketball player).  Hence, we shouldn’t be surprised that he’s only on a pace to produce 2.9 wins this season.  His Wins Produced per 48 minutes [WP48] is 0.052, a mark that’s well below the average mark of 0.100 (and even if he was average, he wouldn’t be helping that much).  So the Raptors – as one could have expected — aren’t getting much return on this investment.

In addition to Turkoglu, the Raptors also held the 8th pick in the 2009 NBA draft (a reward for being so bad last season).  With this pick the Raptors selected DeMar DeRozan.  His draft position suggests DeRozan could be above-average.  His college numbers tell a very different story.  Of the 47 players taken out of college last year, DeRozan ranked 39th in PAWS40 [Position Adjusted Win Score per 40 minutes].  The early returns on DeRozan are consistent with this college numbers.  After 20 games, his WP48 stands at 0.013.  Again, that’s below average. 

Although DeRozan’s production is quite low, it’s well beyond what the Raptors are getting from Antoine Wright.  Last season Wright produced -2.4 wins.  This season he’s on pace to produced -5.0 wins.

Turkoglu, DeRozan, and Wright are not the only players the Raptors added.  Toronto also added Amir Johnson, Marco Belinelli, and Jarret Jack.  Of this trio, only Johnson was above average last season (WP48 of 0.145).  And of this trio, only Johnson is above average this season (WP48 of 0.169).

Of the players who have played 200 minutes this season, only Bosh, Calderon, and Johnson are above average.  Everyone else – and that includes Andrea Bargnani [WP48 of 0.043], Belinelli, DeRozan, Jack, Turkoglu, and Wright – are below average. 

Again, none of this should be surprising. The Raptors struggles are simply not about their coach.  This really is all about the players.  Toronto has assembled a roster of players with very few productive performers.  And these players can grumble about their coach all they want.   But until the Raptors employ better players, a better outcome is not likely to be seen.

- DJ

The WoW Journal Comments Policy

Our research on the NBA was summarized HERE.

The Technical Notes at wagesofwins.com provides substantially more information on the published research behind Wins Produced and Win Score

Wins Produced, Win Score, and PAWSmin are also discussed in the following posts:

Simple Models of Player Performance

Wins Produced vs. Win Score

What Wins Produced Says and What It Does Not Say

Introducing PAWSmin — and a Defense of Box Score Statistics

Finally, A Guide to Evaluating Models contains useful hints on how to interpret and evaluate statistical models.

The Fabled Year of the Super Teams and the Somewhat Struggling Spurs

Before the season started I wrote an NBA Preview with the following title: Previewing the Year of the Super Teams.  The column made two observations. 

  • There had never been an NBA season where three or more teams won more than 75% of their games.
  • In 2009-10 we will see this happen. 

Hence, 2009-10 was going to be the year of the Super Teams!!!

About 20% of the 2009-10 season has now been played.  And when we look at efficiency differential (offensive efficiency minus defensive efficiency), here is the list of teams on pace to win more than 62 games this season.

1.         Boston Celtics (9.7 differential, 66.0 projected wins)

Yes, the list currently is occupied by one team.  The LA Lakers have a differential of 8.1 and are projected to win 61.7 games. Now that projection ignores the fact that Pau Gasol is now going to be playing all the Lakers games (assuming he stays healthy).  So I think the Lakers can surpass the 62 win mark.  But after the Lakers, the next best team is the Denver Nuggets (7.7 differential, 60.7 projected wins); and after the Nuggets, no team is currently projected to win more than 60 games.

The primary issue has been injuries (and suspensions).  At least, I think that’s primarily the story in Cleveland and Orlando (two of my leading “Super Team” candidates). 

But what about the San Antonio Spurs?  The Spurs have won five straight games.  Across the entire season, though, the Spurs have a 5.0 differential and are projected to win only 53.7 games.  This is hardly the mark of a Super Team (although a fantastic mark for many NBA teams).

When we look closely at the Spurs we see that injuries are mostly the story in San Antonio as well.  Manu Ginobili has already missed six games.  If Ginobili averages 30 minutes per game – and maintains his current Wins Produced per 48 minutes [WP48] of 0.222 – the Spurs would be on pace to win 57.9 games. And if Ginobili returns to the WP48 we saw last year (0.335), the Spurs would be on pace to win 62.6 wins (assuming Ginobili can average 30 minutes per game).

Ginobili is not the only player to suffer an injury.  Last year, Tony Parker posted a 0.166 WP48.  This year Parker has missed four games and his WP48 is only 0.090.  If Parker can average 30 minutes per game – and return to what we saw last year – we can add 4.3 wins to the Spurs’ projection.

While Ginobili and Parker have struggled, Tim Duncan and Matt Bonner are posting the best numbers of their respective careers. Back in 2002-03, Duncan posted a 0.375 WP48.  Last year – at the age of 32 – his mark was only 0.265. This year, though, Duncan’s WP48 has soared to 0.413.  Like Duncan, Bonner is also soaring.  His WP48 has risen from 0.158 last year to 0.223 this season.  And if Duncan and Bonner revert to what we saw last year, the Spurs’ projection will decline by 7.6.

If somehow Duncan and Bonner don’t completely return to what we saw last year – and Ginobili and Parker do return to what we saw last year – the Spurs will surpass the 60 win mark.   Those are quite a few ifs, though.  Still, it’s possible the Spurs will be one of these “Super Teams” I mentioned back in October.

Even if that happens, though, it looks like there was a serious flaw with what I said in October.  It appears I seriously under-estimated the importance of injuries.   Essentially I looked at what each team would achieve if everyone stayed healthy.  But of course injuries are going to happen.  Hence, 2009-10 may not be the year of the Super Team.  Yes, I think – after just 20% of the season has been played — I was wrong.

Let me close with two observations that seem correct (see, I can’t keep with the “I was wrong” theme).  First, it looks like DeJuan Blair will be a productive NBA player.  After 15 games his WP48 is currently 0.287.  Such a mark (and I haven’t checked everyone yet) might lead the 2009-10 rookie class.  So just as was thought last summer, it looks like teams shouldn’t have passed on Blair in the 2009 draft.

And there is the case of Richard Jefferson.  RJ’s WP48 mark is currently 0.078.   Jefferson has not been above average since 2005-06, and it looks like he might not be above average again.   In other words, it looks like Jefferson is going to continue to be ranked among the NBA’s most overrated talents.

Then again, I thought this was the year of the Super Teams. So maybe there is hope for Jefferson after all.

- DJ

The WoW Journal Comments Policy

Our research on the NBA was summarized HERE.

The Technical Notes at wagesofwins.com provides substantially more information on the published research behind Wins Produced and Win Score

Wins Produced, Win Score, and PAWSmin are also discussed in the following posts:

Simple Models of Player Performance

Wins Produced vs. Win Score

What Wins Produced Says and What It Does Not Say

Introducing PAWSmin — and a Defense of Box Score Statistics

Finally, A Guide to Evaluating Models contains useful hints on how to interpret and evaluate statistical models.