Joe Posnanski wrote the following for NBCSports: “VANGUARD AFTER THE REVOLUTION: Bill James sparked a baseball insurrection, but he has regrets about the world he wrought”
The article is essentially a reflection on the impact Bill James has had on baseball analysis. And it notes — as the title suggests — that James is not thrilled with everything that people have offered with respect to statistics and baseball.
Two specific quotes stood out for me. First, here is Bill James from 1988:
“As I saw it, baseball had two distinct mountains of material. One the one hand, there was a mountain of traditional wisdom, things that people said over and over again. On the other hand, there was a mountain of statistics. My work was to build a bridge between those two mountains. A statistician is concerned what baseball statistics ARE. I had no concern with what they are. I didn’t care, and I don’t care, whether Mike Schmidt hit .306 or .296 against left-handed pitching. I was concerned with what the statistics MEAN.
“Sportswriters, in my opinion, almost never use baseball statistics to try to understand baseball. They use statistics to decorate their articles. They use statistics as a club in the battle for what they believe intuitively to be correct. That is why sportswriters often believe that you can prove anything with statistics, an obscene and ludicrous position, but one which is the natural outgrowth of the way that they themselves use statistics. What I wanted to do was teach people instead to use statistics as a sword to cut toward the truth.”
I would agree with this assessment. Stats are often thrown into stories with no sense of whether they are credible or not. And there is a sense you can prove anything you like with numbers. This is only true if you have no idea how to analyze numbers (and unfortunately, this is an accurate description of many who think they know how to analyze numbers).
And here is Bill James today discussing Wins Above Replacement (a subject I wrote upon a few weeks ago). As you can see, he is not thrilled with this approach.
“Well, my math skills are limited and my data-processing skills are essentially nonexistent. The younger guys are way, way beyond me in those areas. I’m fine with that, and I don’t struggle against it, and I hope that I don’t deny them credit for what they can do that I can’t.
“But because that is true, I ASSUMED that these were complex, nuanced, sophisticated systems. I never really looked; I just assumed that the details were out of my depth. But sometime in the last year I was doing some research that relied on these WAR systems, so I took a look at them, and … they’re not very impressive. They’re not well thought through; they haven’t made a convincing effort to address many of the inherent difficulties that the undertaking presents. They tend to get so far into the data, throw up their arms and make a wild guess. I don’t know if I’m going to get the time to do better of it, or if it will be left to others, but … we’re not at anything like an end point here. I assumed that these systems were a lot better than they actually are.”
I would add, this critique also applies to much of what I have seen in basketball analysis over the years. Models such as PER, Win Shares, and Adjusted Plus-Minus all have serious problems that are not well understood by the people who like to cite this work.
Unfortunately, many people do “assume” these models are better than what they are. And as I have discovered, there isn’t much one can do to remove that “assumption” (of course, that doesn’t mean you don’t stop trying!).
Let me close by noting that Bill James is twenty years older than me. And I fully expect that in 20 years 1) PER and the other models will still be with us and 2) I will still be noting why these models are not very good.