Ari Caroline has been busy (and he’s about to explain part of why.) But he’s back with a fantastic piece on Moneyball, Healthcare and how to evaluate models
Jonathan Cohn wrote a great cover story for the March issue of The Atlantic titled “The Robot Will See you now” I say this with a degree of bias but also with a bit of remorse.
You see, the article features the work that my Quantitative Analysis and Strategic Initiatives team (QuantStrat for short) and the physician-experts at Memorial Sloan-Kettering Cancer Center (MSK) have been doing with IBM. It describes the work we are doing to train IBM’s Watson super-computer to learn to understand a cancer case and, subsequently, to try to replicate the decision making process used by the Memorial Sloan-Kettering doctors. However, before our work with IBM became public, Jonathan was already at work on the article. His original focus on my group had been about the unique concept of having a “Moneyball” team in healthcare. Watson was the sexier project and won the day for The Atlantic piece but, at Memorial Sloan-Kettering we remain as excited as ever about the “other” work that we do and its potential to transform healthcare.
The nature of a “Moneyball” team in an academic medical center is surprising to a lot of people. For those who haven’t read the Michael Lewis book or seen the movie, I often have to backtrack and explain that what our QuantStrat team does has almost nothing to do with “money” and nothing at all to do with “ball”. The parallel with Billy Beane is in the use of data and analytics to help an organization make better decisions. In sports, those decisions can be which players to sign and which ones to play. In a hospital, it can be how many operating rooms to build, where to place a new clinic or, more excitingly, which treatment is likely to work best for a particular patient. At Memorial Sloan-Kettering, we have a separate team that does the financial accounting so my team has the luxury to focus on mission ends like advancing cancer care and, with Gd’s Help, saving lives.
My recent contributions to Wages of Wins (no, I haven’t given up on Floor Stretch! I’ve just been real busy) prompted the question of what the values are that I see in any quantitative model, be it focused on basketball or on tackling a horrible disease like cancer. The natural corollary to that question is what, in particular, attracted me Wages of Wins when there are all kinds of sports statistics groups out there.
Reflecting on this question has prompted me to try and articulate the qualities that I see as essential to any useful quantitative model. In doing so, it becomes clear what, in particular, was so compelling to me about Prof. Berri’s work. So without further ado, I posit for your consideration and comment: The Three Qualities of a Good Quantitative Model
Is it predictive?
While predictive power is essential for any quantitative model, unfortunately, many analysts start and stop with this quality. I would argue that it is actually pretty meaningless without the other two qualities that I outline below so I won’t focus on this one for now.
You can have the most predictive model in the world but if it is essentially a black box, in my
experience, the model will not end up being very useful. There are a couple of reasons that a
model needs to be translatable in terms of real-world phenomena. First, if it doesn’t match any logical understanding of a system and the system dynamics, there is a risk that you have found some statistical anomaly that, given a new set of circumstances, won’t continue to hold. We’ve seen the value of the systems understanding in our basketball analysis when we consider statistics like minutes played and offensive rebounds. An off-the-cuff statistical model would show great predictive power for minutes played and weak predictive power for offensive rebounds. However, this doesn’t gibe with an even basic understanding of how the system works. Minutes played can’t logically be independently valuable. That would mean that you could improve a bad player’s contribution to the team by playing him more! Similarly, we know that despite the weak correlation offensive rebounds have with wins, we know from a systems perspective that they are important. By definition, they give you another chance at a shot (as well as one that is likely to be closer to the basket)!
It is this system understanding that prompts us to dig deeper and refine our statistical model. When we do, it becomes clear that minutes played are only valuable when they are used by an otherwise quality player. Offensive rebounds only emerge as statistically valuable once we understand that they require a negative event (a missed shot) to happen first and, therefore, we must hold constant missed shots in our model.
Is it transparent?
The second quality is that model must have transparency and translation. Perhaps the more important reason is that if a model can’t be translated in such a way that the subject matter experts (be they coaches or doctors) can understand the model, it is unlikely that they will change their behavior in response to the model. It is this realization that has prompted my QuantStrat group to dedicate as much time to creating intuitive data visualizations as we do building the statistical models. We also maintain a constant dialogue with the physicians, nurses and administrators to ensure that statistical phenomena that we are seeing are grounded in reality.
Can we act on it?
This is the quality of a model that is most frequently, and most tragically, overlooked. It doesn’t really help to know that a group of patients have a certain genetic mutation if there is no known treatment to target that mutation (though it could be valuable in prompting new research).
Football turnovers are a good sports analogy for this. Turnover differential is the most strongly predictive factor in predicting game outcomes in football. The problem is, we don’t have a proven way to act on that knowledge. Turnovers by quarterbacks are generally not consistent over time so you can’t just say change the quarterback. Sack differential is somewhat correlated with turnovers but do you attack this by adding a defensive end, an offensive tackle or an outside linebacker. Turns out, the system dynamics in football are quite complex.
These three qualities of quantitative analysis have emerged as being most essential in our setting in healthcare and it is these same qualities that attracted me to Prof. Berri’s work and the Wages of Wins team. I’m not one to try and defend any model as being perfect. Models, by definition, are imperfect tools for reflecting reality. However, any model that has the three above categories at its core is one with which I am proud to be associated.