An apt analog to William Langewiesche’s story of the 2009 crash of Air France Flight 447 is Bob Wachter’s account of the non-fatal overdosing of a pediatric patient at the UCSF Medical Center. Neither story is new, and I make no claims that my feeble insights below are novel.*
In fact, Wachter mentions Flight 447’s fatal crash as he explains how automation can create new vectors for disaster, even as it closes off old ones. Both he and Langewiesche provide evidence that automation—auto-pilots in aviation and clinical decision support and order fulfillment features in electronic medical systems—improves safety on average while courting danger in a subset of cases. This is not a knock on their work or the compelling anecdotes they use to drive their narratives, but it’s a plea for a bit of perspective as you read either story, and I highly recommend both.
At the heart of both is a cascade of errors that begins with a human (or humans’) misunderstanding of the mode in which an automated system is operating. Wachter offers a very nice example of such a “mode error,” which I’m certain you can relate to: ACCIDENTALLY TYPING WITH CAPS LOCK ON. The caps lock key toggles the keyboard mode such that all (or most) keys behave differently.
When typing, an inadvertent caps lock toggle can cause annoying mode errors, like failing to properly enter a password. When flying an aircraft or ordering medications for a patient, mode errors can be deadly, even if they’re usually annoyances that get remedied before disaster strikes.
The pilots aboard Flight 447 didn’t recognize that their plane had switched modes, relying less on auto-pilot and ceding more control to them. They misinterpreted this sudden grant of autonomy as a confusing set of malfunctions. Likewise, the physician that initiated the sequence of errors that landed Pablo Garcia in the ICU, but might have killed him, didn’t recognize a mode change: the electronic medication entry system had switched from interpreting entries in milligrams to milligrams per kilogram of patient weight, thus multiplying a 160mg dose by a factor of 39.
Failure of humans to recognize mode changes and failure of systems to make them more obvious but without exacerbating “alarm fatigue” are among the many ways automation can harm. It relies on humans’, often well-earned, trust in automation. When we ignore the warning signs that an automated system is telling us, in part that’s because that system has served us very well in the past, sparing us from far more errors than it creates. Despite their intent, the vast majority of alarms (car alarms, fire alarms, the flashing “check engine” light, and the like) are not signals of immediate danger, so our learned response is to treat them as such and to ignore them when possible. Though infrequently, this will sometimes be a mistake. It won’t always lead to disaster (because we have other means of obtaining the right information and correcting our first, false assumption), but it could do so.
Such assumptions are not unique to automated systems. I’m well aware that every wail from my children is not a signal of deadly distress. Their sounds of alarm don’t always mean what they think they mean. Likewise, the political candidate who warns the end of America if his opponent is elected is no longer alarming.
Our trust in (or conferring of) authority is not unique to the machine-human relationship either. Though I do trust many machines, I trust a great number of humans too. They’ve earned it. And yet they err, and their errors cause me harm, just as mine cause harm to others. Naturally, we should be aware of the harms of machines, of humans, and of the marriage of the two. We should strive to reduce the potential for grave error, provided we can do so in ways that don’t invite greater costs (by which I do not merely mean money).
A careful read of the accounts of Flight 447 and patient Pablo Garcia reveals the overwhelming benefits of automation in aviation and medicine, as well as the dangers that still remain. There is much more work to do, as both authors expertly document. Humans are highly imperfect. So are our systems designed to protect us from ourselves.
I write about nutrition far more now than I used to. Part of that is because – as with health policy – as I’ve learned how little of what we say is based on data and evidence, the more irritated I become (see my Twitter avatar).
I recently came across a Viewpoint in JAMA that is illustrative of how things are changing in nutrition. It’s by Dariush Mozaffarian and David Ludwig, and it talks about the Dietary Guidelines Advisory Committee report. Here’s something I already wrote about at The Upshot:
In the new DGAC report, one widely noticed revision was the elimination of dietary cholesterol as a “nutrient of concern.” This surprised the public, but is concordant with more recent scientific evidence reporting no appreciable relationship between dietary cholesterol and serum cholesterol1 or clinical cardiovascular events in general populations.
But they want to focus on something else:
A less noticed, but more important, change was the absence of an upper limit on total fat consumption. The DGAC report neither listed total fat as a nutrient of concern nor proposed restricting its consumption. Rather, it concluded, “Reducing total fat (replacing total fat with overall carbohydrates) does not lower CVD [cardiovascular disease] risk.… Dietary advice should put the emphasis on optimizing types of dietary fat and not reducing total fat.” Limiting total fat was also not recommended for obesity prevention; instead, the focus was placed on healthful food-based diet patterns that include more vegetables, fruits, whole grains, seafood, legumes, and dairy products and include less meats, sugar-sweetened foods and drinks, and refined grains.
The complex lipid and lipoprotein effects of saturated fat are now recognized, including evidence for beneficial effects on high-density lipoprotein cholesterol and triglycerides and minimal effects on apolipoprotein B when compared with carbohydrate. These complexities explain why substitution of saturated fat with carbohydrate does not lower cardiovascular risk. Moreover, a global limit on total fat inevitably lowers intake of unsaturated fats, among which nuts, vegetable oils, and fish are particularly healthful. Most importantly, the policy focus on fat reduction did not account for the harms of highly processed carbohydrate (eg, refined grains, potato products, and added sugar)—consumption of which is inversely related to that of dietary fat.
As with other scientific fields from physics to clinical medicine, nutritional science has advanced substantially in recent decades. Randomized trials confirm that diets higher in healthful fats, replacing carbohydrate or protein and exceeding the current 35% fat limit, reduce the risk of cardiovascular disease. The 2015 DGAC report tacitly acknowledges the lack of convincing evidence to recommend low-fat–high-carbohydrate diets for the general public in the prevention or treatment of any major health outcome, including heart disease, stroke, cancer, diabetes, or obesity. This major advance allows nutrition policy to be refocused toward the major dietary drivers of chronic diseases.
As I’ve said repeatedly, I don’t know that the evidence is so clear that we should be making declarative statements telling anyone how to eat. But it’s amazing just how much the tide has turned not just against carbohydrates, but towards fat. I spent decades being told to reduce my fat intake, lower and lower and lower; that may have been the wrong thing to do.
Some ideas for reducing publication bias and increasing the credibility of published scientific findings, from Brendan Nyhan:
- Pre-accepted articles, based on pre-registered protocol and before findings are known. Brendan reports that this is already happening at AIMS Neuroscience, Cortex, Perspectives on Psychological Science, Social Psychology, and for a planned special issue of Comparative Political Studies.
- Results-blind peer review. A similar idea to pre-accepted articles, this would evaluate submissions on all aspects of a paper (data, methods, import) apart from the actual findings. Brendan notes that this has been attempted at Archives of Internal Medicine.
- Verifying replication data and code, basically providing everything necessary to replicate the study. Already standard at the American Journal of Political Science and American Economic Review.
- Reward higher quality and faster reviews with credits, redeemable for faster review of one’s own manuscripts. I’m not aware of a any journal that has attempted any program aimed at reducing review times, let alone succeeded in doing so.
- Forward reviews of promising manuscripts to section journals. That is, if the flagship journal can’t accept, but recommends publication in an affiliated journal, streamline the process by treating the flagship’s review as the first round. Something like this already happens with JAMA journals and the American Economic Review and its affiliated journals.
- Triple blind reviewing would blind the editor from the authors, not just the authors and reviewers from one another. Already standard at Mind, Ethics, and American Business Law Journal.
As Brendan writes, all of these have limitations and none can remove all potential bias or gaming. Yet it’s hard to argue they’re not worth considering.
First, you put the Clipper Skipper out to pasture, because he has the unilateral power to screw things up. You replace him with a teamwork concept—call it Crew Resource Management—that encourages checks and balances and requires pilots to take turns at flying. Now it takes two to screw things up. Next you automate the component systems so they require minimal human intervention, and you integrate them into a self-monitoring robotic whole. You throw in buckets of redundancy. You add flight management computers into which flight paths can be programmed on the ground, and you link them to autopilots capable of handling the airplane from the takeoff through the rollout after landing. […] As intended, the autonomy of pilots has been severely restricted, but the new airplanes deliver smoother, more accurate, and more efficient rides—and safer ones too.
It is natural that some pilots object. […] [A]n Airbus man told me about an encounter between a British pilot and his superior at a Middle Eastern airline, in which the pilot complained that automation had taken the fun out of life. […]
In the privacy of the cockpit and beyond public view, pilots have been relegated to mundane roles as system managers, expected to monitor the computers and sometimes to enter data via keyboards, but to keep their hands off the controls, and to intervene only in the rare event of a failure. […] Since the 1980s, when the shift began, the safety record has improved fivefold, to the current one fatal accident for every five million departures. No one can rationally advocate a return to the glamour of the past. […]
Once you put pilots on automation, their manual abilities degrade and their flight-path awareness is dulled: flying becomes a monitoring task, an abstraction on a screen, a mind-numbing wait for the next hotel […] [a] process  known as de-skilling. […]
The automation is simply too compelling. The operational benefits outweigh the costs. The trend is toward more of it, not less. And after throwing away their crutches, many pilots today would lack the wherewithal to walk.
Safer by design yet operationally boring, is this the future of medicine? A meaningful subset of it? Not in a million years?
The rest is even better. Read the whole thing.
In a recent piece in Health Affairs, Sherry Glied, Stephanie Ma, and Ivanna Pearlstein discuss how much people who work in health make versus other sectors, and what that might mean for health care spending. I discuss their piece, and what I think it means, over in my latest post over at the AcademyHealth blog.
The following originally appeared on The Upshot (copyright 2015, The New York Times Company).
Both studies examined Medicare’s 32 Pioneer Accountable Care Organizations. This program, and a related, similar one with a larger number of participants, offers health care organizations the opportunity to earn bonuses in exchange for accepting some financial risk, provided they meet a set of quality targets.
One study, published in the New England Journal of Medicine, found that in their first year, Pioneer A.C.O.s reduced spending 1.2 percent, relative to comparable patients who received care elsewhere. The other study, published in the Journal of the American Medical Association, found a 3.6 percent spending reduction in the first year and considerably less in the second year. (Though the studies reach broadly similar conclusions, their different savings estimates stem from methodological differences.)
Across a variety of measures, the two studies found that Pioneer A.C.O. quality of care held steady or improved.
Even if the overall savings are modest and assessed only in the first year or two of the program, the studies’ findings are good news for Medicare. Inspired by some of the nation’s most revered health care organizations — like Kaiser Permanente and the Mayo Clinic — Medicare’s A.C.O. program is its flagship reform initiative. It’s intended to promote the delivery of more efficient and effective care, paying more for value than for volume. Medicare has announced it intends to accelerate the transition from volume to value in the coming years. The new studies offer some confidence that it can do so while reducing spending and without harming quality.
However, there is still cause for concern. Thirteen A.C.O.s left the Pioneer program after the first year. Even though those A.C.O.s had saved money too, according to the studies, this is a troubling sign. A program that fails to retain its members cannot succeed in the long term. And, as these two studies cover only the first two years, despite the encouraging findings they do not provide information about what happened in the longer term.
Because the program is voluntary, an organization that can earn more by leaving, or one that anticipates it cannot recoup investments necessary to succeed, will not participate. One reason organizations may have dropped out is that payments decrease quickly as organizations become more efficient.
Dr. Michael McWilliams, lead author of the New England Journal of Medicine study, suggested that Medicare may achieve greater success over time with a more gradual approach that better balances the goal of achieving savings with the need to retain participants.
“Building on this early success will require greater rewards for A.C.O.s that generate savings,” Dr. McWilliams said.
Dr. McWilliams’s study also found that organizations that consolidated hospitals with physician practices performed no better than those that did not. This suggests that such consolidation — which has been rampant in the industry and drives up prices paid by commercial insurers — is not necessary to reduce Medicare spending and improve care.
“If financial integration between physicians and hospitals fosters more effective responses to new payment models, those efficiencies have not yet manifested among A.C.O.s,” Dr. McWilliams said.
The voluntary nature of the program also challenges study of it. Self-selection invites the possibility that organizations that opt in could be different from those that don’t, perhaps better able to reduce spending and improve quality. Randomizing organizations into the program — akin to a randomized controlled trial of medical therapies — would offer more certain estimates of the program. But it’s not practical to force such a large change on health care organizations, and possibly dangerous to experiment so directly with factors that could affect patient care.
The two studies’ researchers used a different approach to tease out estimates of the program’s effects. They compared changes in cost and quality experienced by beneficiaries in A.C.O.s with those of comparable beneficiaries served by other organizations. Dr. McWilliams’s study also tested whether those changes corresponded to the timing of A.C.O. participation. Since no changes were detected before program initiation, that provides confidence that the findings are because of the program itself.
The good news is that, at least in the first year or two of participation, A.C.O.s seem to spend less and deliver equivalent or better care than other health care organizations. The bad news is that many organizations drop out of the program, even as they’re succeeding.
the battle to set in place a health care system that works for all Americans is far from over.
Cannon is absolutely right.
King was a victory because it prevented millions of people from losing insurance coverage. But it did not advance the cause of health care reform a centimeter with respect to the status quo ante King.
In The New Republic, I argue that the principal goals of health care reform remain to be fulfilled. Getting them fulfilled will require us to win new political fights to extend universal health insurance in every state. We need to keep working on innovative health care delivery models that control the growth in health care expenditures while improving the quality of care.
Above all, we need to get more empirical:
Health care reform has to be driven by results, not political beliefs. Programs should be selected based on evidence, such as the results of randomized clinical trials. Every new reform should collect rigorous data to determine whether it works.
From JAMA Pediatrics,
Importance: It is important to estimate the burden of and trends for violence, crime, and abuse in the lives of children.
Objective: To provide health care professionals, policy makers, and parents with current estimates of exposure to violence, crime, and abuse across childhood and at different developmental stages.
Design, Setting, and Participants: The National Survey of Children’s Exposure to Violence (NarSCEV) includes a representative sample of US telephone numbers from August 28, 2013, to April 30, 2014. Via telephone interviews, information was obtained on 4000 children 0 to 17 years old, with information about exposure to violence, crime, and abuse provided by youth 10 to 17 years old and by caregivers for children 0 to 9 years old.
Main Outcome and Measure: Exposure to violence, crime, and abuse using the Juvenile Victimization Questionnaire.
Violence against children is a significant issue. How often does it occur? This is an attempt to give us some numbers.
More than 37% of youth experienced a physical assault, as defined by the researchers, in the study year; more than half had experienced at least one in their lifetime. More than 9% experienced an assault-related injury in the study year.
Two percent of girls experienced a sexual assault or sexual abuse, as did 4.6% of girls 14 to 17 years old.
More than 15% were mistreated by a caregiver, and one-third of that was physical abuse.
Almost 6% of children witnessed an assault between parents.
There were only two significant changes in these data from 2011. The first was a decline in exposure to dating violence. The second was lifetime exposure to household theft. Both of those things are good, but the overall numbers show we have a lot of work to do.
I encourage you to go read the whole manuscript.
Recent studies have credited Pioneer ACOs with some savings, though less to none accounting for program costs. How do those study results square with predictions? I actually don’t know, and I’m not sure anybody does. The Pioneer Program is a demonstration program within the Center for Medicare and Medicaid Innovation. Demos don’t get scored (I’m told).
I can, however, find* predictions of savings from Medicare’s Shared Savings Program, which is a larger and slightly different ACO program than Pioneer. (Learn about the differences here.) Those predictions are below.
When reading the following, keep in mind that today Medicare costs over $500 billion per year. So, when discussing 10-year budget savings, compare figures below to something on the order of $5 trillion. In other words, $5 billion savings over 10 years is roughly 0.1 percent of total Medicare spending.
 In December 2008, the Congressional Budget Office (CBO) provided some clues to its thinking about ACOs (which, at the time, it called Bonus Eligible Organizations, or BEOs). Under the assumption that 20 percent of beneficiaries participated in such an organization by 2014 and 40 percent by 2019, CBO scored $5.3 billion in savings over 2010-2019. Today, only 7 million beneficiaries are associated with ACOs, or about 14 percent. The savings would decline over that span for several reasons, one of which is that as organizations became more efficient, bonus payments would grow. CBO warned that their prediction was highly uncertain because it was not clear precisely how organizations would respond to financial incentives and due to the voluntary nature of the program. Regulations are evolving to attract and retain more organizations, which often means paying them more.
 In scoring the ACA in March 2010, the Congressional Budget Office (CBO) predicted savings of about $5 billion over the original 10-year window from the Shared Savings Program.
 In 2010, the CMS actuary credited the Shared Savings Program, among many others, with zero savings for every year through 2019.
 For what it’s worth, prior attempts at care coordination programs tended not to produce savings. Researchers have found modest savings for the Physician Group Practice Demo and savings from private-sector ACO-like contracts.
My guesses: the CMS actuary is probably wrong on the low side, but possibly not by much. CBO’s estimate could be close to right, but possibly too high. We already see some savings from Pioneer ACOs, for example, but fewer beneficiaries are associated with ACOs than expected and modifications of regulations may increase program costs.
People who think ACOs will definitely turn the tide of health care spending once and for all are overconfident. Maybe ACO savings can grow over future decades, but it’d have to do some significant compounding to reach a substantial portion of total Medicare spending this century. One way it could compound is if ACOs succeeded in controlling the diffusion of expensive, new health care technology.** I am unaware of anyone making an explicit argument that they will do so.
* By “find” I mean that I asked Loren Adler, and he emailed me links.
** If I’m doing the math right, with 0.1% savings relative to all of Medicare spending in the first decade, even if that doubled every decade, we’d only see about 0.8% savings by 2100.