The majority of investigators who used a $50,000 per QALY threshold beginning around 1996 were unclear as to the origin of the practice. Many used this value without citing a source for the practice. Still others cited sources that did not provide either theoretical or empirical support for thresholds of $50,000 per QALY or LY . [...]
As Garber and Phelps stated, “The $50,000 criterion is arbitrary and owes more to being a round number than to a well-formulated justification for a specific dollar value”. This could account for the failure of subsequent studies to adjust the value for inflation or changing levels of income or healthcare budgets, a common criticism of the $50,000 per QALY value that applies to any fixed monetary threshold.
Prior, recent, QALY threshold posts here and here. Searching TIE, I found a post defending QALYs from 2011 here. (Yes, it’s by me, but I had no memory of it.) Here’s a well-cited paper on willingness to pay for a QALY I’ve never blogged about (or don’t recall and can’t find evidence of) by Richard Hirth, Michael Chernew, and colleagues.
Christina has a debilitating, severe case of Crohn’s disease, an extreme form of inflammatory bowel disease. Hers is a story of struggle, hope, near death, and life on the cutting edge of medical science.
This is also personal. Christina is my sister-in-law, and her story is told by Jack, her husband, a health and science reporter with New Hampshire Public Radio.
Below are links to three new papers on risk adjustment in the ACA Marketplaces. I have not read them in full, but intend to. This is an area I’ve found hard to get clarity on for some time. I hope these papers fill the void.
Previous coverage of this question here. In a recent NEJM Perspective, Peter Neumann, Joshua Cohen, and Milton Weinstein considered it.
The $50,000-per-QALY ratio has murky origins. It is often attributed to the U.S. decision to mandate Medicare coverage for patients with end-stage renal disease (ESRD) in the 1970s: because the cost-effectiveness ratio for dialysis at the time was roughly $50,000 per QALY, the government’s decision arguably endorsed that cutoff point implicitly. However, the link to dialysis is inexact — and even something of an urban legend, given that the cost-effectiveness ratio for dialysis was probably more like $25,000 to $30,000 per QALY, the ESRD decision was controversial, and even at the time Medicare was covering some treatments costing more than $50,000 per QALY.
Furthermore, the $50,000-per-QALY standard did not gain widespread use until the mid-1990s, long after the ESRD decision, and seems to stem more from a series of articles that proposed rough ranges ($20,000 to $100,000 per QALY) for defining cost-effective care. The field settled on $50,000 per QALY as an arbitrary but convenient round number, after several prominent cost-effectiveness analyses in the mid-1990s referenced that threshold and helped to congeal it into conventional wisdom. Researchers continue to cite the threshold regularly, although in recent years more have been referencing $100,000 per QALY. [...]
Given the evidence suggesting that $50,000 per QALY is too low in the United States, it might best be thought of as an implied lower boundary. Instead, we would recommend that analysts use $50,000, $100,000, and $200,000 per QALY. If one had to select a single threshold outside the context of an explicit resource constraint or opportunity cost, we suggest using either $100,000 or $150,000.
Not to detract from the piece at all, but just as a point of humor, I like the, “If you had to pick one, here are two” hedge.
All statistical studies for causal effects are seeking the same type of answer, and real world randomized experiments and comparative observational studies do not form a dichotomy, but rather are on a continuum, from well-suited for drawing causal inferences to poorly suited. For example, a randomized experiment with medical patients in which 90% of them do not comply with their assignments and there are many unintended missing values due to patient dropout is quite possibly less likely to lead to correct inferences for causal inferences than a carefully conducted observational study with similar patients, with many covariates recorded that are relevant to well-understood reasons for the assignment of treatment versus control conditions, and with no unintended missing values.
The first part of the RCM [Rubin Causal Model] is conceptual, and it defines causal effects as comparisons of “potential outcomes”  under different treatment conditions on a common set of units. [...] The second part concerns the explicit consideration of an “assignment mechanism.” The assignment mechanism describes the process that led to some units being exposed to the treatment condition and other units being exposed to the control condition. The careful description and implementation of these two “design” steps is absolutely essential for drawing objective inferences for causal effects in practice, whether in randomized experiments or observational studies, yet the steps are often effectively ignored in observational studies relative to details of the methods of analysis for causal effects. One of the reasons for this misplaced emphasis may be that the importance of design in practice is often difficult to convey in the context of technical statistical articles, and, as is common in many academic fields, technical dexterity can be more valued than practical wisdom.
A crucial idea when trying to estimate causal effects from an observational dataset is to conceptualize the observational dataset as having arisen from a complex randomized experiment, where the rules used to assign the treatment conditions have been lost and must be reconstructed.
Running regression programs is no substitute for careful thinking, and providing tables summarizing computer output is no substitute for precise writing and careful interpretation.
The next step is to think very carefully about why some units (e.g., medical patients) received the active treatment condition (e.g., surgery) versus the control treatment condition (e.g., no surgery): Who were the decision makers and what rules did they use? [...]In common practice with observational data, however, this step is ignored, and replaced by descriptions of the regression programs used, which is entirely inadequate. What is needed is a description of critical information in the hypothetical randomized experiment and how it corresponds to the observed data.
It is remarkable to me that so many published observational studies are totally silent on how the authors think that treatment conditions were assigned, yet this is the single most crucial feature that makes their observational studies inferior to randomized experiments.
No amount of fancy analysis can salvage an inadequate data base unless there is substantial scientific knowledge to support heroic assumptions. This is a lesson that many researchers seem to have difficulty learning. Often the dataset being used is so obviously deficient with respect to key covariates that it seems as if the researcher was committed to using that dataset no matter how deficient.
The following originally appeared on The Upshot (copyright 2014, The New York Times Company).
As a candidate in 2008, President Obamapromised that health reform would reduce family premiums by up to $2,500, equivalent today to about a 15 percent reduction from the2013 level. Though Mr. Obama might have been including the effects of premium subsidies in his calculation, a key premise of the Affordable Care Act is that competition among health insurers will drive premiums downward. So it’s worth asking: How much savings can additional competition produce?
The most direct answer to this question comes from analysis by Leemore Dafny and Christopher Ody of Northwestern University and Jonathan Gruber of M.I.T. They estimated the effect of greater competition on premiums for the second-cheapest silver-rated plans in the 34 exchanges that rely on at least some operational assistance from the federal government, known as “federally facilitated” exchanges. Their findings were based on a statistical model that predicts the effect of competition in the marketplace on premiums, controlling for other factors that could affect premiums like the demographics, income and hospital price levels in each market.
Many insurers did not participate in many of these exchanges in 2014. UnitedHealthcare, the nation’s largest insurer with 84 million policies in force in 2010, did not participate in any exchanges. Had it done so, Ms. Dafny and colleagues estimated that premiums would have been 5.4 percent lower. Had all insurers in each state’s 2011 individual market participated in that state’s exchange in 2014, premiums would have been 11 percent lower, saving $1.7 billion in federal premium subsidies.
(While the study looked only at the specific type of plan whose premiums are used to calculate federal subsidies — the second-cheapest silver-rated plan — Ms. Dafny said that premiums for other types of plans were typically highly correlated.)
But it’s not likely that all states would benefit by the same amount. A lot depends on the existing level of competition, which varies considerably. The exchanges in West Virginia and New Hampshire, for example, each have only one insurer this year, while the one in New York has 17.
Studies show that market entrances by insurers have a greater effect when there are few in a market, compared with when there are already many. (A more precise measure of competition takes into consideration enrollments into plans offered by insurers, not just number of insurers: A four-insurer market in which one has 90 percent of all enrollment is less competitive than a four-insurer market in which enrollment is split evenly.)
Of course, other things influence premiums and their growth, like costs associated with hospital, physician or pharmaceutical markets. Rigorous research controls for such confounding factors when assessing the effects of competition.
Steve Pizer of Northeastern University, Roger Feldman of the University of Minnesota and I exploited a rapid government revision of payment rates to private plans participating in Medicare Advantage markets in 2001. This revision allowed us to observe plan designs under two sets of payment rates offered close in time, and before confounding cost factors could have changed. From this natural experiment, we found that greater competitionlowers premiums and raises benefits.
Researchers from the Congressional Budget Office and Vanderbilt Universityrecently examined the effects of competition on premiums in the 34 regional markets of Medicare’s prescription drug program, known as Part D. In 2010, an average Part D region was served by 18 insurers.
The researchers found that if one additional insurer had entered the market, premiums would have fallen by 0.4 percent. The savings are modest because there are diminishing returns to competition: Adding one insurer to a market with 18 has a much smaller effect than adding one to a market in which only a few insurers participate.
Other work by Ms. Dafny also demonstrates diminishing returns to competition. She examined data from a sample of large employers from 1998-2005. In years in which employers earned higher operating profits, they paid higher premiums, but only in markets with 10 or fewer insurers. She found the largest effects in markets with fewer than five insurers; the more insurers in the market, the lower the premium increases.
Similarly, recent analysis by the Urban Institute found higher premiums in less competitive markets.
The University of Pennsylvania health economist Robert Town, co-author of a comprehensive review of competition in health care markets, thinks that insurers are being drawn to exchanges by favorable market conditions: “The increase is being driven, I believe, by insurers seeing that the volume of enrollment on the exchanges is meeting/exceeding predictions.”
How might we make exchanges with low levels of competition even more attractive to potential participants? Ms. Dafny thought that some changes in how exchanges operate could help. She suggested favorable placement of new insurers’ plans “in search results, on-site ads, or automatic enrollment of certain individuals.” Another possibility is allowing some organizations “to offer insurance on a trial basis before having to satisfy all of the standards imposed by state departments of insurance.”
For a given individual, gains from competition could be offset by other effects on net premiums, like subsidy level or age. So even if President Obama’s promised $2,500 savings actually materialize for some people, they won’t for others. By one interpretation, it was ambitious for the president to have ever made that promise. Premiums almost always go up year to year. However, if competition can be enhanced, particularly where it is weakest, premiums could come down considerably in the future, relative to what they would otherwise be in more concentrated markets.
The premise that consumers can save money in more competitive health insurance markets is reasonable. The challenge is to lure enough competitors to reap the benefits.
I did not expect my children to express sadness around their birthdays. As they’ve aged they’ve been better able to articulate why. “I don’t want to grow up,” they’ve told me, crying. Crying!
Their emotions are mixed. Birthdays are exciting, and they have a good time too. But the sense of the gradual, inexorable chipping away at their youth is palpable, even to them, as it is to me.
I’m 42. I get it. I don’t want to age anymore either. This is a common feeling among adults. I guess kids feel it too, or my kids anyway. Maybe it’s genetic, because I was the same when I was young. I recall trying to hang on to what it felt like to be five as I turned six, then seven, then eight, …
Despite my attempts, I could not do it. My brain aged with my body, and my mind changed with it, as it should. Not only did the memories fade, but so did the feelings. At some point I could not conjure up the sense of fiveness. It was gone. Completely gone. And I was sad. I could never go back, not even in my imagination, and this inevitable irreversibility is not what I wanted.
As a teenager, perhaps 15, I recall discussing something like this with my wonderful and wise step-father. When he realized what I was getting at he asked me what was so special about being young. My answer was paradoxical, “Because you change a lot when you’re a kid. Your mind changes, and that’s fun. It’s interesting.” I wanted to stay young precisely because I enjoyed the rapid change of growing up. Perhaps fiveness was exciting to me because there was so much newness, so much change. But to experience it also ment gradually becoming not five.
Then my step-dad said something important that I will never forget. “You think adults don’t change?”
“Well, once you grow up, you’re kind of done, right?” I said.
“Oh no! Anybody can change. Personal growth does not stop when you’re a grown up.” I believed him because I had learned that he’s usually right. And now I know from experience that he spoke the truth.
Without knowing it, my step-dad and I were discussing the end of history illusion. This is the phenomenon that when one contemplates the future—the next decade, say—one tends to imagine very little change in one’s personality, values, and preferences. However, when one thinks back over the past decade, one recognizes considerably more change. The amazing thing is that, according to work by Jordi Quoidbach, Daniel Gilbert, and Timothy Wilson, this discrepancy between the predicted change in the future and the recognized change in the past exists at every age.
Throughout their lives, people “regard the present as a watershed moment at which they have finally become the person they will be for the rest of their lives.” There is no more growth, we think. But we’re always wrong. This is the end of history illusion.
In a sequence of studies, Quoidbach, Gilbert, and Wilson compared people’s predictions of change in personality, values, and preferences over the next ten years with recollections of people ten years older about the prior ten years. For example, they compared how 20 year olds thought they’d change (or not) to how 30 year olds had actually changed over the same decade of life, 20-30. They did this for every age between 18 and 68 years old. The charts below, based on metrics of changes in personality, values, and preferences, summarize their findings.
As the chart shows, reported changes were above predicted changes for all ages except after age 55/65 with regard to preferences. This suggests that we change as we age a lot more than we expect to, which is what my step-father was telling me. The charts also show that we do change less over time, just not as much less as we think.
“Time is a powerful force that transforms people’s preferences, reshapes their values, and alters their personalities,” the authors wrote.
When I get sad about aging, this insight brightens my mood. It’s not over! Of course I will never be five or even feel fiveness, but I’m not done experiencing the very thing I cherished about youth: change.
My kids aren’t either, despite what they may think.
If you are sad at birthdays because it means the end of the personal growth you’ve been enjoying, you’re wrong. You have not become who you will always be.
Watch Dan Gilbert explaining the end of history illusion in his TED talk:
McDonald was in a unusual position and did a difficult thing. He knew more about the solid rocket booster’s O-rings than just about any other engineer on the planet. Because low temperatures degraded their ability to keep hot gas from escaping the booster’s joints, he did not approve of launching the Challenger in the extremely cold conditions on January 28, 1986. He was in the room when the decision was made to do so anyway, over his objection. And in the subsequent investigation into the cause of the explosion of the Challenger, he was one of the few who spoke the full truth without attempts to paper over the flawed decision-making process.
What gave NASA cover to launch—but in the full context of the situation was inexcusable anyway—was a signed blessing of the O-ring’s adequacy from the solid rocket booster’s manufacturer, Morton Thiokol. Despite the low temperatures, McDonald’s superiors at Morton Thiokol decided the O-rings would be fine. This was what drew me to the story.
Why did Morton Thiokol executives make the go-for-launch call when their engineers and the data strongly argued the risk of failure was high?
I imagine it was a tough call. I sure hope it was! Try to put yourself in their position. On one side, you’ve got engineers telling you the O-rings can’t handle the conditions, and they have some data to support that position, though it’s not an air tight argument. There’s always room for some doubt, some probability things will be fine.
On the other side you have what? “System pressure,” as David Newman would call it, also known as “conflicts of interest.” The pressure to please the client, NASA, was high. NASA was, at that time, considering Morton Thiokol’s next contract. A lot depended on keeping the money flowing. Jobs were at stake. It’s no small thing to displease a client, lose a contract, and have to lay off hundreds of workers who are counting on you. NASA had its own form of system pressure, in wanting to maintain a tight schedule of launches to show Congress—which controls the purse strings after all—it could perform as promised.
System pressure should never have ratcheted up so high that it created strong incentives to launch on January 28, 1986. That was NASA’s fault. Perhaps overly politicized “oversight” by Congress can and did play a role as well. (This is not unique to NASA and the Shuttle.)
But even in much less vital circumstances—ones even you and I face—there is some system pressure. We become invested in our positions, feeling our reputations ride on them. We have some responsibility to maintain our salaries and even grow them. Some of us are responsible for creating revenue that others and their families rely on. There are professional and cultural norms that we are loath to cross.
Sometimes, though by no means always, these forces push against doing the right thing. We are conflicted, at least somewhat. And sometimes they do so when there’s some ambiguity as to just what the right thing is. Here’s where it’s easy (or easier) to shade, to lean, to allow those system pressures to tip the scales so we can have it all. We find a way to justify doing the thing that doesn’t disrupt the status quo, even when without those system pressures we would not do that thing.
It’s very hard to be fully in tune to when this is happening. Most of the time it doesn’t matter much. Few decisions are anywhere near as important as whether or not to launch the Space Shuttle in temperatures below those at which its components have been tested. But sometimes a decision matters just enough that one is risking one’s integrity and credibility (if not worse) by succumbing to system pressure when it opposes what is empirically the (more) right call.
McDonald did a hard, brave thing by resisting system pressure. He paid a price for it, though his career seemed to have gone quite well anyway. It’s no small feat, what he did, and some of his colleagues couldn’t do it. Is it so clear you or I could in the same circumstance? In what ways do you or I allow system pressure to chip away at our credibility and integrity, if only imperceptibly? I find it disturbingly interesting to ponder these questions.
The book is long, both because it is so detailed, but also because it tells the history of the aftermath of the disaster linearly. It was investigated several ways: by Presidential Commission, by congressional committees, and in various lawsuits. In each part of the story some of the same arguments and episodes are covered. I found some parts of the book overly technical. But it’s easy enough to skim and skip. If you don’t read it, at least listen to the Freakonomics episode.