• Does Medicare Advantage have a quality and efficiency advantage?

    Either Medicare Advantage is doing something amazing or data limitations are skewing our view of it. I imagine most people’s priors will drive them to interpret the findings of Bruce Landon and colleagues in one of either of those two ways.

    Medicare Advantage (MA) plans have greater flexibility than traditional Medicare (TM). MA can offer more benefits, selectively contract with providers, impose utilization controls (like referral requirements), and implement care coordination programs without large regulatory burdens or new acts of Congress. MA plans must also be responsive to the market, which should provide incentives for higher quality and greater efficiency.

    Put it all together and, in theory at least, MA should outperform TM in efficiency and quality. But does it?

    Most studies fail to convince one way or the other because researchers are not permitted the same degree of access to MA data as that for TM. For the latter, full claims over many years are available* (though quality measures not derived from claims data are not). For the former, some aggregate measures of utilization provided by plans are usually all we get, and when we get them, they’re not over many years. (However, more quality measures are available from MA plans.)

    Comparing MA to TM is like trying to compare two houses, one of which you can live in, the other of which you can only observe through a few keyholes.

    In 2006 and 2007 (and only 2006 and 2007), the Centers for Medicare and Medicaid Services (CMS) offered a glimpse of MA through a new keyhole: relative resource use (RRU) data. These plan-level data measure utilization with standardized prices, which removes geographic and MA- or TM-specific price differences. They do so for diabetic patients in both years and those with cardiovascular disease in 2007 only. They are also stratified by age, sex, diabetes type (1 or 2), cardiovascular disease (acute myocardial infarction, congestive heart failure, angina, or coronary artery disease), and the presence or absence of at least one major comorbidity.

    Individual-level Healthcare Effectiveness Data and Information Set (HEDIS) data for MA plans—which measure quality of ambulatory care—are also available for those years (and many others). Using 2007 RRU data to measure efficiency and HEDIS to measure quality, Landon et al. constructed similar resource use and quality metrics for a 20% random sample of TM beneficiaries. Quality metrics included, for diabetics, A1C testing in the current year and a diabetic retinal exam in the current or prior year; for both diabetics and patients with cardiovascular disease, LDL cholesterol testing in the current year. These quality metrics are only applicable to and computed for 65 to 75 year olds.

    To control for geographic variation in service delivery and quality and demographic differences between MA and TM, the authors weighted the TM sample such that it matched their MA sample demographically within each zip code. This also controls for zip code level socioeconomic differences across the two samples.

    On average, RRU was about 20 percentage points lower for MA than TM. Lower utilization was observed in MA across both disease types and service categories (inpatient, surgery and procedures, evaluation and management). However, as shown in the figure below, for newer (entered the program in 2006 or 2007), smaller (<25,000 enrollees), and for-profit HMO or PPO MA contracts,** RRU was higher in MA than TM for inpatient care.

    RRU fig 1

    The chart below combines resource use and a composite of diabetes care quality for MA HMOs vs TM. (A chart with similar patterns for cardiovascular disease is provided in the paper’s appendix.) The former is on the horizontal axis (low spending to the left, high to the right). The latter is on the vertical axis (low spending downward, high upward). Each data point (circle or triangle) is the difference between a specific HMO contract and TM. Larger symbols are for larger contracts (>25,000 members); triangles for new contracts, circles for older ones; blue for nonprofit and purple for for-profit.


    By and large, MA HMOs use fewer resources and provide better quality, though this is more often the case for larger, established ones (relatively more big circles in the upper left and relatively more small triangles in the lower right). (The authors did not include a similar analysis of PPOs. They wrote me that most PPOs were small, new, and for profit, and there were many fewer of them than HMOs in 2007.)

    The authors point out several limitations of the analysis:

    • It only considered a few aspects of quality, as constrained by data availability.
    • It is possible MA plans experienced favorable selection in the time period assessed, even within disease type, and even controlling for demographics, comorbidities, and socioeconomic status, to the extent the authors could and did.
    • The data are quite old, from 2007; that’s the latest year available.

    To these, we should add:

    • Only two disease types were considered, again because of data limitations.
    • It is possible MA plans upcoded relative to TM such that the MA cohort appeared relatively sicker, which after adjustments, might make resource use look relatively lower.
    • Beneficiaries with “concomitant specified dominant medical conditions including active cancer, end-stage renal disease, human immunodeficiency virus/AIDS, and organ transplants” were not included in RRUs and, hence, excluded from the analysis. It’s possible MA plans provide disproportionately inefficient or poor care to such beneficiaries.
    • The analysis included all ages above 65 but the contract HEDIS quality measures used in the analysis are only applicable and computed for ages up to 75, so it’s possible there are some offsetting quality differences for older enrollees.
    • Within zip code socioeconomic differences could not be controlled for.
    • PPOs were excluded from the quality/efficiency analysis (the figure just above).

    Even if the results accurately depict the efficiency and quality of MA, relative to TM, it must be emphasized that MA plans were paid well above their costs in 2007 and are still paid above them today, though not by as much. In other words, whatever their efficiency, taxpayers are not benefiting; whatever their quality, that comes at a higher price.

    Still, either MA is doing something amazing—broadly providing substantially better care with less utilization, which is something few initiatives in TM have ever been able to do—or the results are (or are in part) artifacts of analytic limitations. The best way to decide which is to do more research with more complete data. Until we’re offered more than selected glimpses through keyholes at MA, we may never get the chance to do that.

    * These days, with the exception of any substance use disorder related claims.

    ** The analysis included HMO and PPO enrollees, not those in other plan types (PFFS, special needs plans, cost plans, etc.) because those other plan types are both small and exceedingly different.


    Comments closed
  • The TIE daily email is working again

    TIE’s daily email (to which you can subscribe here) was out for a couple of weeks. Did you notice? Since this may happen again, here’s what you should do if you’re someone who counts on the emails:

    1. If you don’t get a TIE email for a couple of days, assume it’s broken. We rarely go more than a day without a post, and then that’s typically only on weekends.
    2. Go to the home page and catch up on posts. (There are other ways to keep up as well.) Even though there’s no email, there are probably still new posts.
    3. If you notice new posts (and no email), please alert us. We sometimes aren’t aware the email isn’t working.
    4. Be patient. Fixing these things is sometimes tricky, and we run TIE on no budget and almost no time.

    Also, even though the email is working again, many, many posts never made it out. You can click back through all the TIE posts from the home page to catch up.


    Comments closed
  • I’m loving my new productivity-enhancing schedule

    Several weeks ago, I made a substantial change to how and when I do various work tasks and non-work activities. It seems to have not only boosted my productivity, but also improved my mental health (both completely subjective and N=1). FWIW, I thought I’d share.

    Before the change, here’s roughly how I spent my weekdays and how I felt about it:*

    • ~5:30AM-6:30AM: Catch up on email, news, Twitter. If time permitted (which was rare), also write blog posts.
    • ~6:30AM-8:00AM: Cycle among podcasts, email, news, Twitter on my commute. If time permitted (which was rare), also read. Already there’s a problem here. Writing and reading were the things I felt I should get to, but clearly I was permitting other tasks to take priority in the early morning hours. This made me feel bad and immediately “behind.”
    • ~8:00AM-4:00PM: Cycle among email, news, Twitter, and various work tasks. This caused me to feel generally scatter brained and unfocused. I didn’t like it, but felt I needed to “keep up.” I had alerts for email going on my computer and phone, which encouraged me to switch to it whenever someone sent me an email. Why I should let someone else dictate my work flow never really crossed my mind … until recently.
    • ~4:00PM-9:00PM: Cycle among email, news, Twitter, family stuff. If time permitted (which was rare until after 8PM), try to write blog posts. Basically more of the same scatter brained nonsense.

    Not surprisingly, this is generally dumb behavior for two reasons. First, I was trying to cycle too quickly among tasks, which is inefficient and made me feel (and, I think, caused me to be) relatively unproductive and unfocused. Second, there was zero attempt to match times of day to tasks for which my brain is best suited. I was not happy, but I didn’t really recognize it until recently.

    I changed all this, so my days are now closer to:

    • ~5:30AM-6:30AM: Write blog posts because this is the absolute best time for me to write. It’s what my brain wants to do. If no topic is available, I read papers.
    • ~6:30AM-8:00AM: Catch up on email, news, Twitter, after which, read. Podcasts are reserved for the walking part of my commute during which none of those are optimal.
    • ~8:00AM-4:00PM: Segment into a long, morning work time (~2.5 hours) and another afternoon work time of the same length during which I do just one work task at a time, to completion or the end of the time period, whichever comes first. Compress all email, news, Twitter checking into a midday and afternoon check. Turn off all alerts. No dings. No vibrations. Nothing. Try to schedule meetings and calls during the remaining times. In other words, protect some large chunks for focused work. Also, only respond to emails requiring responses. Save the rest for later or delete. Do a lot of deleting without even reading. Unsubscribe from lots of stuff. Create filters to trash for the unsubscribe-able. :)
    • ~4:00PM-9:00PM: Read on my commute (or podcasts while walking). Then family stuff until 8PM or so, at which point just deal with emails I’d put off and easy, brainless home and work administrative tasks. (We all have lots of this crap.) Read if time permits. Don’t do any other work. This is the time of day during which writing is much harder. I’m tired. So doing the stuff with low cognitive demands now is optimal.

    Having done this for a few weeks, I feel dramatically less scatter brained (more focused). I think I’m getting more done more efficiently, and I’m happier. I no longer go through the day thinking I need to get to this writing task or read that paper. I get to what I can get to when it makes most sense. I don’t worry about anything else but maintaining the discipline of my schedule. (Sometimes meetings and other demands intervene, but at least I’m not causing additional interruption through bad behavior anymore.)

    It’s no longer about what I have to get to, it’s about simply doing the thing right now that I have dedicated right now for. When I find my mind drifting toward “you’ve got other stuff you have to get to” I push that thought away and keep focusing on the task at hand. When my writing time or my allocated chunk of work time is over, I put the job away, and do the next thing I’m supposed to do. It’s the difference between managing the time (good) vs. juggling the tasks (bad).

    For me, it just works. Your mileage may vary.

    * I also made a big weekend change, which amounts to not checking email/news/Twitter except once in the morning and once at night.


    Comments closed
  • Impact factor

    Via Hilda Bastian, this cartoon accompanies this post by her on open access journals:

    impact factor


    Comments closed
  • AcademyHealth: Medicare Advantage upcoding

    Medicare beneficiaries enrolled in Medicare Advantage (MA) are probably at least a little healthier than those who enroll in traditional Medicare (TM). But, according to recent work, MA enrollees appear much sicker than they would if enrolled in TM due to how their illnesses are coded. Learn all about this upcoding my latest AcademyHealth post.


    Comments closed
  • Health care providers respond to financial incentives. They agree. And it’s not an insult.

    Health care providers rarely admit that their care is influenced by financial incentives. In The 8 Basic Payment Methods in Health Care, Kevin Quinn disagrees.

    [P]ayment methods clearly affect whether, how, and how much care is provided. Examples include hospital length of stay, diagnostic imaging in physician offices, home health care visits, coordination among physicians and hospitals, the volume and mix of services under fee-for-service medicine, and much more. Financial incentives seem particularly potent in situations of clinical ambiguity, such as diagnostic tests, follow-up visits, and some procedures. Effects of financial incentives often become more evident over time, such as decisions to open and close business lines and medical students’ choice of specialty.

    If financing didn’t affect health care delivery, nobody would argue that payment cuts will harm patient care. However, that’s a standard argument providers make every time a Medicare payment cut is proposed. And though it may not apply to every circumstance, it’s not necessarily a bad argument. My point is that making the argument admits that how and how much providers are paid affects the care they deliver.

    Similarly, how much I get paid affects my children’s education. Just as clinicians provide care under resource constraints, I parent under resource constraints. I cannot afford to live in the town in America with the absolute best schools, but my kids might learn more if I could. I cannot afford to take a year off to live with my kids in Paris or Hong Kong, but, arguably, it’d be an educational enhancement for my kids.

    My kids don’t get every possible educational advantage money could by; they get every possible educational advantage I can reasonably afford with the money I have. Likewise, clinicians cannot provide every patient with every possible thing that might make them healthier (which, in many cases, wouldn’t be health care anyway). By and large, clinicians do the best they can with the resources they have available.

    Saying that the nature of health care delivery responds to how and how much providers are paid isn’t an insult any more than saying I’d provide my kids with different educational experiences if I were paid more (or less or differently). It’s just admission of the fact that doing stuff requires resources and resources cost money. Different kinds and amounts of payment causes different kinds and amounts of resources to be affordable, therefore purchased, which affects the nature of care. Pay me in vouchers redeemable only for a home in the nearby town with a better school system or for flights to Paris and my children’s lives would no doubt be different.

    Quinn’s paper is subtly awesome and a recommended read. I will write more about it in the future. He’s also the author that wrote one of my favorite opening paragraphs to a paper.


    Comments closed
  • An important message for TIE email subscribers*

    This post is pinned to the top of TIE. If you’re on the home page, scroll down for new posts. We’re still posting daily!

    TIE’s daily email is not working. We are working on a fix, but it’s proven to be a difficult issue. If you rely on the email but still want to read TIE, you can do so on TIE’s website, and follow in other ways too. You do not need to unsubscribe and resubscribe to the email list. It’s our problem, not yours.

    * Yes, I’m aware that those who rely on email subscriptions may not see this post.

    Comments closed
  • How to Know Whether to Believe a Health Study

    The following originally appeared on The Upshot (copyright 2015, The New York Times Company).

    Every day, new health care research findings are reported. Many of them suggest that if we do something — drink more coffee, take this drug, get that surgery or put in this policy — we will have better (or worse) health, or longer (or shorter) lives.

    And every time you read such news, you are undoubtedly left asking: Should I believe this? Often the answer is no, but we may not know how to distinguish the research duds from the results we should heed.

    Unfortunately, there’s no substitute for careful examination of studies by experts. Yet, if you’re not an expert, you can do a few simple things to become a more savvy consumer of research. First, if the study examined the effects of a therapy only on animals or in a test tube, we have very limited insight into how it will actually work in humans. You should take any claims about effects on people with more than a grain of salt. Next, for studies involving humans, ask yourself: What method did the researchers use? How similar am I to the people it examined?

    Sure, there are many other important questions to ask about a study — for instance, did it examine harms as well as benefits? But just assessing the basis for what researchers call “causal claims” — X leads to or causes Y — and how similar you are to study subjects will go a long way toward unlocking its credibility and relevance to you.

    Let’s look closer at how to find answers. (If the answers are not in news media reports, which they should be, you’ll have to chase down the study — and admittedly that’s not easy. Many are not available without cost on the web.)

    It’s instructive to consider an ideal, but impossible, study. An ideal study of a drug would make two identical copies of you, both of which experience exactly the same thing for all time, with one exception: Only one copy of you gets the drug. Comparing what happens to the two yous would tell us the causal consequences of that drug for you.

    Clearly, there are a few complications in the real world. We only have one of you to play with. Also, you don’t participate in most studies, if any. The people researchers examine are never exactly like you. So how do we extract some value from the imperfect?

    Researchers employ various methods to infer what would happen to people who might be like you in two different circumstances, such as taking or not taking a drug. The most widely trusted approach is the randomized controlled trial. In the most basic randomized trial, individuals are randomly assigned to treatment (e.g., they get the new drug) and control (e.g., they get a placebo or nothing).

    This random assignment is powerful. If done with enough people, it causes the two groups to be statistically identical to each other except for the experience of the treatment (or not). Whatever changes are observed can usually be attributed to that treatment with a good degree of confidence.

    Though a randomized trial makes two groups statistically identical to each other — apart from treatment received — it still doesn’t mean either group is identical to you. If the individuals selected to participate in the trial happen to be very similar to you — similar ages, income, living environment and so forth — that increases the chances that the results would apply to you. But if you’re, say, a 65-year-old, middle-class New Yorker, a study whose subjects were poor 30-somethings in rural China may not translate to your experience.

    This is one of the chief limitations of randomized trials. They’re typically focused on narrow populations that meet strict criteria — those most likely to benefit from treatment. Many drug trials exclude older patients or children because of ethical or safety concerns. Many, particularly much earlier trials, didn’t include women. We know a lot less about how drugs affect groups who weren’t studied than we might like. Harm could even come if it was assumed that findings from those who were studied applied to people who weren’t.

    My colleague Aaron Carroll provided an example of just this problem. Based on the results of randomized trials that included only adults, prescriptions of drugs known as proton pump inhibitors to infants withgastroesophageal reflux disease grew sevenfold between 2000 and 2004. Only later, in 2009, a direct study of infants found that those drugs caused them harm, with no benefit.

    A type of study other than a randomized trial is less likely to have this kind of problem. Rather than recruiting and randomizing a narrow set of patients to generate new data, researchers can turn to “nonexperimental” or “observational” database studies. These database studies use large data sets, like those available from Medicare, Medicaid, the Veterans Health Administration or very large surveys. Some studies of this kind are large enough to allow researchers to report differences in treatment effects across groups. Perhaps women respond differently, for example.

    And because they don’t have to generate new data, nonexperimental studies are typically cheaper than randomized trials and produce results more quickly.

    People like you are more likely to be represented in a nonexperimental database study, so your top concern might be whether the findings are valid. After all, such a study doesn’t rely on the clean comparisons of randomized groups of people. Instead, it often compares groups of people who could have self-selected into receiving treatment or not. Maybe those who opted to receive it are systematically different — healthier, sicker, more careful, for example — and that’s what drives the findings. If so, what might appear causal isn’t, giving rise to the familiar “correlation does not imply causation.”

    That concern is why researchers employ techniques to try to adjust for differences across comparison groups in nonexperimental studies. These can get complex in a hurry, and few news media reports could describe them in detail. But that doesn’t mean they’re all sketchy or all ironclad. The key fact is that they all rely on different assumptions than a randomized trial, and those assumptions can and should be probed to gain confidence in causal inferences.

    Most news media reports acknowledge when a study is nonexperimental, and sometimes you can find a sentence or two about how the researchers sought to adjust for differences and tested assumptions. You should also look for statements from experts about whether those adjustments and tests were sufficient. However, these rely on judgment. There is always room for doubt.

    Ultimately, no single study is perfect. Whether it’s a randomized trial or a nonexperimental one, one can never be absolutely sure study findings are valid and applicable to you. The best bet is to wait, if you can, until evidence accumulates from many studies using a range of methods and applied to different populations.

    Few things are miracle cures, but when one shows up, we’ll see its signature in not just one study, but in many. Yes, that can take time. But if you want solid evidence you can count on, you cannot also be impatient.


    Comments closed
  • The managed care backlash was very effective

    From The Impact of the Political Response to the Managed Care Backlash on Health Care Spending: Evidence from State Regulations of Managed Care, by Maxim Pinkovskiy:

    My results indicate that because of the political response to the managed care backlash, health care spending in a state with average HMO penetration in 1995 grew by 0.16 percentage points more per year than it would have otherwise, which is larger than the average change in the health care share across states in 2005. To assess the magnitude of my result, I use my regression to make a dynamic counterfactual forecast of the evolution of each state’s health care share under the assumption that the number of backlash regulations was equal to zero in every state and year, and aggregate the forecasts to predict the counterfactual for the U.S. health care share for each specification I run. I find that under the counterfactual of no political response to the managed care backlash, the U.S. health care share in 2005 would have been 11.52%, nearly two percentage points of GDP lower than the actually observed level, and somewhat below the 2000 level of 11.94%.

    Here’s that result in chart form:


    The paper includes a nice summary of the managed care backlash and research pertaining to it. It does not cite my favorite paper on it (which I quote here).


    Comments closed
  • AcademyHealth: Who has universal health coverage?

    True or false? All the other wealthy nations the U.S. might consider its peers have achieved universal health care coverage. The answer in my latest AcademyHealth post.



    Comments closed