• Methods: Intention-to-treat

    In JAMA, Michelle Detry and Roger Lewis explain the “intention-to-treat” (ITT) principle:

    [I]n a trial in which patients are randomized to receive either treatment A or treatment B, a patient may be randomized to receive treatment A but erroneously receive treatment B, or never receive any treatment, or not adhere to treatment A. In all of these situations, the patient would be included in group A when comparing treatment outcomes using an ITT analysis. Eliminating study participants who were randomized but not treated or moving participants between treatment groups according to the treatment they received would violate the ITT principle.

    Why do this?

    The effectiveness of a therapy is not simply determined by its pure biological effect but is also influenced by the physician’s ability to administer, or the patient’s ability to adhere to, the intended treatment. The true effect of selecting a treatment is a combination of biological effects, variations in compliance or adherence, and other patient characteristics that influence efficacy. Only by retaining all patients intended to receive a given treatment in their original treatment group can researchers and clinicians obtain an unbiased estimate of the effect of selecting one treatment over another.

    Treatment adherence often depends on many patient and clinician factors that may not be anticipated or are impossible to measure and that influence response to treatment.

    Why not do this?

    [1] Noninferiority trials, which are designed to demonstrate that an experimental treatment is no worse than an established one, require special considerations. […] The intervention in group A may incorrectly appear noninferior to the intervention in group B, simply as a result of nonadherence rather than because of similar biological efficacy. […]

    [2] Although the ITT principle is important for estimating the efficacy of treatments, it should not be applied in the same way in assessing the safety (eg, medication adverse effects) of interventions. […]

    [3] [I]t would be unfortunate to falsely conclude, based on the ITT analysis of a phase 2 clinical trial, that a novel pharmaceutical agent is not effective when,in fact, the lack of efficacy stems from too high a dose and patients’ inability to be adherent because of intolerable adverse effects. In that case, a lower dose may yield clinically important efficacy and a tolerable adverse effect profile.

    In these cases, one may be more interested in an estimate of the effect of treatment-on-the-treated (TOT), or a per-protocol analysis.

    If you’re aware of good papers that explain the use and interpretation of common research methods, let me know in the comments, which are open for one week after this post’s time stamp, or by email or Twitter.


    • Intent to treat analysis can dilute the statistical power of a study when there are a significant number of randomized but ineligible subjects. For example, ondansetron (Zofran) is approved for the treatment of chemotherapy induced nausea. For regimens such as cisplatin where nearly all patients develop severe nausea and vomiting, it was possible to demonstrate efficacy in relatively small trials. However, for regimens such as FEC where some but not all patients develop nausea, intent to treat analysis would not have judged the effects statistically signficant. Patients were only considered eligible to receive treatment or control if they developed nausea, but they were randomized to treatment vs. control arms at the time of chemotherapy. If you analyze the study based on the treatment eligible patients who were randomized, the drug is found to significantly reduce symptoms and the FDA approved the product for this application. On the other hand, if you dogmatically apply intent to treat analysis including all of the patients who were randomized, regardless of whether they ever became eligible for treatment, the study does not achieve P<0.05 statistical significance. In this case, the per protocol analysis better reflects real world application; it would be a significant burden on patients to delay enrollment and randomization until they developed symptoms because this would have inevitably delay treatment for their symptoms.

      This discussion has specific relevance to studies examining the efficacy of personalized medicine products. Extended time periods may elapse between enrollment and initially obtaining a specimens and the time when a finished product is ready to be administered to the patient. During this intervening period, the patient's condition may change or they may for other reasons become ineligible for treatment. In these cases, as was the case for Zofran, per protocol analysis will be more appropriate than strict intent to treat analysis.


    • I think the ITT should be preferred to “as treated” because it would always produce a more conservative estimate. The per protocol removes the real world impact, and reduces the generalizability of the findings. Also resting the data using per protocol after finding no effect by an ITT would inflate your type-I error rate.

      I would want to know the effect sizes for something that was not statistically significant for an ITT but was in a per protocol. My guess is that it was weak. Statistical significance does not indicate that something is clinically or practically significant/useful/effective.

    • At the BMJ, Philip Sedgwick runs a weekly column under “Endgames” in which he discusses some common statistical issue regarding the interpretation of published studies. He presents a multiple choice question first, then explains the answer; each column refers to an actual study from the literature. These are short and sweet and often clarify ideas which can improve the reader’s ability to make good use of the literature.

      Regarding ITT, there is actually some good science to suggest that abstinence alone is effective in preventing teen pregnancy and STDs. You could set up a study which randomized high school students to either abstinence or usual adolescent behavior. The per protocol analysis would show that abstinence worked perfectly. However, in the real world, you could anticipate that there would be some crossovers from one intervention group to the other, and the ITT analysis would yield a more conservative, but more accurate estimate of the effectiveness of a program of going to high schools and teaching girls to simply say, “Not tonight, dear, I have a headache.”

      There have been acrimonious debates about precisely this issue, with the moralists focusing on the per protocol analysis and the public health community focusing on the ITT analysis, so to speak. In order to illustrate the point, a thought experiment is sufficient; you need not set up the actual RCT to appreciate the difference between the two analyses.

    • It seems like this has broader applicability than just medical testing. Besides abstinence, the theory and practice of, there are plenty of recommendations and rules for road safety, and anyone residing in the Boston area can observe our gap between theory and practice.

      For an example of an actual experiment, extra mirrors were added to heavy goods vehicles in parts of Europe to improve visibility of cyclists and pedestrians when turning, yet the theorized improvements to safety did not materialize (OECD, “Cycling, Health and Safety”, http://dx.doi.org/10.1787/9789282105955-en ). Apparently if you don’t look, you don’t see. Who knew?