II-Smoking Cessation: Selection Effects

This is the third in a series of posts looking more closely at the methods used to quantify the benefits of smoking cessation using this paper as an example; earlier posts:

A RCT of smoking cessation couldn’t and wouldn’t be done, so some sort of observational data must be used to quantify the benefits of smoking cessation. The choice of the Cancer Prevention Study II (CPS-II) presented benefits and costs:

  • Biggest benefit: large number of person-years of follow up (7.2 million for females, 4.3 million for males)
  • Biggest cost: selection effects that its use introduced

By selection, I mean the participants in CPS-II differed from the overall population:

  • They were whiter (93% of CPS-II v. 80% 1990 Census) and they had higher levels of education (30% college degree in CPS-II v. 9% U.S. adults).

What does this mean for our estimates of the relative risk of death by smoking status? First, the direction of any bias due to selection is ambiguous, unlike measurement error in smoking status that provides a predictably conservative bias toward identifying no effect. Using racial differences as an example, two stories seem plausible. First, whites could receive more benefit from cessation because they have a longer background life expectancy than minorities. On the other hand, non whites could receive more benefit from cessation at later ages because of healthy survivor effects. I am unsure what direction the net effect that selection bias of this sort may go.

In the end, 11.5 million person years of follow up covers a multitude of sins, and we were able to control for race (white v. non white) and education in the estimation of the relative risk of mortality calculations that underlie our models. However, even though we had large enough cell counts to control for these variables, there is still worry that the underlying population is different from that used to obtain the relative risk estimates.

Other cohort studies available were even whiter and based on one geographic locale (Framingham), or were based on the experience of General Practitioners in the U.K. (British Doctor Study). It would be great to have a long term follow up database that was representative of the overall population in terms of race and education to update this study, particularly given the increase in the Hispanic population of the U.S. The Health and Retirement Study, which is approaching 20 years of follow up could be an option for updating, and the tradeoff would be smaller cell sizes vs. being more representative. Are there other databases that should be considered?



