The Cookie Crumbles: A Retracted Study Points to a Larger Truth

The following originally appeared on The Upshot (copyright 2017, The New York Times Company). 

Changing your diet is hard. So is helping our children make healthy choices. When a solution comes along that seems simple and gets everyone to eat better, we all want to believe it works.

That’s one reason a study by Cornell researchers got a lot of attention in 2012. It reported that you could induce more 8-to-11-year-olds to choose an apple over cookies if you just put a sticker of a popular character on it. That and similar work helped burnish the career of the lead author, Brian Wansink, director of the Cornell Food and Brand Lab.

Unfortunately, we now know the 2012 study actually underscores a maxim: Nutrition research is rarely simple.

Last week the study, which was published in a prestigious medical journal, JAMA Pediatrics, was formally retracted, and doubts have been cast about other papers involving Mr. Wansink.

When first published, the study seemed like an enticing example of behavioral economics, nudging children to make better choices.

Before the study period, about 20 percent of the children chose an apple, and 80 percent the cookie. But when researchers put an Elmo sticker on the apple, more than a third chose it. That’s a significant result, and from a cheap, easily replicated intervention.

While the intervention seems simple, any study like this is anything but. For many reasons, doing research in nutrition is very, very hard.

First, the researchers have to fund their work, which can take years. Then the work has to be vetted and approved by an Institutional Review Board, which safeguards subjects from potential harm. I.R.B.s are especially vigilant when studies involve children, a vulnerable group. Even if the research is of minimal risk, this process can take months.

Then there’s getting permission from schools to do the work. As you can imagine, many are resistant to allowing research on their premises. Often, protocols and rules require getting permission from parents to allow their children to be part of studies. If parents (understandably) refuse, figuring out how to do the work without involving some children can be tricky.

Finally, many methodological decisions come into play. Let’s imagine that we want to do a simple test of cookies versus apples, plus or minus stickers — as this study did. It’s possible that children eat different things on different days, so we need to make sure that we test them on multiple days of the week. It’s possible that they might change their behavior once, but then go back to their old ways, so we need to test responses over time.

It’s possible that handing out the cookie or apple personally might change behavior more than just leaving the choices out for display. If that’s the case, we need to stay hidden and observe unobtrusively. This matters because in the real world it’s probably not feasible to have someone handing out these foods in schools, and we need the methods to mirror what will most likely happen later. It’s also possible that the choices might differ based on whether children can take both the apple and the cookie (in which case they could get the sticker and the treat) or whether they had to choose one.

I point out all these things to reinforce that this type of research isn’t as simple as many might initially think. Without addressing these questions, and more, the work may be flawed or not easily generalized.

These difficulties are some of the reasons so much research on food and nutrition is done with animals, like mice. We don’t need to worry as much about I.R.B.s or getting a school on board. We don’t have to worry about mice noticing who’s recording data. And we can control what they’re offered to eat, every meal of every day. But the same things that make animal studies so much easier to perform also make them much less meaningful. Human eating and nutrition are typically more complex than anything a mouse would encounter.

Overcoming these problems and proving spectacular results in preteens are some of the reasons this study on cookies and apples, and others like it, are so compelling. The authors have transformed this work into popular appearances, books and publicity for the Food and Brand Lab.

But cracks began to appear in Mr. Wansink’s and the Food and Brand Lab’s work not long ago, when other researchers noted discrepancies in some of his studies. The numbers didn’t add up; odd things appeared in the data, including the study on apples and cookies. The issues were significant enough that JAMA Pediatrics retracted the original article, and the researchers posted a replacement.

The problems didn’t end there. As Stephanie Lee at BuzzFeed recently reported, it appears that the study wasn’t conducted on 8-to-11-year-olds as published. It was done on 3-to-5-year-olds.

Just as mice can’t be easily extrapolated to humans, research done on 3-to-5-year-olds doesn’t necessarily generalize to 8-to-11-year-olds. Putting an Elmo sticker on an apple for a small child might matter, but that doesn’t mean it will for a fifth grader. On Friday, the study was fully retracted.

Making things worse, this may have happened in other publications. Ms. Lee has also reported on a study published in Preventive Medicine in 2012 that claimed that children are more likely to eat vegetables if you give them a “cool” name, like “X-ray Vision Carrots.” That study, too, may be retracted or corrected, along with a host of others.

As a researcher, and one who works with children, I find it hard to understand how you could do a study of 3-to-5-year-olds, analyze the data, write it up and then somehow forget and imagine it happened with 8-to-11-year-olds. The grant application would have required detail on the study subjects, as well as justification for the age ranges. The I.R.B. would require researchers to be specific about the ages of the children studied.

I reached out to the authors of the study to ask how this could have happened, and Mr. Wansink replied: “The explanation for mislabeling of the age groups in the study is both simple and embarrassing. I was not present for the 2008 data collection, and when I later wrote the paper I wrongly inferred that these children must have been the typical age range of elementary students we usually study. Instead, I discovered that while the data was indeed collected in elementary schools, it was actually collected at Head Start day cares that happened to meet in those elementary schools.”

This is a level of disconnect that many scientists would find inconceivable, and I do not mean to suggest that this is the norm for nutrition research. It does, however, illustrate how an inattention to detail can derail what might be promising work. The difficulties of research in this area are already significant. Distrust makes things worse. The social sciences are already suffering from a replication problem; when work that makes a big splash fails to hold up, it hurts science in general.

We want to believe there are easy fixes to the obesity epidemic and nutrition in general. We want to believe there are simple actions we can take, like putting labels on menus, or stickers on food, or jazzing up the names of vegetables. Sadly, all of that may not work, regardless of what advocates say. When nutrition solutions sound too good to be true, there’s a good chance they are.


Hidden information below


Email Address*