It seems like every day you hear about studies “proving” that one food causes cancer, another causes heart disease, and a magical supplement will help you live forever. The media loves to report sensationalized stories about the dangers and benefits of everything from coffee to red meat to multivitamins. And you can’t blame them! These are the types of stories that get the eyeballs and the Facebook “likes.” Unfortunately, these are also the types of stories that are easily misrepresented and misinterpreted. But how can someone without a background in statistics possibly figure out what’s real and what isn’t?
By the end of this article, you should have a basic understanding of how to interpret mainstream fitness research. What are the main types of studies conducted? What are the advantages and disadvantages of these studies? Which statistics are useful to look at and which statistics are there to mislead you?
With this knowledge and a little practice, you’ll have the ability to interpret research for yourself and come to your own conclusions. It’s time to take back the power and make your own fitness decisions!
Why You Can’t Trust Observational Studies
In fitness research, epidemiology is the study of populations to determine the effects of different foods, drinks, supplements, and lifestyles on health. Also called observational studies, these are the types of studies you usually hear about in the news. There are several different types of observational studies, but we’ll focus on ones called cohort studies.
Cohort studies look at groups of people, called populations, over a period of time. By tracking these populations (sometimes numbering in the 100,000s!), researchers try to determine the way certain factors (like food) affect certain outcomes (like cancer).
There are two types of cohort studies: prospective and retrospective. Prospective studies are carried out from the present into the future: subjects are watched over a period of time while data is collected. Retrospective studies look at historical data from a time in the past up until the present. Prospective studies are seen as the more valuable of the two.
Unfortunately, observational studies are only able to measure correlation, not causation. One of the basic tenets of statistics is that “correlation does not imply causation.” This is a fancy way of saying that even though two things happen at the same time (correlation) it doesn’t mean that one thing caused the other thing to happen (causation). For example: Are some people healthy because they take multivitamins or are healthy people the ones who already tend to take multivitamins? Think about that for a second… (Surveys of supplement takers suggest that the latter case is true.)
By their very design, observational studies simply can’t provide evidence to determine cause and effect. This means that you absolutely should not use these studies to make any changes to your lifestyle and behavior – that’s what randomized controlled trials are for.
Randomized Controlled Trials Are The Gold Standard
The randomized controlled trial (RCT) is the cornerstone of evidence-based fitness. Used to test the effectiveness of a treatment or intervention within a population, unlike observational studies which find correlations, these trials find causations. If you want to make an educated lifestyle change, RCTs are the way to go. They are known as the “gold standard” of research.
Randomized controlled trials begin with a hypothesis and a bunch of people (or sometimes mice). The subjects are randomly separated into groups (hence randomized). Typically, one group is given a treatment like a supplement, a type of diet, or an exercise protocol, while another group serves as a “control.” Control groups don’t receive any real treatment – they’re used as an objective comparison to see whether the treatment actually had an effect on the group being tested.
The two most common types of RCTs are parallel studies and crossover studies. In a parallel study, one group of people is exposed to a treatment while another group of people usually receives a placebo. In a crossover study, each participant is given both the treatment as well as the placebo during different time periods. Crossover studies are powerful tools because they allow people to act as their own controls.
A word of warning: Industry-funded research is notorious for providing really positive and impressive results. Studies done by drug companies and supplement manufacturers can be cleverly designed to elicit favorable results while unsuccessful trials often go unpublished. Take industry-funded results with a grain of salt and always look for independent research for verification. Always check the “Conflicts of Interest” section first before you jump into the details of the paper.
Another word of warning: Mice are not little men. Studies done on mice often show really promising results, but more often than not, the treatments don’t show the same impressive results in people. You should also take rodent studies with a grain of salt and always look for human research for verification.
What Is The Significance of “Statistical Significance?”
One of the most common phrases you come across when looking at research is statistical significance. Statistical significance shows up in both observational studies and RCTs. It’s a tricky number to calculate unless you’ve taken a few statistics courses, so here’s the English translation: if results are statistically significant, you can be pretty sure that those results weren’t due to dumb luck. Something SUPER important to keep in mind: just because something is statistically significant, it doesn’t mean that it is clinically significant.
Statistical significance doesn’t say anything about the magnitude of the results. Clinical (also known as practical) significance, on the other hand, tells you if the magnitude of the results is big enough to make the treatment worthwhile. There’s no set level for clinical significance; it’s really just a judgment call based on experience.
Say you’re testing a new weight loss pill. You give the pill to 1,000 obese people and all 1,000 people each lose 2 lbs. These subjects began the study at a hefty 300 lbs. and ended the study at a “trim” 298 lbs. You can be pretty sure that these results are statistically significant: the pill almost definitely caused the weight loss. However, are these results clinically significant? Probably not. Even though a study shows statistically significant results, it doesn’t mean a thing if the treatment isn’t clinically significant. If the pill helped each person to lose 100 lbs., that would be clinically significant.
The #1 Misleading Statistic In Fitness & Nutrition Research
By far, the single most misleading research statistic presented by the media is relative risk. This statistic is used both in observational studies and RCTs. It is a measure of how harmful or helpful something might be.
When you hear in the news that eating red meat causes a 50% increase in heart attack risk, what are you supposed to do? You’re obviously supposed to throw away all of the meat in your fridge, become a strict vegetarian, and be happy that you figured out how to live forever, right?1
The 50% figure that’s quoted in the news is almost always a number called relative risk. Relative risk is the rate of some outcome in the intervention group relative to the rate of that outcome in a different group. Take the red meat example. Researchers want to see if people who eat meat have more fatal heart attacks than people who don’t eat meat, so they set up a huge observational study involving 200,000 meat eaters and non-meat eaters over the course of 10 years. They find that 6 out of 1,000 people (0.6%) in the meat group died from a heart attack while 4 out of 1,000 people (0.4%) in the no-meat group died from a heart attack. 6 deaths is a 50% increase over 4 deaths. The next morning’s headlines read “Red Meat Kills! Study Finds 50% Increase in Heart Attack Deaths Caused by Red Meat!” However, by now you should know that observational studies can’t actually tell you if one thing “caused” another thing.
Despite the normally useless nature of relative risk, there is one time when it’s interesting: when it’s REALLY, REALLY big. For example, one study showed the relative risk of getting lung cancer for male smokers versus non-smokers to be 2,300%. That means male smokers are 23 times more likely to get lung cancer. Once relative risk starts to exceed a few hundred percent, it might be worth looking into. Compared to a relative risk of 2,300%, a relative risk of 50% just isn’t very impressive.2
This brings us to a statistic that’s far more important than relative risk: absolute risk. Absolute risk is the difference in the rates of an outcome between the intervention group and another group. This metric is also used in both observational studies and RCTs. In the hypothetical red meat study, 0.6% of people who ate meat died from a heart attack, while 0.4% of people who didn’t eat meat died from a heart attack. This is an increase in absolute risk of 0.2%; 0.6% – 0.4% = 0.2%. In other words, by eating meat, you don’t have a 50% increase in risk of dying from a heart attack, you really only have a 0.2% increase! Does 0.2% seem clinically significant? Probably not. A good benchmark for clinical significance might be 5-10%. If 8% of the meat group died and only 0.4% in the no-meat group died, then you may have something to worry about!
Fitness & Nutrition Research Checklist
Understanding fitness research can be difficult. It takes a lot of practice to be able to sift through all the hype, smoke, and mirrors and drill down into what the studies actually show. Does the research demonstrate causation or just correlation? Is the source reliable? Are the results statistically and clinically significant? Does the study have real world implications?
Here are the most important things to keep in mind:
- Remember that correlation does not imply causation. Observational studies inherently aren’t able to measure a cause-and-effect relationship.
- Was the study industry sponsored? If so, take with a grain of salt.
- Relative risk is only worth noting if it’s really, really big.
- Is the magnitude of absolute risk clinically significant? If absolute risk is greater than 5-10%, it’s probably clinically significant.
- If the data actually looks interesting, look for a similar study performed as an RCT to verify the results.
Nothing is more rewarding than using your own brain to come to your own conclusions to make decisions which can change your own life. The next time someone says to you, “They say you shouldn’t drink more than 1 cup of coffee per week,” you can confidently reply, “Show me the research.”
Recommended Fitness & Nutrition Research Resources:
- Google Scholar
- Relative vs. Absolute Risk Reduction
- Number Needed to Treat
- Ben Goldacre: Battling Bad Science
- A note to all my vegetarian friends: This example isn’t meant to encourage or discourage any type of behavior! This is just a hypothetical example. “Meat” could be replaced with salt, saturated fat, coffee, wine, statins, hormone replacement therapy, or a million other things. ↩
- In the 1960s, Sir Austin Bradford Hill, Professor Emeritus of Medical Statistics at the University of London, outlined a set of criteria which could be met in order to strengthen the argument for a cause-and-effect relationship detected by an observational study. For more information, see http://epiville.ccnmtl.columbia.edu/assets/pdfs/Hill_1965.pdf. ↩