145 lines
11 KiB
Plaintext
145 lines
11 KiB
Plaintext
|
|
Episode: 2695
|
||
|
|
Title: HPR2695: Problems with Studies
|
||
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2695/hpr2695.mp3
|
||
|
|
Transcribed: 2025-10-19 07:37:16
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
This is HPR Episode 2695 entitled Problem with Studies and in part of the series Health and Health Care.
|
||
|
|
It is hosted by AYUKA and in about 13 minutes long and currently in a clean flag.
|
||
|
|
The summary is some principles for evaluating medical studies.
|
||
|
|
This episode of HPR is brought to you by an honesthost.com.
|
||
|
|
Get 15% discount on all shared hosting with the offer code HPR15.
|
||
|
|
That's HPR15.
|
||
|
|
Better web hosting that's honest and fair at an honesthost.com.
|
||
|
|
Hello, this is AYUKA, welcoming you to Hacker Public Radio and another exciting episode.
|
||
|
|
You know, a series about health and taking care of yourself and we've now moved into evaluating studies.
|
||
|
|
And there are problems that, you know, need to be addressed when we take a look at it.
|
||
|
|
We don't want to misconstrue what's going on.
|
||
|
|
As we said last time around, medical science has made great strides.
|
||
|
|
And when we look at problems, we should not lose sight of all of the significant accomplishments
|
||
|
|
that medical science has made.
|
||
|
|
I for one have no wish to return to the days when the leading cause of death for women was childbirth
|
||
|
|
and a man was considered old if he reached the age of 40.
|
||
|
|
However, it is worth taking the time to develop a better idea of what constitutes quality in research
|
||
|
|
and how results should be interpreted.
|
||
|
|
So what are some of the problems we run into?
|
||
|
|
Well, the first one is generally no one cares about negative results.
|
||
|
|
Now, they should care.
|
||
|
|
You know, it's all important to know these things.
|
||
|
|
But let's modify our previous example.
|
||
|
|
Okay, that was the children eating breakfast thing.
|
||
|
|
So let's consider whether it is a hot breakfast that improves student performance and raises test scores.
|
||
|
|
Our control group would then be students who only got a cold breakfast like, say, juice and cereal with milk.
|
||
|
|
We do a study.
|
||
|
|
And it turns out there is no statistically significant evidence that hot breakfasts produce
|
||
|
|
better performance than cold breakfasts.
|
||
|
|
Well, with this in mind, we could then target our assistance to children a little better,
|
||
|
|
maybe save some money.
|
||
|
|
But we run into a problem of bias in publishing, which is that journals tend to prefer to publish studies
|
||
|
|
that have positive results.
|
||
|
|
This is very pronounced in studies of the efficacy of drugs.
|
||
|
|
Pharmaceutical companies in the United States have to do careful studies for any new drug
|
||
|
|
to establish a, the safety of the drug, and b, the efficacy of the drug.
|
||
|
|
And I think there are similar rules in other countries.
|
||
|
|
So what happens if a company does extensive testing and cannot find proof that the drug does anything?
|
||
|
|
Now, the study goes on to a dusty shelf somewhere and has never heard from again.
|
||
|
|
Now, this matters for a couple of reasons.
|
||
|
|
The first data is data and even negative results have utility.
|
||
|
|
But even more important is that sometimes studies that by themselves do not show positive results
|
||
|
|
can be combined with other studies and a positive result can come out.
|
||
|
|
This is because combining studies, something called meta-analysis,
|
||
|
|
gives you a larger sample size, which can improve the power of the combined studies.
|
||
|
|
Okay. The next problem, and this is serious, is no one cares about replication.
|
||
|
|
Well, there's a few people that do, but remember in the last episode, we said that about one out
|
||
|
|
of 20 studies will reach the wrong conclusion, even if no one did anything wrong.
|
||
|
|
It's just, you know, randomness.
|
||
|
|
Now, the best defense against that is to have other scientists replicate the study and see if they
|
||
|
|
get the same results. And this is a huge problem because quality journals generally are not
|
||
|
|
interested in publishing replication studies. And then the tenure committees at universities,
|
||
|
|
therefore, do not consider replication studies as valuable research.
|
||
|
|
Everyone wants something original. But many original studies have problems that could be
|
||
|
|
addressed through replication studies. And when it's been studied, we found that a large number of
|
||
|
|
these medical studies cannot be replicated. And this is particularly acute in social psychology.
|
||
|
|
This has become known as the replication crisis. For an example, there was a 2018 paper in
|
||
|
|
Nature Human Behavior, which looked at 21 studies and found that only 13 could be successfully
|
||
|
|
replicated. It's slightly over half, but this is not, not standing record here.
|
||
|
|
Now, there's a fellow, if you're at all interested in this, this is a name that you would want to
|
||
|
|
follow. Professor John Ioannidis, that's I-O-A-N-N-I-D-I-S, who is a professor at Stanford
|
||
|
|
University. And he's been a leading voice in this replication problem.
|
||
|
|
He was an advisor to the reproducibility project Cancer Biology. Now, this study looked at five
|
||
|
|
influential studies in cancer research. Influential means other people were basing their research
|
||
|
|
off of this and found that two were confirmed, two were uncertain, and one was out and out disproved.
|
||
|
|
Now, we want to be careful here. This is not something that derives from fraud in the vast majority
|
||
|
|
of cases, although there are examples of fraud that we'll mention. It can derive from small
|
||
|
|
differences in the research approach, and if we understand what those differences are, that might
|
||
|
|
bring out areas for further research and can point you in the direction of important factors.
|
||
|
|
And if an influential study is just out and out, wrong, it's going to lead to wasting time and
|
||
|
|
resources that are based on this. Pharmaceutical companies have found, for instance, they cannot
|
||
|
|
replicate the basic research underlying some of their proposed drugs, which causes waste and
|
||
|
|
diverse resources that could be used more productively. Now, there's a lot of controversy over
|
||
|
|
replication, and part of it is that there is sometimes an inference that if a scientist's results
|
||
|
|
cannot be replicated, the scientist was at worst fraudulent or at best sloppy and lacking in care.
|
||
|
|
Now, there are examples of fraud, Andrew Wakefield, who published a study claiming to find
|
||
|
|
relationship between vaccination and autism that was just totally bogus, although people are still
|
||
|
|
citing it. But I mean, this guy, his license to practice medicine was taken away, the paper was
|
||
|
|
retracted, and he's now, you know, making a living, talking to fringe groups about vaccination,
|
||
|
|
but no one in medical science is going to take him at all seriously. Another one, Huang Wu-Suk,
|
||
|
|
also guilty of fraud, and it just had to do with stem cell research that it turns out he had
|
||
|
|
basically fabricated some results. Now, sometimes a respected researcher who did not do anything wrong
|
||
|
|
can somehow get tainted. An example of that is Yoshiki Sasai ended up committing suicide
|
||
|
|
because someone else in his lab made some mistakes and that he got tagged with all of that.
|
||
|
|
So you can understand it is a little bit of a controversial area, but the basics is
|
||
|
|
you cannot rely on any study on any one study for anything. There is a replication problem here.
|
||
|
|
Now, this could be dealt with, okay? We could start allocating a percentage of the research
|
||
|
|
budget to replication studies. We could start rewarding people who do good work in this area,
|
||
|
|
but it's going to take some work to change that. Now, another thing that is a problem is something
|
||
|
|
that's referred to as p-hacking. Well, there's an interesting buzz word, isn't it? Now, as we discussed
|
||
|
|
previously in our previous episode, statistical significance of a result is based on the p-value,
|
||
|
|
which is the probability that the result came from random chance instead of coming from a real
|
||
|
|
relationship. And this is important because people are not in general good at dealing with
|
||
|
|
probability and statistics. And a formal test can provide a valuable check on confirmation bias.
|
||
|
|
Confirmation bias is when you see what you want to see. If you want to know confirmation bias,
|
||
|
|
you know, think of how many times you look in the mirror and say, oh, I'm not that much overweight.
|
||
|
|
Now, we've noted how this should happen. A good researcher should come up with an hypothesis,
|
||
|
|
then collect the data, then do a test to see if there is a statistically significant result.
|
||
|
|
That is the ideal. But as we saw above, if you find nothing, you have a problem.
|
||
|
|
Studies that fail to find anything do not generally get published. Researchers who do not get
|
||
|
|
published do not get tenure, and will probably have trouble getting grants to do any more research.
|
||
|
|
So, there is tremendous pressure to get a result that is significant and can be published.
|
||
|
|
Your whole career can rest on it. Now, one result of that pressure is a practice known as
|
||
|
|
p-hacking, which in general means looking for ways to get a significant result that is
|
||
|
|
publishable by any means necessary. I've got a few links in the show notes if you want to follow
|
||
|
|
up on this one, including a nice video that gets into this. Now, colloquially, p-hacking has been
|
||
|
|
referred to as torturing the data until it confesses. One variation is to skip the initial hypothesis,
|
||
|
|
and just go looking for any relationship you can find in the data. You might think that no
|
||
|
|
harm is done, since if the researcher finds a relationship, that's a good thing, but remember
|
||
|
|
the idea is to find true relationships. If you grab enough variables and run them against each other,
|
||
|
|
you will definitely find correlations that they will not generally be anything other than
|
||
|
|
coincidence. The whole point of stating an hypothesis in advance of collecting the data is that
|
||
|
|
your hypothesis should be grounded in a legitimate theory of what is going on. For example,
|
||
|
|
someone once did a study showing that you could predict the winner of the US presidential election,
|
||
|
|
by which team won the previous Super Bowl? Now, there's no way that this correlation
|
||
|
|
means anything other than random chance helped by a small sample size.
|
||
|
|
Another approach is to drop some parameters of the original study to get a better p-value,
|
||
|
|
or you might find an excuse to eliminate some of the data until you get a result.
|
||
|
|
But if the p-value is the gate you have to get through to get published and have a career,
|
||
|
|
some people will look for a reason to go through that gate.
|
||
|
|
So the point of this analysis is that you have to be careful about uncritically accepting any
|
||
|
|
result you come across. Reports on television news, or in newspapers deserve skepticism.
|
||
|
|
I never accept any one study as definitive. You need to see repeated studies that validate the
|
||
|
|
result before you can start to accept it. Now, of course, most civilians are going to have trouble
|
||
|
|
doing all of that, which sets us up for the next article where we look at how healthcare providers
|
||
|
|
can help with this. And above all else, if you want one rule to live by, if Gwyneth Paltrow
|
||
|
|
suggests something, do the opposite. She may be the first recorded case of an IQ measured in
|
||
|
|
negative numbers. And with that, this is Ahuka for Hecker Public Radio, signing off,
|
||
|
|
and as always, reminding you to support FreeSoftware. Bye-bye.
|
||
|
|
You've been listening to Hecker Public Radio at Hecker Public Radio.org. We are a community podcast
|
||
|
|
network that releases shows every weekday, Monday through Friday. Today's show, like all our shows,
|
||
|
|
was contributed by an HBR listener like yourself. If you ever thought of recording a podcast,
|
||
|
|
then click on our contributing to find out how easy it really is. Hecker Public Radio was found
|
||
|
|
by the Digital Dove Pound and the Infonomicon Computer Club, and is part of the Binary Revolution
|
||
|
|
at Binrev.com. If you have comments on today's show, please email the host directly, leave a comment
|
||
|
|
on the website or record a follow-up episode yourself. Unless otherwise stated, today's show is
|
||
|
|
released on the creative comments, attribution, share a like, free.or license.
|