How a diverse dataset
can help close the
racial pain gap
Researchers trained an
algorithm to predict knee pain better than the decades-old
standard
author
Melanie Ehrenkranz
Emma Pierson, a senior researcher at Microsoft Research
New England, said that her medical
collaborator on a recent study shared an accurate,
though not very reassuring fact about pain.
“We don’t understand it very well.”
Pierson is a computer scientist developing machine
learning solutions to inequality and healthcare.
A research paper she
published in January alongside other researchers
explores pain disparities
in underserved populations, specifically looking at
osteoarthritis in the knee and how it
disproportionately affects people of color. And they
found that their algorithm detected pain
areas that doctors and machines have since missed.
The motivation behind the study was this mysterious pain
gap
Pierson said that the basic idea was to train a machine
learning algorithm to find any
additional signals in the knee X-ray which isn’t being
captured by regular risk scores
and medical assessments, seeing if this algorithmic
approach could narrow the racial
pain gap for knee osteoarthritis and, subsequently, for
other medical problems.
The study points out that the correlation between
radiographic measures (using X-rays)
and pain is contested—people whose X-rays don’t show
severe disease might experience
severe pain, and vice versa. The current standard system
for measuring osteoarthritis
is the Kellgren-Lawrence grade (KLG), which was
developed more than 50 years ago
in
white populations. “It’s plausible they
are not capturing factors relevant to pain
in more diverse populations living and working very
differently,” Pierson said.
“You take a score that's 60 years old, yeah it might not
capture the full story.”
“The current standard system... was developed more than
50 years ago
in white populations.”
And the KLG system is just one example of a diagnostic
test that fails patients of color.
For instance, there’s a kidney test that
automatically adjusts scores for
Black patients
based on a discredited scientific theory on race and
genetic differences. Because of this
unfounded basis for adjusting diagnostic algorithms,
nonwhite patients were more inclined
to miss out on vital treatments. This system is still
prevalent.
Machine learning and healthcare go pretty far back
The concept of machine learning in the diagnostic
process is hardly new—a study for
“an algorithm to assist in the selection of the most
probable diagnosis of a given patient”
was published in the National Library of Medicine in
1986. But neglecting to reexamine
decades-old medical systems—especially ones that were
built on the foundation of discredited
racially-biased theories or on a foundation that fails
to include nonwhite communities at
all—is perpetuating healthcare inequality and harming
vulnerable communities.
“On the one hand, having an algorithm is sort of like
the illusion of objective in science,”
Dr. Ezemenari M. Obasi, director of the HEALTH Research
Institute at the University of Houston
who studies health disparities,
told Mashable, citing the
importance of checks and balances when
it comes to how a diagnostic algorithm might
disproportionately harm or benefit certain groups.
“Otherwise, you’re creating a scientific way of
justifying the unequal distribution of resources.”
The outdated algorithm absent of any meaningful checks
and balances is just one possible reason
the radiographic measure might fail to identify pain in
people of color and continue to be widely used.
An external factor of racial bias can be attributed to
pain assessment in patients of
color.
And so the researchers trained a convolutional neural
network to predict the pain score in knees using
a diverse dataset from an NIH-funded study.
The dataset had a sample of 4,172 patients in the United
States predisposed to or who already had
a high risk of developing knee osteoarthritis. The
algorithm’s predictions found more of the variance
in pain than KLG did, showing that the X-rays did have
signals for pain that the current system didn’t
detect. The researchers attribute the success to the
diverse dataset.
They even retrained the neural network on a non-diverse
training set. Both instances were better at
detecting pain in X-rays than KLG, but the machine
learning models trained on diverse datasets were
better at predicting pain and narrowing the racial and
socioeconomic pain gap.
Pierson said that she didn’t go in assuming they would
find signals not being captured by the existing,
conventional scores, but there were reasons to believe
it wasn’t impossible, but looking at the risk
versus return, the impact was high if their algorithmic
approach did find undetected signals.
“Machine learning models trained on
diverse datasets were
better at predicting pain and narrowing the racial and
socioeconomic pain gap.”
“It’s quite clear empirically that diversity of training
set is important,” she said,
adding that, when it comes to medicine in the broader
context, “you shouldn’t throw all
the women out of the study or only do your analysis on
white European ancestry.”
Using machine learning to reduce (rather than
perpetuate) bias in healthcare settings
Pierson’s research illustrated that patients of color in
pain have been disproportionately
misdiagnosed by a system designed for white populations.
That ultimately impacts treatment options.
Using the algorithmic predictor that was trained on the
diverse dataset, more black patients would
be eligible for knee surgery. The neural network also
found that those with the most severe pain
were most likely to be taking pain killers, like opiods.
The researchers note that knee surgery
intervention could help lower opioid use among certain
racial and socioeconomic populations,
since it would help with their pain.
“Using the algorithmic predictor that was trained on the
diverse dataset, more black patients would be eligible
for knee surgery.”
The researchers don’t see an algorithmic approach as a
replacement for humans. Instead,
it can be used as a decision aid. So rather than just a
human or an algorithm making the final call,
the radiologist can look at both the X-ray and the
results from the algorithm, to see if they might
have missed something.
Pierson also said that their findings show “the
potential for more equitably allocating surgery
using these algorithmic severity scores.” While not yet
a flawless system, these approaches show
promise in closing the racial disparities in pain
assessment and treatment options.
“I think there are follow ups along the directions of
surgery allocations and decision aids,” Pierson said.
“Those are not hypothetical things, those are things I’m
actively interested in.”
An unconventional approach to pain or the new standard?
This approach is a bit more unconventional in that it
wasn’t trained to do what the doctor does.
It was trained to see what doctors and existing systems
are missing. Rather than learn from the doctor,
the algorithm was learning from the patient. When
clinical knowledge is incomplete or inaccurate,
you can go beyond the systems in play and learn from the
patient directly.
“Rather than learn from the doctor, the algorithm was
learning from the patient.”
What’s also an important takeaway from this research is
that algorithms can be used for pure
knowledge discovery—by training an algorithm to read
thousands of X-rays, they were able to
equate certain parts of the image to pain, detections
that radiologists missed. Though because
of the black box nature of algorithms, it’s unclear
exactly what the algorithm is “seeing”—but
it’s a notion that can be applied to other medical
practices with archaic foundations that might
not capture the lived experiences of the diverse
demographic of patients.
Pierson said that the study was predicated on the
existence of this diverse, publicly-available
dataset with suitable privacy protections. “Without that
data collection, the study wouldn’t have
been possible,” she said. Part of the onus ultimately
falls on data-collection efforts, ones that
are inclusive and ethical. This type of study proves
that those efforts are not in vain—they can
quite literally ease the pain.
Melanie Ehrenkranz
is a writer with a focus on tech, culture, power, and the
environment.
She has been featured in Gizmodo, Vice’s Motherboard,
Medium’s OneZero,
National Geographic, and more.
You can follow her work here.
Put simply, we encourage free syndication. If
you’re interested in sharing,
posting or Tweeting our full articles, or even just a
snippet, just reach out to medium@loka.com.
We also ask that you attribute Loka, Inc. as the original
source. And if you post on the web,
please link back to the original content on Loka.com. Pretty
straight forward stuff.
And a good deal, right? Free content for a link back.
If you want to collaborate on something or have another
idea for content,
just
email me.
We’d love to join forces! 🙌