How a diverse dataset
can help close the racial pain gap
Researchers trained an algorithm to predict knee pain better than the decades-old standard
Emma Pierson, a senior researcher at Microsoft Research New England, said that her medical
collaborator on a recent study shared an accurate, though not very reassuring fact about pain.
“We don’t understand it very well.”
Pierson is a computer scientist developing machine learning solutions to inequality and healthcare.
A research paper
she published in January alongside other researchers explores pain disparities
in underserved populations, specifically looking at osteoarthritis in the knee and how it
disproportionately affects people of color. And they found that their algorithm detected pain
areas that doctors and machines have since missed.
The motivation behind the study was this mysterious pain gap
Pierson said that the basic idea was to train a machine learning algorithm to find any
additional signals in the knee X-ray which isn’t being captured by regular risk scores
and medical assessments, seeing if this algorithmic approach could narrow the racial
pain gap for knee osteoarthritis and, subsequently, for other medical problems.
The study points out that the correlation between radiographic measures (using X-rays)
and pain is contested—people whose X-rays don’t show severe disease might experience
severe pain, and vice versa. The current standard system for measuring osteoarthritis
is the Kellgren-Lawrence grade (KLG), which was developed more than 50 years ago in
. “It’s plausible they are not capturing factors relevant to pain
in more diverse populations living and working very differently,” Pierson said.
“You take a score that's 60 years old, yeah it might not capture the full story.”
“The current standard system... was developed more than 50 years ago in white populations
And the KLG system is just one example of a diagnostic test that fails patients of color.
For instance, there’s a kidney test that automatically adjusts scores for Black patients
based on a discredited scientific theory on race and genetic differences. Because of this
unfounded basis for adjusting diagnostic algorithms, nonwhite patients were more inclined
to miss out on vital treatments. This system is still prevalent.
Machine learning and healthcare go pretty far back
The concept of machine learning in the diagnostic process is hardly new—a study for
“an algorithm to assist in the selection of the most probable diagnosis of a given patient”
was published in the National Library of Medicine in 1986. But neglecting to reexamine
decades-old medical systems—especially ones that were built on the foundation of discredited
racially-biased theories or on a foundation that fails to include nonwhite communities at
all—is perpetuating healthcare inequality and harming vulnerable communities.
“On the one hand, having an algorithm is sort of like the illusion of objective in science,”
Dr. Ezemenari M. Obasi, director of the HEALTH Research Institute at the University of Houston
who studies health disparities, told Mashable
, citing the importance of checks and balances when
it comes to how a diagnostic algorithm might disproportionately harm or benefit certain groups.
“Otherwise, you’re creating a scientific way of justifying the unequal distribution of resources.”
The outdated algorithm absent of any meaningful checks and balances is just one possible reason
the radiographic measure might fail to identify pain in people of color and continue to be widely used.
An external factor of racial bias can be attributed to pain assessment in patients of color
And so the researchers trained a convolutional neural network to predict the pain score in knees using
a diverse dataset from an NIH-funded study.
The dataset had a sample of 4,172 patients in the United States predisposed to or who already had
a high risk of developing knee osteoarthritis. The algorithm’s predictions found more of the variance
in pain than KLG did, showing that the X-rays did have signals for pain that the current system didn’t
detect. The researchers attribute the success to the diverse dataset.
They even retrained the neural network on a non-diverse training set. Both instances were better at
detecting pain in X-rays than KLG, but the machine learning models trained on diverse datasets were
better at predicting pain and narrowing the racial and socioeconomic pain gap.
Pierson said that she didn’t go in assuming they would find signals not being captured by the existing,
conventional scores, but there were reasons to believe it wasn’t impossible, but looking at the risk
versus return, the impact was high if their algorithmic approach did find undetected signals.
“Machine learning models trained on
diverse datasets were better at predicting pain and narrowing the racial and socioeconomic pain gap.”
“It’s quite clear empirically that diversity of training set is important,” she said,
adding that, when it comes to medicine in the broader context, “you shouldn’t throw all
the women out of the study or only do your analysis on white European ancestry.”
Using machine learning to reduce (rather than perpetuate) bias in healthcare settings
Pierson’s research illustrated that patients of color in pain have been disproportionately
misdiagnosed by a system designed for white populations. That ultimately impacts treatment options.
Using the algorithmic predictor that was trained on the diverse dataset, more black patients would
be eligible for knee surgery. The neural network also found that those with the most severe pain
were most likely to be taking pain killers, like opiods. The researchers note that knee surgery
intervention could help lower opioid use among certain racial and socioeconomic populations,
since it would help with their pain.
“Using the algorithmic predictor that was trained on the diverse dataset, more black patients would be eligible for knee surgery.”
The researchers don’t see an algorithmic approach as a replacement for humans. Instead,
it can be used as a decision aid. So rather than just a human or an algorithm making the final call,
the radiologist can look at both the X-ray and the results from the algorithm, to see if they might
have missed something.
Pierson also said that their findings show “the potential for more equitably allocating surgery
using these algorithmic severity scores.” While not yet a flawless system, these approaches show
promise in closing the racial disparities in pain assessment and treatment options.
“I think there are follow ups along the directions of surgery allocations and decision aids,” Pierson said.
“Those are not hypothetical things, those are things I’m actively interested in.”
An unconventional approach to pain or the new standard?
This approach is a bit more unconventional in that it wasn’t trained to do what the doctor does.
It was trained to see what doctors and existing systems are missing. Rather than learn from the doctor,
the algorithm was learning from the patient. When clinical knowledge is incomplete or inaccurate,
you can go beyond the systems in play and learn from the patient directly.
“Rather than learn from the doctor, the algorithm was learning from the patient.”
What’s also an important takeaway from this research is that algorithms can be used for pure
knowledge discovery—by training an algorithm to read thousands of X-rays, they were able to
equate certain parts of the image to pain, detections that radiologists missed. Though because
of the black box nature of algorithms, it’s unclear exactly what the algorithm is “seeing”—but
it’s a notion that can be applied to other medical practices with archaic foundations that might
not capture the lived experiences of the diverse demographic of patients.
Pierson said that the study was predicated on the existence of this diverse, publicly-available
dataset with suitable privacy protections. “Without that data collection, the study wouldn’t have
been possible,” she said. Part of the onus ultimately falls on data-collection efforts, ones that
are inclusive and ethical. This type of study proves that those efforts are not in vain—they can
quite literally ease the pain.
is a writer with a focus on tech, culture, power, and the environment.
She has been featured in Gizmodo, Vice’s Motherboard, Medium’s OneZero,
National Geographic, and more. You can follow her work here.
Put simply, we encourage free syndication. If you’re interested in sharing,
posting or Tweeting our full articles, or even just a snippet, just reach out to firstname.lastname@example.org.
We also ask that you attribute Loka, Inc. as the original source. And if you post on the web,
please link back to the original content on Loka.com. Pretty straight forward stuff.
And a good deal, right? Free content for a link back.
If you want to collaborate on something or have another
idea for content, just email me.
We’d love to join forces! 🙌