Welcome to Loka’s podcast, “What fascinates you?” Conversations with entrepreneurs, engineers,
and visionaries who are driven to bring innovations to life. I'm Bobby Mukherjee and today's
podcast is about the power of telling stories with your data and how more diversity in our data
– and our teams – can uncover much greater outcomes.
Here to share her story through video chat is Ahna Girshick – life-long researcher, creator, and
former manager of computational genomics at Ancestry.com. Like many of our guests, Ahna’s career
journey is as fascinating as the innovation’s she pursues.
Her research has spanned the worlds of science and data – from neuroscience to machine learning and computational storytelling.
She has received over 3000 citations press from the New York times and NPR, and has been
featured at the museum of modern art for her work with Philip glass and Bjork.
If you work with data in any capacity, if you want to connect better with your customers,
or if you want to see more diversity in the tech world, I think you'll come away with something
valuable from this interview.
Hello. Welcome to the show.
Hi, thank you. It's nice to be here.
Pleasure to have you here. So, the first thing I wanted to talk about was when you and I were
talking previously, you had mentioned that your dad was a scientist and your mom was an artist,
both at Stanford, which made me wonder what were conversations like at your dinner table growing up?
I really grew up surrounded by both arts and sciences. From both my parents from my father,
I saw his scientific journey. It was very academic. It felt very innovative and stimulating.
And then also surrounded by the arts. So my mother was painting inside the house and in her home studio.
So that felt very creative and beautiful. And my parents had quite a bit of appreciation for each other's
professions, but at the same time, even as a kid, I could see how different they were. And so, as a kid,
I love the arts. I love the sciences and it didn't really dawn on me until probably fairly late in high
And then increasingly in college that most of our society is kind of set up in this... it's very specialized,
right? So our educational systems, our professional tracks where people tend to self-segregate is generally,
you know, it could be scientific, it could be artistic, or it could be many other things, but.
There's more and more deeper and deeper specialization. So that's great. I love going deep, but it also creates
some limitations on diversity of ways of thinking and working, communicating. And so that was a little bit
disappointing. I think for me as a young adult and confusing,
Do you remember at an early age, if you found yourself drawn to one over the other,
I think I was probably a little bit more drawn to math and science. But I'm not sure how natural that was.
It might've been because I internalize this kind of story from my mother that I, you know, you don't want
to become a starving artist. Like I'm going to get a good, get a computer science degree or something like
I'm not quite sure, but I was always looking for ways to connect them. And I, I still have been my whole career.
So one of the things that you had taught me when we were just talking the last time was you provided this
lens on how to look at data, and you talked about the power of stories and how. Data can be very dry and
it's not very meaningful in and of itself, and it can become so much more powerful if you can craft a
powerful story around it.
So before we dive into the specific notion of creating stories with data, I was just curious at a higher
level. What role does stories play in your life?
Stories are this uniquely human format for communicating information. So we can trace stories back
like almost 5,000 years. And you know, I'm a mother
And so, quickly, I discovered even for very young children, storytelling has this huge power. I think
we're wired to be captivated by stories because they give us this framework for. Interpreting our lives.
And they're also very connecting, which is why they run strong and families and communities, and get passed
down over centuries.
So I'm not a natural storyteller. I'm not a published author of literature. Most of us aren't. But I
think when I was at ancestry, During my time as a research scientist there, , I became really interested
in this idea of data driven stories and saw that my work there was really kind of like computational
storytelling, which was a way for me to fit.
You know, maybe it's the story I was telling myself, but it connected into my career quest to connect
AI and technology into the human experience.
Let's dig into that a bit. So you worked at ancestry as a matter of computational genomics research
for a little under five years. What were your major roles during that time?
I started as a research scientist and then later on I was managing a research team and its history
is, uh, about a 40 year old company. Their main focus is family history and users can create family
trees. And a Monday joined my association with building a family tree, which is called genealogy.
You know, it was a hobby for a cookie Gregg uncle who I didn't really want to talk to or something
like that. And I think a lot of people have that association, but you know, I'm also a data geek.
And then I quickly understood that family trees, especially at the scale ancestry has, they have
over a hundred million family trees.
They're this very rich data source. And there. Spatial, because usually in family trees, people
say so-and-so is born in this place and they've been in this place and died in this place. So they,
and that covers the whole globe. And they're also temporal because they say. My mother was born
this year. My grandfather was born in this year, going back, uh, the birth dates, death dates,
They, you know, all the significant events when the children were born and et cetera. So you can
think of them as this raw material for that creating a computational history. And then when you
aggregate them, you can discover historical trends. So, you know, we have textbook history which
was written. By historians, their version of the story.
And then there's this kind of computational aggregated family tree story, which is potentially
the same and possibly a little different, right. So, uh, hold on to that thought because I'm going
to return to it. But another way to look at your past is genetics. And I was in the ancestry DNA.
Science research labs.
So we are very focused on genetics and, genetics tell you about your past because all the DNA you have,
you inherited from your parents who inherited it from their parents all the way back. Right. And, you know,
it's funny because in school, you don't get to combine or even major in like history and genetics or
combine those data sets.
But that's what we did. So we're combining historical data from the family trees and the genetics data,
which I think is something like pretty awesome and data science. So if you find these disparate data
sources and you get to work sort of cross disciplinarily, so I had the opportunity to do that research
We called it know, I think of it as data storytelling. So the way we did this is first we built a social
network. So think of it like Facebook, but instead of friendships, connecting to individuals, it's determined
based on genetic relatedness, how much DNA you share. So siblings are going to be directly connected and
very close in that network.
Whereas people from opposite parts of the world are going to be very far in that network. So think of this as
a massive genetic social network. And then. You can use clustering algorithms. You don't really need to know
what those are, but to find clusters of individuals in that social network. So these clusters represent a group
of people there they're large clusters in the tens of thousands or hundreds of thousands, but these clusters,
aren't representing a group of people who share DNA more DNA with a toddler than they do with others.
And generally when you share DNA, especially when we're going back like eight generations, it means to share a common history.
You know, it was only in the past few hundred years that people are traveling across the world to meet their mate. Right.
Right. So I definitely believe in the power of storytelling. And then you take something
like data, which untouched can be really dry and about the furthest thing away from a story.
So if I'm trying to create better stories with data, one of the things that I just picked up
is first you have to have a character or characters in your story that people can empathize
with and relate to.
Yeah. I mean, I think connecting to people is a powerful technique. It's probably not the only
technique, but we empathize with people. We empathize with other people, especially if they can
relate to them.
If there are. Data products out there that are geared towards customers. And so there, the people
are there already, and then there's also, , data journalism and the news where it's looking at
large populations of people. So it's not necessarily targeted towards individuals, but there's
still that human connection.
Right. So I think that seems to be, at least one of the key ingredients in trying to make
a more compelling story.
Right. So if you're making a story, even if it's about climate change or something like that,
and you're, looking at the data on that, how does that affect people?
Because we care about people. Oh, you could talk about how it affects animals too, I suppose, right?
No, no. I mean, exactly. But I think again, the key ingredient is can have you created a character.
That, whether it's a person, animal or whatever form that people will relate to and empathize with.
That's the key thing.
So switching gears a bit, , something that I would really love your perspective on is, having had
this tremendous journey in the field of AI, , , being a practitioner, you know, just your perspective,
is it different for women in the field of AI and machine learning than for men?
Like, I have anecdotal answers to that question for myself, but I think what I know is that,
you know, the more diversity we can bring to teams, building AI systems, The more, algorithms
can reflect that diversity, which I think is a very good thing. You know, diversity comes in many forms.
It's not just gender, right. It's race, but it's also a diversity of educational institutions and diversity
of training. Right? So matching computer scientists with anthropologists or artists or journalists,
Or matching , a Stanford PhD to someone who is self-educated.
There's many farms, but I think when we are willing to create a more diverse team,
sometimes that might feel hard or different, but it ends up creating different types of solutions.
I had definitely have seen that happen with us and with other teams and constantly,
constantly looking for opportunities to make that happen because the outcomes are fantastic.
So here we are, hopefully, hopefully in the tail end of the pandemic in your mind,
did 2020 change things for data AI and storytelling.
Yeah. I mean, 2010 each changed the world, right? It was just a moment. Right? I mean,
two things stand out to me. One, we didn't talk about COVID research, but. Because of COVID.
I noticed that self-reported health data became more mainstream and more accessible.
So addressing the COVID pandemic while we're all remote, it kind of forced the healthcare
industry to be more accepting of that self-reported data.
And also people were motivated to help. So one of the things I did at ancestry was to help
coordinate a COVID research study. We had. Nearly a million volunteers sign up to anonymously
participate, you know, to contribute to the scientific understanding of the genetics and other
risk factors underlying COVID.
And I saw many other efforts like this, where organizations were collecting data and cell phone
apps that like ping you every day. And you could enter your zip code and what symptoms you were
feeling, that sort of thing. And that. Healthcare industry, which has traditionally been rather
low tech, you know, looking for this data because it's only in mass.
So that was one big change. And I'm hoping that trajectory will continue because self-reported
data is really valuable, really powerful, and what it can do , for science and healthcare.
And then the other big issue of course, of 2020 was, you know, racial justice movements that
sort of Gulf the country.
But I think they also helped drive this kind of more honest dialogue about, racial disparities
in the workplace machine learning. Last week on PBS, I saw the coded bias documentary, which
if you haven't seen is, is amazing. And it's about bias in AI. It's disturbing, but it didn't
tell me anything.
I didn't already know. And, kind of highlights bias and training data by the lack of diversity
of engineers, designing the algorithms, but also the business forces dominating that conversation.
And so, my hope is that those AI ethics organizations that you mentioned, we'll be applying pressure
and working with, the big AI companies.
There's only a few really big ones right now, you know, it's just to prioritize. Transparency,
for example, that would be a great start, but what's how the algorithms are working and where
the data is coming from. And so people, you know, this goes back to data storytelling that people
want to use, that product wants to understand why this algorithm is.
No, maybe something serious, like telling them they have cancer or something. Right. Like where did it,
how did it make that decision based on what data and how was that learned? How do you contextualize it
within a population to understand? Because if you're making a big decision about your healthcare or
getting a job, you know, you want to understand that whole context behind it.
Where my optimism comes from for, for 21, I think is those observations. You said, lay the groundwork
for some momentum in that direction. So you see, you see better outcomes with things like model
explainability and less confusion about. How algorithms are making these weird decisions that
we don't believe they shouldn't be.
So I think that is great. Cause for optimism.
Well, I know this has been fantastically useful and engaging. I've learned a ton. I have many more
things to ask you, but. I really, really appreciate your time. Thank you so much for being on the show.
Yeah, it was a pleasure.
That was Anna Gershon, researcher, creator, former manager of computational genomics and ancestry.com
and barrier breaker in AI.
If you're interested in learning more about Hannah and her research, you can visit her firstname.lastname@example.org.
And if you enjoyed our show, please like, and rate us until next time. This is Bobby Muff, Virginia.