FintTech, ML model
explainability, and Google’s new ideas – a transcript of
Episode 9
Bobby:
Hi and welcome to the Loka podcast. I'm your host Bobby
Mukherjee.
On this episode, I talk to Chanchal Chatterjee who leads
a team as a machine learning specialist at Google.
This is a great episode for anyone that wants to get a
primer on how machine learning is being used,
particularly in the financial services industry, we
cover a lot of different use cases for machine learning.
Including fraud risk. And for those who really want to
get into deep dive into ML model explainability,
Google is clearly a thought leader in really applying ML
to problems. And this episode covers a lot of
Google's tools and helping in that, including TensorFlow
and covering newer ideas like auto ML. Enjoy.
All right. So I'm here with my guests and first things
first I'd like him to go ahead and introduce himself.
Chanchal:
Hi, my name is Chanchal . I am a machine learning
specialist at Google cloud. My job is to come to
customers,
take their business problems to machine learning
problems, and lead a team to execute on this machine
learning
problems and create solutions.
Bobby:
Well, terrific. Chanchal, thanks so much for taking the
time. It's a real pleasure to be here. The first
thing I'd look to understand is, you know, there's so
many how many different facets of computer science
that one could study for you? When was your first
introduction to this world of machine learning and what
started to make you kind of gravitate towards that?
Chanchal:
Yeah, I did machine learning long time ago when machine
learning was not the hardest thing to do.
The reason I did machine learning is because I was
trying to solve the optical character recognition
problem,
and I didn't know exactly how to solve the problem by
using vision. Every character was slightly different.
I went to Purdue. Versiti where I met. My professor was
old time friend. And I said, how can I solve this
problem where you have numerous characters coming
through and they're constantly changing. And so we took
the problem up as an adaptive learning problem, and we
found out that the solution wasn't very interesting.
So I said we should solve some more interesting
problems, like trying to find more important features
out
of characters or any other signals advice signals. And
that's how I embarked on the journey of machine
learning. That was the beginning for a more interesting
problem to solve. That's always a good bet.
Bobby:
One of the things that really fascinated me about you
was kind of how you had deep dived and become a
specialist in use cases within, among other things, but
within the financial services. It's vertical.
And I thought what we could spend the balance of our
time doing is kind of talking through as much time
as we have for a lot of these use cases. You know, it's
dealer's choice. So why don't you, why don't we
pick out the first one financial services?
Chanchal:
We have many, many machine learning problems and at
Google we see hundreds of these problems solve the
top 100 banks are our customers. In order to categorize
them into multiple different categories, it
would be too hard to do, but at a high level financial
fraud is a very big area of interest.
The next one is risk, how to mitigate risk and how to
quantify risks as margins or interest rates.
There are other areas like how to read that sensitive
data so that no sensitive data lands in any public
place,
especially the cloud. The other area is conversational
agents. Google is big on conversational agents.
We have Google home, which is very popular. So we also
have a product on the cloud called dial-up Florida AI.
So that's another topic. Then Google has been doing ads
and recommendations for 20 years. So we have
recommendation as another solution. Search will be
another. And one of the big areas that is of great
interest to me is model explainability, which is very
important from consumer standpoint, for regulator
standpoint, from developer standpoint, from management
standpoint, So that's another area of great interest
forecasting, user behavior, and normally smart
ticketing.
You name it. There's hundreds of applications.
Bobby:
So all of that's really fascinating. So if you were a
beginner and trying to understand where to start
in financial services and applying specifically applying
machine learning in that industry, like which
one of these use cases would you pick to sort of walk
through as an example of how to apply?
Chanchal:
It all depends. You start from the business first. Check
the business, which are the most important
from the business standpoint, we don't start from the
technology and then we convert that to technology
called problems. So for example, from the business side,
fraud is more a preventive mechanism.
That is where you prevent loss.
So funny, whereas a risk or how to charge more margin or
interest is more of a revenue generating solution.
So, I guess from a very layman's point of view, the
revenue generation will be very important from the
bottom
line perspective or the top line perspective for any
business entity. However compliance is very important
in the financial services because you can incur terrible
fines.
So financial fraud and similar and model explanation or
regulator based model explanation will be very
important from. Loss prevention standpoint. So those
would be the two top categories in mind, things
that are kind of helping you on the revenue, you know,
revenue increasing side, and then the things
that are helping you kind of limit your losses and
losses on the expense side of the equation.
Bobby:
Let's take fraud, for example. So at a high level,
generically, because every bank is going to be set
up differently, but if we just take a generic kind of
use case, like what are the. Inputs that are being
fed into kind of a machine learning construct. And then
what, like how is that like playing out? So there
are multiple types of fraud.
Chanchal:
There's the business banking fraud, which is very
transactional based. Then you have credit card fraud,
which is also very transactional based. So this online
e-business as well as credit card and the two top
categories of fraud. The other is money laundering and
anti money laundering. Of course, that's the other
category, which is very important.
I put fraud and anti-money laundering as two separate
categories. And within fraud here, banking upgrade,
cardia, check fraud. Yeah. ATM image, fraud, and so on
and so forth. So let's look at the transactional
product, which is banking and the credit card. These are
actually very complex problems to solve.
Because you'll have real time transactions from the time
it credit card is swapped at the time when the fraud
is detected will have to be done in far less than a
second. And there are a lot of bottlenecks in the
system.
There could be a transaction servant that is very slow.
That would be all. So the whole machine learning
problem becomes a very short latency problem to solve
like 25 milliseconds.
So some of these are challenges.
Bobby:
Presumably the ML building blocks that are present today
that were not available maybe five or six years ago.
Chanchal:
So, so the ML building block, what is great about the
cloud-based machine learning solutions is that
we have the entire ML infrastructure. What we call the
ML platform.
It is very hard and complex to build them all platforms.
I have done that many times in my career and you
only build bits and pieces. In order to build an ML
solution, you have to build the entire ecosystem and
a machine learning code. It has been famously published
by some Googlers. That machine learning code is
only 10% or 5% of the entire problem.
The other issues that they have to worry about in
machine learning is how to ingest the data both real
time
and store data. How to pre-process the data, how to
extract the features, how to do continuous training.
How do you orchestrate the training? Because data is
what we call non-stationary, which means that.
Your consumer base today, and tomorrow will be very
different the transactions today and tomorrow.
So it is very time sensitive changes. You cannot build a
model and use it forever. You have to constantly
train sometimes hourly daily, and therefore you need an
infrastructure which has to do this whole training
orchestration.
Then you have to store this models, which is a model
repository. It has to properly version. It has to remit
attack.
Then you also have to have a proper serving
infrastructure so that you can serve the right model. If
you're doing
a credit card transaction where it's a see banking
purchase or with a mobile or a website might have
different models
and therefore you need a proper intelligence serving
infrastructure monitoring.
So there are lots of pieces. And it's very difficult to
build all these pieces, which already has been done by
Google.
Yeah. So there's no need to reinvent the wheel and
absolutely make these yourselves.
Bobby:
So it just dawned on me that you said something very
interesting when you were talking about just
generically,
the example of credit card fraud.
For example, credit cards have been around for decades
and decades and decades databases that are storing
credit
card transactions. I've also been around, as I say, as
old as Oracle, the company. What's an example of either
a hardware or software based building block ML building
block that came onto the market, you know, from Google
that allowed one to take this transactional data that
has been on kind of these older systems and start to
do recommendations about fraud in the sub-second
category that wasn't possible. I don't know, five or 10
years ago.
Chanchal:
Yeah. So the three things, the available of large
volumes of data from multiple sources. So data was very
fragmented
before. So now we have the ability to consolidate data
with similar type of sources. So there are basically
three
types of data. One is real-time data, which is streaming
data, and that can be the transactions that can be click
stream the app and web logs.
So there are lots of streaming data you can mix with
that social media data, for example. Because that's
like,
I could be saying that I'm on vacation, but I'm posting
up there and Twitter that I'm having fun in the Hills.
So there's social media data, which can also reveal a
lot about your behavior. So the thing is then there is
the second category of data, which is batch data.
The batch data would be things like in a backing
context, like accounts like marketing data, order
management.
So those are batch data. Then you also have repository
data, which are typically stored in CRM systems.
So enterprise data, warehouses, it is lot easier today.
To have all this data in a similar source.
For example, Google cloud offers big query, which is a
data warehouse, which is an incredible place to store
all
your warehouse data and patch that in one place. And
it's simple query to extract that you can pull streaming
data with golf cart sub or any other public sources.
There's a lot of technical advances that has happened,
which has made data easy to access.
The second big innovation we have done is algorithms.
The benefit there is that before it was very difficult
to take multiple types of data and build algorithms that
will actually find insightful features. So people
is to spend very well-trained PhDs is to go in and do
feature engineering, which is to look at this huge
volumes of data and find which features are important.
Now the advent of narrow networks has made it very easy
to extract those features. So that's the second
innovation that has happened. And the third big
innovation is the compute. That is, we have so much of
compute power with the GPU's, with the TPOs and the very
powerful processes from Intel and AMD and all
this, the companies that we can now process is huge
amounts of data in a short amount of time.
So these are the three pillars that have really
propelled machine learning. From when I did my PhD to
today,
we used to do machine learning models then, uh, with
neural networks, but it was so hard to either build
or to deploy. Now it's so much easier.
Bobby:
Yeah. And I'm guessing back then you were doing sun
spark work.
Chanchal:
Most of the highest power processor. 550 megahertz
processors. So those were immensely underpowered.
But is this interesting? Cause I mean, a lot of these
algorithms that are used today have been around
for decades, but they just weren't, you couldn't get the
full power out of them. Uh, also, but it's
not entirely true because ESD algorithms like
convolutional networks are discovered 20, 30 years ago,
but they were never effectively used for face
recognition.
For example. So the use of the data and also some
innovations have happened in those algorithms too,
which is really enhance the ability to increase the
performance, I guess, was Liz looking following
some of the historical performances that we had. Five
years ago, doing a text recognition, a human
being is generally 3% out. The times you make it
machines were had like almost 10 or five, five to
10% times. We made a mistake. Today lot of the text
recognition solutions that Google produces or
many other companies produce can do 1%. Or less, and
they're trying to reduce it down to 0.1.
So from a place where we were far worse than humans,
that we are better than humans, it's not
entirely because of compute it's lot of algorithmic
innovations have happened. That's pretty exciting
to see that rate of innovation that's going on right
now.
Bobby:
Let's talk about some of these building blocks just for
my listeners, just to give them more context
because they come up a lot and maybe you could give kind
of a layman's explanation of some of these things.
So let's TensorFlow and just purely from a user
standpoint,
like how does that building block fit into like say fit
into something like fraud?
Chanchal:
So flow is a machine learning platform. It is a way by
which you can write complex machine learning
code in a simple way. There is a construct on top of
TensorFlow, careless, careless is also owned by Google.
So it is basically simplifies the complexity of
TensorFlow so that you can write machine learning
code back when I did my PhD in 1996, in order to build a
decision tree. We had to take the C 4.4
library of machine learning of decision tree. You had to
go hours and hours, figure out what function
to call for each of these steps.
And it was very difficult. It took you days, if not
weeks implement today with carers running on who is
sighted learn for example, which is another package you
can do decision tree in wildlife. Now, what did
TensorFlow do? TensorFlow is immensely useful to do
complex, keep learning to do a narrow network that
is so complex 10, 20 years ago, even five years ago, to
build those today with carers in five lines of
code, you can build a very complex, deep learning
network.
And you can train your machine learning models and you
can run your machine learning models.
It will be a very short snippet of code. Wow. It just
allows you to accelerate your progress in.
Absolutely. It's so easy to build and therefore it is so
easy to experiment and TensorFlow within
cloud machine learning engine has many, many very
exciting components.
What are the competencies I want to particularly speak
about is called hyper parameter tuning.
So when you build a complex machine learning model, it
has many, many parameters. Like how many
layers of the neural network should you have? How many
nodes, what are the different activation
functions, which are different functions in order to
change the output?
So you could come up with thousands, if not millions of
these different parameters, how do you change
them in this complex parameter space to find the optimum
combination is very difficult and that sort
of hyper parameter tuning does. It's a very complex
algorithm. Google invented partly open-sourced,
but it's available in the cloud in order to do that.
Bobby:
And in the spirit of making it easy to experiment, maybe
this doesn't fit perfectly.
But tell us a little bit about what the promise of auto
ML homeless.
Chanchal:
So, um, there are three ways of doing machine learning
on the Google cloud.
One is you can build machine learning solutions from
scratch.
We call it your data, your model, you bring the
customer, you bring in your data.
You bring in a data, scientist. Scientists will
basically provide you the AI infrastructure
for you to build the models. The second one is auto ML.
There you bring in your data.
We provide you the models. These are predefined models,
not trained models. So you bring
in your data and you can train it with our models.
So we bring in all our experience in order to create
these models for you. The third one is called API.
API is where we have pre-trained models. You bring in
your data and the models are already trained and
you can use them as is. So that's the beauty of using
machine learning on the cloud, because you can
now use any of these three different combinations in
order to build your model from very simple,
to very complex.
Bobby:
That makes a lot of sense. And I think people are pretty
excited about the possibilities of that with
what they can now do in a much shorter period of time.
But thanks to auto ML. One of the things you
touched on earlier was there are many aspects of
deploying a machine learning solution.
And one aspect of it is the sort of data ingestion.
Process stage. I mean, it's, it's like table
stakes. You must do this. You what, what else? So what,
talk to me a little bit about the challenges
in that and some of the building blocks or things that
you've seen that have helped grudge.
Chanchal:
So in the financial services industry lock up the
confidential and proprietary data resides on premise.
So bringing it to the cloud is always a challenge
because there's a lot of personalized data there.
So you need to de identify, did those data redact the
sensitive content. We don't want any of that
on the cloud. So that is obviously a challenge. We have
an API called the DLP API, which allows
you to basically redacted data.
You can also use third-party solutions in order to
reduct sensitive content. That's the first challenge.
The second challenge you have is the multiple sources of
data. For example, in financial services,
you might have a fraud repository. You might have some
other system where you have stored like a
Hadoop repository.
So you might have 10 different repositories in New York
bank. Which is different. So constantly dating
and aggregating. All of that towards the cloud is a lot
of different types of ETL work and involved
in getting that data over to the cloud. Even after you
get the data to the cloud, it is also very
complex to pre-process the data so that you can create a
single source promptly to you can build the fraud.
Bobby:
And just pausing for a sec. Could you talk a little bit
about what would be an example of preprocessing
and why that's important?
Chanchal:
The processing is very important. For example, if I am
doing a transaction today, I typically withdraw a
hundred dollars from the bank and all of a sudden, one
day I drop $10,000. So how would you know that
the $10,000 is an anomaly?
You need to know the average transaction for the last 30
days, the average transaction for the last
30 days compared to the current transaction is what we
call a feature. And that average transaction
for the last 30 days will be the necessary confidence to
extract the feature. So that's what you need
to pre-process the data in order to extract that feature
for example.
Bobby:
Again, the, the, the sort of the cloud platform is what
allows for a lot of that ingestion and
or hostessing to happen much more easily today than
would have been.
Chanchal:
Correct and can be stored perpetually. For example, you
can offline create the last 30 days,
average transaction on a customer by customer basis, and
you can keep it stored in your internal
repository so that you can create the features on the
fly.
This is just example of one feature. You can do hundreds
of features like this. And all of those have
to be processed. A new customer comes in and all of a
sudden, a new feature vectors to be created for
that customer. And all of that becomes a lot easier to
do once you've kind of got that base going in
the correct.
And for that you need an automatic orchestration
platform. These are not done manually. These are all
done automatically. As a new data point comes in or a
new customer comes in. So there has to be a set
of triggers based on the amount of data type of users
coming in type of consumers coming in. So you'll
have to constantly have an internal logic in order to
create these things and create an initiative,
the training process.
Bobby:
So on that, on the training process, one of the things
you talked about earlier was the need for a model
repository today. What have you seen as being effective
in. Monitoring models once they're deployed.
Chanchal:
Correct. So we have multiple ways of monitoring the
whole model we have what is called tensor board,
which is a great way of visualizing the model training
process and the prediction process.
In terms of the internals of the training, the
collecting, the analytics we have, what is called
Stackdriver. The Stackdriver basically collects the
analytics. Of how the CPU GPU, all that consumption.
Yeah. The other awesome thing. And then, uh, we should
mention is TPU besides GPS. That is, I have done
projects in retail.
CNTP use have given three X the advantage in cost and
three X, the advantage in processing, we were building
a very large model. A very large complex neural network
model. And the data was also very large. So is 3d
volumetric data. Uh, with the dead-end model, we
couldn't do much of training. After that, with the GPU.
When we moved to the TPU, we could get a, what we call a
batch size that is a batch of 16 into the memory
in order to make it work. So I've seen myself that the
TPS have been pretty beneficial. And when you said
the three X advantage, was that over GBS or just
traditional CDP over GPS? Yeah. I saw that in one of the
projects.
We don't want to compare GPS to peers, but I'm just
telling you, we are still a big user of GPS.
We love Nvidia, and we work with them. We want a lot of
GPS to be on the cloud, but in this one example,
I've seen that we've got three X advantage in price and
three X advantage in speed. That's pretty compelling.
Bobby:
Let's do an overview and then maybe drill down a little
bit deeper into the notion of conversational agents.
Maybe kick us off with like, what is a conversational
agent and then how, how is that applicable and applied
in the say the financial services use case?
Chanchal:
Absolutely. So conversational agents are very helpful in
call center management.
So we also have a complete product which is called
conversational AI, which is a. conversational
agent management all the way from input to fulfillment,
but there is a subset of that, which is
called dialogue flow. Yeah. Dial-up flow is a customized
solution that we have built in order to
build chat bot agents.
So this is where. We can do delightful and natural
conversational experiences. The key component of
that is that you have agents for different questions
that a person could ask. We train those agents
with different types of utterances, because the same
question can be asked in many different ways.
And then on the fulfillment side, we can integrate with
messaging.
So with Google home and most people type of output, and
those are called fulfillment. The other big
thing is that it has to support multiple types of
languages. We have at least 30 supported languages.
You also need to have what you call session flow
visualization so that you can see how the sessions
are happening.
And it also helps to have prebuilt age it's like asking
your name is very common in a lot of these.
So there are pre pre-built agents. Like we have many
pre-built pages, at least 30, maybe more like more.
Again, at a high level, how does the training of the
conversational agent process happen and what,
what is it very simple, actually take the multiple types
of utterances.
You identify an agent for each different question and
you take the utterances and you train the agent
for windows is very fast and very simple. Wow. The
complexities and the fulfillment side.
Bobby:
Got it. Let's look at recommendation. Again, what is it
at a high level and how could we absorb it?
Chanchal:
Financial services, one of the biggest things is
portfolio management, wealth management.
These are instances where you want to have a
personalized service. So like when you create a
automatic
creating platform or an automatic. Portfolio management
platform like Wealthfront, then you do need to
have a personalized service.
So given a person's risk appetite, their point in life
and income and so on and so forth, you can have
a combination of equities or bonds or the two major
types of agents. And then you can create a sort of
a pyramid of what do you need. To invest in or to get
the return that you want. So this is an example
of a personalized recommendation.
Google has been doing recommendation for a lot of years
and we been sourced many recommendations solutions.
So for example, historically, we have done payments your
products like symbol factorization,
machine distribution, distributed Waltrip is search for
Google cloud walls. You would find the open source
code for that.
We also have later innovations like widened deep
learning, brain-based watch recommendation, deep
retrieval.
And then most of the head of curve, uh, recommendation
engines that we have now are time Delta sequence
modeling,
reinforcement learning. This is where the you are having
dealing with non-stationary data, data that is changing
so that you can time sequence reinforcement learning is
an active learning method.
So these are all new generation of learning methods. So
Google has also been using recommendation in a lot of
our products, like search apps, maps, videos, music, and
so on and so forth. And so we have come up with a,
we have two types of solutions. One is the
recommendation that AI, which is an auto ML solution.
It's a retail centric product. And it has been very
beneficial in terms of increasing the click through
rate,
customer adoption, which is also known as CVR revenue
generation and so on and so forth. But the other one is
that we have open source product like the distributed
wall cell grid, um, make this after ideation that you
can download code, or you can just go to your cloud out,
can use it as is.
Bobby:
I could see how that got it' start more in the
e-commerce retail world, because people could sit and
understand
like people like Amazon and so forth had made that very
famous. If, you know, you read these books, you might
read some of these books, but you can, you can apply
that same printer.
Chanchal:
Yeah. So that is, uh, that concept or recommendation
comes from basically, you have a set of users and you
have a set of products and you have a set of ratings.
And then, so you take the user's products and the
ratings and you match and create a new product.
There is some very interesting variations of that. There
is one that I have seen is where you do editorial
recommendations. So a big publishing house. They have
editors as your customer. They want to know how an
editors should recommend a content to the users.
So that's a completely different model there. You don't
have the users and products and ratings in between.
Your user is the editor and the products is all your
content so that we can slightly different variations.
So the same algorithm cannot be used there, but all I'm
saying is that the recommendations can have specific,
separate specialization.
Bobby:
You see how that can be very powerful, how the mind
blows and the possibilities on that front.
Chanchal:
The second one is very interesting because you have
creating trends, right? Like in journalism, you could
create a new trend.
Bobby:
That's pretty wild. Let's talk about one thing I think
is more near and dear to your heart model
explainability.
What is that? And how can we use it?
Chanchal:
So in the banking sector model explainable is very, very
useful in many sectors like credit lending, transaction
fraud,
anti money laundering, blah, blah, blah. But a key thing
to understand is model explainability is not a
monolithic
thing. You have multiple audiences.
You have the developer says the audience, which is how
well, most of all, for it to be a model experiment,
how to build better models, better models. So we look at
the model to look at this output, what features
went into it so that we can enhance those features and
build another model and iterate through that process.
The second audience for model explainability is the
consumers. You go to the bank and you ask for a loan
of 250 K and you only get 200 K or you get denied. So
the person says, why did my own get denied?
You cannot just say it because the model came up with.
So yeah, I can give the reasons, the reasons
your income wasn't sufficient or yet to my clone or
whatever.
And you have to come up with that from the model itself.
The third audience is regulators and regulators
want to do model explainability so that to see that
you're compliant regulations and the forethought in
system management, I'll only address the first three.
Each of them have different goals. So engineers
want to produce better models.
Consumers want to produce better analysis or want to see
better analysis. So engineers models,
consumers analysis, regulators reports, each of them
have different and end result.
Bobby:
Using the same basic principle. So as you look forward
into 2020, and you know, you're spending
all this time in financial services and talking about
these different things, which of these types
of applications are you thinking? This still early days.
And you know, there's going to be significant
uptake in the year ahead, more than others, any jump out
at you as kind of being, you know,
just getting more traction for whatever reason. Yes,
that's right.
Chanchal:
Yeah. So I think that lot of the use cases we talked
about will be very, very useful in the financial
services and other services too.
I view risk or margin prediction going to be a huge
topic of discussion. Because it's a revenue generator
and therefore, how do you appropriately choose the
margin in any type of lending situation? So I think
that's a big topic of discussion. If you can predict how
a stock is going to do, you could charge
appropriate amount of margin.
And if you can explain that well to your customer, I
think you can make a lot of money. And then again,
even simple things like hedge funds and others, if you
could predict which stocks are going to do better,
then that will be a big yeah. And who to put in best and
right. Stock prediction is not just an individual
manner.
It's also a societal matter. Right? A lot of the time, I
mean, who knew Snapchat will be so big, right?
So to understand how the underpinning of the society
that led to some of the successes and also very
difficult
machine learning problems to solve.
Bobby:
It's interesting. So if you think about the financial
services world and all of these different use cases
within them, take an extreme example. Let's say that
next year was a super down economy year and the market
takes up 20% or bigger, a bigger crash. I'm just saying.
Chanchal:
We have a good year for short trading. If you could
predict that accurately, you can still make a lot of
money
in doing hard to borrow securities, out, to find which
list will go down the most.
Bobby:
Yeah. I mean famously some people, I think, I think they
made some Hollywood movies out of people doing exactly
that in the big short and in 2008 for those reasons. So
who knows, maybe there's an opportunity there. I feel
like I borrowed you for a long time and I really
appreciate it, but this has been an absolute pleasure.
Thank you so very much. Thank you very much.
I
appreciate it.