You and your colleagues have recently published an article describing the algorithm intended for banking that can identify irresponsible clients based on their spending traits. Can you tell us about this project?

It’s a part of a bigger study conducted by the School of Translational Information Technologies within the Russian Science Foundation grant project and in partnership with Bank St. Petersburg. 

Generally, credit scoring has quite a history. Italian bankers used to perform it as early as the Middle Ages when trying to decide who they could loan money to. To a certain extent, we can say that these early practices were the precursor of contemporary statistics, which in its turn led to machine learning. 

The bank gave us depersonalized data of those clients who already received their loans. Unfortunately, not everyone of those clients paid back even though the bank’s scoring system considered them trustworthy. So, the representatives of the bank were naturally interested why this happened.   

How can machine learning help us answer this question?

We were given the data about the categories of products purchased with the credit card and the amount of money spent in each category. 

Our first hypothesis was that the clients who default on their loans can be divided into two categories. Roughly, in the first category, there were irresponsible clients who knew from the start that they were not going to pay back or maybe they didn’t think they would need to return it hoping that it would sort itself out later. In the second category, there were clients who couldn’t fulfill the agreement because of some unexpected challenges that affected their financial state and who would’ve otherwise paid the loan back. 

The word “unexpected” is key here because it’s about an event that was hard to predict and thus the scoring system is not to blame for approving these clients for loans, whereas those in the first category should’ve been rejected. It’s crucial that we didn’t know for sure which clients were irresponsible and which fell victim to random financial challenges.

If we can apply machine learning to single out those clients who loan money carelessly or openly irresponsibly, then the scoring system will be more efficient.

Illustration by Dmitry Lisovsky, ITMO.NEWS
Illustration by Dmitry Lisovsky, ITMO.NEWS

How can we achieve that? 

Here we make our second hypothesis that is based on how we divided those who defaulted on their loans. We suggest that clients who default out of ill will behave less like those clients who return their debts. In other words, the financial behavior of those who took the loan but couldn’t return it due to an unexpected crisis, will be similar to the spending structure of people who returned their loans to the bank.

Armed with this hypothesis, we developed an algorithm based on an autoencoder, which is a type of neural network that can produce reduced dimensionality representations of objects that have the most significant characteristics. To put it simply, at the start we have accumulated spendings of the bank’s clients in 13 categories. Then, the autoencoder has to compress this data to have a two-dimensional representation of what he received as input. 

This representation should allow it to correctly restore the initial code. During training, we have also added a regularizer for this internal representation, so that all internal representations of responsible clients were as similar as possible. They were meant to blend together into one stereotypical responsible client.  

There was a risk that the data of problematic clients would be indistinguishable from that of responsible ones, maybe there wouldn’t have been a clear pattern, but actually the pattern is there. And unintentional debtors are less distinguishable than malicious ones. 

The regularization didn’t take into account those who didn’t pay back their loans. Thanks to that we could identify the clients who have fallen victim to unexpected challenges and those who defaulted on their loans “intentionally,” so to say. Having filtered the “random” debtors, the scoring system can now better detect those who won’t pay back their loans thus proving both our hypotheses right. 

Which patterns did the algorithm identify? What were the links between spendings and a client’s credit quality? 

Unfortunately, we don’t know the exact answer to this question. The autoencoder builds hardly interpretable representations of data. It’s challenging for us to understand which characteristics affect the classification. Nevertheless, the classification works and with its help we can separate malicious debtors from responsible clients.   

Illustration by Dmitry Lisovsky, ITMO.NEWS
Illustration by Dmitry Lisovsky, ITMO.NEWS

But if the algorithm can’t answer the question which spending pattern is more suspicious – huge car maintenance bills or regular lunches at restaurants – then how can banks trust this algorithm? 

That’s a wonderful question! Generally, the problem of interpretation arises when people are trying to implement AI in a field where they don’t trust this very AI. People want to be in control, they want to see what this “strange machine” will do and where it will try to trick them. 

I think that this problem could be overcome if people saw that AI works well, it is stable, and doesn’t have failures often, then they would realize that they don’t need to understand each step the computer takes. The main thing is that it works. This won’t get rid of all the questions but will help us come to terms with them. We still don’t know everything about how our brains work, but we tend to trust their conclusions.

Apart from that, there is a possibility to make the data more interpretable, it is just that we didn’t aim for that in our project. 

Can this system be fooled? 

Of course. For instance, we worked with the data of clients who were already approved for loans. There was a category for cash withdrawals and we don’t know on what the clients spent this cash. Luckily, there weren’t many such cases, as people preferred to pay by card. Though in the future, if the system is implemented, it can also account for that.

How can banks implement the system? 

Our hypothesis was based only on the behavior of clients who were already approved for loans. We had no information on how they spent money on other accounts, the amount of their salary, and so on. It was important for us to help the bank tell how responsible clients are different from malicious ones and potentially separate them using machine learning.  

But you don’t have to work with only credit card data or only in those 13 categories that we had. The same can be done for different types of data. What’s important here is the concept that allows us to separate responsible clients, random debtors, and malicious clients. Based on this, we can already improve the systems used at banks. Especially given the fact that banks do collect information on a client’s spending patterns after they give loans. 

Naturally, our study doesn’t guarantee that you can always easily identify the categories. We could train our neural network and make the right conclusions only for a particular selection of clients of Bank St. Petersburg. But if the algorithm also works on other data, then it will be easier for banks to identify irresponsible clients at the stage of approving loans. It is also possible to create a system that will analyse the behavior of clients who were already approved for credit cards. Then, if the system spots something suspicious, it can warn the client that without changing their behavior they will be at risk to default on their loan.  

Illustration by Dmitry Lisovsky, ITMO.NEWS
Illustration by Dmitry Lisovsky, ITMO.NEWS

This sounds a little intrusive. We often hear that corporations spy on people. Wouldn’t the neural network be a kind of a warden that “punishes” people for buying one croissant too many? 

That’s a great question! Let me try and defend the algorithm. It is created so that the bank gives less a priori default loans and tries to make those already given if only a little bit better. It benefits everyone because when the bank doesn’t receive payments on loans, they fall on the shoulders of responsible clients. That means they pay higher interest rates on their loans. 

Thus, by lowering the number of unpaid loans we are indirectly helping to lower interest rates. This means that the collected information turns into profit for society in general. That’s one part of the problem. 

The second part has to do with the fact that the system can be wrong and a client can truly be warned for buying a croissant with their credit card. But this can be solved by a large amount of data. The more information a neural network has, the better it looks for connections and the less it gets confused by our metaphorical croissants.