TerM

White-box Machine Learning

TerM

White-box Machine Learning

Introduction

In general, there are two types of machine learning: white-box and black-box. Black-box machine learning is unsupervised, meaning an algorithm works independently to learn patterns in data, make decisions based on those patterns, and answer questions without human intervention or oversight.

White-box models, on the other hand, show the process by which they produce answers to questions, in addition to returning the answers themselves. This allows them to be fine-tuned to reason and return answers according to the user’s needs.

Here, we’ll explain in a bit more detail what white-box machine learning is and how it works. We’ll also outline its pros and cons in contrast to black-box machine learning for fraud detection and prevention applications.

What is White-box Machine Learning?

White-box machine learning is a model where the algorithm returns an answer to a question it was asked, as well as the process by which the answer was reached. In this way, a human can check the rules and factors considered to determine if both the answer and the logic used to reach it are correct.

Let’s put this in the context of fraud prevention. Unlike a black-box algorithm, a white-box algorithm won’t just return how likely it thinks a customer is to be a fraudster based on their input ID credentials. It will also illustrate which considerations played key roles (or not) in giving that person a higher or lower risk score.

Was the person trying to log into their account through a VPN? Did their IP address represent somewhere they don’t usually do business? Were their ID credentials valid, and did they all correspond to the same person?

Based on how heavily factors like these were weighed, a financial institution using a white-box setup could fine-tune the system’s rules to more accurately separate the fraudsters from the legitimate customers.

How Does a White-box Machine Learning Model Work?

A white-box model of machine learning works similarly to other other machine learning models, which tend to be black-box. However, it involves a few extra processes because it cares about showing the reasoning it used to produce a result, not just the result itself. The upside is that this also gives a white-box algorithm the ability to be modified quickly and gradually to fine-tune its performance.

White-box machine learning model

Step 1: The algorithm is trained with sample data

The first step in any machine learning system is to feed the algorithm massive quantities of sample data related to the questions it will be used to answer. This lets the algorithm discover patterns in the data, and thereby creates rules for when to make certain judgments versus others.

Step 2: Real-world data is input to ask a question

Next, the algorithm is asked a question by being given data from a relevant real-world scenario. It then uses the pattern-based rules it built during training to make a series of decisions. These decisions result in an answer to the question the system was asked.

Step 3: The algorithm visualizes its process and provides its answer

In revealing the answer it came up with, the algorithm also provides an explanation of how it got to that conclusion. This involves showing which decisions were made, as well as how heavily the algorithm weighed any relevant factors in making each choice. This is the main way in which white-box machine learning algorithms differ from black-box ones.

Step 4: Users assess the results and make adjustments

Users of a white-box system can now look at the returned process and answer together to determine whether or not the algorithm is working properly. For example, the algorithm may have come to the correct conclusion, but made some incorrect decisions in doing so. Or the algorithm may have given out the incorrect answer because it weighed a particular factor too heavily or lightly somewhere along the line.

Based on their analysis, users can tweak the rules and factor weights the algorithm follows at will. Then they can return to step 2 and test the algorithm again to see if it produces a different result. Again, this generally can’t be done in a black-box machine learning model.

Black-box vs. White-box Machine Learning

White-box machine learning models are often contrasted with black-box models. The latter return answers to questions they are asked, but do not return any insights into what decisions were made in order to come up with those answers.

Black-box vs white-box machine learning models processes compared

Hence, the main difference in black-box vs. white-box machine learning is transparency. White-box algorithms are transparent about how they work, while black-box algorithms are not. There are pros and cons to each approach.

White-box Machine Learning

Black-box Machine Learning

Provides a visualization of decisions made in order to arrive at a conclusion

Provides a conclusion only, with no insight into how it was arrived at

Allows humans to verify if the answer an algorithm arrives at, and the logic used to get there, are correct

Provides an answer only; does not allow humans to check the logic to see if the correct decisions were made or the correct answer was arrived at

Makes it easier to test and configure an algorithm on the fly by showing which decisions are made, and when

Requires more guesswork regarding which rules and parameters to change in order to achieve a desired result, as the algorithm’s process isn’t shown 

Works slower, as an algorithm has to not only compute a result but also visualize the process by which it arrived at that result

Works faster, as the algorithm only has to output answers without also explaining its processes

Requires more human supervision and intervention to adjust rules and parameters on-the-fly

Works unsupervised, so it can produce conclusions faster without needing humans to tweak or approve rules

Tends to be better at detecting historical patterns because of its tightly-curated rulesets

Tends to be better at detecting new or unusual patterns because of its unsupervised nature

The Benefits of a White-box Machine Learning Model

White-box models of machine learning tend to perform better than black-box ones in terms of detecting and preventing fraud. That’s because there isn’t a universal method for determining whether or not a customer or transaction is legitimate.

There are a lot of variables involved in detecting fraud, and they can be different depending on what the financial institution is, where in the world it is, and how it works. So the transparency in a white-box model of illustrating how it determines a customer or transaction’s risk level, rather than just what that risk level is, allows it to be customized to fit an institution’s specific use cases.

For example, white-box algorithms can:

  • Be trained on an institution’s historical data to learn how cases specific to that organization were handled and resolved
  • Be given pre-set rules and decision trees to follow, based on the institution’s experience and needs
  • Help with manual reviews by showing how factors were weighed, and how decisions were made, in assigning fraud scores to transactions
  • Be tested and adjusted on-the-fly to produce more accurate results by tweaking the rules and factor weights in choosing to approve or decline a transaction

Leverage White-box Machine Learning for Better Fraud Prevention

Financial fraud comes in many different forms; there isn’t a one-size-fits-all approach to detecting and preventing it. So when using machine learning for these purposes, it’s usually an advantage to use a transparent white-box setup. This lets you check if the algorithm is working properly, and make quick adjustments to it, in terms of accurately detecting the specific kinds of threats your organization wants to block.

Unit21’s platform takes white-box anti-fraud/AML machine learning to the next level with a feature called Alert Scoring. It trains on an organization’s data – both the initial data feed, and passively as new alerts are encountered – to determine on a scale of 0 to 100 how likely each alert is a true positive.

Machine learning empowers the system to learn based on previous alert dispositions and behaviors for more precise scoring. This allows an organization to prioritize alerts that are the most likely to require further investigation, while reducing time wasted on alerts that are likely to be false positives.

To learn more about how Unit21 uses white-box machine learning to combat fraud, schedule a demo with us today.