Keeping Machine Learning Algorithms Humble and Honest

Algorithms need to be fully transparent in their decisions, easily validated, and monitored by a human expert.

Get the DATAx newsletter, The Cluster, delivered to your inbox. Sign up now!

Today in so many industries, from manufacturing and life sciences to financial services and retail, we rely on algorithms to conduct large-scale machine learning analyses. They are hugely useful for problem-solving and beneficial for augmenting human expertise within an organization. But they are now under the spotlight for many reasons – and regulation is on the horizon. Gartner projects that four of the G7 countries will establish dedicated associations to oversee artificial intelligence and ML design by 2023. It remains vital that we understand algorithms’ reasoning and decision-making processes at every step.

Algorithms need to be fully transparent in their decisions, easily validated, and monitored by a human expert. Machine learning tools must introduce this full accountability to evolve beyond unexplainable “black box” solutions and eliminate the easy excuse of “the algorithm made me do it!”

Put Bias in Its Place

Bias can be introduced into the machine learning process as early as the initial data upload and review stages. There are hundreds of parameters to take into consideration during data preparation, so it can often be challenging to strike a balance between removing bias and retaining useful data.

Gender, for example, might be a useful parameter when looking to identify specific disease risks or health threats. But using gender in many other scenarios is utterly unacceptable if it risks introducing bias and, in turn, discrimination. Machine learning models will inevitably exploit any parameters — such as gender — in data sets to which they have access. Users must understand the steps taken for a model to reach a specific conclusion.

Lifting the Curtain

Removing the complexity of the data science procedure will help users discover and address bias faster — and better understand the expected accuracy and outcomes of deploying a particular model.

Machine learning tools with built-in explainability allow users to demonstrate the reasoning behind applying ML to tackle a specific problem and ultimately justify the outcome. The first steps toward this explainability would be features in the ML tool to enable the visual inspection of data — with the platform alerting users to potential bias during preparation — and metrics on model accuracy and health. Those would include the ability to visualize what the model is doing.

Beyond this, ML platforms can take transparency further by introducing full user visibility, tracking each step through a consistent audit trail. Such tracking records how and when data sets have been imported, prepared, and manipulated. It also helps ensure compliance with national and industry regulations — such as the European Union’s GDPR "right to explanation" clause — and helps effectively demonstrate transparency to consumers.

There is a further advantage here of allowing users to quickly replicate the same preparation and deployment steps, guaranteeing the same results from the same data – particularly vital for achieving time efficiencies on repetitive tasks. We find, for example, that in life sciences users are particularly keen on replicability and visibility for ML where it becomes an essential facility in areas such as clinical trials and drug discovery.

Model Accountability

There are so many different model types that it can be a challenge to select and deploy the best model for a task. Deep neural network models, for example, are inherently less transparent than probabilistic methods, which typically operate in a more “honest” and transparent manner.

Here’s where many machine learning tools fall short: They are fully automated, offering no opportunity to review and select the most appropriate model. That fact may help users rapidly prepare data and deploy a machine learning model, but it provides little to no prospect of visual inspection to identify data and model issues.

An effective ML platform must be able to help identify and advise on resolving possible bias in a model during the preparation stage. Then it needs to provide support through to creation. In creation, it can visualize what the chosen model is doing and provide accuracy metrics. Then, in deployment, it can evaluate model certainty and provide alerts when a model requires retraining.

Testing Procedures

Building greater visibility into data preparation and model deployment, we should look towards ML platforms that incorporate testing features. Users should be able to test a new data set and receive scores of the model performance. That helps identify bias and make changes to the model accordingly.

During model deployment, the most effective platforms will also extract extra features from data that are otherwise difficult to identify and help the user understand what is going on with the data at a granular level, beyond the most apparent insights.

The goal is to put power directly into the hands of the users, enabling them to actively explore, visualize, and manipulate data at each step, rather than merely delegating to an ML tool and risking the introduction of bias.

Driving the Ethics Debate Forward

Introducing explainability and enhanced governance into ML platforms is an essential step towards ethical machine learning deployments, but we can and should go further.

Researchers and solution vendors hold responsibility as ML educators to inform users of the use and abuses of bias in machine learning. We need to encourage businesses in this field to set up dedicated education programs on machine learning, including specific modules that cover ethics and bias. Those modules should explain how users can identify and in turn tackle or outright avoid the dangers.

Raising awareness in this manner will be a crucial step towards establishing trust in sensitive AI and ML deployments such as medical diagnoses, financial decision-making, and criminal sentencing.

Break Open the Boxes

AI and machine learning offer truly limitless potential to transform the way we work, learn, and tackle problems across industries. Ensuring these operations are conducted in an open and unbiased manner is paramount to winning and retaining both consumer and corporate trust in these applications.

The is truly humble, honest algorithms that work for us and let us to make unbiased, categorical predictions and consistently provide context, explainability, and accuracy insights.

Recent research shows that 84% of CEOs agree that AI-based decisions must be explainable to be trusted. The time is ripe to embrace AI and ML solutions with baked-in transparency.

Davide Zilli is client services director at Mind Foundry, a pioneer in the development and use of "humble and honest" algorithms from the very beginning of its applications development.

Don't forget to register for DATAx San Francisco, June 10-11. Register now!
Happypostitgettyimages 1095451044

Read next:

5 Ways to Boost Your Relationship With Customers