A new technique called monotonic selective risk could be deployed to reduce the error rate for underrepresented groups in AI models.
As AI gets used in more high-stakes decisions, the emphasis on these decisions being made as fair and accurate as possible, without any inherent biases, has become a pursuit of several academic and corporate bodies.
Researchers at MIT and the MIT-IBM Watson AI lab have published a new paper, aimed at rebuking the use of selective regression in some scenarios, as this technique can reduce the overall performance of a model for underrepresented groups in a dataset.
These underrepresented groups tend to be women and people of color, and this failure to account for them has led to several reports of AI being racist and sexist. In one account, an AI used for risk assessment wrongly flagged black prisoners at twice the rate of white prisoners. In another, pictures of men without any context were identified as doctors and homemakers at a higher rate than women.
With selective regression, for each input an AI model is able to make two choices: a prediction or abstain. The model will only make a prediction if it is confident in the decision, which, in several tests, has led to higher performance of the model by rooting out inputs that cannot be properly assessed.
However, as an input is removed, it amplifies biases that already exist inside the dataset. This can lead to further inaccuracies in underrepresented groups once the AI model is deployed in real-life, due to the fact it cannot remove or reject the underrepresented groups as it had during development.
“Ultimately, this is about being more intelligent about which samples you hand off to a human to deal with. Rather than just minimizing some broad error rate for the model, we want to make sure the error rate across groups is taken into account in a smart way,” said senior MIT author Greg Wornell, the Sumitomo Professor in Engineering in the Department of Electrical Engineering and Computer Science (EECS).
The MIT researchers introduced a new technique that looks to improve the performance of the model for every subgroup. This technique is called monotonic selective risk, and instead of abstaining, one model includes sensitive attributes such as race and sex, and the other does not. In tandem, both models make decisions, and the model without the sensitive data is used as calibration for biases in the dataset.
“It was challenging to come up with the right notion of fairness for this particular problem. But by enforcing this criteria, monotonic selective risk, we can make sure the model performance is actually getting better across all subgroups when you reduce the coverage,” said Abhin Shah, an EECS graduate student.
When testing with a medical insurance dataset and a crime dataset, the new technique was able to lower the error rate for underrepresented groups while not significantly impacting the overall performance rate of the model. Researchers intend to apply the technique to new applications, such as house prices, student GPA and loan interest rate, to see if it can be calibrated for other tasks.