Algorithmic exclusion could be harming AI models’ ability to make successful predictions, due to a lack of variety in demographics in the training stage.
As algorithms become responsible for more important decisions, such as what job applications employers should see, how much funding should go to a city district, and if a person should receive parole, it is important to acknowledge some of the issues with considering AI as always accurate.
Lots has been written about algorithm bias, which occurs when an AI or ML model has been built with data that propagates existing societal biases. Another concept is algorithmic exclusion, another factor of social inequality on the internet and smartphones.
“Though much of the digital economics literature has focused on inequality and access and usage of the internet, algorithmic exclusion is a new and important concern for understanding digital exclusion and inequality,” said MIT Sloan School of Management professor and author of the paper, Catherine Tucker. “Algorithmic exclusion occurs when algorithms are unable to even make predictions because they lack the data to so.”
Algorithmic exclusion is part of a concept of known as data deserts, first written about in 2014 to describe zones which collect far less data than the average. In Tucker’s piece, she mentions that the city of Boston attempted to build an app, Street Bump, which use sensors in citizen’s smartphones to detect road bumps. The goal of this app was to identify road problems at a faster rate, through the collection of data to show where roads were in the worst condition. The project had to be abandoned shortly after however, due to richer neighborhoods providing far more data on the app than poorer ones, and thus receiving more than their fair share of road repairs.
In this case and others of algorithmic exclusion, the AI model is not preferring one neighborhood over another per se, but the lack of data coming from poorer neighborhoods means it cannot make an accurate judgment as to whether the road quality is up to par.
“One reason why algorithmic exclusion has often been neglected in the literature on social justice and algorithms is that it occurs when algorithms make predictions about individuals on a real-time basis,” said Tucker. “Many of the most talked-about examples of algorithmic bias involve cases where algorithms are trained on population data to make generalized predictions about individuals. Algorithmic exclusion, by contrast, occurs when an algorithm needs an individual’s data to make a prediction.”
Alongside data deserts, another issue that plays into algorithmic exclusion is fragmented data, in which marginalized communities have less complete records or are unable to provide full records due to a lack of access. For example, while smartphone penetration is above 90 percent in the United States for those under 25 years old, for those over 65 years old it is less than 60 percent. When filling out a form which requires a mobile number or an email address, a lot of these forms will be left unfilled or with holes.
This combination of missing and incomplete data can lead to algorithms missing predictions, in which an algorithm cannot make an accurate prediction and so does not make one. It can also lead to inaccurate predictions, which happens when an algorithm has fragmented data, and uses it as a basis for making a prediction.
Discussing ways to alleviate the algorithmic exclusion, Tucker said that potentially rethinking the debate on privacy, which she argues has been of most benefit to the richer and more privileged in society while potentially further excluding marginalized communities, could be one route to improving the demographic variety. By focusing more on the safeguards for privacy rather than banning tools, such as cookies, tracking, and data collection, could enable a wider variety of responses, potentially improving the accuracy and scope of algorithms.