Zum Inhalt springenZur Suche springen

How biased is Machine Learning? Why the status quo is not neutral

In our most recent HeiCADLecture, Dr Brent Mittelstadt, Senior Research Fellow in data ethics at the Oxford Internet Institute, University of Oxford, and Turing Fellow at the Alan Turing Institute, talked about bias preservation in fair machine learning. He startet out mentioning Article 10 on data and data governance of the recent Artificial Intelligence Act proposed by the European Commission (see this HeiCADLecture), which states that training, validation and testing data sets for AI systems shall be examined in view of possible biases. Mittelstadt distinguished between two types of bias: technical and societal bias, which both inform each other. As the names already suggest, while the former arises from technical constraints or technical considerations, the latter originates from society such as from organisations, institutions, or culture at large.

At the moment, the majority of bias and fairness tests is not compatible with EU non-discrimination law. It aims to obtain substantive equality, i.e. equality of opportunity and equality of results, in contrast to formal equality, where everyone is treated equally regardless of the context. The latter can be mapped onto direct discrimination, i.e. less favourable treatment explicitly based on a protected characteristic. In contrast, indirect discrimination is defined as a disproportionate adverse impact on a protected group resulting from a seemingly neutral provision, criterion or practice in EU non-discrimination law to achieve substantial equality.

How does the law align with technical work on fairness in machine learning (ML)? In his recent paper, Mittelstadt examined the most popular fairness metrics in ML and classified them into two groups based on conditional independence. Bias-preserving metrics are always satisfied by a perfect classifier that exactly predicts its target labels with zero error, replicating bias present in the data. They seek to replicate error rates found in the training data (or status quo) in the outputs of the trained model. Bias-transforming metrics are not necessarily satisfied by a perfect classifier. They are not concerned with replicating error rates and typically seek to match decision rates between groups. In the paper, 20 of the most popular fairness metrics commonly used in the ML research community are reviewed. Two thirds of them are bias-preserving and do not align well with EU non-discrimination law. The paper can be found here.

The status quo is not neutral, and we do not even have enough data to estimate how big the problem actually is. The entire talk can be re-watched in the Mediathek in case you are affiliated with HHU.

Autor/in: Dr. Joana Grah
Verantwortlichkeit: