Tech & Rights

​Algorithmic Bias: Why And How Do Computers Make Unfair Decisions?

Computers behind social welfare systems, courts, or security forces make unfair decisions all the time. Transparency is key to mitigate the risks.

by LibertiesEU

If you’ve tried your hand at online dating, binge-watched Netflix, or just did a bit of online shopping during lunch, you’ve put yourself at the mercy of algorithms. It’s almost impossible to use the internet without giving away some of your personal information, whether that be physical characteristics or birth date or simply your browsing history. And this means that artificial intelligence has profiled you, and algorithms have used this profile to make predictions about you. What you like to watch or buy, whom you would like to date, where you want to go on holiday – the options you see, and the choices you make, are heavily driven by algorithms.

Stay in the loop.

But algorithms are inherently biased for a number of reasons. Fundamentally, when we train an algorithm to perform a task, we feed it data that reflects what’s going on in society at the moment and in the past. For example, an algorithm that screens job applications might set its criteria for what kind of CV is a good match based on who's got those jobs at the moment. If it's mostly white men in the database that the algorithm is fed to train it, then the algorithm is likely to decide that the ideal candidate is a white man. The data we use to 'train' algorithms reflect society's inequalities. And so biases get baked into the algorithm. Whether we qualify for a loan, are fit for a job, or should have a place in a study program are all things that algorithms may now determine. And the consequences this has on us and our society could be dramatic.

How does a computer make decisions?

In a nutshell, an algorithm is a set of instructions that tells a computer how to interpret certain pieces of information and distill them into a decision. This follows a process of input, transformation, and output. Let’s say you’re ordering lunch at McDonald’s. The data for input are the options available to you. You could order a burger, a chicken sandwich, a fish sandwich, or a salad. Then you contemplate the side effects of these various options. The first three would play the congas on your digestive tract. But the salad would mean eating foot lettuce. All of this together is the data – the input – that will be considered when making your order.

The next stage is the computation. You apply your criteria of what makes a good meal to the menu. What have you eaten in the past and what did you like and not like? How much do you want to spend the rest of the day in the bathroom if you order the fish? What bacteria are in the foot lettuce? Does it really taste like feet? Is the burger actually, you know, beef? You’re wearing nice clothes – which option has the least chance of dripping sauce onto them? All of this information is considered to produce a decision. It could be the chicken, because there’s no chance of ketchup dripping. Or it could be an about-face and you leave and wonder why you were in a McDonald’s in the first place. But you’re hungry, and you want something light, and the salad doesn’t have ketchup either. So you go for foot salad.

This is the output, the last stage. You’ve aggregated all the data, the data was computed, and then it told your brain to say, “I’ll have a salad, please.” This is how a computer, using an algorithm, makes a decision. It brings together a bunch of data, like your browsing history, throws a bunch of criteria and computation at it, and arrives at a decision, like what ads to show you as you browse the internet.

But while that metaphor might help to understand the basics of an algorithm, it’s important to understand that there’s nothing human about them. For the stage of computation, algorithms apply certain criteria to reach a decision. And the algorithms fix those criteria based on the data they're fed to train them. Algorithms will only consider the data they’re told to consider. A person deciding on who to hire for a job might be able to realise that they should disregard a candidate's ethnicity. Whereas an algorithm will just blindly apply the criteria it's learnt from previous successful candidates, regardless of whether past hiring practices were discriminatory. So it’s important to remember that while a simple lunchtime decision-making process can be an easy way to relate to how algorithms work, it is just as important to remember that algorithms do not function like humans, and they are best at solving mathematical problems within a defined data set.

And the process suffers from a severe lack of transparency. Many tech companies consider algorithms trade secrets. The companies with algorithms that make the most accurate predictions about you are more likely to sell you something. Or allow it to correctly deny you insurance, or a thousand other things. Companies don’t want to just say “this is how we did it” because then they would lose their competitive advantage.

And this lack of transparency means that for all we know, algorithms could be applying a whole bunch of unethical criteria. For instance, your race or religion could count for you or against you, or your health history could be used against you even when it would be against the law to base a decision on it. The only way to make sure algorithms aren't applying bad criteria is to make them transparent, so that researchers and watchdogs can run tests on them and see if they consistently make discriminatory decisions.

What decisions do computers make?

So many decisions are now made by computers. It’s not just the rather mundane things like which movies Netflix recommends to you or which outfits are suggested by an online clothing retailer. Increasingly, decisions that are truly important to you and could seriously affect your life are also made by computers.

Some study programs use algorithms to cull their applicant pool and determine which applicants to offer places. Insurance companies use algorithms to determine whether or not you qualify for coverage. And, of course, tech companies use algorithms to decide what advertisements to show you, based on what items you'd be most likely to purchase.

What does algorithmic bias mean?

Algorithmic bias refers to certain attributes of an algorithm that cause it to create unfair or subjective outcomes. When it does this, it unfairly favors someone or something over another person or thing.

Algorithmic bias can exist because of many factors. It could be the design of the algorithm – like when the way that the algorithm is written gives some data more importance than other data. It could be in the collection and selection of data – some things that should be computed by the algorithm are omitted, or data that shouldn’t be involved is. For example, an algorithm deciding on car insurance premiums might be fed data about car accidents that includes the gender of drivers involved. On this basis, the algorithm might decide that women are involved in fewer crashes and therefore women drivers automatically receive lower cost insurance.

Or it could be down to humans. Algorithms are engineered by people, at least at some level, and therefore they may include certain biases held by the people who created it. Everyone is biased about something. For example, airbags were designed on assumptions about the male body, making them dangerous for women. Because the designers were men. The same sort of bias that went into designing airbags can be included when designing algorithms.

Algorithmic bias is found across platforms, from social media to search engines to online retailers. And the impact this has on us can be significant, from serious breaches of privacy to the perpetuation of inequality and social divisions based on race, gender, religion, sexuality or many other things. That’s why it’s so important for tech companies to be transparent about their algorithms. But, hey, trade secrets. You understand, right?

Examples of algorithmic bias

As you probably understand by now, examples of algorithmic bias are manifold. And some of the most important examples are continuing to shape our society. A good example is policing. Police use algorithms to analyze data and predict where crimes might occur in the future. This is called predictive policing. Computers are fed data about past arrests and then use it to spit out places where police should patrol in the future. So what’s the problem? The level of criminality is the same across all groups in society. Despite this, the police often overpolice certain ethnic minorities. This means they spend more time and energy policing areas where certain ethnic minorities live. As a consequence of overpolicing particular groups, there are relatively higher numbers of people from these ethnic minorities in the criminal justice system. So the data that is fed to the algorithm for training has been created by policing according to a stereotype. And trained on this information, the algorithm perpetuates discrimination and inequality because it tells the police to keep focusing on certain ethnic minorities.

Help us fight for your rights! Donate

Similarly, judges are increasingly using algorithms to hand down sentences. It’s a practice known as predictive sentencing. Based on specific data points about the offender, like socioeconomic status, neighborhood and family background, and answers the offender gives to a risk assessment questionnaire, an algorithm is used to predict how likely that person is to commit a crime in the future. And thus how long their sentence should be. The problem is that the assumptions that inform the decision are heavily biased. The problem here is similar to the issue with predictive policing. Because certain minorities are disproportionately represented in criminal cases, the algorithm creates discriminatory criteria – if you're from a certain ethnic minority it decides you need a longer sentence because there are lots of people from that ethnic minority who reoffend in the system. If you're white, not so much.

But even law-abiding citizens are victims of algorithmic bias, often without even knowing it. Facial recognition technology is a good example. Facial recognition software captures pictures of people’s faces as they walk past, often in very public places like train stations or shopping malls. These images are then fed into computers that use algorithms to make predictions about the person – everything from how likely that person is to be a criminal or commit a crime in the future to what the person’s gender is or even their sexuality.

Algorithmic bias even pops up in word association. Did you know that most people perceive European names as more pleasant than names with origins outside of Europe? Or that the words “woman” and “girl” are more likely to be associated with the arts, while the fields of math and science are more associated with males? Gender splits also arise in recruitment bias, often to favor men over women for jobs. Amazon’s global workforce is 60 percent male, and men hold 74 percent of the company’s managerial positions. Its hiring algorithm was fed successful CVs to help identify good job applicants, and because the CVs that had been fed into the algorithm to inform it were from men, the algorithm translated this as a ‘quality’ that enhances an application – it rejected female applicants because they were women instead of men. This realization made the company abandon its recruiting algorithm.

And every time you go online, algorithmic bias can present itself. We’ve mentioned that you get profiled so that companies can predict what you’ll want to buy and thus show you appropriate ads. But this online profiling can also be harmful. It can be used to predict your political leanings or what sort of news (or propaganda) you’ll be interested in. This means your data can be used to micro-target you with misinformation in an attempt to manipulate how you’ll vote in elections. This happens every day, and the scale of it means that it does indeed present a threat to our democratic system.

What causes the problem?

Algorithmic bias can be caused by many things. It can be rooted in issues with the data that is fed into the algorithm. Maybe the data is incomplete, or it’s irrelevant and thus inappropriate in the algorithm. Or maybe the data is just plain wrong. This could mean that the mistake was made while collecting the data, before it even met the algorithm. This can cause even the best-intentioned programmer to deliver biased and unjust results.

Algorithmic bias can also occur even when all the data is correct and appropriate. What if the mathematical model gives more weight to certain data over other data, even though it shouldn’t? This will bias the output despite the data used being fundamentally valid. And because we don’t know the details about so many companies’ algorithms, we cannot discount the possibility that this is done on purpose – that proper and correct data was intentionally manipulated in a biased or improper way in order to produce a certain output.

And this brings us back to, well, us – humans. We mentioned it earlier, but it’s important to highlight that human bias is often the cause of algorithmic bias. It’s the “man behind the curtain” who has true control. As Jonathan Grudin, a researcher at Microsoft, puts it: “The algorithms are not in control; people create and adjust them.” And humans can bias the output at multiple levels, from the data people choose to include to how the programmer chooses to build the algorithm. In this way, human bias could be the hardest one to remove, and the leading cause of algorithmic bias.

How can algorithmic bias be detected and mitigated?

In order to detect and root out bias, we must first understand what causes it. We’ve discussed some of the main causes, and it’s important that governmental bodies, with the help of civil society groups, other stakeholders, and tech companies themselves, address these causes. And it’s critically important that throughout this process, the privacy of people and their data is not compromised any more than it already has been.

Mitigating algorithmic bias can be done in a number of ways, but the key is transparency. There should be greater transparency over the algorithms used by tech companies, governmental agencies and other bodies. Trade secrets are acceptable, but not when they facilitate widespread discrimination and privacy abuse.

In fact, ‘trade secrets’ is not a legitimate excuse to avoid transparency and evade regulation. The food we eat, the cars we drive, the medicines we take have all been checked by regulators to ensure that they’re safe. And this is done without exposing company secrets. The same should be true for algorithms. Independent regulators can check to ensure that they do not contain unethical bias and do not compromise privacy or human rights. The algorithm's ‘secret sauce’ does not need to be made public.

With greater transparency, governments and civil society groups alike will be better able to monitor how, when and why algorithms are used, and call out instances of unjust bias that leads to breaches of fundamental rights. And the enforcement of existing laws, such as non-discrimination legislation, can help force bad actors to change discriminatory algorithms.

Whether we like it or not, computers are here to stay, and their decision-making powers will only increase. But with greater transparency and a greater effort to remove algorithmic bias, we can use computers and algorithms in a way that enhances our lives and creates a safer, more just society.

FAQs

What does algorithmic bias mean?

Algorithmic bias refers to certain attributes of an algorithm that cause it to create unfair or subjective outcomes. When it does this, it unfairly favors someone or something over another person or thing.

Can a computer make an unfair decision?

Yes, computers make unfair decisions all the time. But algorithmic bias has a number of causes, so there could be many, and multiple, reasons for this.

How can biases be detected?

Biases can be detected not simply by looking at the algorithms themselves and the data they use, but by each step of the process, including the collection of data and the construction of the algorithm. Transparency is key.

Information Hub

COVID-19 Contact Tracing Apps in the EU

Find out more