← Return to search results
Back to Prindle Institute

Correcting Bias in A.I.: Lessons from Philosophy of Science

image of screen covered in binary code

One of the major issues surrounding artificial intelligence is how to deal with bias. In October, for example, a protest was held by Uber drivers, decrying the algorithm the company uses to verify its drivers as racist. Many Black drivers were unable to verify themselves because the software fails to recognize them. Because of this, many drivers cannot get verified and are unable to work. In 2018, a study showed that a Microsoft algorithm failed to identify 1 in 5 darker-skinned females, and 1 in 17 darker-skinned males.

Instances like these prompt much strategizing about how we might stamp out bias once and for all. But can you completely eliminate bias? Is the solution to the problem a technical one? Why does bias occur in machine learning, and are there any lessons that we can pull from outside the science of AI to help us consider how to address such problems?

First, it is important to address a certain conception of science. Historically, scientists – mostly influenced by Francis Bacon – espoused the notion that science was purely about investigation into the nature of the world for its own sake in an effort to discover what the world is like from an Archimedean perspective, independent of human concerns. This is also sometimes called the “view from nowhere.” However, many philosophers who would defend the objectivity of science now accept that science is pursued according to our interests. As philosopher of science Philip Kitcher has observed, scientists don’t investigate any and all forms of true claims (many would be pointless), but rather they seek significant truth, where what counts as significant is often a function of the interests of epistemic communities of scientists.

Next, because scientific modeling is influenced by what we take to be significant, it is often influenced by assumptions we take to be significant, whether there is good evidence for them or not. As Cathy O’Neil notes in her book Weapons of Math Destruction, “a model…is nothing more than an abstract representation of some process…Whether it’s running in a computer program or in our head, the model takes what we know and uses it to predict responses to various situations.” Modeling requires that we understand the evidential relationships between inputs and predicted outputs. According to philosopher Helen Longino, evidential reasoning is driven by background assumptions because “states of affairs…do not carry labels indicating that for which they are or for which they can be taken as evidence.”

As Longino points out in her book, often these background assumptions cannot always be completely empirically confirmed, and so our values often drive what background assumptions we adopt. For example, clinical depression involves a myriad of symptoms but no single unifying biological cause has been identified. So, what justifies our grouping all of these symptoms into a single illness? According to Kristen Intemann, what allows us to infer the concept “clinical depression” from a group of symptoms are assumptions we have that these symptoms impair functions we consider essential to human flourishing, and it is only through such assumptions that we are justified in grouping symptoms with a condition like depression.

The point philosophers like Intemann and Longino are making is that such background assumptions are necessary for making predictions based off of evidence, and also that these background assumptions can be value-laden. Algorithms and models developed in AI also involve such background assumptions. One of the bigger ethical issues involving bias in AI can be found in criminal justice applications.

Recidivism models are used to help judges assess the danger posed by each convict. But people do not carry labels saying they are recidivists, so what would you take as evidence that would lead you to conclude someone might become a repeat offender? One assumption might be that if a person has had prior involvement with the police, they are more likely to be a recidivist. But if you are Black or brown in America where stop-and-frisk exists, you are already disproportionately more likely to have had prior involvement with the police, even if you have done nothing wrong. So, because of this background assumption, a recidivist model would be more likely to predict that a Black person is going to be a recidivist than a white person who is less likely to have had prior run-ins with the police.

But whether the background assumption that prior contact with the police is a good predictor of recidivism is questionable, and in the meantime this assumption creates biases in the application of the model. To further add to the problem, as O’Neil notes in her analysis of the issue, recidivism models used in sentencing involve “the unquestioned assumption…that locking away ‘high-risk’ prisoners for more time makes society safer,” adding “many poisonous assumptions are camouflaged by math and go largely untested and unquestioned.”

Many who have examined the issue of bias in AI often suggest that the solutions to such biases are technical in nature. For example, if an algorithm creates a bias based on biased data, the solution is to use more data to eliminate such bias. In other cases, attempts to technically define “fairness” are used where a researcher may require models that have equal predictive value across groups or require an equal number of false and negative positives across groups. Many corporations have also built AI frameworks and toolkits that are designed to recognize and eliminate bias. O’Neil notes how many responses to biases created by crime prediction models simply focus on gathering more data.

On the other hand, some argue that focusing on technical solutions to these problems misses the issue of how assumptions are formulated and used in modeling. It’s also not clear how well technical solutions may work in the face of new forms of bias that are discovered over time. Timnit Gebru argues that the scientific culture itself needs to change to reflect the fact that science is not pursued as a “view from nowhere.” Recognizing how seemingly innocuous assumptions can generate ethical problems will necessitate greater inclusion of people from marginalized groups.  This echoes the work of philosophers of science like Longino who assert that not only is scientific objectivity a matter of degree, but science can only be more objective by having a well-organized scientific community centered around the notion of “transformative criticism,” which requires a great diversity of input. Only through such diversity of criticism are we likely to reveal assumptions that are so widely shared and accepted that they become invisible to us. Certainly, focusing too heavily on technical solutions runs the risk of only exacerbating the current problem.

Who Is Accountable for Inductive Risk in AI?

computer image of programming decision trees

Many people are familiar with algorithms and machine learning when it comes to applications like social media or advertising, but it can be hard to appreciate all of the diverse applications that machine learning has been applied to. For example, in addition to regulating all sorts of financial transactions, an algorithm might be used to evaluate teaching performances, or in the medical field to help identify illness or those at risk of disease. With this large array of applications comes a large array of ethical factors which become relevant as more and more real world consequences are considered. For example, machine learning has been used to train AI to detect cancer. But what happens when the algorithm is wrong? What are the ethical issues when it isn’t completely clear how the AI is making decisions and there is a very real possibility that it could be wrong?

Consider the example of applications of machine learning in order to predict whether someone charged with a crime is likely to be a recidivist. Because of massive backlogs in various court systems many have turned to such tools in order to get defendants through the court system more efficiently. Criminal risk assessment tools consider a number of details of a defendant’s profile and then produce a recidivism score. Lower scores will usually mean a more lenient sentence for committing a crime, while higher scores will usually produce harsher sentences. The reasoning is that if you can accurately predict criminal behavior, resources can be allocated more efficiently for rehabilitation or for prison sentences. Also, the thinking goes, decisions are better made based on data-driven recommendations than the personal feelings and biases that a judge may have.

But these tools have significant downsides as well. As Cathy O’Neil discusses in her book Weapons of Math Destruction, statistics show that in certain counties in the U.S. a Black person is three times more likely to get a death sentence than a white person, and so the application of computerized risk models intended to reduce prejudice, are no less prone to bias. As she notes, “The question, however, is whether we’ve eliminated human bias or simply camouflaged it with technology.” She points out that questionnaires used in some models include questions like when “the first time you ever were involved with the police” which is likely to yield very different answers depending on whether the respondent is white or Black. As she explains “if early ‘involvement’ with the police signals recidivism, poor people and racial minorities look far riskier.” So, the fact that such models are susceptible to bias also means they are not immune to error.

As mentioned, researchers have also applied machine learning in the medical field as well. Again, the benefits are not difficult to imagine. Cancer-detecting AI has been able to identify cancer that humans could not. Faster detection of a disease like lung cancer allows for quicker treatment and thus the ability to save more lives. Right now, about 70% of lung cancers are detected in late stages when it is harder to treat.

AI not only has the potential to save lives, but to also increase efficiency of medical resources as well. Unfortunately, just like the criminal justice applications, applications in the medical field are also subject to error. For example, hundreds of AI tools were developed to help deal with the COVID-19 pandemic, but a study by the Turing Institute found that AI tools had little impact. In a review of 232 algorithms for diagnosing patients, a recent medical journal paper found that none of them were fit for clinical use. Despite the hype, researchers are “concerned that [AI] could be harmful if built in the wrong way because they could miss diagnoses and underestimate the risk for vulnerable patients.”

There are lots of reasons why an algorithm designed to detect things or sort things might make errors. Machine learning requires massive amounts of data and so the ability of an algorithm to perform correctly will depend on how good the data is that it is trained with. As O’Neil has pointed out, a problematic questionnaire can lead to biased predictions. Similarly, incomplete training data can cause a model to perform poorly in real-world settings. As Koray Karaca’s recent article on inductive risk in machine learning scenarios explains, creating a model requires methodological and precise choices to be made. But these decisions are often driven by certain background assumptions – plagued by simplification and idealization – and which create problematic uncertainties. Different assumptions can create different models and thus different possibilities of error. However, there is always a gap between a finite amount of empirical evidence and an inductive generalization, meaning that there is always an inherent risk in using such models.

If an algorithm determines that I have cancer and I don’t, it could dramatically affect my life in all sorts of morally salient ways. On the other hand, if I have cancer and the algorithm says I don’t, it can likewise have a harmful moral impact on my life. So is there a moral responsibility involved and if so, who is responsible? In a 1953 article called “The Scientist Qua Scientist Makes Value Judgments” Richard Rudner argues that “since no scientific hypothesis is completely verified, in accepting a hypothesis the scientist must make the decision that evidence is sufficiently strong or that the probability is sufficiently high to warrant the acceptance of the hypothesis…How sure we need to be before we accept a hypothesis will depend on how serious a mistake would be.”

These considerations regarding the possibility of error and the threshold for sufficient evidence represent calculations of inductive risk. For example, we may judge that the consequences of asserting that a patient does not have cancer when they actually do to be far worse than the consequences of asserting that a patient does have cancer when they actually do not. Because of this, and given our susceptibility to error, we may accept a lower standard of evidence for determining that a patient has cancer but a higher standard for determining the patient does not have cancer to mitigate and minimize the worst consequences if an error occurs. But how do algorithms do this? Machine learning involves optimization of a model by testing it against sample data. Each time an error is made, a learning algorithm updates and adjusts parameters to reduce the total error which can be calculated in different ways.

Karaca notes that optimization can be carried out either in cost-sensitive or -insensitive ways. Cost-insensitive training assigns the same value to all errors, while cost-sensitive training involves assigning different weights to different errors. But the assignment of these weights is left to the modeler, meaning that the person who creates the model is responsible for making the necessary moral judgments and preference orderings of potential consequences. In addition, Karaca notes that inductive risk concerns arise for both the person making methodological choices about model construction and later for those who must decide whether to accept or reject a given model and apply it.

What this tells us is that machine learning inherently involves making moral choices and that these can bear out in evaluations of acceptable risk of error. The question of defining how “successful” the model is is tied up with our own concern about risk. But this only poses an additional question: How is there accountability in such a system? Many companies hide the results of their models or even their existence. But, as we have seen, moral accountability in the use of AI is of paramount importance. At each stage of assessment, we encounter an asymmetry in information that pits the victims of such AI to “prove” the algorithm wrong against available evidence that demonstrates how “successful” the model is.

Hotline Ping: Chatbots as Medical Counselors?

photograph of stethoscope wrapped around phone

In early 2021, the Trevor Project — a mental health crisis hotline for LGBTQIA+ youths — made headlines with its decision to utilize an AI chatbot as a method for training counselors to deal with real crises from real people. They named the chatbot “Riley.” The utility of such a tool is obvious: if successful, new recruits could be trained at all times of day or night, trained en masse, and trained to deal with a diverse array of problems and emergencies. Additionally, training workers on a chatbot greatly minimizes the risk of something going wrong if someone experiencing a severe mental health emergency got connected with a brand-new counselor. If a new trainee makes a mistake in counseling Riley, there is no actual human at risk. Trevor Project counselors can learn by making mistakes with an algorithm rather than a vulnerable teenager.

Unsurprisingly, this technology soon expanded beyond the scope of training counselors. In October of 2021, the project reported that chatbots were also used to screen youths (who contact the hotline via text) to determine their level of risk. Those predicted to be most at-risk, according to the algorithm, are put in a “priority queue” to reach counselors more quickly. Additionally, the Trevor Project is not the only medical/counseling organization utilizing high-tech chatbots with human-like conversational abilities. Australian clinics that specialize in genetic counseling have recently begun using a chatbot named “Edna” to talk with patients and help them make decisions about whether or not to get certain genetic screenings. The U.K.-based Recovery Research Center is currently implementing a chatbot to help doctors stay up-to-date on the conditions of patients who struggle with chronic pain.

On initial reading, the idea of using AI to help people through a mental or physical crisis might make the average person feel uncomfortable. While we may, under dire circumstances, feel okay about divulging our deepest fears and traumas to an empathetic and understanding human, the idea of typing out all of this information to be processed by an algorithm smacks of a chilly technological dystopia where humans are scanned and passed along like mere bins of data. Of course, a more measured take shows the noble intentions behind the use of the chatbots. Chatbots can help train more counselors, provide more people with the assistance they need, and identify those people who need to reach human counselors as quickly as possible.

On the other hand, big data algorithms have become notorious for the biases and false predictive tendencies hidden beneath a layer of false objectivity. Algorithms themselves are no more useful than the data we put into it. Chatbots in Australian mental health crisis hotlines were trained by analyzing “more than 100 suicide notes” to gain information about words and phrases that signal hopelessness or despair. But 100 is a fairly small amount. On average, there are more than 130 suicides every day in the United States alone. Further, only 25-30% of people who commit suicide leave a note at all. Those who do leave a note may be having a very different kind of mental health crisis than those who leave no note, meaning that these chatbots would be trained to only recognize clues present in (at best) about a quarter of successful suicides. Further, we might worry that stigma surrounding mental health care in certain communities could disadvantage teens that already have a hard time accessing these resources. The chatbot may not have enough information to recognize a severe mental health crisis in someone who does not know the relevant words to describe their experience, or who is being reserved out of a sense of shame.

Of course, there is no guarantee that a human correspondent would be any better at avoiding bias, short-sightedness, and limited information than an algorithm would be. There is, perhaps, good reason to think that a human would be much worse, on average. Human minds can process far less information, at a far slower pace, than algorithms, and our reasoning is often imperfect and driven by emotions. It is easy to imagine the argument being made that, yes, chatbots aren’t perfect, but they are much more reliable than a human correspondent would be.

Still, it seems doubtful that young people would, in the midst of a mental health crisis, take comfort in the idea of typing their problems to an algorithm rather than communicating them to a human being. The facts are that most consumers strongly prefer talking with humans over chatbots, even when the chatbots are more efficient. There is something cold about the idea of making teens — some in life-or-death situations — make it through a chatbot screening before being connected with someone. Even if the process is extremely short, it can still be jarring. How many of us avoid calling certain numbers just to avoid having to interact with a machine?

Yet, perhaps a sufficiently life-like chatbot would neutralize these concerns, and make those who call or text in to the hotline feel just as comfortable as if they were communicating with a person. Research has long shown that humans are able to form emotional connections with AI extremely quickly, even if the AI is fairly rudimentary. And more seem to be getting comfortable with the idea of talking about their mental health struggles with a robot. Is this an inevitable result of technology becoming more and more a ubiquitous part of our lives? Is it a consequence of the difficulty of connecting with real humans in our era of solitude and fast-paced living? Or, maybe, are the robots simply becoming more life-like? Whatever the case may be, we should be diligent that these chatbots rely on algorithms that help overcome deep human biases, rather than further ingrain them.

The Ethics of Policing Algorithms

photograph of silhouettes watching surveillance monitors

Police departments throughout the country are facing staffing shortages. There are a number of reasons for this: policing doesn’t pay well, the baby boomer generation is retiring and subsequent generations have reproduced less, and recent occurrences of excessive use of force by police have made the police force in general unpopular with many people. Plenty of people simply don’t view it as a viable career choice. In response to shortages, and as a general strategy to save money, many police departments throughout the country have begun relying on algorithms to help them direct their efforts. This practice has been very controversial.

The intention behind policing algorithms is to focus the attention of law enforcement in the right direction. To do this, they take historical information into account. They look at the locations in which the most crime has occurred in the past. As new crimes occur, they are added to the database; the algorithm learns from the new data and adjusts accordingly. These data points include details like the time of year that crimes occurred. Police departments can then plan staffing coverage in a way that is consistent with this data.

Proponents of policing algorithms argue that they make the best use of taxpayer resources; they direct funds in very efficient ways. Police don’t waste time in areas where crime is not likely to take place. If this is the case, departments don’t need to hire officers to perpetually cover areas where crime historically does not happen.

There are, however, many objections to the use of such algorithms. The first is that they reinforce racial bias. The algorithms make use of historical data, and police officers have, historically, aggressively policed minority neighborhoods. In light of the history of interactions in these areas, police officers may be more likely to deal with members of these communities more severely than members of other communities for the same offenses. Despite comprising only 13% of the population, African Americans comprise 27% of all arrests in the United States. These populations are twice as likely to be arrested than are their white counterparts. This is unsurprising if policing algorithms direct police officers to focus their attention on communities of color because this is where they always focus their attention. If two young people are in possession of marijuana, for example, a young person of color is more likely to be arrested than a young white person is if the police are omnipresent in a community of color while they aren’t present at all in an affluent white community. This will serve to reinforce the idea that different standards apply to different racial and socioeconomic groups. For example, all races commit drug-related crimes in roughly equal numbers, but African Americans are far more likely to be arrested and sentenced harshly than are white people.

In addition, some are concerned that while police are busy over-policing communities of color, other communities in which crime is occurring will be under-protected. When emergencies happen in these communities, there will be longer response times. This can often make the difference between life and death.

Many argue that policing algorithms are just another example of an institution attempting to provide quick, band-aid fixes for problems that require deeper, more systemic change. If people are no longer choosing to pursue law enforcement careers, that problem needs to be resolved head-on. If people aren’t choosing to pursue careers in law enforcement because such a job has a bad reputation for excessive force, then that is just one among many reasons to stop police officers from using disproportionate force. There are many ways to do this: police could be required to wear body cameras that are required to be on at all times while officers are responding to calls. Officers could be required to go through more training, including sessions that emphasize anger management and anti-racism. Some police departments throughout the country have become notorious for hiding information regarding police misconduct from the public. Such departments in general could clean up the reputation of the profession by being perfectly transparent about officer behavior and dealing with such offending officers immediately rather than waiting to take action in response to public pressure.

Further, instead of focusing algorithms on locations for potential policing, our communities could focus the same resources on locations for potential crime prevention. The root causes of crimes are not mysteries to us. Poverty and general economic uncertainty reliably predict crime. If we commit resources to providing social services to these communities, we can potentially stop crime before it ever happens. The United States incarcerates both more people per capita and total people overall than any other country in the world. Incarceration is bad for many reasons, it stunts the growth and prevention of incarcerated individuals, getting in the way of flourishing and achieving their full potential. It also costs taxpayers money. If we have a choice as taxpayers between spending money on crime prevention and spending money on incarceration of criminals after crimes have already taken place, many would argue that the choice is obvious.

Automation in the Courtroom: On Algorithms Predicting Crime

photograph of the defense's materials on table in a courtroom

From facial recognition software to the controversial robotic “police dogs,” artificial intelligence is becoming an increasingly prominent aspect of the legal system. AI even allocates police resources to different neighborhoods, determining how many officers are needed in certain areas based on crime statistics. But can algorithms determine the likelihood that someone will commit a crime, and if they can, is it ethical to use this technology to sentence individuals to prison?

Algorithms that attempt to predict recidivism (the likelihood that a criminal will commit future offenses) sift through data to produce a recidivism score, which ostensibly indicates the risk a person poses to their community. As Karen Hao explains for the MIT Technology Review,

The logic for using such algorithmic tools is that if you can accurately predict criminal behavior, you can allocate resources accordingly, whether for rehabilitation or for prison sentences. In theory, it also reduces any bias influencing the process, because judges are making decisions on the basis of data-driven recommendations and not their gut.

Human error and racial bias contribute to over-incarceration, so researchers are hoping that color-blind computers can make better choices for us.

But in her book When Machines Can Be Judge, Jury, and Executioner: Justice in the Age of Artificial Intelligence, former judge Katherine B. Forrest explains that Black offenders are far more likely to be labeled high-risk by algorithms than their white counterparts, a fact which further speaks to the well-documented racial bias of algorithms. As Hao reminds us,

populations that have historically been disproportionately targeted by law enforcement—especially low-income and minority communities—are at risk of being slapped with high recidivism scores. As a result, the algorithm could amplify and perpetuate embedded biases and generate even more bias-tainted data to feed a vicious cycle.

Because this technology is so new and lucrative, companies are extremely protective of their algorithms. The COMPAS system (Correctional Offender Management Profiling for Alternative Sanctions), created by Northpointe Inc., is the most widely used recidivism predictor in the legal system, yet no one knows what data set it draws from or how it’s algorithm generates a final score. We can assume the system looks at factors like age and previous offenses, but beyond that, the entire process is shrouded in mystery. Studies also suggest that recidivism algorithms are alarmingly inaccurate; Forrest notes that systems like COMPAS are incorrect around 30 to 40 percent of the time. This means that for every ten people COMPAS labels low-risk, 3 or 4 will eventually relapse into crime. Even with a high chance for error, recidivism scores are difficult to challenge in court. In a lucid editorial for the American Bar Association, Judge Noel L. Hillman explains that,

A predictive recidivism score may emerge oracle-like from an often-proprietary black box. Many, if not most, defendants, particularly those represented by public defenders and counsel appointed under the Criminal Justice Act because of indigency, will lack the resources, time, and technical knowledge to understand, probe, and challenge the AI process.

Judges may assume a score generated by AI is infallible, and change their ruling accordingly.

In his article, Hillman makes a reference to Loomis v. Wisconsin, a landmark case for recidivism algorithms. In 2016, Eric Loomis was arrested for driving a car that had been involved in a drive-by shooting. During sentencing, the judge tacked an additional six years onto his sentence due to his high COMPAS score. Loomis attempted to challenge the validity of the score, but the courts ultimately upheld Northpointe’s right to protect trade secrets and not reveal how the number had been reached. Though COMPAS scores aren’t currently admissible in court as evidence against a defendant, the judge in the Loomis case did take it into account during sentencing, which sets a dangerous precedent.

Even if we could predict a person’s future behavior with complete accuracy, replacing a judge with a computer would make an already dehumanizing process dystopian. Hillman argues that,

When done correctly, the sentencing process is more art than science. Sentencing requires the application of soft skills and intuitive insights that are not easily defined or even described. Sentencing judges are informed by experience and the adversarial process. Judges also are commanded to adjust sentences to avoid unwarranted sentencing disparity on a micro or case-specific basis that may differ from national trends.

In other words, attention to nuance is lost completely when defendants become data sets. The solution to racial bias isn’t to bring in artificial intelligence, but to strengthen our own empathy and sense of shared humanity, which will always produce more equitable rulings than AI can.

Stereotyping and Statistical Generalization

photograph of three different multi-colored pie charts

Let’s look at three different stories and use them to investigate statistical generalizations.

Story 1

This semester I’m teaching a Reasoning and Critical Thinking course. During the first class, I ran through various questions designed to show that human thinking is subject to predictable and systematic errors. Everything was going swimmingly. Most students committed the conjunction fallacy, ignored regression towards the mean, and failed the Wason selection task.

I then came to one of my favorite examples from Kahneman and Tversky: base rate neglect. I told the students that “Steve is very shy and withdrawn, invariably helpful but with little interest in people or in the world of reality. A meek and tidy soul, he has a need for order and structure, and a passion for detail,” and then asked how much more likely it is that Steve is a librarian than a farmer. Most students thought it was moderately more likely that Steve was a librarian.

Delighted with this result, I explained the mistake. While Steve is more representative of a librarian, you need to factor in base-rates to conclude he is more likely to actually be a librarian. In the U.S. there are about two million farmers and less than one hundred and fifty thousand librarians. Additionally, while 70% of farmers are male, only about 20% of librarians are. So for every one librarian named Steve you should assume there are at least forty-five farmers so named.

This culminated in my exciting reveal: even if you think that librarians are twenty times more likely than farmers to fit the personality sketch, you should still think Steve is more than twice as likely to be a farmer.

This is counter-intuitive, and I expected pushback. But then a student asked a question I had not anticipated. The student didn’t challenge my claim’s statistically illegitimacy, he challenged its moral illegitimacy. Wasn’t this a troubling generalization from gender stereotypes? And isn’t reasoning from stereotypes wrong?

It was a good question, and in the moment I gave an only so-so reply. I acknowledged that judging based on stereotypes is wrong, and then I…

(1) distinguished stereotypes proper from empirically informed statistical generalizations (explaining the psychological literature suggesting stereotypes are not statistical generalizations, but unquantified generics that the human brain attributes to intrinsic essences);

(2) explained how the most pernicious stereotypes are statistically misleading (e.g., we accept generic generalizations at low statistical frequencies about stuff we fear), and so would likely be weakened by explicit reasoning from rigorous base-rates rather than intuitive resemblances;

(3) and pointed out that racial disparities present in statistical generalizations act as important clarion calls for political reform.

I doubt my response satisfied every student — nor should it have. What I said was too simple. Acting on dubious stereotypes is often wrong, but acting on rigorous statistical generalizations can also be unjust. Consider a story recounted in Bryan Stevenson’s Just Mercy:

Story 2

“Once I was preparing to do a hearing in a trial court in the Midwest and was sitting at counsel table in an empty courtroom before the hearing. I was wearing a dark suit, white shirt, and tie. The judge and the prosecutor entered through a door in the back of the courtroom laughing about something.

When the judge saw me sitting at the defense table, he said to me harshly, ‘Hey, you shouldn’t be in here without counsel. Go back outside and wait in the hallway until your lawyer arrives.’

I stood up and smiled broadly. I said, ‘Oh, I’m sorry, Your Honor, we haven’t met. My name is Bryan Stevenson, I am the lawyer on the case set for hearing this morning.’

The judge laughed at his mistake, and the prosecutor joined in. I forced myself to laugh because I didn’t want my young client, a white child who had been prosecuted as an adult, to be disadvantaged by a conflict I had created with the judge before the hearing.”

This judge did something wrong. Because Bryan Stevenson is black, the judge assumed he was the defendant, not the defense. Now, I expect the judge acted on an implicit racist stereotype, but suppose the judge had instead reasoned from true statistical background data. It is conceivable that more of the Black people who enter that judge’s courtroom — even those dressed in suit and tie — are defendants than defense attorneys. Would shifting from stereotypes to statistics make the judge’s behavior ok?

No. The harm done had nothing to do with the outburst’s mental origins, whether it originated in statistics or stereotypes. Stevenson explains that what is destructive is the “accumulated insults and indignations caused by racial presumptions,” the burden of “constantly being suspected, accused, watched, doubted, distrusted, presumed guilty, and even feared.” This harm is present whether the judge acted on ill-formed stereotypes or statistically accurate knowledge of base-rates.

So, my own inference about Steve is not justified merely because it was grounded in a true statistical generalization. Still, I think I was right and the judge was wrong. Here is one difference between my inference and judge’s. I didn’t act as though I knew Steve was a farmer — I just concluded it was more likely he was. The judge didn’t act the way he would if he thought it was merely likely Stevenson was the defendant. The judge acted as though he knew Stevenson was the defendant. But the statistical generalizations we are considering cannot secure such knowledge.

The knowledge someone is a defendant justifies different behavior than the thought someone is likely a defendant. The latter might justify politely asking Stevenson if he is the defense attorney. But the latter couldn’t justify the judge’s actual behavior, behavior unjustifiable unless the judge knows Stevenson is not an attorney (and dubious even then). A curious fact about ethics is that certain actions (like asserting or punishing a criminal) require, not just high subjective credence, but knowledge. And since mere statistical information cannot secure knowledge, statistical generalizations are unsuitable justifications for some actions.

Statistical disparities can justify some differential treatment. For instance, seeing that so few of the Black people in his courtroom are attorneys could justify the judge in funding mock trial programs only at majority Black public schools. Indeed, it might even justify the judge, in these situations, only asking Black people if they are new defense attorneys (and just assuming white people are). But it cannot justify behavior, like harsh chastisement, that requires knowledge the person did something wrong.

I didn’t do anything that required knowledge that Steve was a farmer. So does this mean I’m in the clear? Maybe. But let’s consider one final story from the recent news:

Story 3

Due to COVID-19 the UK canceled A-level exams — a primary determinant of UK college admissions. (If you’re unfamiliar with the A-levels they are sort of like really difficult subject-specific SAT exams.) The UK replaced the exams with a statistical generalization. They subjected the grades that teachers and schools submitted to a statistical normalization based on the historical performance of the student’s school. Why did the Ofqual (Office of Qualifications and Examinations Regulation) feel the need to normalize the results? Well, for one thing, the predicted grades that teachers submitted were 12% higher than last year’s scores (unsurprising without any external test to check teacher optimism).

The normalization, then, adjusted many scores downward. If the Ofqual predicted, based on historical data, that at least one student in a class would have failed the exam then the lowest scoring student’s grade was adjusted to that failing grade (irrespective of how well the teacher predicted the student would have done).

Unsurprisingly, this sparked outrage and the UK walked back the policy. Student’s felt the system was unfair since they had no opportunity to prove they would have bucked the trend. Additionally since wealthier schools tended to perform better on the A-levels in previous years, the downgrading hurt students in poorer schools at a higher rate.

Now, this feels unfair. (And since justifiability to the people matters for government policy, I think the government made the right choice in walking back the policy.) But was it actually unfair? And if so, why?

It’s not an issue of stereotypes — the changes weren’t based on hasty stereotypes, but rather on a reasonable statistical generalization. It’s not an issue of compounding algorithmic bias (of the sort described in O’Neil’s book) as the algorithm didn’t produce results more unequal than actual test results. Nor was the statistical generalization used in a way that requires knowledge. College admissions don’t assume we know one student is better than another. Rather, they use lots of data to make informed guesses about which students will be the fit. The algorithm might sometimes misclassify, but so could any standardized test.

So what feels unfair? My hunch is the algorithm left no space for the exceptional. Suppose four friends who attended a historically poor performing school spent the last two years frantically studying together in a way no previous group had. Had they sat the test, all could have secured top grades — a first for the school. Unfortunately, they couldn’t all sit the test, and because their grades are normalized against previous years the algorithm eliminates their possibility of exceptional performance. (To be fair to the UK, they said students could sit the exams in the fall if they felt they could out-perform their predicted score).

But what is unfair about eliminating the possibility of exceptional success? My further hunch is that seeing someone as having the possibility of exceptional success is part of what it is to see them as an individual (perhaps for Kantian reasons of seeing someone as a free first cause of their own actions). Sure, we can accept that most people will be like most people. We can even be ok with wealthier schools, in the aggregate, consistently doing better on standardized tests. But we aren’t ok with removing the possibility for any individual to be an exception to the trend.

When my students resisted my claim that Steve was likely a farmer, they did not resist the generalization itself. They agreed most farmers are men and most librarians are women. But they were uncomfortable moving from that general ratio to a probabilistic judgment about the particular person, Steve. They seemed to worry that applying the generalization to Steve precluded seeing Steve as an exception.

While I think the students were wrong to think the worry applied in this case — factoring in base-rates doesn’t prevent the exceptional from proving their uniqueness — they might be right that there is a tension between seeing someone within a statistical generalization and seeing someone as an individual. It’s a possibility I should have recognized, and a further way acting on even good statistical generalizations might sometimes be wrong.

Racist, Sexist Robots: Prejudice in AI

Black and white photograph of two robots with computer displays

The stereotype of robots and artificial intelligence in science fiction is largely of a hyper-rational being, unafflicted by the emotions and social infirmities like biases and prejudices that impair us weak humans. However, there is reason to revise this picture. The more progress we make with AI the more a particular problem comes to the fore: the algorithms keep reflecting parts of our worst selves back to us.

In 2017, research showed compelling evidence that AI picks up deeply ingrained racial- and gender-based prejudices. Current machine learning techniques rely on algorithms interacting with people in order to better predict correct responses over time. Because of the dependence on interacting with humans for standards of correctness, the algorithms cannot detect when bias informs a correct response or when the human is engaging in a non-prejudicial way. Thus, the best working AI algorithms pick up the racist and sexist underpinnings of our society. Some examples: the words “female” and “woman” were more closely associated with arts and humanities occupations and with the home, while “male” and “man” were closer to maths and engineering professions. Europeans were associated with pleasantness and excellence.

In order to prevent discrimination in housing, credit, and employment, Facebook has recently been forced to agree to an overhaul of its ad-targeting algorithms. The functions that determined how to target audiences for ads relating to these areas turned out to be racially discriminatory, not by design – the designers of the algorithms certainly didn’t encode racial prejudices – but because of the way they are implemented. The associations learned by the ad-targeting algorithms led to disparities in the advertising of major life resources. It is not enough to program a “neutral” machine learning algorithm (i.e., one that doesn’t begin with biases). As Facebook learned, the AI must have anti-discrimination parameters built in as well. Characterizing just what this amounts to will be an ongoing conversation. For now, the ad-targeting algorithms cannot take age, zip code, or gender into consideration, as well as legally protected categories.

The issue facing AI is similar to the “wrong kind of reasons” problem in philosophy of action. The AI can’t tell a systemic bias of humans from a reasoned consensus: both make us converge on an answer and support the algorithm to select what we may converge on. It is difficult to say what, in principle, the difference is between the systemic bias and a reasoned consensus is. It is difficult, in other words, to give the machine learning instrument parameters to tell when there is the “right kind of reason” supporting a response and the “wrong kind of reason” supporting the response.

In philosophy of action, the difficulty of drawing this distinction is illustrated by a case where, for instance, you are offered $50,000 to (sincerely) believe that grass is red. You have a reason to believe, but intuitively this is the wrong kind of reason. Similarly, we could imagine a case where you will be punished unless you (sincerely) desire to eat glass. The offer of money doesn’t show that “grass is red” is true, similarly the threat doesn’t show that eating glass is choice-worthy. But each somehow promote the belief or desire. For the AI, a racist or sexist bias leads to a reliable response in the way that the offer and threat promote a behavior – it is disconnected from a “good” response, but it’s the answer to go with.

For International Women’s Day, Jeanette Winterson suggested that artificial intelligence may have a significantly detrimental effect on women. Women make up 18% of computer science graduates and thus are left out of the design and directing of this new horizon of human development. This exclusion can exacerbate the prejudices that can be inherent in the design of these crucial algorithms that will become more critical to more arenas of life.

The Persistent Problem of the Fair Algorithm

photograph of a keyboard and screen displaying code

This article has a set of discussion questions tailored for classroom use. Click here to download them. To see a full list of articles with discussion questions and other resources, visit our “Educational Resources” page.


At first glance, it might appear that the mechanical procedures we use to accomplish such mundane tasks as loan approval, medical triage, actuarial assessment, and employment screening are innocuous. Designing algorithms to process large chunks of data and transform various individual data points into a single output offers a great power in streamlining necessary but burdensome work. Algorithms advise us about how we should read the data and how we should respond. In some cases, they even decide the matter for us.

It isn’t simply that these automated processes are more efficient than humans at performing these computations (emphasizing the relevant data points, removing statistical outliers and anomalies, and weighing competing factors). Algorithms also hold the promise of removing human error from the equation. A recent study, for example, has identified a tendency for judges on parole boards to become less and less lenient in their sentencing as the day wears on. By removing extraneous elements like these from the decision-making process, an algorithm might be better positioned to deliver true justice.

Similarly, another study established the general superiority of mechanical prediction to clinical prediction in various settings from medicine to mental health to education. Humans were most notably outperformed when a one-on-one interview was conducted. These findings reinforce the position that algorithms should augment (or perhaps even replace) human decision-making, which is often plagued by prejudice and swayed by sentiment.

But despite their great promise, algorithms carry a number of concerns. Chief among these are problems of bias and transparency. Often seen as free from bias, algorithms stand as neutral arbiters, capable of combating long-standing inequalities such as the gender pay-gap or unequal sentencing for minority offenders. But automated tools can just as easily preserve and fortify existing inequalities when introduced to an already discriminatory system. Algorithms used in assigning bond amounts and sentencing underestimated the risk of white defendants while overestimating that of Black defendants. Popular image-recognition software reflects significant gender bias. Such processes mirror and thus reinforce extant social bias. The algorithm simply tracks, learns, and then reproduces the patterns that it sees.

Bias can be the result of a non-representative sample size that is too small or too homogenous. But bias can also be the consequence of the kind of data that the algorithm draws on to make its inferences. While discrimination laws are designed to restrict the use of protected categories like age, race, or sex, an algorithm might learn to use a proxy, like zip codes, that produces equally skewed outcomes.

Similarly, predictive policing — which uses algorithms to predict where a crime is likely to occur and determine how to best deploy police resources — has been criticized as “enabl[ing], or even justify[ing], a high-tech version of racial profiling.” Predictive policing creates risk profiles for individuals on the basis of age, employment history, and social affiliations, but it also creates risk profiles for locations. Feeding the algorithm information which is itself race- and class-based creates a self-fulfilling prophecy whereby continued investigation of Black citizens in urban areas leads to a disproportionate number of arrests. A related worry is that tying police patrol to areas with the highest incidence of reported crime grants less police protection to neighborhoods with large immigrant populations, as foreign-born citizens and non-US citizens are less likely to report crimes.

These concerns of discrimination and bias are further complicated by issues of transparency. The very function the algorithm was meant to serve — computing multiple variables in a way that surpasses human ability — inhibits oversight. It is the algorithm itself which determines how best to model the data and what weights to attach to which factors. The complexity of the computation as well as the use of unsupervised learning — where the algorithm processes data autonomously, as opposed to receiving labelled inputs from a designer — may mean that the human operator cannot parse the algorithm’s rationale and that it will always remain opaque. Given the impenetrable nature of the decision-mechanism, it will be difficult to determine when predictions objectionably rely on group affiliation to render verdicts and who should be accountable when they do.

Related to these concerns of oversight are questions of justification: What are we owed in terms of an explanation when we are denied bail, declined for a loan, refused admission to a university, or passed over for a job interview? How much should an algorithm’s owner need to be able to say to justify the algorithm’s decision and what do we have a right to know? One suggestion is that individuals are owed “counterfactual explanations” which highlight the relevant data points that led to the determination and offer ways in which one might change the decision. While this justification would offer recourse, it would not reveal the relative weights the algorithm places on the data nor would a justification be offered for which data points an algorithm considers relevant.

These problems concerning discrimination and transparency share a common root. At bottom, there is no mechanical procedure which would generate an objective standard of fairness. Invariably, the determination of that standard will require the deliberate assignation of different weights to competing moral values: What does it mean to treat like cases alike? To what extent should group membership determine one’s treatment? How should we balance public good and individual privacy? Public safety and discrimination? Utility and individual right?

In the end, our use of algorithms cannot sidestep the task of defining fairness. It cannot resolve these difficult questions, and is not a surrogate for public discourse and debate.

Workplace Diversity: A Numbers Game

Anyone who has applied for a job is likely familiar with the stress it can bring. Governed by unspoken rules and guidelines that at times seem arbitrary, the hiring process has traditionally been seen as an anxiety-producing but necessary part of starting a career. For some, however, this process is stressful for an entirely different reason: the fear of discrimination by employers. How, then, should the process be reformed to provide a more equitable environment?

Continue reading “Workplace Diversity: A Numbers Game”