← Return to search results
Back to Prindle Institute

Automation in the Courtroom: On Algorithms Predicting Crime

photograph of the defense's materials on table in a courtroom

From facial recognition software to the controversial robotic “police dogs,” artificial intelligence is becoming an increasingly prominent aspect of the legal system. AI even allocates police resources to different neighborhoods, determining how many officers are needed in certain areas based on crime statistics. But can algorithms determine the likelihood that someone will commit a crime, and if they can, is it ethical to use this technology to sentence individuals to prison?

Algorithms that attempt to predict recidivism (the likelihood that a criminal will commit future offenses) sift through data to produce a recidivism score, which ostensibly indicates the risk a person poses to their community. As Karen Hao explains for the MIT Technology Review,

The logic for using such algorithmic tools is that if you can accurately predict criminal behavior, you can allocate resources accordingly, whether for rehabilitation or for prison sentences. In theory, it also reduces any bias influencing the process, because judges are making decisions on the basis of data-driven recommendations and not their gut.

Human error and racial bias contribute to over-incarceration, so researchers are hoping that color-blind computers can make better choices for us.

But in her book When Machines Can Be Judge, Jury, and Executioner: Justice in the Age of Artificial Intelligence, former judge Katherine B. Forrest explains that Black offenders are far more likely to be labeled high-risk by algorithms than their white counterparts, a fact which further speaks to the well-documented racial bias of algorithms. As Hao reminds us,

populations that have historically been disproportionately targeted by law enforcement—especially low-income and minority communities—are at risk of being slapped with high recidivism scores. As a result, the algorithm could amplify and perpetuate embedded biases and generate even more bias-tainted data to feed a vicious cycle.

Because this technology is so new and lucrative, companies are extremely protective of their algorithms. The COMPAS system (Correctional Offender Management Profiling for Alternative Sanctions), created by Northpointe Inc., is the most widely used recidivism predictor in the legal system, yet no one knows what data set it draws from or how it’s algorithm generates a final score. We can assume the system looks at factors like age and previous offenses, but beyond that, the entire process is shrouded in mystery. Studies also suggest that recidivism algorithms are alarmingly inaccurate; Forrest notes that systems like COMPAS are incorrect around 30 to 40 percent of the time. This means that for every ten people COMPAS labels low-risk, 3 or 4 will eventually relapse into crime. Even with a high chance for error, recidivism scores are difficult to challenge in court. In a lucid editorial for the American Bar Association, Judge Noel L. Hillman explains that,

A predictive recidivism score may emerge oracle-like from an often-proprietary black box. Many, if not most, defendants, particularly those represented by public defenders and counsel appointed under the Criminal Justice Act because of indigency, will lack the resources, time, and technical knowledge to understand, probe, and challenge the AI process.

Judges may assume a score generated by AI is infallible, and change their ruling accordingly.

In his article, Hillman makes a reference to Loomis v. Wisconsin, a landmark case for recidivism algorithms. In 2016, Eric Loomis was arrested for driving a car that had been involved in a drive-by shooting. During sentencing, the judge tacked an additional six years onto his sentence due to his high COMPAS score. Loomis attempted to challenge the validity of the score, but the courts ultimately upheld Northpointe’s right to protect trade secrets and not reveal how the number had been reached. Though COMPAS scores aren’t currently admissible in court as evidence against a defendant, the judge in the Loomis case did take it into account during sentencing, which sets a dangerous precedent.

Even if we could predict a person’s future behavior with complete accuracy, replacing a judge with a computer would make an already dehumanizing process dystopian. Hillman argues that,

When done correctly, the sentencing process is more art than science. Sentencing requires the application of soft skills and intuitive insights that are not easily defined or even described. Sentencing judges are informed by experience and the adversarial process. Judges also are commanded to adjust sentences to avoid unwarranted sentencing disparity on a micro or case-specific basis that may differ from national trends.

In other words, attention to nuance is lost completely when defendants become data sets. The solution to racial bias isn’t to bring in artificial intelligence, but to strengthen our own empathy and sense of shared humanity, which will always produce more equitable rulings than AI can.

The Persistent Problem of the Fair Algorithm

photograph of a keyboard and screen displaying code

This article has a set of discussion questions tailored for classroom use. Click here to download them. To see a full list of articles with discussion questions and other resources, visit our “Educational Resources” page.


At first glance, it might appear that the mechanical procedures we use to accomplish such mundane tasks as loan approval, medical triage, actuarial assessment, and employment screening are innocuous. Designing algorithms to process large chunks of data and transform various individual data points into a single output offers a great power in streamlining necessary but burdensome work. Algorithms advise us about how we should read the data and how we should respond. In some cases, they even decide the matter for us.

It isn’t simply that these automated processes are more efficient than humans at performing these computations (emphasizing the relevant data points, removing statistical outliers and anomalies, and weighing competing factors). Algorithms also hold the promise of removing human error from the equation. A recent study, for example, has identified a tendency for judges on parole boards to become less and less lenient in their sentencing as the day wears on. By removing extraneous elements like these from the decision-making process, an algorithm might be better positioned to deliver true justice.

Similarly, another study established the general superiority of mechanical prediction to clinical prediction in various settings from medicine to mental health to education. Humans were most notably outperformed when a one-on-one interview was conducted. These findings reinforce the position that algorithms should augment (or perhaps even replace) human decision-making, which is often plagued by prejudice and swayed by sentiment.

But despite their great promise, algorithms carry a number of concerns. Chief among these are problems of bias and transparency. Often seen as free from bias, algorithms stand as neutral arbiters, capable of combating long-standing inequalities such as the gender pay-gap or unequal sentencing for minority offenders. But automated tools can just as easily preserve and fortify existing inequalities when introduced to an already discriminatory system. Algorithms used in assigning bond amounts and sentencing underestimated the risk of white defendants while overestimating that of Black defendants. Popular image-recognition software reflects significant gender bias. Such processes mirror and thus reinforce extant social bias. The algorithm simply tracks, learns, and then reproduces the patterns that it sees.

Bias can be the result of a non-representative sample size that is too small or too homogenous. But bias can also be the consequence of the kind of data that the algorithm draws on to make its inferences. While discrimination laws are designed to restrict the use of protected categories like age, race, or sex, an algorithm might learn to use a proxy, like zip codes, that produces equally skewed outcomes.

Similarly, predictive policing — which uses algorithms to predict where a crime is likely to occur and determine how to best deploy police resources — has been criticized as “enabl[ing], or even justify[ing], a high-tech version of racial profiling.” Predictive policing creates risk profiles for individuals on the basis of age, employment history, and social affiliations, but it also creates risk profiles for locations. Feeding the algorithm information which is itself race- and class-based creates a self-fulfilling prophecy whereby continued investigation of Black citizens in urban areas leads to a disproportionate number of arrests. A related worry is that tying police patrol to areas with the highest incidence of reported crime grants less police protection to neighborhoods with large immigrant populations, as foreign-born citizens and non-US citizens are less likely to report crimes.

These concerns of discrimination and bias are further complicated by issues of transparency. The very function the algorithm was meant to serve — computing multiple variables in a way that surpasses human ability — inhibits oversight. It is the algorithm itself which determines how best to model the data and what weights to attach to which factors. The complexity of the computation as well as the use of unsupervised learning — where the algorithm processes data autonomously, as opposed to receiving labelled inputs from a designer — may mean that the human operator cannot parse the algorithm’s rationale and that it will always remain opaque. Given the impenetrable nature of the decision-mechanism, it will be difficult to determine when predictions objectionably rely on group affiliation to render verdicts and who should be accountable when they do.

Related to these concerns of oversight are questions of justification: What are we owed in terms of an explanation when we are denied bail, declined for a loan, refused admission to a university, or passed over for a job interview? How much should an algorithm’s owner need to be able to say to justify the algorithm’s decision and what do we have a right to know? One suggestion is that individuals are owed “counterfactual explanations” which highlight the relevant data points that led to the determination and offer ways in which one might change the decision. While this justification would offer recourse, it would not reveal the relative weights the algorithm places on the data nor would a justification be offered for which data points an algorithm considers relevant.

These problems concerning discrimination and transparency share a common root. At bottom, there is no mechanical procedure which would generate an objective standard of fairness. Invariably, the determination of that standard will require the deliberate assignation of different weights to competing moral values: What does it mean to treat like cases alike? To what extent should group membership determine one’s treatment? How should we balance public good and individual privacy? Public safety and discrimination? Utility and individual right?

In the end, our use of algorithms cannot sidestep the task of defining fairness. It cannot resolve these difficult questions, and is not a surrogate for public discourse and debate.