Are AI tools used to make sentencing decisions in UK courts?

AI risk-scoring tools are not used directly in sentencing in English and Welsh courts in the way they have been in some US jurisdictions. However, algorithmic risk assessment tools are used at the custody stage — informing bail and pre-charge detention decisions — and risk scores generated by tools such as Durham Constabulary's HART can appear in pre-sentence reports that judges read before passing sentence, creating an indirect influence on sentencing outcomes.

AI in Courts and Sentencing

Q: What is COMPAS and why is it controversial?

COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) is a risk assessment tool widely used in the United States to inform bail, sentencing, and parole decisions. A 2016 investigation by ProPublica found that COMPAS was nearly twice as likely to incorrectly flag Black defendants as future criminals compared with white defendants, and nearly twice as likely to incorrectly label white defendants as low risk. The tool's vendor, Equivant (formerly Northpointe), disputed the methodology, but the case became a touchstone in the global debate about algorithmic bias in criminal justice.

Q: Does using an AI risk score in court breach a defendant's right to a fair trial?

This is an unsettled legal question. In the US case Loomis v Wisconsin (2016), the Wisconsin Supreme Court held that using COMPAS in sentencing did not violate due process, provided the judge did not treat the score as determinative. In the UK, Article 6 of the European Convention on Human Rights guarantees the right to a fair trial, which includes the right to examine the evidence used against a defendant. Whether an opaque algorithmic score used to influence a custodial decision satisfies Article 6's requirements has not been definitively resolved.

Risk assessment tools — what they are and how they work

Algorithmic risk assessment tools in criminal justice produce a numerical score or categorical rating — typically low, medium, or high — that purports to estimate the likelihood that an individual will reoffend, fail to attend court, or pose a risk of harm. They generate this score by comparing a set of data about an individual against patterns observed in historical datasets of people who did or did not reoffend. The inputs typically include factors such as age, prior criminal history, accommodation status, employment, and in some tools, characteristics of family members or social associates. The score is intended to assist decision-makers — custody officers, prosecutors, judges, parole boards — rather than to make decisions autonomously, though the practical influence of a score on a human decision-maker is difficult to measure and likely to be significant.

Two distinct categories of tool are often conflated in coverage of this area, and the distinction matters. Risk of reoffending tools attempt to predict whether an individual is likely to commit further offences, and are used most often in pre-sentence reports, parole hearings, and community supervision decisions. Risk of harm tools focus specifically on the likelihood that an individual will cause serious harm to others, and feed into decisions about whether to remand in custody, impose licence conditions, or refer to mental health services. The evidential basis for the two categories is different, the appropriate use cases are different, and the consequences of error are different. Treating them as interchangeable, as some coverage and some policy documents do, creates confusion that obscures the specific risks associated with each.

COMPAS and the American experience

No single episode has shaped the global conversation about AI in criminal justice more than ProPublica's 2016 investigation into COMPAS — the Correctional Offender Management Profiling for Alternative Sanctions tool produced by the company then known as Northpointe and subsequently rebranded as Equivant. COMPAS was used extensively across the United States to inform bail, sentencing, and parole decisions. ProPublica's analysis of COMPAS scores assigned to more than seven thousand defendants in Broward County, Florida, found that the tool was nearly twice as likely to incorrectly flag Black defendants as future criminals compared with white defendants, and nearly twice as likely to incorrectly label white defendants as low risk.

The vendor disputed the methodology, arguing that ProPublica had applied a definition of fairness — equal false positive rates across racial groups — that was mathematically incompatible with another standard definition of fairness — equal predictive accuracy across groups — and that COMPAS satisfied the latter standard even if it did not satisfy the former. This turned out not to be a minor technical quibble but a genuinely deep point about the nature of algorithmic fairness: several computer scientists demonstrated formally that different definitions of fairness cannot simultaneously be satisfied in most real-world conditions. The COMPAS controversy effectively launched an academic field — algorithmic fairness in machine learning — that continues to produce important research, very little of which has translated into regulatory requirements on tool vendors.

The legal challenge to COMPAS came in Loomis v Wisconsin (2016), where the Wisconsin Supreme Court considered whether sentencing a defendant partly on the basis of a COMPAS score violated his due process rights, given that the score was generated by a proprietary algorithm whose methodology he could not examine. The court held that it did not, provided that the sentencing judge did not treat the score as determinative. The ruling was widely criticised by legal academics as insufficiently protective of defendants' rights, and the US Supreme Court declined to hear an appeal.

The UK context: HART and custody decisions

Durham Constabulary's Harm Assessment Risk Tool — HART — became the most discussed UK example of algorithmic risk assessment in criminal justice when it was developed in partnership with Cambridge University's Institute of Criminology and deployed operationally from around 2017. The tool categorises suspects at the custody stage into low, medium, or high risk of serious further offending, with the intention of informing decisions about whether to offer diversion programmes, impose bail conditions, or remand in custody.

A 2019 review by the Oxford Internet Institute raised a number of concerns about HART that have become touchstones in the UK policy debate. The review found that the model used postcode as a proxy variable — meaning that where an individual lived influenced their risk score. Since postcode correlates with socioeconomic deprivation, and deprivation correlates with race in many UK communities, the use of postcode as an input introduced a form of indirect discrimination that the Public Sector Equality Duty arguably required the force to address. Durham subsequently modified the tool to remove postcode as a direct input, though critics questioned whether the underlying socioeconomic factors it captured were adequately excluded by that change.

HART is not the only such tool in use across UK policing, though others tend to attract less public scrutiny. The Offender Assessment System (OASys) used by the National Probation Service and Her Majesty's Prison and Probation Service (HMPPS) to produce pre-sentence reports has used actuarial risk factors for many years and has been updated to include more sophisticated predictive modelling. Risk scores generated by OASys appear in the pre-sentence reports that judges read before passing sentence, creating a channel through which algorithmic assessment influences criminal justice outcomes that is rarely described as such in public debate.

Article 6 and the right to examine evidence

Article 6 of the European Convention on Human Rights guarantees the right to a fair trial, which encompasses the right of a defendant to have adequate time and facilities to prepare a defence, and to examine and challenge the evidence against them. In the context of algorithmic risk assessments, this raises the question of whether a defendant can meaningfully challenge a score whose methodology is opaque — either because it is generated by a proprietary commercial product or because the algorithm's complexity makes its reasoning practically inaccessible even to experts.

The question has not been definitively resolved in England and Wales. The position of the courts has been that risk assessments are one input among several and that a judge considering a risk score alongside other evidence, and giving reasons for their decision, is not in breach of Article 6 merely because the score is generated algorithmically. What remains untested is a case in which an algorithmic score is demonstrably central to a decision — a custodial remand, for example, or a parole refusal — and the defendant argues that they have been deprived of the ability to effectively challenge the basis for that decision. The Human Rights Act framework provides the tools for such a challenge; the case law has not yet been developed.

Article 11 of Part 3 of the Data Protection Act 2018 is also relevant here: it restricts automated decision-making in a law enforcement context and requires that any such decisions be subject to meaningful human review. The question of whether a custody officer or judge reviewing an algorithmic score constitutes meaningful review — rather than deference to an output they have neither the time nor expertise to interrogate — is one that lawyers and academics have raised but courts have not yet squarely addressed.

Parole and release decisions

The Parole Board for England and Wales considers risk to the public as a central criterion in decisions about whether prisoners who have served the minimum term of an indeterminate or extended sentence should be released. Risk assessments using tools such as OASys and the Violence Risk Appraisal Guide (VRAG) form part of the evidence considered at parole hearings, alongside oral evidence from psychologists, probation officers, and the prisoner themselves. The weighting given to actuarial risk scores relative to clinical judgement and current contextual factors in parole decisions has been a persistent source of criticism from criminal justice practitioners and academics, who argue that the historical pattern-matching at the core of actuarial tools is a poor basis for individual predictions of future behaviour when the individual before the board may differ substantially from the historical population on which the tool was trained.

Prisoners who believe their parole decisions have been influenced by incorrect or inappropriately applied risk assessments face significant practical difficulties in challenging them. The Parole Board does not currently provide detailed reasoning for refusals that would enable a prisoner to identify which factor or tool output was determinative. The Howard League for Penal Reform and the Prison Reform Trust have both called for greater transparency in how algorithmic tools influence Parole Board decisions, arguing that the current approach is incompatible with the board's obligations under natural justice and the Human Rights Act.

The Ministry of Justice and AI efficiency

The Ministry of Justice has expressed significant interest in the potential for AI to reduce court backlogs and improve the efficiency of case management. This interest sits in some tension with the more cautious position on algorithmic risk assessment in sentencing decisions, and reflects a broader ambiguity in government policy: AI as a tool for administrative efficiency is generally encouraged, while AI that directly influences individual rights and liberties is subject to more hedged support. The distinction is not always as clear as policy documents imply, because administrative efficiency decisions — how quickly a case is processed, how it is prioritised, which legal aid applications are approved — have consequences for individuals' rights even when they do not take the form of a sentence or bail decision.

Follow the coverage

PoliceAI News tracks AI in courts and sentencing continuously — new tool deployments, appeal cases, policy consultations, and academic research as they emerge.

View Live Courts and Sentencing Stories

You can also browse the full archive on this topic or explore related subjects: predictive policing, forensic science and AI, and AI bias and discrimination.

// Explore Topics

POLICEAI NEWS