The accountability gap in AI-driven warfare by Cheta Nwanze

By Cheta Nwanze

There is some speculation that the attack, on the first day of the war by Israel and the US against Iran, which killed at least 160 schoolgirls, may have been carried out by an AI autonomous system. This, is scary news, especially if you are not white.

Alvi Choudhury spent ten hours in a police cell in Southampton, England, for a crime he did not commit. As a matter of fact, the crime was committed 130 kilometres from his home in Milton Keynes, a city he has never visited. This mistake was because an algorithm told police officers that he was the suspect. The actual suspect was younger, lighter-skinned, and clean-shaven. The officers who interviewed Choudhury later admitted they knew it was not him before the interview began. But the algorithm had spoken, and for a time, that was enough.

Choudhury’s case is a small story, the kind that generates outrage for a news cycle and then vanishes. But it is also a warning. If we cannot trust facial recognition software to correctly identify a burglary suspect without locking up the wrong Bangladeshi software engineer, what confidence should we have that the same technology, scaled to industrial proportions and mounted on drones, will correctly identify combatants on a battlefield?

The question is not hypothetical. In Gaza, the Israeli military reportedly used a targeting system called Lavender that, at its peak, generated 37,000 suspected militants for potential strikes. Human operators were expected to spend roughly twenty seconds reviewing each recommendation before authorising a bomb. The system had a ten percent error rate, meaning one in every ten people flagged for death was not a combatant. That error rate was deemed acceptable for high-intensity operations.

Ten percent. Acceptable.

Now apply the Choudhury case to that logic. Choudhury’s face was flagged because his Bangladeshi features were underrepresented in the training data. A Home Office-commissioned study found that, in certain settings, Asian faces produce false-positive rates of 4 percent, compared to 0.04 percent for white faces. That is a hundredfold difference. In Gaza, an AI trained primarily on one demographic might systematically misclassify civilians from another as threats. The ten percent error rate is not evenly distributed. It is concentrated among those that the training data did not adequately represent.

This is where the recent clash between the Pentagon, Anthropic, and OpenAI becomes relevant. When the US Department of War demanded that AI companies accept “all lawful use” of their models on classified military networks, Anthropic refused. Its CEO drew a line: no mass surveillance, no autonomous weapons. The Pentagon gave a deadline. When Anthropic requested more time to negotiate the final language, the Pentagon designated it a “supply chain risk” and blacklisted it from future contracts.

Within hours, OpenAI announced it had reached an agreement of its own. The company secured the right to implement technical safeguards and inserted contract language prohibiting use of its systems to “independently direct autonomous weapons.” It argues that its multi-layered approach, cloud-only deployment, on-site engineers, and termination rights provide stronger protection than policy alone.

Perhaps. But the most vocal safety advocate is now out of the room. Oxford’s Professor Mariarosaria Taddeo called that “a real problem.” The companies that refused to bend have been punished. Those who collaborated have been rewarded. The message to the industry is unmistakable: fall in line or be locked out.

In the ongoing strikes on Iran, AI systems identified bombing targets faster than “the speed of thought,” as one expert put it. The scale and speed of modern warfare now exceed human capacity for oversight. The International Committee of the Red Cross warns that autonomous systems “could lead to escalation and reduce the threshold of going to war, thus putting civilians at greater risk.”

When the sensor-to-shooter timeline compresses from days to seconds, the twenty-second review becomes a formality. When the machine recommends a target, the human approves. When the machine is wrong, there is no one to hold accountable. The United Nations asks: Who bears responsibility for crimes committed by autonomous weapons? The commander? The programmer? The manufacturer? As one expert noted, someone might say, “Actually, no one did it. The machine did it.”

The thread connecting Alvi Choudhury’s wrongful arrest to the strikes on Iran is the training data. Both systems fail because they were not trained on representative populations. Both errors were foreseeable. Both were, in different ways, deemed acceptable.

Thames Valley Police admitted to Choudhury that his arrest “may have been the result of bias within facial recognition technology.” They also told him they saw no need to escalate the issue for “wider organisational learning.” No lessons would be learned. No systems would be improved. The error would simply be filed away, and the algorithm would continue.

That refusal to learn is precisely what makes the future of autonomous weapons so terrifying. If police forces cannot be bothered to correct a system that wrongfully arrests innocent citizens, what confidence can we have that military contractors will adequately test systems that decide who lives and who dies? The ten percent error rate accepted in Gaza is the same logic scaled to industrial proportions. The bias that put an innocent man in a cell for ten hours is the same bias that, in an autonomous weapon, could put a village in a cemetery.

The guardrails are gone. The speed is increasing. The errors are baked into the code. And when the machine gets it wrong, there will be no one to blame, no lessons learned, and no mechanism for justice. Only the aftermath. Only the dead. Only the knowledge that we saw this coming and chose to look away.

Nwanze is a partner at SBM Intelligence

What's Hot

The accountability gap in AI-driven warfare by Cheta Nwanze

Related Posts

Subscribe to Updates