Inside Project Maven’s Accountability Trap
Inside Project Maven’s Accountability Trap
As the Pentagon expands Project Maven in military planning and targeting, the danger is not faster war, but a system where human judgment appears present while accountability becomes harder to trace.
Project Maven is no longer the Pentagon’s old experiment in sorting drone footage. It is becoming something more ambitious and more consequential: part of the infrastructure through which the United States will process, prioritize, and act on battlefield information.
That shift should change the argument about military artificial intelligence (AI). The familiar debate is about control. Will humans stay in the loop? Will machines be allowed to make life-and-death decisions? Will autonomy cross some final legal or moral line? Those are real concerns. But they are no longer the most urgent ones.
The more immediate danger is that as systems like Maven move deeper into military planning, targeting, and operations, responsibility gets harder to trace. Who made the call? On what basis? With what chance for review, dissent, or delay? The central risk of military AI is not simply that a model will be wrong. It is that AI-enabled warfare can create an attribution gap, a decision architecture in which responsibility is spread across analysts, commanders, software systems, data pipelines, vendors, and compressed timelines until accountability survives on paper but starts to fail in practice.
That matters because militaries do not adopt AI in a vacuum. They insert it into institutions already under pressure to move faster, process more information, and act inside dense, high-velocity data environments. In that setting, AI does not just add capability. It changes the conditions under which human judgment is exercised. It can shrink the time available for legal review, narrow the space for challenge, and make it much harder afterward to tell whether a human decision-maker genuinely assessed a recommendation or merely ratified it.
Human-in-the-Loop Does Not Ensure Human Control
For years, the comforting phrase has been that humans remain in the loop. But human presence is not the same as human control. In high-pressure military environments, oversight can become nominal rather than meaningful. Operators may still be formally authorized to review AI-generated outputs while lacking the time, context, or institutional backing needed to question them. The result is not necessarily machine autonomy in the science-fiction sense. It is something more mundane, and in many ways more dangerous: human approval without meaningful human judgment.
This is often treated as a technical problem. It is not, at least not mainly. It is a political and institutional one.
Military organizations have always depended on friction. Verification. Legal review. Command consultation. Documentation. Opportunities for challenge. None of those pauses guaranteed wise or humane outcomes. But they forced people to explain themselves. They made responsibility visible. They created moments when someone had to say not only what should be done, but why.
AI-enabled systems are attractive in part because they promise to strip that friction out. They can ingest signals, imagery, communications, and intelligence at a scale no human staff can match. That promise is real. But acceleration is not neutral. When workflows are redesigned around machine-assisted speed, the pauses that once made review possible begin to look like inefficiencies. And once those pauses disappear, it becomes much harder to tell whether a human decision-maker actually exercised judgment or merely kept the process moving.
That is where the military AI debate often goes wrong. The key question is not whether AI is “making” decisions in some abstract sense. The key question is whether humans can still be said to exercise accountable judgment inside workflows optimized for rapid execution and supported by systems whose reasoning, confidence, provenance, or limits may not be fully visible at the moment of action.
Agentic AI Is Reshaping Decision-Making and Judgment
This problem gets sharper as military AI becomes more agentic. The issue is not only that newer systems can process more data faster, but that they can take over more of the interpretive work that once made human judgment meaningful. The machine is no longer just sorting information for humans. It is increasingly helping decide what matters, what deserves more scrutiny, what counts as enough confidence, and when the process can move forward.
That is a bigger shift than most public discussions admit. It means the machine is not merely speeding up a human decision. It is shaping the evidentiary and operational frame within which the human decision is later made. That is why the usual defense of AI-enabled warfare starts to look thin. It is not enough to say that a human still makes the final call. Final sign-off does not restore meaningful control if the system has already structured the pool of evidence, filtered what gets seen, and set the tempo under which judgment is exercised. A person may still authorize force. That is different from saying that human judgment remained a meaningful part of why this target, this moment, and this use of force were chosen.
In any serious account of responsibility, human judgment has to remain part of the explanation of the outcome, not merely part of the formal workflow. If a later investigation asks why a target was struck, it should not be enough to point to the fact that a person clicked approve. The deeper question is what actually drove the choice. Which information was surfaced? Which uncertainties were suppressed? Which alternatives disappeared upstream? If those parts of the process were already machine-shaped, then the human role may still be visible while becoming much thinner in substance.
This is also why Maven matters beyond the Pentagon’s modernization story. The real question is not only whether systems like Maven can generate larger target banks, synthesize more data, or move information faster across the kill chain. The harder question is whether speed and scale begin to crowd out the distinctly human task of asking why a target matters, what strategic effect is actually being sought, and whether the process still leaves room for validation, legal review, and dissent.
When tactical throughput outruns strategic clarity, humans remain in the loop, but their role starts to look supervisory rather than genuinely deliberative. That is not a semantic problem. It is a governance problem. It means an institution can preserve the appearance of human control while steadily weakening the substance of it.
And the machine cannot be analyzed on its own. Strategy, standards, and political intent come first. If the purpose of an operation is unclear, or if civilian-protection norms and legal constraints are already being bent to fit political will, AI will not correct that failure. It will help operationalize it faster. The danger is not just technical opacity. It is that institutional choices and political priorities can be translated into software, then obscured behind the language of algorithmic complexity.
That is why the reporting from Gaza still matters, even if it should not be treated as a universal template for every military use of AI. What it shows is how machine-assisted targeting can be folded into an environment where standards have already eroded, procedures have thinned out, and tactical speed has been elevated over careful judgment. Seen that way, AI does not remove humanity from war on its own. It can amplify political and institutional decisions that have already pushed humanity to the margins.
Vendor Responsibility and the Limits of “Responsible AI”
This is also where the conversation about vendor responsibility needs more honesty. Governments increasingly rely on commercial firms to build, integrate, and maintain AI-enabled systems. Vendors can provide audit logs, provenance tools, access controls, and other compliance features. Those tools can help. But they do not solve the underlying governance problem, and they do not run themselves. A private company can build a system that fits an organization’s policies. It cannot substitute for clear legal authorities, documentation rules, escalation pathways, and institutions empowered to say no.
That is why the rhetoric of “responsible AI” so often feels too easy. Responsibility is not a design feature that can simply be engineered into a system and checked off. The legitimacy of military AI will depend less on abstract assurances than on whether responsibility remains assignable when it matters most: after a strike, after an error, after civilian harm, after the pressure to move quickly has already done its work.
Restoring Accountability Through Traceability and Institutional Design
That requires more than ethics principles and vendor promises. It requires institutional design. Militaries should treat traceability as an operational requirement, not an afterthought. They should deliberately reintroduce friction where AI removes it, especially in decisions where speed is most likely to overwhelm judgment. And they should build oversight across levels, from commanders and legal advisers to civilian harm mitigation structures and post hoc investigators.
Still, even that may not be enough. Even if the records exist, can human beings realistically absorb and act on that information in time? Can traceability still matter if the pace of operations has already turned judgment into throughput? That may be the hardest question of all, because it goes to the heart of what these systems are for. They are built to compress time. But the lawful and legitimate use of force often depends on preserving the very pauses that speed makes hardest to keep.
That is the real test for Project Maven and for the wider turn toward AI-assisted warfare. The question is not whether a human remains somewhere in the chain. It is whether, after a strike, an investigation, or a wrongful death, an institution can still answer four basic questions clearly: What did the system recommend? Who reviewed it? What checks were available? And who is responsible for the final judgment?
If it cannot, then the problem is not simply technical opacity. It is a failure of governance. And in war, failures of governance do not stay procedural for long. They become a way of making violence easier to authorize, easier to scale, and easier to deny.
About the Authors: Martin Wählisch and Averill Campion
Martin Wählisch is the inaugural associate professor of transformative technologies, innovation, and global affairs at the University of Birmingham’s Centre for Artificial Intelligence in Government where he holds dual appointments in the School of Government and the School of Computer Science. Focusing on bridging technology and global affairs, he now spearheads initiatives focused on interdisciplinary collaboration, innovative governance, and impactful technological solutions to global public challenges.
Averill Campion is a non-resident fellow at the New Lines Institute, where she focuses on strategic technology alliances and emerging tech policy. Originally from Texas, she has spent the past decade acquiring international experience through her PhD awarded from ESADE Business School in Barcelona, Spain, and MPA and MSc from University College London and Aston Business School in the UK. Her academic research is on interorganizational collaboration and trust building in public sector AI adoption, where her scholarly work has been published in international peer-reviewed journals.
The post Inside Project Maven’s Accountability Trap appeared first on The National Interest.