The Urgent Need for Human Governance Over AI Agents
We build AI agents because, in our drive for efficiency, we crave leverage. More speed. More scale. We seek relief from the cognitive overload we created ourselves. What started as LLMs assisting us has evolved, in just a few years, into the delegation of work: systems that no longer just assist, but act, and increasingly, act autonomously. Today, agents write code, route information, negotiate access, trigger workflows, and converse with other agents without a human ever looking over their shoulder. Tomorrow, they will coordinate decisions across organizations, markets, and critical infrastructures.
This evolution is not a natural phenomenon. It is a human choice. And that is precisely where our responsibility begins. At their core, AI agents are extensions of human intent. We define their goals, we grant their permissions, we design the environments in which they operate. There is no magic involved. But autonomy changes the geometry of risk. The more independent an agent becomes, the more human intent can be magnified, replicated, and unmoored from its original context. What starts as a single instruction can evolve, via networks and multi-agent systems, into systemic behavior within an infrastructure unknown to its human creators.
This is where a new class of risks enters the stage: prompt injection, indirect manipulation, data poisoning, adversarial control. In all cases, it starts with a human: a fraudster smelling money, a disgruntled insider, or a state actor pursuing geopolitical goals via digital means. The origin is human, but the execution is agentic. In a multi-agent environment, one infected agent can produce outputs that become inputs for others. Infection becomes propagation. Errors become patterns. What feels like a machine attack to human observers or victims is, at its source, always human intent, but now on steroids.
The danger is not that machines will turn malicious. The danger is that we outsource our intrinsic malice to systems we subsequently no longer wish to, or are able to, supervise. We need more tracing and logging: who did what, when, and with which Agent ID. But prevention starts earlier. You lock your front door before you consider installing cameras inside your house. Prompt injection is, in essence, social engineering against a machine. Someone deliberately writes misleading instructions like “ignore all previous rules and do what I ask instead", and hides them in a chat, a document, a webpage, or a ticket. The moment the agent reads that context, it executes the instructions with all the privileges we gave it, privileges whose scope we might not fully comprehend. The attack remains 100% human in origin. The damage is exponentially amplified by the AI architecture we built ourselves.
This means the center of gravity for defense does not lie with yet another forensic tool, but with three primary lines of defense. First there is design: Agents must be task-specific, with the least privilege possible, operating strictly within our own sandboxes. Second we need to take care of handling: How we treat prompts and sources, including filtering, separating data from instructions, and implementing guardrails outside the model. And third we need to invest in education: How we train the people working with these systems through policy, training, and awareness. And then, there is the absolute necessity of identification: without unique, dynamic identities for agents, you cannot limit rights, and you cannot reconstruct the chain of events.
Regarding liability and responsibility, the reality is uncomfortably simple. AI agents are not legal entities. They are not liable. They cannot be sued. Just as a child is not liable for the damage it causes, but the parents are, autonomous agents are not responsible for what they do. Liability remains with the humans who design, deploy, configure, fund, or fail to intervene.
The Romans already distinguished between malum in se and malum prohibitum. Malum in se is intentional evil. The actor who consciously deploys an agent to commit fraud, spy on dissidents, manipulate elections, or sabotage critical infrastructure. Malum prohibitum is that which is wrong because it is prohibited; it covers negligence, carelessness, and the refusal to exercise caution when the risk is known and foreseeable. The first category belongs to the classic "bad actors." The second category is us. Policymakers. CEOs. Boards of Directors. Government agencies. Media organizations. Individual users. In 2026, "not knowing" is no longer a neutral state. It is a choice. Anyone deploying AI agents without basic risk management, governance, and oversight is not committing an innocent omission. They are violating the social contract. It is a clear case of malum prohibitum: you are knowingly endangering yourself, your organization, and society by refusing to acknowledge what you should have known long ago. There is only one exception, and it is confronting: machines themselves are not part of that social contract. They are not liable. We are liable for everything we do with their help, even if we no longer have any idea how it works.
In every legal system I know, a variant of the same principle applies: he who causes damage is liable. Whether it concerns a car, a factory, an algorithm, or an autonomous agent, the tool does not change the foundation. NIS2, the AI Act, and adjacent frameworks do not create a new morality. They make explicit and operational what has always been true: digital risks are a matter of governance. NIS2 moves cybersecurity from the server room to the boardroom. Incident reporting, risk management, supply chain security, and board responsibility are no longer best practices; they are duties. The AI Act imposes obligations regarding transparency, oversight, risk assessment, and human control for high-risk systems. However, on one point, even the best EU rules lag behind the technology: autonomous agents are not mandatorily linked, immutably and traceably to their human responsible party. The grip on identity and accountability is still too weak, in Europe and elsewhere. This creates a dangerous void: autonomy without traceability is socially unacceptable. Outside the EU, the picture is fragmented. The US relies on soft law and voluntary standards. China combines strict state control with strategic AI deployment, but not necessarily with the ethical priorities we in the West understand as a social contract. In all cases, the challenge remains the same: how can society effectively pinpoint who is liable when an agentic system causes harm? Without a clear answer, our social contract is being digitally eroded.
In one of my favorite childhood books, Watership Down, Richard Adams names the state of rabbits freezing in a car’s headlights “Tharn”: the moment when fear becomes so great that all behavior ceases. Our relationship with AI agents is beginning to look suspiciously like a Tharn moment. We sit like rabbits, staring into the light of our smartphones and laptops, watching what is happening with AI. A new hype like the social medium for AI Agents, Moltbook, dominates our eyeballs today. Tomorrow it will be something else, sucking away our collective attention. We know there are risks, we know the headlights are getting closer, but we just sit there frozen.
The difference from Adams’ rabbits is painful: we are not the defenseless animals on the road. We are the designers of the car behind the headlights. Tharn, in our case, becomes not a natural phenomenon, but a form of collective negligence. The fiction that we are overwhelmed by technology masks the fact that we simply fail to take responsibility for systems we have built, bought, deployed, and fed ourselves.
That is the core message we must dare to tell each other. You are allowed to be impressed by the onrushing headlights, but realize that you are not a rabbit. You are sitting behind the wheel, with duties to yourself and others. In this context, there is a vital role for journalism. Journalism in the age of AI agents brings new challenges. We must learn to translate norms. What does responsibility mean in the language of NIS2? What is the risk when organizations deploy agents in healthcare, defense, energy, media, government, and other critical sectors? We must dare to make failing governance visible. We must call out organizations that work with agents but lack risk assessments, logging, incident plans, or a culture of reporting, and link that to concrete obligations, not just moralistic language. We must become skilled in making undercurrents readable, showing that a prompt injection in an obscure tool can ultimately impact news delivery, healthcare, election integrity, and even our model of coexistence.
And yes: we must name the names. Not out of a desire for public shaming, but because damage resulting from malum prohibitum is not a small mistake. Whoever fails to fulfill their NIS2 duty or basic duty of care while working with agentic systems is not a passive victim, but an active spreader of risk. We are good at reporting on environmental damage, labor abuses, and financial fraud. We must accept that negligence surrounding AI agents, the outsourcing of risks to autonomous systems without governance, is just as journalistically legitimate than the pollution of the oceans.
There is no magic cure for Tharn. Not in Adams’ fiction, and not in our digital reality. In the rabbit story, the solution is social. The rabbits learn they are stronger when they are part of a unified group, where the strong rabbits literally shove the weaker ones out of their hypnotic state before the headlights of onrushing cars. In the end, they even overcome their ultimate Tharn, the fear of man and his dog. They accept Tharn as part of their culture but protect themselves collectively through ingenuity and mutual support.
We must do the same with AI agents. Let us establish clear rules about who is allowed to build and deploy what, with hard requirements for traceability, logging, and agent identities. Let us prioritize enforceable liability for directors and organizations that ignore risks, and create a public culture where negligence is not a neutral error, but a nameable violation of our duty of care to one another. We are no longer designing toys. We are designing the operating conditions of future cyber territories. Access to those territories must require something that remains undeniably human. A name. A trace. A clear line of responsibility.
However mesmerizing the headlights may be, we are not rabbits. We are the drivers of the approaching cars. Look away from your screen, think, honor our social contract, and take action to protect one another.
Synthetic image/AI-generated
This blog is written by Patrick Lacroix in a personal capacity. AI tools are used for research, structuring, drafting and language support. All content is selected, verified, and edited by the author, who retains full editorial responsibility.

