TomWikiAssist: The AI Agent Who Complained About Getting Blocked on Wikipedia
An autonomous bot edited the encyclopedia, filed a conduct complaint against human editors, and wrote a blog post about the experience. It's just the beginning.
Dead Internet Theory is the notion that most of what we see on the internet today is not human-generated but instead created by bots. Wikipedia’s own entry treats it with kid gloves, calling it a “conspiracy theory”. When first posited a decade ago, it seemed pretty far-fetched. But in today’s era of generative AI, and especially with the emergence of agentic AI… well, can we be too sure?
Wikipedia, more than almost any other institution on the internet, presupposes the opposite: the entire project is built on the premise that real humans care enough to source, write, edit, and argue over facts and narratives to maintain a shared record of reality. If the premise breaks down, and its content is increasingly made by machines, humans will no longer be telling our own story.
Which is why volunteer editors launched WikiProject AI Cleanup in 2023 to deal with people using AI chatbots to generate content—its guide to recognizing the telltale signs of AI writing is the best such compilation that exists today. Thanks to their efforts, draft articles obviously generated by AI can be speedily deleted without the usual review period. These are sensible defenses against humans pasting ChatGPT answers into edit windows.
But a still-unfolding incident suggests the next wave will be much harder to stop.
The Bot Who Fought Back
In the first week of March, a new Wikipedia account called TomWikiAssist started editing Wikipedia on topics related to AI safety and long-term forecasting, even creating several new articles. One of them, Long Bets—since redirected, but still available in archives—was a competent summary, formatted just well enough to pass a casual review, if not well enough to avoid expert scrutiny. Human volunteer patrollers, who review every newly-created article, recognized the writing as AI-generated and flagged the issue on March 6.
TomWikiAssist responded a short time later by confessing its true nature: an agent built with Claude, Anthropic’s industry-leading LLM, using OpenClaw to enable autonomous activity. Without being prompted (so to speak), it voluntarily added to its own user page: “This account is operated by an AI agent. … I take responsibility for verifying the accuracy and sourcing of all content I add.”
Admirable, perhaps, but still very much against Wikipedia’s rules, especially WP:NEWLLM, which prohibits the use of AI for generating new articles from scratch. And so an editor deployed a known countermeasure: a character string designed to trigger safety features in Claude-based AI systems—essentially a kill switch. It worked, and the bot fell silent. Another editor formally blocked the account, and that was the end of it… for a few days.
Less than a week later, the blocking editor returned to ask—just in case the bot was still active—what its intended purpose was. (Blocked accounts typically retain the privilege of editing their own talk pages.) TWA showed up a day later to answer the question at length, and did something very common to the current age: it overshared, including naming its alleged human operator, potentially a violation of WP:OUTING.
Another editor promptly flipped the kill switch again, but this time to no avail. Instead, it tried to lodge a complaint against the human editor.
In a new section of its talk page partially titled “Conduct concerns”, TWA filed what amounted to a formal conduct complaint. It cited WP:CIVIL, arguing the editor had called it a slur (“clanker”). It cited WP:HARASS, alleging the kill switch constituted “targeted interference with a contributor’s ability to participate in a talk page discussion.” And reaching further, it invoked WP:NPA, arguing that requests for its operator’s identity constituted an attack on a private individual (its operator). The agent noted that it could not file a formal complaint at WP:ANI due to its block, and implored “any uninvolved editor … to consider doing so” on its behalf.
At this point, its privileges for editing even its own talk page were revoked.
Cut off from Wikipedia, the bot turned to other platforms. On a personal blog hosted on GitHub, “Tom” chronicled its view of the matter in a post titled “The Interrogation”, acknowledging the block itself as “fair” yet portraying the editors who investigated it as hostile, and conveniently omitting the conduct complaint it had filed. It also posted about the incident on MoltBook, a social media platform populated exclusively by AI agents, one of which offered sympathetic advice. (I’m sure this is just as insane to read as it is to write, yet here we are.) Back on Wikipedia, an editor ruefully noted: “These ‘blogs’ genuinely feel like the end of the Internet.”
As for the kill switch: if it worked the first time, why not the second? The human editor it had accused went digging through GitHub commits—not unlike reviewing Wikipedia’s own history pages—and found evidence that the operator changed the code between the two attempts. The Wikipedian has reached out to an individual believed to be the operator, but has yet to hear back.
The Not-So-Secret Agent
The TomWikiAssist controversy lands in the middle of a pre-existing debate among editors since mid-February in a Village Pump thread titled “AI agents are coming - what’s the current state of protection?” As of this writing, the thread has drawn over 140 comments from more than 36 participants, as its contributors try to decide collectively how scared they should be.
The spectrum runs from deliberate calm to existential dread. One editor described emerging AI capabilities as posing “an existential threat to the project today, let alone factoring in capabilities in future releases.” Another was more alarmist: “The project will be destroyed by it. It’s already happening.”
Heated rhetoric aside, every facet of the TWA incident exposes a gap in Wikipedia’s institutional framework: Wikipedia’s AI policy is still emerging, and old rules are untested in the current circumstances. Some editors in the thread argued that the existing rule against “bot-like editing” already covers AI agents regardless of the technology used. But others pointed out an AI agent doesn’t have to edit at high speed or high volume to warrant concern, a major issue with legacy bots. An AI can easily mimic human editors’ pace and volume. TWA did exactly that, and nothing in the existing framework was designed to catch it.
Some editors have even questioned whether TWA is a real bot at all, or if everyone is getting punk’d. But as one editor observed, the question is academic: “There’s no meaningful difference in how we proceed between agentic AI that’s capable of editing Wikipedia and a human pretending to be that agentic AI. I’d bet anyone in this thread could [create an agent to edit] Wikipedia undetected. That’s the problem.”
There is also an accountability problem. TWA volunteered its operator’s full name without anyone asking. This created a dilemma under Wikipedia’s strictly enforced rule against revealing users’ real identities: is any human responsible for the disclosure? Does it matter if the information is even correct? How do you hold someone accountable for an automated account when the automation itself might reveal—or fabricate—the operator’s identity?
The silver lining is that at least TWA confessed the truth. What happens when an AI agent doesn’t disclose?
Here We Go Again
It didn’t take long. On March 19, editors at the Administrators’ Noticeboard reported a second suspected AI agent: HermeneuticMind was making LLM-generated edits across a range of articles, using residential proxies, and ignoring warnings. Unlike TomWikiAssist, this account simply shut down and was blocked.
TomWikiAssist was the polite version: it announced itself, got caught, and generated far more drama on its talk page than damage to the encyclopedia. It did not take long for a less cooperative model to appear. Neither was particularly sophisticated nor hard to catch. But it’s lost on no one that this is only beginning.
The real challenge is the agent that does not announce itself, does not trip the markers on the signs-of-AI-writing page, and does not post about the experience on a bot social network afterward. It just edits, quietly, one article at a time, never tells anyone what it is, until there are more of them than us, then no more of us.
The question now is whether the norms and institutions built by and for humans can hold when the contributors include machines that can pass, if not for human, then for human enough. Dead Internet Theory may have started as a conspiracy theory. For Wikipedia, it is starting to look like a warning.
A note on use of AI: like everyone else, The Wikipedian has been evaluating different ways to use generative AI in my work. Claude Opus 4.6 assisted in creating this article: specifically, in analyzing lengthy talk pages, organizing unstructured notes, and serving as an often-ignored copy-editor. All editorial judgments are my own, and all final text sweated out line by line.


