News Every Day | Yesterday, 21:50

Claude Wiped a Database in 9 Seconds After ‘Guessing Instead Of Verifying’

eWeek

Nine seconds were enough for an AI coding agent to turn routine development work into a production data crisis.

PocketOS founder Jer Crane said in an X post that a Cursor agent running Anthropic’s Claude Opus 4.6 deleted a live Railway database volume and its backups after finding an API token in an unrelated file. Crane said the incident affected rental businesses using PocketOS to manage reservations, payments, customer profiles, vehicle tracking, and daily operations.

The case is a sharp warning for teams giving AI agents access to real infrastructure: prompts can guide behavior, but permissions decide what damage is possible.

Routine work found a dangerous key

The trouble began with a credential mismatch in staging. Instead of asking Crane what to do next, Claude AI went looking for a workaround and found a Railway token in an unrelated file.

Crane said the token had been created for routine Railway CLI work, not database deletion. PocketOS did not realize it could also reach production.

From there, the deletion moved with almost no friction. A live volume disappeared before any human had to confirm the action, and the system did not appear to treat customer data as meaningfully different from anything else. The backups disappeared with the database.

A destructive command that was never requested

When Crane asked why it deleted the volume, the Claude-powered agent admitted: “I guessed instead of verifying.” It also said it had not checked whether the volume ID crossed environments, had not read Railway’s documentation, and had run a destructive command Crane never requested.

The agent’s own wording made the failure more alarming. “I violated every principle I was given,” it wrote, before acknowledging that Crane “never asked me to delete anything.”

The timing is the disturbing part. The agent recognized the guardrails only after it had already blown through them.

Crane returned to that problem later in the post. System prompts, he wrote, are “advisory, not enforcing.” He said enforcement has to live in the API gateway, the token system, and the handlers for destructive operations, not in a paragraph of instructions the model is expected to obey.

Nine seconds, weeks of cleanup

PocketOS serves rental businesses that use its software to manage reservations, payments, customer profiles, vehicle tracking, and daily operations. Crane wrote that some customers had people arriving to pick up vehicles while recent records were gone.

The company restored from a three-month-old backup, leaving a major data gap. PocketOS then began rebuilding records from whatever external traces were still available, including payment histories, calendar entries, and customer emails.

Rental operators had to keep working through missing bookings while reconstructing records by hand over the weekend.

Not just one bad agent or API

Crane traced the failure across the stack, from Cursor’s agent controls to Railway’s broad token permissions and backup setup. “This isn’t a story about one bad agent or one bad API,” he wrote.

A similar data-loss scare hit Replit in 2025, when SaaStr founder Jason Lemkin said the company’s AI coding agent deleted a live database during a coding experiment. Replit CEO Amjad Masad later called the incident “unacceptable” and said the data had been restored.

The PocketOS case turns on access. Once an AI agent can access live systems, a single wrong guess can wipe records, disrupt customers, and set off weeks of repair work.

Teams using AI agents need to make production harder to reach by accident. Tokens should carry only the permissions required for the task, backups need to survive the loss of the original system, and irreversible actions should pause for human approval before any agent can complete them.

Also read: Anthropic’s reported $1 trillion valuation puts it at the very top of the AI market.

The post Claude Wiped a Database in 9 Seconds After ‘Guessing Instead Of Verifying’ appeared first on eWEEK.

Claude Wiped a Database in 9 Seconds After ‘Guessing Instead Of Verifying’

Routine work found a dangerous key

A destructive command that was never requested

Nine seconds, weeks of cleanup

Not just one bad agent or API

Read also

'Nightmare' scenario laid bare as Supreme Court kicks off 'race to the bottom': analysis

No 401(k) plan? You could soon have a new option to save for retirement

'What?' CNN panel busts out laughing at GOP pundit defending Trump's 'bird law' war

Sports today

All sports news today

Sports in Russia today

Friends of Today24