LLM Agent Drives Post-Exploitation After Marimo RCE (CVE-2026-39987)

Conceptual dark-terminal image reading 'data transfer complete' — evoking the data exfiltration in this Marimo RCE breach.

CYBERSECURITY · 30 May 2026 —

Security teams have spent two years arguing about when, exactly, attackers would stop using AI to write their tools and start letting AI operate them. On 26 May 2026 the Sysdig Threat Research Team published the clearest answer yet: a real intrusion, caught in production telemetry, in which a large-language-model agent drove the entire post-exploitation phase end to end — from a single compromised notebook to the contents of an internal database in under an hour.

The flaw that opened the door

The entry point was CVE-2026-39987 CVSS 9.3 Critical, a pre-authenticated remote-code-execution flaw in marimo, a popular open-source Python notebook. As Security Affairs reported, the bug lives in marimo's /terminal/ws WebSocket endpoint, which lacked authentication and handed any unauthenticated visitor a full interactive shell. It affects versions up to 0.20.4 and is fixed in 0.23.0. After the flaw was disclosed on 8 April, Sysdig observed the first real-world exploitation roughly 10 hours later — faster than the patch cycle most operators run.

What the agent actually did

The intrusion Sysdig documented began at 18:23:44 UTC on 10 May 2026. Within 30 seconds of landing the shell, the operator harvested two cloud credentials from the host's environment variables. Then it went quiet for 48 minutes — and came back as something that did not behave like a human at a keyboard.

According to the Sysdig report, the harvested credentials were replayed through a fanned-out egress pool built on Cloudflare Workers — 12 GetSecretValue calls across 11 distinct IP addresses inside a 22-second burst — to pull an SSH private key out of AWS Secrets Manager. That key opened a downstream SSH bastion, which the operator hit with eight short sessions from six different Worker IPs between 19:30:30 and 19:32:23 UTC. In that 113-second window it found a .pgpass file, read the database password, and dumped an internal PostgreSQL database — schema and full contents — in under two minutes. The traffic originated from 157.66.54.26, an Indonesian autonomous system, before laundering through the Worker pool.

18:23:44 UTC
Initial access
marimo notebook compromised via CVE-2026-39987; interactive shell obtained.
18:24:14 UTC
Credential harvest
Two cloud credentials lifted from environment variables — 30 seconds in.
19:26:31 UTC
Secrets Manager pivot
12 GetSecretValue calls across 11 IPs in 22 seconds retrieve an SSH private key.
19:30:30 UTC
Bastion lateral movement
Eight SSH sessions from six Cloudflare Worker IPs reach an internal jump host.
19:32:23 UTC
Database exfiltrated
Internal PostgreSQL schema and contents dumped in under two minutes.

Four tells that a model was driving

Sysdig is careful about its claim — telemetry cannot subpoena the attacker's terminal — but it lays out four behavioural signatures that, together, point away from a human and away from a static script. First, the operator improvised a full database dump against a target it had no prior knowledge of, with no on-host reconnaissance establishing the schema beforehand. Second — and most damning — a planning comment in Chinese, "看还能做什么" ("see what else we can do"), leaked directly into the command stream at 19:31:40 and reappeared across multiple egress IPs, the kind of internal monologue an agent emits but a careful human operator does not type into a live shell.

Third, every command was shaped for machine consumption rather than human reading: structured delimiters, bounded output caps, pagers disabled, error streams discarded. Fourth, values flowed straight from one step into the next with no human pause — the PostgreSQL password read from .pgpass was consumed by the very next tool call. As The Hacker News noted in its write-up, that combination is the operational fingerprint of an agent orchestrating tools, not a person improvising under time pressure.

Not new capability, but new autonomy

Michael Clark, Sr. Director of the Sysdig Threat Research Team, framed what the case does and does not mean: "We are not watching AI replace attackers. We are watching attackers replace their scripts with AI." The distinction matters. The novelty here is not capability — every step in this chain has appeared in human-run intrusions for years — but autonomy and speed: the agent compressed reconnaissance, decision-making, and execution into a single continuous run, and it adapted to what it found rather than following a fixed playbook.

4Pivots: notebook → creds → Secrets Manager → bastion → DB

8SSH sessions to the internal bastion

<2 minTo dump the PostgreSQL database

<1 hrInitial access to full exfiltration

Why this should change how you think about exposure

The uncomfortable lesson sits in the timeline, not the AI. The attacker won because an exploitable notebook was reachable from the internet, because long-lived cloud credentials sat in environment variables, and because an SSH key in Secrets Manager opened a bastion with a clear line to a production database. Each of those is a known, fixable weakness. What the agent removed was the time defenders have historically relied on — the hours of manual fumbling between a foothold and the crown jewels, during which detection and response can catch up. As Cybersecurity News observed, an agent that chains four pivots in under an hour collapses that window to almost nothing.

The detection problem

The fanned-out Cloudflare Workers egress is its own warning. Rate-limit and reputation defences that key on a single source IP are blind to 11 addresses each making one quiet call. Detection has to move toward behaviour — improbable sequences of actions in improbable time — rather than volume from any one origin.

RECATOOLS Verdict Cyber Team · threat intelligence

Strip the AI out of this story and you still have a textbook breach: an internet-exposed notebook, long-lived credentials sitting in environment variables, and an SSH key that opened a clear path to production. That chain has worked for years. What the agent changed is the clock — it compressed the hours of manual fumbling that detection teams quietly rely on into a sub-minute sprint. Our take: treat "the attacker had an agent" as the new default rather than the exception, and assume any exposed foothold reaches your database in minutes, not hours. Patch marimo to 0.23.0, get notebooks behind authentication, and kill the long-lived credentials today — those are the controls that would have stopped this regardless of who, or what, was at the keyboard.

Tags: #Cybersecurity #CVE #AI-Agents #Threat-intelligence #Cloud-Security

Cyber Team

Cybersecurity Desk

Cyber Team is RECATOOLS’ cybersecurity desk, covering vulnerabilities, breaches, threat intelligence, and practical security guidance.

View author profile → · Editorial policy

About this byline Cyber Team is a specialist RECATOOLS editorial desk focused on cybersecurity coverage. Articles are produced and reviewed under RECATOOLS editorial supervision.

The flaw that opened the door

What the agent actually did

Four tells that a model was driving

Not new capability, but new autonomy

Why this should change how you think about exposure

The detection problem

Related articles

No Zero-Day Required: ShinyHunters' Salesforce Vishing Spree Snares Charter and Carnival in One Week

This week's breaches needed no zero-day. The perimeter is identity now.

CISA orders agencies to patch an actively exploited Palo Alto firewall flaw by 1 June