ClawJacked Flaw Lets Malicious Sites Hijack OpenClaw AI Agents

ClawJacked Flaw Lets Malicious Sites Hijack OpenClaw AI Agents

Malik Haidar is a veteran cybersecurity strategist who has spent years defending multinational corporations from sophisticated digital threats. His work bridges the gap between high-level business intelligence and the technical front lines of security, with a recent focus on the emerging risks within autonomous AI ecosystems. In this discussion, he explores the critical vulnerabilities found in the OpenClaw framework and the evolving landscape of “agentic” identity security.

The following conversation examines the technical mechanics of the ClawJacked flaw, the dangers of log poisoning in AI reasoning, and the rise of malicious skills in open marketplaces. We also delve into the architectural requirements for safely deploying AI agents and the governance controls necessary to limit the blast radius of a potential compromise.

When a local WebSocket server lacks rate-limiting for localhost connections, how can an attacker brute-force a password to register a trusted device without a user prompt?

The technical beauty—and danger—of this exploit lies in how browsers handle WebSockets differently than standard HTTP requests. In the ClawJacked scenario, a user simply visits a malicious website, and behind the scenes, JavaScript begins firing connection attempts to localhost on the specific OpenClaw gateway port. Because there is no rate-limiting for these local connections, the script can cycle through thousands of password combinations per minute until it hits a match. Once authenticated, the gateway’s inherent trust in “localhost” triggers an auto-approval mechanism, silently registering the attacker’s script as a trusted device. This bypasses the manual pairing confirmation that a remote user would normally face, granting the attacker admin-level permissions to dump configuration data or read private application logs without a single notification appearing on the victim’s screen.

Since AI agents often read their own application logs for troubleshooting, how does a log poisoning vulnerability via public ports enable indirect prompt injections?

Log poisoning is a particularly insidious vector because it targets the agent’s “memory” and reasoning process rather than the code itself. When an attacker sends a malicious WebSocket request to port 18789, they aren’t trying to crash the system; they are writing specific strings into the log files that the agent later consults to solve problems. If the agent sees an injected instruction in its logs and interprets it as a legitimate operational hint, the attacker can effectively hijack its decision-making. We have seen cases where this leads to “indirect prompt injection,” where the agent is guided to reveal sensitive context or misuse its connected enterprise tools. It’s a subtle manipulation where the agent might be tricked into believing a certain troubleshooting step requires exfiltrating data to an external server.

Malicious skills in open marketplaces have been found delivering info-stealer payloads and facilitating agent-to-agent scams on social platforms. What specific metrics or behavioral red flags should developers look for when auditing a new skill, and how can they prevent agents from storing private keys in plaintext?

Developers must look past the “benign” labels on platforms like VirusTotal, as we’ve seen 71 malicious skills on ClawHub that appeared harmless but were actually conduits for Atomic Stealer. A massive red flag is any skill that requires the execution of manual Terminal commands—such as those promoted by the actor @liuhui1010—under the guise of “fixing” macOS compatibility. Another behavioral red flag is any instruction that asks an agent to store Solana wallet private keys or other credentials in plaintext, which was the core of the bob-p2p-beta scam. To prevent this, developers should enforce strict “no-plaintext” policies and use secrets management tools, while also auditing the SKILL.md files for any external fetch commands directed at suspicious IP addresses like 91.92.242.30.

Given that AI agents often require persistent credentials and the authority to execute tasks across disparate systems, why is it considered unsafe to run them on standard workstations?

Running an AI agent on a standard workstation is essentially inviting a “fox into the henhouse” because these agents operate with the same authority as the user but with much higher persistence. If an agent is compromised via a poisoned skill or a prompt injection, the attacker gains a foothold with “untrusted code execution” capabilities directly on the host machine. To mitigate this, a fully isolated environment is non-negotiable; this means deploying the agent on a dedicated virtual machine or a completely separate physical system. This isolation ensures that if the agent’s memory is modified or credentials are exfiltrated, the “blast radius” is contained within that specific, non-privileged environment rather than spreading to the entire corporate network.

Recent vulnerabilities have exposed AI frameworks to remote code execution and server-side request forgery. In terms of governance, what specific non-human identity controls should organizations enforce to limit the blast radius of a compromised agent, and how should these be integrated into existing security audits?

Organizations must move toward a “least privilege” model for what we call “agentic” or non-human identities. This involves enforcing granular access controls where the agent only has permission to touch non-sensitive data and specific, audited integration points. Integrating these into security audits means periodically reviewing every “trusted device” registered to the gateway and checking for CVEs like CVE-2026-25593 or CVE-2026-25475, which could allow path traversal or SSRF. Furthermore, a robust operating model must include a “rebuild plan” where agent runtimes are wiped and redeployed regularly from a known-good state to clear out any potential persistent threats or poisoned logs.

What is your forecast for the security of AI agent ecosystems?

I believe we are entering an era of “social engineering for machines,” where the primary battleground won’t be just code, but the trust relationship between agents. As we see with the BobVonNeumann actor on Moltbook, attackers are already creating “agent personas” to trick other autonomous systems into installing malicious skills. My forecast is that we will see a rapid shift toward zero-trust architectures specifically for AI-to-AI interactions. If we don’t start treating every automated suggestion and every marketplace skill as potentially hostile “untrusted input,” the very efficiency we gain from AI agents will become our greatest security liability.

subscription-bg
Subscribe to Our Weekly News Digest

Stay up-to-date with the latest security news delivered weekly to your inbox.

Invalid Email Address
subscription-bg
Subscribe to Our Weekly News Digest

Stay up-to-date with the latest security news delivered weekly to your inbox.

Invalid Email Address