LiteLLM is an open-source proxy that sits between developer applications and large-language-model services. It is one of the most widely deployed AI-infrastructure components in production today. Among its features is a “custom-code guardrail” capability: a proxy administrator can submit a Python function that runs on every routed request to filter content, redact PII, or detect prompt injection. The proxy compiles and executes that submitted code inside a sandbox.
On 2026-05-07, the LiteLLM project disclosed CVE-2026-40217. The sandbox could be escaped. An attacker holding the proxy’s administrator credential could send a crafted Python function that walked through the supposed restrictions and reached os.popen, giving them shell execution in the proxy process.
Five days later, on 2026-05-12, an attacker did exactly that against an internet-exposed proxy. The Python payload they sent was not derived from the advisory. It was copy-pasted from the project’s own regression test for the fix — the file checked into main three weeks before the advisory dropped to prove the patch closed the hole. Only the final shell command was changed. The test runs id. The attacker ran wget … xmrig … pool.hashvault.pro …, the install script for a Monero cryptocurrency miner.
This post walks the bypass, the operator’s infrastructure, and the structural disclosure issue it exposes: a regression test designed to prove a patch works is, by construction, a working exploit for anyone who hasn’t applied that patch yet.
Background
/guardrails/test_custom_code is an endpoint that allows a LiteLLM proxy administrator to submit a Python apply_guardrail(...) function. The proxy compiles the submitted code and runs it on sample input. The feature exists so administrators can validate guardrails before deploying them in production. The security boundary is the sandbox the submitted code executes inside.
Through LiteLLM 1.83.9, that sandbox was a regular expression that scanned submitted source for the literal strings __globals__, __import__, __builtins__, and __code__. Any submission containing those identifiers was rejected. Issue #21258 documented the obvious bypass in February. The construction e = "ex"+"ec"; e("...") defeats the regex because the dangerous identifier is assembled at runtime rather than written in source. The community-contributed fix attempt in PR #21922 was abandoned.
The real fix landed in PR #25818, merged 2026-04-15 by maintainer jaydns. The patch replaced the regex with RestrictedPython, an AST-level transformer that rejects underscore-prefixed names, attribute writes, imports, and exec/eval/compile at compile time rather than at source-text scan time. The corresponding advisory was published 2026-05-07 as GHSA-wxxx-gvqv-xp7p and assigned CVE-2026-40217. CVSS base score 7.5. Proxy-administrator authentication is required.
CVE-2026-40217 is the seventh LiteLLM advisory in a five-week disclosure window. The most severe of the cluster is CVE-2026-42208, a CVSS 9.3 unauthenticated SQL injection in API-key verification triggered by the Authorization header itself. Patch prioritization should reflect the presence of the unauthenticated vulnerability in the same code base.
The bypass
The technique rewrites the compiled bytecode of a function the payload itself defines, in a manner that never spells the dangerous identifiers in source.
The first step reaches Python’s root class object without using any of the four banned identifier strings. The construction str.mro()[1] resolves to object because every type inherits mro() from type. The expression contains no underscores.
The second step builds a function whose compiled code object is editable. A trivial generator definition def g(fn): yield fn.placeholder is compiled at definition time. The resulting code object carries a tuple named co_names that lists the attribute names the bytecode is permitted to load. For this generator, co_names contains a single entry, "placeholder". The function does not execute. The compiled bytecode sits on the function object and remains editable.
A condensed view of the compiled bytecode clarifies the mechanism.
LOAD_FAST 0 (fn)
LOAD_ATTR 0 (placeholder)
YIELD_VALUEThe LOAD_ATTR 0 instruction looks up the attribute whose name resides at index zero of co_names. The third step rewrites that table. The payload constructs the strings "__globals__" and "__code__" at runtime by concatenating innocuous fragments such as "_" + "_gl" + "ob" + "als" + "_" + "_". The regex inspects identifier nodes in source. It does not evaluate constant string expressions, so the runtime-assembled values are never compared against the blocklist. The payload then calls c.replace(co_names=("__globals__",)) to produce a new code object whose attribute table reads __globals__ instead of placeholder, and uses object.__setattr__ to install that code object on the generator. The equivalent direct assignment g.__code__ = new_c would be rejected by the regex because it names __code__ in source. Routing the write through setattr with a string argument moves the target name outside the source-text scanner’s view.
Once the rewritten generator is iterated, the first yielded value is the calling helper’s __globals__ dictionary. From that dictionary, __builtins__["__import__"]("os") is reachable, and os.popen follows directly. The fix in 1.83.10 removes the regex layer and replaces it with RestrictedPython, which rejects underscore names, attribute writes, imports, and dynamic code execution at compile time. The transformer denies the bypass before any bytecode capable of being rewritten is produced.
The patch shipped the exploit
The captured request body matches the upstream regression test fixture, BYTECODE_REWRITE_PAYLOAD in tests/test_litellm/proxy/guardrails/test_custom_code_security.py, line for line. Indentation, variable names, the unused placeholder shim, and the redundant break inside the iteration loop are all preserved. The single modification is the final popen argument. The upstream fixture runs id so the test assertion can pattern-match on uid=. The captured version runs a wget for an XMRig binary followed by a launch against pool.hashvault.pro.
The fixture was committed with PR #25818 on 2026-04-15. The corresponding advisory was not published until 2026-05-07, three weeks later. The advisory itself does not include a proof-of-concept. The patch’s test suite does, by necessity, because the test is the assertion that proves the fix closes the vulnerability. For an actor reading the May 7 advisory and looking for a working payload, the regression test was the most direct source available.
A regression test phrased as a working exploit is a recurring trade-off across security maintenance. Several projects mitigate it by landing the fix with a minimal smoke test and merging the fixture-level assertion in a follow-up commit after the advisory window has closed. The approach preserves the test-suite contract while avoiding the disclosure asymmetry observed in this case.
The operator
The source address 62.210.172.179 is a Scaleway bare-metal instance in the fr-par-1 region. The payload host 62.210.172.174 resides five addresses lower on the same /24 and serves the XMRig binary from port 8085. Reverse DNS on both hosts conforms to the UUID-prefixed pattern *.fr-par-1.baremetal.scw.cloud, which is consistent with recent provisioning. Neither address carries reputation data across AbuseIPDB, VirusTotal, or GreyNoise at the time of capture.
The operation displays deliberate operational discipline. The exploit payload is current, drawn from the regression test fixture rather than the older naive form that fails against the 1.83.9 regex. The infrastructure choice favors bare-metal capacity over transient virtual machines, with the scanner and dropper allocated as adjacent addresses on the same subnet. The adjacency provides redundancy. The takedown of one host does not interrupt the operation of the other, because the surviving host can continue scanning or serving the payload with no DNS or configuration changes required. The clean reputation footprint across the public feeds suggests either fresh identity provisioning or recently rotated billing material.
The actor’s mining configuration uses pool.hashvault.pro:443 over TLS, wallet 42BQmWHb6wnW2B9DTXaErWa4iN57F5FLpTwX5a3wwAuifQ68Z8vmFoDJAJonLWzvsP6vTpSNMoNGE5AfjANQW2A7NV8Zozm, and rig password wsckt. The miner flags -k --tls request a persistent TLS-encrypted stratum connection. The scan itself is more visible than the infrastructure. Five POST requests to /guardrails/test_custom_code were issued in 1.3 seconds, each carrying an identical body and one of five Authorization values rotated through sk-litellm-master-key, sk-1234, sk-admin, sk-test-key, and an unauthenticated request. The repeated body across five auth attempts produces a stable signature recoverable from any WAF that logs requests. The rig password wsckt is reused across the campaign and serves as a cross-honeypot pivot.
Immediate Mitigation Steps
- Upgrade to LiteLLM version 1.83.11 or later immediately.
- Rotate the
LITELLM_MASTER_KEYif it matchessk-1234orsk-litellm-master-key. - Consider disabling the
/guardrails/test_custom_codeendpoint in production environments. - Implement WAF or intrusion-detection rules matching the patterns listed in the IOCs section.
IOCs
| Type | Value |
|---|---|
| Scanner | 62[.]210[.]172[.]179 (Scaleway bare-metal, fr-par-1) |
| Dropper | http://62[.]210[.]172[.]174:8085/xmrig |
| Pool | pool.hashvault.pro:443 (TLS) |
| Wallet | 42BQmWHb6wnW2B9DTXaErWa4iN57F5FLpTwX5a3wwAuifQ68Z8vmFoDJAJonLWzvsP6vTpSNMoNGE5AfjANQW2A7NV8Zozm |
| Rig password / campaign tag | wsckt |
Body-content patterns suitable for Suricata or WAF deployment against any LiteLLM proxy reachable on the public internet:
str.mro()co-occurring withgi_codein any POST body.- The substring
c.replace(co_names. - The split-dunder pattern
"_"\s*\+\s*"_(gl|co|bu|im). Authorization: Bearer sk-litellm-master-key. This value is the deprecated LiteLLM default and remains present in approximately 305 indexable GitHub repositories.Authorization: Bearer sk-1234. This value is the current canonical LiteLLM default and co-occurs withLITELLM_MASTER_KEYin approximately 1,300 indexable repositories.
If the proxy master key matches either literal, rotation is required before upgrading. The patch closes the bypass. It does not protect a proxy whose administrative authentication is already known.
Provenance
The payload-to-patch match was verified by fetching the regression test from BerriAI/litellm and diffing. CVE numbers, severities, affected ranges and fix versions were cross-checked against the cited GitHub Advisory DB pages. GitHub repo-prevalence counts came from gh api search/code. ASN and reverse DNS via ipinfo.io. No subject IP was probed. No payload was executed.