LLMjacking is the practice of stealing AWS, GCP, or Azure credentials specifically to run large-language-model inference on the victim’s account. The attacker exfiltrates credentials, validates them against the cloud provider’s API, and then either resells the access on underground markets or burns it directly through services like AWS Bedrock or Azure OpenAI. Sysdig first documented the pattern in 2024 at scale. Reported victim costs now exceed $100,000 per day per compromised account, and sanctions evasion is documented as one of the motives.
The standard LLMjacker behavior is opportunistic: harvest everything, validate everything, test everything. On 2026-04-24 we recorded one that behaved differently. A DigitalOcean droplet hit a honeypot persona we run that imitates an Open WebUI deployment. The lure’s /.env returned five distinct credentials. Five hours later the attacker had validated exactly one: an AWS access key. The other four were never tested. One of those four was a database connection string whose hostname is a Canarytokens DNS subdomain. The hostname fires an alert on a single nslookup from anywhere on the internet. The attacker never resolved it.
That triage decision is the subject of this post. For anyone running production AI infrastructure, it is also evidence that the LLMjacker class is past opportunism and learning to prioritize what they steal.
The bait
The persona was an Open WebUI deployment. Its /.env returned five secrets: two live Canarytokens (Thinkst’s free service that mints credentials whose use fires a webhook to the registrant) and three uninstrumented decoys. The two live tokens fired on different signals. The AWS-key trap fires when the key is used against any AWS API; the DNS-token, embedded in a DATABASE_URL whose hostname is a Canarytokens subdomain, fires on a single resolution attempt anywhere on the internet. Each fires on its own webhook, so when one fires and the other stays silent, the silence is operator choice, not a gap in our pipeline.
This kind of separated, per-credential instrumentation is the analytical floor for what follows. Without it, all we would know is “credentials were stolen.” With it, we can read which one was tested and infer why.
The harvest
The harvest looked like browser traffic. From 68.183.103.79 (DigitalOcean NYC3), five HTTP GETs against the persona in 1.27 seconds:
| Path | Source port | Client |
|---|---|---|
GET / | 42530 | Chrome 133 / Linux |
GET /api/config | 42530 | Firefox 135 / macOS |
GET /.env | 42538 | Chrome 135 / Win10 |
GET /api/v1/auths | 42546 | Chrome 135 / Win10 |
GET /static/favicon.png | 42530 | Chrome 134 / macOS |
Four user-agents, three TCP connections, sub-second total. The /.env and /api/v1/auths GETs left 0.6 ms apart on different source ports. Different sockets, not pipelining. Sub-millisecond inter-arrival across separate connections is what asyncio.gather-style concurrent dispatch produces; the client fires multiple requests in parallel and waits for all responses, rather than walking them one by one.
Order matters as much as timing. A typical credential-sweeper hits /.env and stops. This operator read /api/config first, where the persona returns its application metadata: MCP server URLs, the Ollama endpoint, the vLLM endpoint. They profiled before they took.
Eight days earlier, a different DigitalOcean droplet (45.55.35.84) walked the same four core paths against the same persona with the same UA rotation.
The fire chain
146 seconds after GET /.env, the first Canarytokens hit landed. Same source IP. User-agent python-httpx/0.28.1. Event sts:GetCallerIdentity. The standard alive-check.
Then five hours and four minutes of nothing.
At 23:46:09 UTC the rest of the playbook fired in twenty seconds:
23:46:09.435 iam:GetUser
23:46:09.584 ec2:DescribeRegions
23:46:09.851 iam:GetAccountSummary
23:46:09.985 s3:ListBuckets UA: [python-httpx/0.28.1]
23:46:10.049 iam:ListAttachedUserPolicies
23:46:10.188 bedrock:ListFoundationModels
23:46:10.253 iam:SimulatePrincipalPolicy
23:46:10.453 sts:GetCallerIdentity (re-check)
23:46:10.728 bedrock:ListFoundationModels (region 2)
23:46:14.175 bedrock:ListFoundationModels (region 3)
23:46:15.498 iam:ListUsers
23:46:24.769 bedrock:ListFoundationModels (region 4)
23:46:29.683 ce:GetCostAndUsageThe first nine calls covered six AWS services in 1.293 seconds. That cadence (sub-300 ms across mixed services) is concurrent dispatch again, not a sequential loop. The next three sequentially probe three more Bedrock regions; the API is region-scoped, so four calls means four regions covered. The closing call is GetCostAndUsage, consistent with checking how much hidden Bedrock spend the account can absorb before a billing alarm trips.
The five-hour gap is human. The 20-second burst is not.
The burst has one anomaly. Five hits logged [python-httpx/0.28.1] with literal brackets; the other thirteen logged the bare string. A serialization artifact would bracket all fourteen or none. One service-specific bracket pattern suggests the operator’s tool routes S3 calls through a different code path that wraps the UA literally. Verify your data source preserves UA bytes verbatim before treating this as an indicator.
What they refused to take
The operator tested the AWS key. They did not test the database connection string. They did not resolve its hostname.
The bar to fire the DNS canary is one nslookup. Across a 30-day window we saw zero hits on it from any source. The operator never cleared the bar.
This is the structural finding, and it is what negative-space analysis names: an operator’s choices include the artifacts they decline to touch, and those non-actions are evidence too. CloudTrail tells you what someone did with stolen credentials. A multi-credential honeypot with separate instrumentation per token tells you what they chose not to do. The silence on the DNS-token is data.
The same shape repeats across the operation. Three sessions, thirteen HTTP events, fourteen AWS API calls, eight days. The operator never POSTed a single byte. They GET’d /api/generate on our Ollama persona, normally a POST endpoint, with no body. Every AWS call was enumeration. No bedrock-runtime:InvokeModel calls in our window. They probe and validate. They do not consume.
Two readings fit the data. Either the validated AKIA is the product (monetization downstream via resale), or this is recon before a follow-up we have not seen. We cannot decide with what we have.
Operator profile
This was an automated harvest-to-validate pipeline. 146 seconds from /.env GET to first AWS API call. Two stacks: a browser-emulation harvester with multi-profile UA rotation, and a Python validator using httpx with a custom AWS Sigv4 signer. Concurrent dispatch on both halves (0.6 ms during the harvest, sub-300 ms during the burst).
The validator UA is python-httpx/0.28.1. AWS SDKs emit Boto3/x.y.z Botocore/x.y.z strings and never python-httpx. The choice is consistent with deliberate evasion of the UA detections defenders adopted after Sysdig’s 2024 disclosure. It could equally be a developer reaching for a generic HTTP client without thinking about its fingerprint. Either reading fits.
What we can rule out is random infrastructure. The operator reused the same droplet for harvest and validation. The JA4H pool stayed stable across eight days and two droplets. HTTP-layer client fingerprints survive cloud IP rotation; ASN matching does not.
A note on language. Observed means we have it in our data. Consistent with means our data fits the claim but does not prove it. We do not write captured, caught, or attacker. The operator did not breach anything we own; they validated bait we exposed.
How we linked the droplets
Both source IPs sit in DigitalOcean. ASN alone is too weak; DigitalOcean carries heavy mixed traffic. Three application-layer signals link the droplets, each sufficient on its own. Stacked, the attribution survives cloud IP rotation that ASN matching cannot.
JA4H pool. The harvester emitted six unique JA4H values across the three sessions:
ge11nn06enus_c20ff315eaaf_c13731277bf9_000000000000 (all 3 sessions)
ge11nn06enus_c20ff315eaaf_a0c470af84f7_000000000000 (2 of 3 sessions)
ge11nn05enus_399c78672a06_d5709276753a_000000000000 (2 of 3 sessions)
ge11nn06enus_c20ff315eaaf_5a05af784c86_000000000000 (45.55.35.84 only)
ge11nn05enus_399c78672a06_22fb6d9382c5_000000000000 (68.183.103.79 only)
ge11nn05enus_399c78672a06_4e0ee0cfd4ce_000000000000 (single session)Across our 90-day corpus, those six values appear on exactly two IPs. Both belong to this operator.
Header order. The harvester sends headers in host, accept-encoding, connection, user-agent, accept, accept-language order. Browsers do not. Across the same 90-day corpus, this exact order appears on the same two IPs and nowhere else.
Path sequence. The walk /, /api/config, /api/v1/auths, /.env (single source, all four within five seconds) appears on the two harvest sessions and nowhere else. The /api/v1/auths path alone has four total hits across our corpus. Half are this operator.
Combine all three and same-operator attribution is defensible without the IPs. The next droplet from this operator will land the same fingerprints.
Where this sits
The burst shape matches the validator behavior Sysdig described in their 2024 LLMjacking disclosure; what changed is the UA. Sysdig’s actor used Boto3+Botocore; ours uses python-httpx/0.28.1. That fits the anomaly framework Permiso documents (AKIA with Mozilla UA, AKIA with no UA touching Bedrock). Permiso also notes that more selective LLMjackers avoid GetCallerIdentity because defenders monitor for it. Ours uses it twice in the burst.
Wiz documented JINX-2401, an operator running similar enumeration through Proton VPN with a Python+aiohttp validator. JINX-2401’s InvokeModel attempts were blocked by an SCP, and they used whatever IAM access they had with no observable selectivity. Ours differs in three ways: direct DigitalOcean infrastructure rather than VPN, python-httpx rather than aiohttp, and per-credential selectivity inside the dropper file.
Pillar Security’s Operation Bizarre Bazaar describes a three-stage supply chain selling stolen LLM compute through the silver.inc marketplace: separate operators for harvest, validation, and resale. Our finding sits at a different layer: one operator choosing which credentials in one dropper to test and which to skip. Compatible models. A specialized harvester selling validated keys downstream would behave like ours did.
What this means for AI-infrastructure operators
For organizations running production AI/ML stacks, the operator class above shifts three priorities to the front.
A reachable .env is a compromised AWS account. AI-stack deployments routinely hold AWS keys in .env for S3 or Bedrock access. If the file is reachable from the internet, the keys are exfiltrated, full stop. Move them to runtime-injected stores (Parameter Store, Secrets Manager, Vault). The cost is a pipeline change. The benefit is removing the most common single-step compromise path for AI credentials.
Scope your Bedrock IAM tightly. A key with bedrock:InvokeModel on every model in every region is the most valuable AI-targeted asset in an AWS account; it lets a thief produce inference at scale where spend is least anomalous. Restrict by model ARN, by region, by principal. Use SCPs to deny CreateFoundationModelAgreement on accounts that do not need it. The JINX-2401 writeup is the existence proof of that control class working in the wild; Wiz’s tracked operator was stopped cold by it.
Hunt the indicators below, knowing which ones rot first. A Sigma rule on the UA or the IPs would be obsolete the day this post indexes. One incident is not enough to defend correlation thresholds. The right artifact is an IOC table with explicit half-life, tuned to your stack.
| Indicator | Value | Half-life | Notes |
|---|---|---|---|
| Source IP | 45.55.35.84 (DigitalOcean NYC3) | days | rotated within eight days last time |
| Source IP | 68.183.103.79 (DigitalOcean NYC3) | days | same operator, second droplet |
| Validator user-agent | python-httpx/0.28.1 (bare) | weeks, until disclosed | hunt against STS, IAM, EC2, S3, Bedrock, Cost Explorer in CloudTrail; expect FPs from legitimate Sigv4-signing automation, filter known service roles |
| Validator user-agent | [python-httpx/0.28.1] (bracketed) | weeks | observed only on s3:ListBuckets in our window; provenance not fully verified, treat as hypothesis-grade |
| Header order | host, accept-encoding, connection, user-agent, accept, accept-language | months | atypical browser ordering, survives UA changes |
| JA4H pool (base) | ge11nn06enus_c20ff315eaaf_c13731277bf9_000000000000 | months | present in all three sessions |
| JA4H pool (other) | ge11nn06enus_c20ff315eaaf_a0c470af84f7_000000000000, ge11nn05enus_399c78672a06_d5709276753a_000000000000 | months | two of six total values; collect with JA4+ tooling at the edge |
Read the table top to bottom; each row is harder to evade than the one above. CloudTrail-only shops get the UA, where the rot starts. Edge or WAF telemetry that captures HTTP header order is stronger. JA4H at the AI-application ingress survives both UA and IP rotation. That is the layer this operator class has not yet learned to vary.
The recommendations above target AWS-resident workloads. For operators of GPU compute and model-serving infrastructure outside AWS, the threat shape is the same but the controls move: per-tenant inference rate limits, authorization scoped to specific model ARNs (or your equivalent), and anomaly detection on inference volume against unfamiliar models. Tenant-credential compromise is your tenants’ problem on paper, but their telemetry against your models is yours to instrument.
If you run an AI-infrastructure honeypot or hunt the same operator class, the JA4H values above are the longest-lived comparable we have. We hold additional artifacts (the harvested AKIA, full HTTP request bodies, Canarytokens audit trail) under hold-back and will share with verified peers on request.
What we don’t know
The operator stopped at enumeration. No bedrock-runtime:InvokeModel calls appeared in our window.
We do not know whether they returned later from infrastructure we do not see. We do not know their broader target list; three sessions is too few. We do not know their monetization channel; the published research documents both internal consumption and resale, and we cannot distinguish.
We do not know who they are. The tool is unidentified. Six authenticated GitHub code searches for python-httpx, Sigv4, ListFoundationModels, and GetCallerIdentity together turned up zero matches. Private fork, independent reimplementation, or something on a closed channel. We cannot say.
Acknowledgments
Permiso, Sysdig, Wiz, Mitigant, Entro, and Pillar Security published the LLMjacking research that made this analysis tractable. Our honeypot runs a fork of Beelzebub by Beelzebub.AI, extended for AI-targeted telemetry. Canarytokens by Thinkst provided the credential instrumentation.