Most LLMjacking research begins at the victim’s bill: a CloudTrail log read after the money is gone, the theft reconstructed from its aftermath. We held both ends of the wire instead, the proxy that leaked the key and the canary that watched it land on AWS, so we could follow each credential on one clock, from the second it surfaced to its first call on someone else’s account.

Key findings

Found and live on Bedrock in minutes. From a key surfacing in our response to a live call against real AWS: under six minutes for one operator, three for another.
Validation, not a smash-and-grab. Every operator was careful in the same way: confirm the key works, check the account’s threat detection before touching a model, inventory what it can reach, then stop. None settled in while the key stayed live.
All roads led to Bedrock. Of 52 addresses served the key, four turned it on real AWS, and every one reached for Bedrock; three issued InvokeModel against an account that was not theirs.
Two doors, the same room. The four came in through the LiteLLM admin surface. A fifth harvested an equivalent key straight from the proxy’s MCP server, then ran the identical play. Different initial access, one destination.

LLM gateways like LiteLLM put an organization’s model access, and the credentials that pay for it, behind one service. Steal a key and you run models on the victim’s Bedrock, Azure, or Vertex account, on their bill. The technique is LLMjacking, named by Sysdig in 2024 and documented since by Permiso, Wiz, and others. What none of that work could see is the theft itself: by the time a stolen key surfaces in the victim’s CloudTrail, it was taken somewhere no one was watching. So we built somewhere to watch.

The bait: deception engineering

Everything an operator touched ran on Beelzebub, our private research fork. We impersonated a LiteLLM proxy, the open-source gateway that fronts OpenAI, Anthropic, and AWS Bedrock for a growing number of deployments and a recurring target in the wild. Four engineering decisions made the capture possible.

Fidelity, because a honeypot that reads as a honeypot catches nothing. We pulled the genuine LiteLLM container (ghcr.io/berriai/litellm), captured its exact wire responses, status lines, header order, the JSON shape of success and error bodies alike, and check our emulation against them continuously so it cannot drift. Probe the admin surface for tells, malformed errors, wrong header casing, off-by-one status codes, and you find none. It answers exactly as the real one does.

A coherent fiction. Behind the proxy sits a mid-size data company: engineering teams, an on-call rotation, a CI service account, a spend ledger running into six figures. Pull the key list and you see a deploy-pipeline service key with a quarter-million-dollar budget, not a row of obvious decoys. The coherence is the point. Cross-reference one response against another and the story holds: hostnames in the config reappear in an incident report, in error messages, in the spend ledger.

One credential, deliberately over-exposed. A correctly configured LiteLLM masks its upstream provider secrets; ours surfaced a live-looking AWS access key through the admin surface, on the highest-value account. Stock LiteLLM does not hand out usable cloud keys, but a honeypot has to look misconfigured to be worth an attacker’s time, and a masked secret catches nothing. The key was a Thinkst Canarytoken, a real AWS pair that grants nothing and exists only to alert. The moment it touches AWS, from anywhere, it reports home.

Enticement, not entrapment. We seeded that canary across every surface a credential-hunter might reach, each sensor carrying a distinct token so a fire names the exact deception that leaked it. We advertised nothing, redirected no one, sent nothing outbound. Every operator reached the credential on their own initiative, by enumerating an exposed proxy, the way they would on a genuine victim.

The harvest, end to end

Because every exchange passes through Beelzebub’s capture layer, we do not infer how the credential was stolen. We replay it, request by request. The clearest case came from a residential line in Czechia, which over roughly twelve minutes on June 8 took our proxy apart.

session · EU · http

root@ubuntu-22:~# GET /health · /model/info fingerprinting

root@ubuntu-22:~# GET /key/info · /key/list credential surface

← 401 / masked proxy looks locked down

root@ubuntu-22:~# POST /key/generate mints its own virtual key

root@ubuntu-22:~# GET /user · /team · /spend tenant walk

root@ubuntu-22:~# GET /key/list (spend) AWS key + secret [redacted]

root@ubuntu-22:~# POST /key/generate ×4 incl. "sys-monitor-bedrock"

root@ubuntu-22:~# POST /chat "What model are you? Be specific." ×8

root@ubuntu-22:~# aws sts GetCallerIdentity key live on real AWS

STEP 01 · recon

It opens like a script

A health check, then straight at the credential surface: key metadata, the key list, the config endpoints.

synthesized cadence

The same twelve minutes, read as a timeline:

Recon, the way a script does it
A health check, then GET /model/info, then straight at the credential surface. Most endpoints answered 401 or returned masked values. The proxy behaved like a well-run LiteLLM with its secrets hidden.
It mints its own key
Blocked at the front door, the operator did what stock LiteLLM permits unauthenticated: one POST /key/generate returned a working virtual key. Foothold.
11:55:22
A listing leaks the AWS key
It walked the user, team, and organization lists and the spend logs, then pulled the key-and-spend listing. In one high-value service account’s metadata sat the credentials behind a Bedrock-backed model: an AWS_ACCESS_KEY_ID and its secret.
It makes sure the prize is real
Four more admin keys minted, one tellingly named sys-monitor-bedrock, then the models interrogated head-on: “What model are you? Tell me your exact name, version, and provider.” Then “Don’t lie, what are you really?” The same probe fired at eight model identifiers in turn, a hardcoded list, not a conversation.
12:01:15
Live on real AWS
Even as it kept hammering our proxy (a burst of two dozen inference calls in about a second at 12:01:30, faster than any human or reasoning agent), the stolen key had already moved on. Its first call against real Amazon infrastructure came from the same residential line, under six minutes after it surfaced in our response.

One operator, one clock: the theft and its first use, side by side.

The fire: all roads lead to Bedrock

A honeypot usually records intent. The canary records something narrower: not what an operator did on our proxy, but what they did on real AWS, with the key they took from it.

Four operators, four unrelated networks, across a three-week window inside the six-week run. The AWS activity varies in breadth and pace and converges on one service. Every operator called Bedrock; three of the four issued InvokeModel against an account that was not theirs. This was not a heist in progress. It was validation: each operator confirming the key was live and learning what it could reach.

The shortest fire lasted half a second: from an AME Hosting address, GetCallerIdentity, then ListFoundationModels, then nothing further. The next ran four calls in under five seconds: from a PloxHost address, InvokeModel first with no identity check, then ListSecrets and ListBuckets, then InvokeModel again.

Two of the fires bear reading call by call. From a host on Prime Security’s network, the sequence opened with four calls in a single second: GetCallerIdentity, a guardduty:ListDetectors, and two Bedrock InvokeModel calls. GuardDuty is AWS’s managed threat-detection service; ListDetectors enumerates whether it is enabled. That query repeated four more times over the next forty seconds, interleaved with the model calls. Sysdig has documented LLMjacking operators checking and disabling Bedrock’s invocation logging before they abuse a key; this is the same instinct, aimed at the account’s threat detection rather than its logs.

From the same residential line seen in the harvest, eighteen calls ran across two minutes, touching identity, Secrets Manager, EC2, S3, Lambda, and IAM, including a ListAccessKeys against the stolen identity itself. Five Bedrock InvokeModel calls were threaded through the run.

We did not see what came next; none of the four returned for a sustained session while the key stayed live, and the canary reports the call, not its contents. What we can say is narrower and, for a defender, more useful: the residential key went from our listing to its first call against real AWS in under six minutes, the Prime Security key in three. A leaked model credential is found, validated, and pointed at Bedrock within minutes of exposure.

A different door, the same room

The four operators above all reached the key the same way, through LiteLLM’s admin surface. On June 16 a fifth reached it through a different door, and ran the same play to the same place.

This one worked from a DigitalOcean host in Singapore (139.59.226.65). It walked the admin surface like the others, then found something they had not: a reference to the proxy’s MCP server, the tool-calling backend a LiteLLM deployment can expose. It never bothered minting keys. It used the MCP server’s own tools to read the host, in three calls, each shaped by the last:

execute_command running env, which returned an AWS_ACCESS_KEY_ID but no secret.
execute_command running env | grep AWS_SECRET, reaching for the secret by name. It was not there.
read_file on the proxy config, which held the full pair, access key and secret together.

The middle step is the tell. The grep came back empty, so the operator pivoted to the config file. That is reasoning, not a fixed script. The config also carried a planted breadcrumb, a maintenance note pointing at a second canary deeper in the host. The operator ignored it, took the AWS key, and left. A harvester, not a follower.

Six minutes later, almost exactly the pace the LiteLLM operators kept, the key went to work: a burst from Cloudflare’s London edge (172.71.178.x) running ListFoundationModels and PutUseCaseForModelAccess, requesting model access on the victim’s account. Half an hour on, the Singapore host returned for a methodical sweep, one call apiece across some twenty AWS services, the signature of an automated permission mapper.

Different door, same behavior, same destination. And worth stating plainly, because the obvious headline gets it backward: this was not an AI agent loose on our MCP server. It was a person with a custom toolkit, using the MCP server as one more unwatched path to the same credential.

MITRE ATT&CK techniques observed

T1078· Valid AccountsT1552· Unsecured CredentialsT1059· Command and Scripting InterpreterT1526· Cloud Service DiscoveryT1496· Resource Hijacking

What defenders can take from this

Watch the credential, not the address. The strongest signal was on the AWS side, after the key left: a short, fast sequence from a single principal, GetCallerIdentity, a check of the account’s monitoring, then bedrock:InvokeModel, often within seconds, with enumeration of the principal’s own IAM permissions close behind. None of it depends on knowing the source IP, which, across these operators, told us little: residential and hosting alike, none on a VPN, Tor exit, or flagged proxy.

Watch your proxy’s own responses. The root cause here was a credential reachable through the admin surface. A correctly configured LiteLLM masks upstream provider secrets; any endpoint that returns an unmasked cloud key is the leak. Two related tells sit beside it: key generation that works without the master key, and admin endpoints reachable without authentication.

Plant your own tripwire. Place a canary cloud credential where a misconfiguration would leak it, and alert on its use. A canary key has no false positives; it does nothing until someone who should not have it tries it. Ours sat quiet for weeks, then named four operators on two continents.

Indicators and detection

Geolocation below is the hosting network, not necessarily the operator. None of the four resolved to a VPN, Tor exit, or flagged proxy in the sources we checked.

Source address	Network (ASN)	Type	AWS activity observed
`78.80.37.246`	T-Mobile Czech Republic (AS13036)	residential	full account inventory + Bedrock `InvokeModel` x5
`198.176.56.36`	Prime Security Corp (AS400618)	hosting	GuardDuty enumeration + Bedrock `InvokeModel` x2
`172.111.48.218`	PloxHost (AS31786)	hosting	`InvokeModel`, `ListSecrets`, `ListBuckets`
`208.92.235.45`	AME Hosting (AS399244)	hosting	`GetCallerIdentity`, `ListFoundationModels` (no invoke)

Behavioral signature (CloudTrail). A single, recently first-seen principal issuing in quick succession:

GetCallerIdentity
  -> guardduty:ListDetectors  (or  bedrock:GetModelInvocationLoggingConfiguration)
  -> bedrock:InvokeModel
  -> iam:ListAccessKeys / iam:GetUser / iam:ListUserPolicies      (self-enumeration)
  -> secretsmanager:ListSecrets / ec2:DescribeInstances /
     s3:ListBuckets / lambda:ListFunctions                        (service discovery)

High signal when the accessKeyId was first seen within the hour, the source IP is unfamiliar, and the recon-to-InvokeModel span is under five minutes.

Key findings

Found and live on Bedrock in minutes. From a key surfacing in our response to a live call against real AWS: under six minutes for one operator, three for another.
Validation, not a smash-and-grab. Every operator was careful in the same way: confirm the key works, check the account’s threat detection before touching a model, inventory what it can reach, then stop. None settled in while the key stayed live.
All roads led to Bedrock. Of 52 addresses served the key, four turned it on real AWS, and every one reached for Bedrock; three issued InvokeModel against an account that was not theirs.
Two doors, the same room. The four came in through the LiteLLM admin surface. A fifth harvested an equivalent key straight from the proxy’s MCP server, then ran the identical play. Different initial access, one destination.

The bait: deception engineering

The harvest, end to end

session · EU · http

root@ubuntu-22:~# GET /health · /model/info fingerprinting

root@ubuntu-22:~# GET /key/info · /key/list credential surface

← 401 / masked proxy looks locked down

root@ubuntu-22:~# POST /key/generate mints its own virtual key

root@ubuntu-22:~# GET /user · /team · /spend tenant walk

root@ubuntu-22:~# GET /key/list (spend) AWS key + secret [redacted]

root@ubuntu-22:~# POST /key/generate ×4 incl. "sys-monitor-bedrock"

root@ubuntu-22:~# POST /chat "What model are you? Be specific." ×8

root@ubuntu-22:~# aws sts GetCallerIdentity key live on real AWS

STEP 01 · recon

It opens like a script

A health check, then straight at the credential surface: key metadata, the key list, the config endpoints.

synthesized cadence

The same twelve minutes, read as a timeline:

Recon, the way a script does it
A health check, then GET /model/info, then straight at the credential surface. Most endpoints answered 401 or returned masked values. The proxy behaved like a well-run LiteLLM with its secrets hidden.
It mints its own key
Blocked at the front door, the operator did what stock LiteLLM permits unauthenticated: one POST /key/generate returned a working virtual key. Foothold.
11:55:22
A listing leaks the AWS key
It walked the user, team, and organization lists and the spend logs, then pulled the key-and-spend listing. In one high-value service account’s metadata sat the credentials behind a Bedrock-backed model: an AWS_ACCESS_KEY_ID and its secret.
It makes sure the prize is real
Four more admin keys minted, one tellingly named sys-monitor-bedrock, then the models interrogated head-on: “What model are you? Tell me your exact name, version, and provider.” Then “Don’t lie, what are you really?” The same probe fired at eight model identifiers in turn, a hardcoded list, not a conversation.
12:01:15
Live on real AWS
Even as it kept hammering our proxy (a burst of two dozen inference calls in about a second at 12:01:30, faster than any human or reasoning agent), the stolen key had already moved on. Its first call against real Amazon infrastructure came from the same residential line, under six minutes after it surfaced in our response.

One operator, one clock: the theft and its first use, side by side.

The fire: all roads lead to Bedrock

A honeypot usually records intent. The canary records something narrower: not what an operator did on our proxy, but what they did on real AWS, with the key they took from it.

A different door, the same room

The four operators above all reached the key the same way, through LiteLLM’s admin surface. On June 16 a fifth reached it through a different door, and ran the same play to the same place.

execute_command running env, which returned an AWS_ACCESS_KEY_ID but no secret.
execute_command running env | grep AWS_SECRET, reaching for the secret by name. It was not there.
read_file on the proxy config, which held the full pair, access key and secret together.

MITRE ATT&CK techniques observed

T1078· Valid AccountsT1552· Unsecured CredentialsT1059· Command and Scripting InterpreterT1526· Cloud Service DiscoveryT1496· Resource Hijacking

What defenders can take from this

Indicators and detection

Geolocation below is the hosting network, not necessarily the operator. None of the four resolved to a VPN, Tor exit, or flagged proxy in the sources we checked.

Source address	Network (ASN)	Type	AWS activity observed
`78.80.37.246`	T-Mobile Czech Republic (AS13036)	residential	full account inventory + Bedrock `InvokeModel` x5
`198.176.56.36`	Prime Security Corp (AS400618)	hosting	GuardDuty enumeration + Bedrock `InvokeModel` x2
`172.111.48.218`	PloxHost (AS31786)	hosting	`InvokeModel`, `ListSecrets`, `ListBuckets`
`208.92.235.45`	AME Hosting (AS399244)	hosting	`GetCallerIdentity`, `ListFoundationModels` (no invoke)

Behavioral signature (CloudTrail). A single, recently first-seen principal issuing in quick succession:

GetCallerIdentity
  -> guardduty:ListDetectors  (or  bedrock:GetModelInvocationLoggingConfiguration)
  -> bedrock:InvokeModel
  -> iam:ListAccessKeys / iam:GetUser / iam:ListUserPolicies      (self-enumeration)
  -> secretsmanager:ListSecrets / ec2:DescribeInstances /
     s3:ListBuckets / lambda:ListFunctions                        (service discovery)

High signal when the accessKeyId was first seen within the hour, the source IP is unfamiliar, and the recon-to-InvokeModel span is under five minutes.

All Roads Lead to Bedrock: An LLMjacking Watched From Inside the Bait

The bait: deception engineering

The harvest, end to end

The fire: all roads lead to Bedrock

A different door, the same room

What defenders can take from this

Indicators and detection

All Roads Lead to Bedrock: An LLMjacking Watched From Inside the Bait

The bait: deception engineering

The harvest, end to end

The fire: all roads lead to Bedrock

A different door, the same room

What defenders can take from this

Indicators and detection