Popular AI Sandbox Has a Backdoor—Since August

Written by Derek Fisher | Fri | Mar 20, 2026 | 1:23 PM Z

Those of us in cybersecurity should be familiar with sandbox environments where we can detonate and review malware in a minimal risk container. Similarly, a managed sandbox environment for AI allows you to run code, process data, and call tools all from a contained and controlled environment.

A prime example of a use case is presenting a chatbot with data and asking it to evaluate the data and return some analysis. An LLM behind the chatbot will not likely respond accurately, but an AI agent can create and execute Python to analyze a CSV, query a database, or run statistical models and return the analysis. In one of these sandbox environments, they can do that without accessing your infrastructure.

Here are some of the other benefits of a managed sandbox environment.

Containment of unintended side effects

AI agents, especially code-executing ones, can produce outputs that interact with the real world, like writing files, making network calls, and modifying state, but a sandbox draws a hard line around what the agent can touch. This means a bug in the generated code or a bad prompt doesn't cascade into your infrastructure.

Isolation of untrusted code

When an AI agent generates and then executes code, that code is fundamentally untrusted. It was written by a model trained on the internet, possibly manipulated through prompt injection, and hasn't been audited by a human. Sandboxing treats it the same way you'd treat code from an unknown external source. You can run it, but you run it in a box.

Reproducible, ephemeral execution

Managed sandboxes are typically ephemeral and short-lived. Each execution starts clean, which prevents one agent's session from contaminating another's and makes behavior more predictable and auditable.

Abstracting infrastructure responsibility

The "managed" part means the cloud provider handles the low-level mechanics of isolation such as the containerization, the resource limits, and the kernel boundaries. The customer gets a safe execution surface without having to build and maintain it themselves.

Bottom line: AI execution in a managed sandbox means you reduce the ability for the AI to affect other systems. Well, in theory at least. More on that in a bit.

Not every sandbox is open play

One of these AI execution sandboxes is the AWS Bedrock AgentCore Code Interpreter, available since August of 2025. It is a fully-managed service that enables AI agents to securely execute code in isolated sandbox environments, designed so that agentic workloads cannot access external systems. It allows for three network modes: Sandbox, VPC, and Public. The promise of the Code Interpreter goes beyond data analysis and code execution. Take the instance of an LLM reviewing a dataset for anomalies. Using LLM inference means you'll likely get results that will be imprecise or even hallucinated. However, if an agent can create Python code to parse the data and return results, you're more prone to get better and more accurate results.

Engineering teams use AI agents in these sandboxes to run Python, JavaScript, and TypeScript, perform complex data analysis, generate visualizations, analyze financial and operational data, and execute mathematical computations without compromising system security.

This all sounds great, so what's the problem?

Well, from a security standpoint, the piece that matters most for teams is that Code Interpreter supports running AWS CLI commands directly within the sandbox using an SDK and API, using IAM-based access controls and fine-grained permissions. This is what makes it useful for engineering workflows but also why the default role permissions are so problematic.

Research from BeyondTrust found that The AgentCore Starter Toolkit—AWS's open source quick start for getting Code Interpreter up and running—ships with a default IAM role that grants full S3 read access, full DynamoDB access, and unrestricted Secrets Manager access. That's not a misconfiguration a developer introduced, that's the out-of-the-box posture AWS documented and published (features that AWS stated are by design). The tyranny of the default strikes again!

No internet access doesn't always mean none

Getting the Code Interpreter to, you know, interpret code was not difficult for the BeyondTrust team. This meant getting a chatbot, and the agents it relies on, to execute code of the researcher's choosing through a prompt injection, supply-chain attack, or getting the chatbot to generate code that was influenced by the researcher. For example:

Once the code execution is achieved, the researchers move on to the next phase. And, stop me if you've heard this, but it's always DNS.

What was found in the BeyondTrust research is that the Code Interpreter could be persuaded to interact with C2 (command and control) channels and exfiltrate data through S3 buckets all through DNS A and AAAA record queries. For the data exfiltration, base64 encoded data was embedded in DNS subdomain queries. The researchers showed that they could run AWS CLI commands using the Code Interpreter's attached IAM credentials. This allowed them to list S3 buckets, pull files containing customer PII, API credentials, or financial records, and send that data encoded into DNS subdomain lookups to a DNS server controlled by the researchers.

While helpful, the researchers needed a method for controlling the Code Interpreter remotely. Enter the C2 ability through DNS. The researchers we able to send commands through DNS A record responses. Each octet in the response was encoded base64 command chunks as explained by the BeyondTrust writeup:

The Code Interpreter polls the attacker's DNS server for these chunked commands, reconstructs and executes them, then returns the output via DNS subdomain queries. Circle complete. There is now a fully bidirectional, persistent communication channel hidden entirely within traffic that looks like routine DNS traffic.

These channels allow for the bypass of any network isolation through DNS, and makes it difficult for defenders to block without crippling the operation of their sandboxed environment. Perhaps more frightening is the fact that more sophisticated DNS C2 implementations could establish a fully interactive shell, not just one-off commands.

Defense-in-depth for DNS

All is not lost, and there are practical steps that can be taken to limit the risk if you are using Code Interpreter. BeyondTrust recommends the following:

Inventory your AgentCore Code Interpreter instances, their network modes, and their privileges.
If you're using Sandbox Mode and assumed it provided complete network isolation, it does not. DNS resolution is enabled by design, which means DNS-based data exfiltration is possible. Migrate sensitive workloads to VPC only mode.
Scan code for prompt injection vulnerabilities to reduce risk of attackers manipulating code that is sent to the code interpreter.
Use Guardrails on the input as an additional safeguard.
Prefer newer models that have built-in safeguards to limit outright prompt injection.

But it's worth noting that you can take your defensive posture a few steps further. Specifically for DNS, consider:

Deploying a Route53 Resolver DNS Firewall to configure an allow-list of known-good domains. This list should be short. Additionally, you can alert on high-frequency DNS queries to single domains.
Make sure you monitor DNS query volume and entropy. A query for aGVsbG8gd29ybGQ.attacker.com looks nothing like api.github.com. Look to baseline normal DNS query patterns and alert on deviations.

Lastly, harden the Code Interpreter. Since the default IAM role provided with Code Interpreter has full S3 read, full DynamoDB, and unrestricted Secrets Manager access by default. This means the blast radius is equal to the IAM role. That's a problem, and one that AWS says is working as intended.

For users of the Code Interpreter, take matters into your own hands and consider auditing and replacing the default Starter Toolkit IAM role with inline policies scoped to specific S3 paths and ARNs only. Enforce least privilege as a hard requirement, not a best practice. Lastly, make sure to enable CloudTrail for all API calls made by the Code Interpreter's IAM role and alert on calls to services outside expected scope.

The path forward

Like all things AI, we're on the cutting edge of a lot of this technology, and we're only in the early stages of understanding the attack surface AI technology presents. From prompt injection to autonomous agents to poisoned models to the insecure platforms AI operates in, there is no doubt that we are going to continue to see novel (and even not so novel) ways of pushing the boundaries of security with these new systems.

This appeared originally on Substack here.

View full post