Sandbox Best Practices
Run agent file and command workflows inside a bounded workspace.
Use @anvia/sandbox when an agent needs to execute commands, write files, inspect generated output, or perform multi-step file workflows without touching the host project directly.
A sandbox is a boundary, not a permission system by itself. Your app still owns which files are staged, which tools are exposed, which commands are allowed, and how outputs are returned to users.
Default Shape
| Boundary | Practice |
|---|---|
| workspace | create one short-lived session per user task |
| files | stage only the files needed for the task |
| commands | pass structured command and args; avoid shell strings |
| network | keep network disabled unless the workflow requires it |
| limits | set timeout, memory, CPU, and output limits |
| cleanup | always destroy sessions in finally |
| audit | trace command names, exit codes, and selected files |
Create Sessions Per Task
Keep sandbox lifetime scoped to one request, job, or approval window.
import { DockerSandbox } from "@anvia/sandbox";
export async function runCodeCheck(source: string) {
const sandbox = new DockerSandbox({
image: "node:22-bookworm",
limits: {
timeoutMs: 30_000,
maxOutputBytes: 64_000,
memoryMb: 512,
cpus: 1,
},
});
const session = await sandbox.createSession({
manifest: {
files: {
"index.js": source,
},
},
});
try {
return await session.exec({
command: "node",
args: ["index.js"],
});
} finally {
await session.destroy();
}
}Avoid long-lived shared sessions unless your product explicitly needs a persistent workspace model. Shared sessions make cleanup, permissions, and audit harder.
Stage Inputs Explicitly
Do not bind mount the whole host repository into a sandbox for untrusted work. Copy in the minimum task files.
const session = await sandbox.createSession({
manifest: {
files: {
"package.json": JSON.stringify(packageJson, null, 2),
"src/task.ts": taskSource,
"README.md": readme,
},
},
});This keeps host secrets, unrelated source files, local credentials, and build artifacts out of the container.
Expose Narrow Tools
If an agent only needs file operations, do not expose command execution.
import { createSandboxTools } from "@anvia/sandbox";
const tools = createSandboxTools(session, {
include: ["read_file", "write_file", "list_files"],
});Expose exec_command only when command execution is part of the product workflow. For higher-risk flows, put command execution behind approval or wrap it in your own tool that validates the command allowlist.
Validate Commands
Prefer structured command execution over shell strings:
await session.exec({
command: "npm",
args: ["test", "--", "--runInBand"],
timeoutMs: 60_000,
});Avoid:
await session.exec({
command: "sh",
args: ["-c", userGeneratedCommand],
});Shell execution is harder to inspect, quote, approve, and audit. Use it only for trusted scripts that your app controls.
Handle Outputs as Data
Sandbox command failures are product states. Inspect exit code, timeout state, and truncated output before sending results back to the model or user.
const result = await session.exec({
command: "npm",
args: ["test"],
timeoutMs: 60_000,
});
if (result.timedOut) {
return { status: "timeout", summary: "The test command exceeded 60 seconds." };
}
if (result.exitCode !== 0) {
return {
status: "failed",
stdout: result.stdout,
stderr: result.stderr,
};
}
return { status: "passed", stdout: result.stdout };Keep large logs out of prompt context. Summarize or truncate before returning them to the model.
Testing
Test your app's sandbox workflow by creating a real session, staging representative files, and asserting the command or tool result. See Sandbox Testing for app-level test and GitHub Actions examples.
Checklist
- Create sessions per task or request.
- Stage only task-specific files.
- Keep network disabled by default.
- Use structured
commandandargs. - Set explicit limits for timeout, memory, CPU, and output size.
- Destroy sessions in
finally. - Expose only the sandbox tools the agent needs.
- Trace command names, exit codes, timeout state, and output truncation.
