Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Sandboxing for tool execution #2040

Merged
merged 56 commits into from
Nov 22, 2024
Merged

feat: Sandboxing for tool execution #2040

merged 56 commits into from
Nov 22, 2024

Conversation

mattzh72
Copy link
Collaborator

@mattzh72 mattzh72 commented Nov 14, 2024

Description

Enable sandboxing for tool execution. Previously, tool execution was being done in the same runtime environment/thread as the agent control loop - this is problematic for obvious reasons. We introduce the ability to run tools in a sandboxed environment:

  • We support two different kinds of sandboxes currently: E2B and "local".
  • The E2B sandbox requires an E2B API key, and users can currently configure the template_id and timeout of the box
  • The local sandbox relies on a local directory, and runs Python code in the context of that directory and the dependencies installed in the current environment

We also add the ability for users to manage sandbox configurations:

  • Create configurations for each kind of Sandbox (more configurations to come in the future):
class LocalSandboxConfig(BaseModel):
    venv_name: str = Field("venv", description="Name of the virtual environment.")
    sandbox_dir: str = Field(..., description="Directory for the sandbox environment.")


class E2BSandboxConfig(BaseModel):
    timeout: int = Field(5 * 60, description="Time limit for the sandbox (in seconds).")
    template_id: Optional[str] = Field(None, description="The E2B template id (docker image).")
  • Add environment variables per sandbox configuration

There's also some nifty optimizations around E2B, such as not immediately killing the sandbox so sandboxes can get reused (save on spin-up time), and only refreshing the sandbox when we detect the user has changed either the config or environment variables for that box.

In a separate PR (merged into this one), we also add the ability to modify agent state via tools by serializing the agent state and passing back and forth between the sandbox and running Python thread.

Testing

  • A suite of unit tests that cover the client functionality (both local and REST)
  • A suite of live integration tests with E2B covering happy paths + edge cases where the config changed in the middle of execution, differing environments compared to local, etc.
  • Manual testing on the dev portal

Contributions

Thanks to @carenthomas for contributing, this PR is based off her initial PR!

@mattzh72 mattzh72 merged commit ae083fc into main Nov 22, 2024
30 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants