Cloud Deployments

Containerized agentic tasks can be slow when performing rollouts. This is due to container startup and teardown overhead, waiting for LLM API calls, and waiting for command execution. Horizontal scaling becomes the only viable way to accelerate experimentation, so we recommend using a cloud sandbox provider like Daytona.

Using a cloud sandbox provider shifts command execution to the cloud, making trials I/O bounded rather than compute bounded. This means you can typically parallelize far above your CPU count.

Using a cloud sandbox provider

There are many cloud sandbox providers to choose from. We recommend Daytona, because we have found them to be the most flexible. Other good options are Modal and E2B.

harbor run -d "<dataset@version>" \
  -m "<model>" \
  -a "<agent>" \
  -e daytona \
  -n "<n-parallel-trials>"

We run up to 100 trials in parallel on a MacBook Pro with 14 cores.

Limitations

All cloud sandbox providers we have tried do not support multi-container environments. Until one does, or you are willing to implement from scratch using Kubernetes (please submit a PR), you won't be able to run multi-container tasks in the cloud.

However, the Docker environment still supports multi-container tasks. Just make sure to include an environment/docker-compose.yaml file in your task definiton.

Cloud Deployments

Using a cloud sandbox provider

Limitations

On this page