Harbor Cookbook

March 27, 2026

Introducing the Harbor cookbook: recipes for building Harbor tasks and optimization loops

If you've built a Harbor task before, you've probably spent time figuring out the same things everyone else does: multi-container setups, simulated users, adding MCP tools, implementing computer use environments, and more.

We built Harbor to make agent evals simple. That’s why we’re releasing the Harbor Cookbook. It contains a collection of realistic, ready-to-run examples of how to build evals and optimize agents with Harbor.

What’s inside

We recommend you give your coding agent context on the one closest to what you're building and adapt from there.

RecipeWhat it does
simple-taskMinimal single-container task
multi-containerDocker Compose task where the agent interacts with a locally hosted REST API
mcp-toolsGiving the agent custom tools via a locally hosted FastMCP server
skillsRecipes for including skills in a Harbor task
multi-rewardMultiple independent verifiers each producing their own score
simulated-userAgent discovers requirements by talking to a simulated user
computer-use-ubuntuComputer use reference implementation on an Ubuntu virtual desktop
computer-use-windowsComputer use reference implementation on a remote Windows desktop (Daytona)
dns-blacklistingNetwork-level hostname blacklisting with exact, wildcard, and regex rules

Beyond evals: optimizing agents

Harbor tasks produce a reward, which means the same datasets you use for evals can also serve as training environments. The cookbook includes two recipes that demonstrate this: one example pairs Harbor with GEPA to optimize an agent harness on MedAgentBench. The other is the Harbor integration contributed by Thinking Machines, which uses Harbor tasks as RL environments through the Tinker SDK.

We welcome feedback on which examples to build next and how to improve Harbor. We’re actively developing the Harbor framework to improve its ability to integrate into optimization loops.

Get started

Clone the cookbook and then run:

uv tool install harbor
harbor run -p harbor_cookbook/recipes/simple-task -a "<agent>" -m "<model>"

We welcome community contributions and will keep adding examples of interesting use cases for Harbor. Our goal is to make this the starting point for all things Harbor.

GitHub →

The Harbor Team