stop zipping your job results.
harbor: evaluate agents in sandboxed environments
harbor is a framework for specifying sandboxed agent tasks for evaluation and optimization
uv tool install harborfrom the makers of terminal-bench
stop zipping your job results.
harbor is a framework for specifying sandboxed agent tasks for evaluation and optimization
uv tool install harborfrom the makers of terminal-bench
uv tool install harbor