Artifact Collection
Collecting files from the sandbox after a trial completes
Harbor can automatically collect files from the sandbox environment after each trial completes. This is useful for preserving model outputs, logs, generated files, evidence held by sidecar services, or any other byproducts of the agent's work.
Convention directory (zero configuration)
Any files written to /logs/artifacts/ inside the sandbox are collected automatically with no configuration needed. For Docker environments, this directory is volume-mounted directly to the host. For remote environments (Daytona, Modal, E2B, Tensorlake, etc.), files are downloaded after the trial finishes.
For example, if your task's test script or agent writes files to /logs/artifacts/:
# Inside the sandbox
echo "result" > /logs/artifacts/output.txt
cp model.pt /logs/artifacts/model.ptThese files will appear in the trial output directory at <trial_dir>/artifacts/logs/artifacts/.
Config-driven artifact collection
To collect files from arbitrary paths in the sandbox (not just /logs/artifacts/), add an artifacts field to your job configuration or task.toml.
Simple form
List the paths you want to collect. Each path is mirrored under the trial's artifacts/ base directory (the service is not part of the host path).
artifacts:
- /app/hello.txt
- /workspace/output.csv
- /data/resultsThis saves the files to <trial_dir>/artifacts/app/hello.txt, .../artifacts/workspace/output.csv, and .../artifacts/data/results/.
Object form
Use the object form to control where files are saved within the artifacts/ directory, or to collect files from a Docker Compose sidecar service.
artifacts:
# Place at an explicit (relative) destination on the host
- source: /app/hello.txt
destination: workspace/hello.txt
# Collect from a compose sidecar service instead of the main container
- source: /var/log/api/requests.log
service: api| Field | Meaning |
|---|---|
source | Absolute path inside the container to collect. Also where the file re-materializes inside a separate verifier environment ("no translation"). |
destination | Optional host-side relative path under <trial_dir>/artifacts/. Never affects verifier-side placement. Must be relative, must not contain .., and must not shadow the reserved manifest.json. |
service | Optional Docker Compose service to collect from. Defaults to main (the agent's container). Requires a compose-capable provider. |
Collecting evidence from sidecar services
Multi-container tasks often hold their score signal inside a sidecar — a database the agent wrote to, a server that logged the agent's requests. Sidecar artifacts let the verifier read that evidence even in separate verifier mode, where all containers are torn down before verification.
artifacts = [
{ source = "/var/log/api/requests.log", service = "api" },
{ source = "/tmp/db-dump.sql", service = "postgres" },
]
# Snapshot runtime state into files before teardown
[[verifier.collect]]
service = "postgres"
command = "pg_dump -U postgres app > /tmp/db-dump.sql"
timeout_sec = 60.0Because sidecar evidence is pulled directly from the sidecar's filesystem — over a channel the agent's container cannot write to — it is tamper-resistant: in separate verifier mode, Harbor stops the main container before collecting sidecar evidence, so leftover agent processes cannot interfere.
See the task documentation for the full sidecar workflow, and examples/tasks/sidecar-artifacts for a working example.
How collection works
Artifact collection runs after the agent finishes. It is best-effort -- failures to collect an artifact will never cause the trial to fail (the failure is recorded in the manifest instead).
The collection process:
- Main collect hooks:
[[verifier.collect]]entries targetingmainrun while the agent container is still up. - Main artifacts: The convention directory and main-targeted config paths are collected.
- Main stop (separate verifier mode only): the main service is stopped so agent processes cannot interfere with sidecar evidence.
- Sidecar collect hooks:
[[verifier.collect]]entries targeting sidecars run. - Sidecar artifacts: Sidecar-targeted paths are pulled from each service's filesystem.
- Manifest: A
manifest.jsonfile is written to the artifacts directory listing what was collected, from which service, and whether collection succeeded.
Output structure
After collection, the trial directory contains:
<trial_dir>/
├── artifacts/ # One flat base dir shared by all services
│ ├── manifest.json # Collection manifest
│ ├── logs/artifacts/ # The convention directory (from main)
│ │ └── output.txt
│ ├── app/hello.txt # Config-driven artifact (source-mirrored)
│ ├── var/log/api/requests.log # Sidecar artifact (from the `api` service)
│ └── workspace/hello.txt # Config-driven artifact with a destination
├── agent/
├── verifier/
├── config.json
└── result.jsonAll services' artifacts share this one base dir, keyed only by their source
path. If two services export the same path they collide on the host; collection
keeps the first claimant and logs a warning (it never overwrites), recording the
skipped entry with status: "skipped" in the manifest.
The manifest tracks each artifact's source, destination, originating service, type (file or directory), and whether collection succeeded:
[
{
"source": "/logs/artifacts",
"destination": "artifacts/logs/artifacts",
"type": "directory",
"status": "ok",
"service": null
},
{
"source": "/var/log/api/requests.log",
"destination": "artifacts/var/log/api/requests.log",
"type": "file",
"status": "ok",
"service": "api"
}
]Viewing artifacts
Artifacts are viewable in the Harbor results viewer. Run harbor view and navigate to a trial to see collected artifacts under the Artifacts tab.
Environment support
Artifact collection works across all environment types. Sidecar artifacts and collect hooks additionally require a Docker Compose-capable provider:
| Environment | Convention directory | Config-driven paths | Sidecar artifacts & collect hooks |
|---|---|---|---|
| Docker | Volume-mounted (no download needed) | Downloaded after trial | Supported |
| Daytona | Downloaded after trial | Downloaded after trial | Supported (compose tasks) |
| Modal | Downloaded after trial | Downloaded after trial | Supported (compose tasks) |
| E2B | Downloaded after trial | Downloaded after trial | Not supported (no compose) |
| Tensorlake | Downloaded after trial | Downloaded after trial | Not supported (no compose) |
Tasks that declare sidecar artifacts or collect hooks on a provider without compose support fail at trial start with a clear error.