harbor
BixBench - A benchmark for evaluating AI agents on bioinformatics and computational biology tasks.
harbor run -d bixbench@1.5
harbor run -d bixbench@1.5 -t bix-8-q6
harbor run -d bixbench@1.5 -t bix-8-q7
harbor run -d bixbench@1.5 -t bix-9-q3
harbor run -d bixbench@1.5 -t bix-9-q4
harbor run -d bixbench@1.5 -t bix-9-q5