Skip to content

Agents API

Agent execution and management functionality for running LLM agents on biomedical tasks.

Agent execution module for BioML-bench.

This module provides functionality to run AI agents on biomedical tasks within the biomlbench framework.

AgentTask dataclass

AgentTask(
    run_id,
    seed,
    image,
    path_to_run_group,
    path_to_run,
    agent,
    task,
    container_config,
)

Represents a single agent-task execution.

check_agent_success async

check_agent_success(run_dir, run_logger)

Check if an agent run actually succeeded by examining submission files and logs.

This catches cases where agents fail internally but exit with success codes.

worker async

worker(
    idx,
    queue,
    client,
    tasks_outputs,
    retain_container=False,
)

Worker function that processes agent tasks from the queue.

create_task_list_file

create_task_list_file(task_id)

Create a temporary task list file for single task execution.

get_task_ids

get_task_ids(task_id=None, task_list=None)

Get list of task IDs from either single task or task list file.

run_agent_async async

run_agent_async(
    agent_id,
    task_ids,
    n_workers=1,
    n_seeds=1,
    container_config_path=None,
    retain_container=False,
    data_dir=None,
    cpu_only=False,
    fast=False,
)

Run an agent on multiple tasks asynchronously.

Returns: Tuple[str, Path]: The run group ID and path to the generated submission file

run_agent

run_agent(args)

Main entry point for running agents from the CLI.

Args: args: Parsed command line arguments

Returns: str: The run group ID for this execution