Agents API¶
Agent execution and management functionality for running LLM agents on biomedical tasks.
Agent execution module for BioML-bench.
This module provides functionality to run AI agents on biomedical tasks within the biomlbench framework.
AgentTask
dataclass
¶
Represents a single agent-task execution.
check_agent_success
async
¶
Check if an agent run actually succeeded by examining submission files and logs.
This catches cases where agents fail internally but exit with success codes.
worker
async
¶
Worker function that processes agent tasks from the queue.
create_task_list_file ¶
Create a temporary task list file for single task execution.
get_task_ids ¶
Get list of task IDs from either single task or task list file.
run_agent_async
async
¶
run_agent_async(
agent_id,
task_ids,
n_workers=1,
n_seeds=1,
container_config_path=None,
retain_container=False,
data_dir=None,
cpu_only=False,
fast=False,
)
Run an agent on multiple tasks asynchronously.
Returns: Tuple[str, Path]: The run group ID and path to the generated submission file
run_agent ¶
Main entry point for running agents from the CLI.
Args: args: Parsed command line arguments
Returns: str: The run group ID for this execution