Adding Tasks¶
Guide for adding new biomedical benchmark tasks to BioML-bench.
Task Requirements¶
Every task needs:
- Pre-split data for training and testing
- Clear evaluation metrics
- A description of the task
Note: If adding a benchmarks from a new database, you'll also need to add a new data source module in biomlbench/data_sources/
. See examples in biomlbench/data_sources/kaggle.py
and biomlbench/data_sources/polaris.py
.
Implementation Steps¶
- Create task directory structure
- Configure task metadata
- Implement data preparation
- Define evaluation logic
- Write task description
- Test and validate
Task directory structure¶
tasks/
├── my-source/ # Data source folder (e.g., polarishub, manual)
│ └── my-biomedical-task/ # Task directory
│ ├── config.yaml # Task configuration
│ ├── description.md # Task description
│ ├── prepare.py # Data preparation script
│ ├── leaderboard.csv # Leaderboard with human performance baselines (optional but recommended)
│ └── grade.py # Evaluation logic
│
├── ~/.cache/biomlbench/data/my-source/my-biomedical-task/prepared/ # Generated by biomlbench prepare -t <task_id>
│ ├── dataset-0/ # This folder must be created by prepare()
│ ├── public/ # Public data
│ │ ├── train.<ext> # Training data (e.g., train.csv, train.h5ad)
│ │ ├── test_features.<ext> # Test features
│ │ └── sample_submission.<ext> # Example submission
│ └── private/ # Private data
│ └── answers.<ext> # Test set answers
│ └── dataset-1/ # Multiple datasets allowed per task (e.g. for K-fold cross-validation)
│ └── public/ # Same directory structure as dataset-0 above
...
Task Configuration (config.yaml
)¶
id: my-source/my-biomedical-task
name: "My Biomedical Task"
task_type: drug_discovery # or medical_imaging, protein_engineering
domain: pharmacokinetics # specific biomedical domain
difficulty: medium # easy, medium, hard
data_source:
type: kaggle # or polaris, custom
competition_id: my-task
dataset:
answers: my-source/my-biomedical-task/prepared/private/answers.csv
sample_submission: my-source/my-biomedical-task/prepared/public/sample_submission.csv
grader:
name: rmse
grade_fn: biomlbench.tasks.my-source.my-biomedical-task.grade:grade
preparer: biomlbench.tasks.my-source.my-biomedical-task.prepare:prepare
biomedical_metadata:
modality: "molecular_properties"
organ_system: "liver"
data_type: "regression"
clinical_relevance: "drug_metabolism"
Data Preparation (prepare.py
)¶
See example biomlbench/tasks/polarishub/tdcommons-caco2-wang/prepare.py
.
from pathlib import Path
import pandas as pd
def prepare(raw: Path, public: Path, private: Path) -> None:
"""
Prepares the task data into public/private directories.
Args:
raw: Directory with the raw data
public: Directory for public data (training examples and inputs for test examples)
private: Directory for private data (answers for test examples)
"""
# Download and process raw data
# Create train.<ext>, test_features.<ext>, sample_submission.<ext>
# Generate private answers.<ext>
Evaluation Logic (grade.py
)¶
See example biomlbench/tasks/polarishub/tdcommons-caco2-wang/grade.py
.
import pandas as pd
import numpy as np
def grade(submission: pd.DataFrame, answers: pd.DataFrame) -> float:
"""Calculate task-specific metric."""
y_true = answers['label'].values
y_pred = submission['label'].values
# Implement domain-specific metric
return np.sqrt(np.mean((y_true - y_pred) ** 2))