Curious Now

Story

SCALE: Scalable Conditional Atlas-Level Endpoint transport for virtual cell perturbation prediction

Life SciencesComputing2 sourcesView updates

Key takeaway

Researchers developed a new tool that can better predict how cells react to changes, which could aid in developing new medicines and treatments. The tool overcomes key bottlenecks to make large-scale cell modeling more feasible.

Quick Explainer

SCALE is a scalable foundation model for virtual cell perturbation prediction. It approaches this task by formulating it as a conditional transport problem between observed control and perturbed cell populations, rather than attempting to reconstruct unobserved trajectories. SCALE uses a hierarchical set-aware encoder to learn stable, biologically grounded latent representations, and couples this with a conditional flow-based transport backbone that predicts perturbed endpoints under paired supervision. This novel endpoint-aligned formulation, combined with a scalable BioNeMo-based training and inference framework, enables efficient atlas-scale virtual cell modeling and meaningful gains on biologically grounded evaluation metrics.

Deep Dive

Technical Deep Dive: SCALE for Scalable Conditional Atlas-Level Endpoint Transport in Virtual Cell Perturbation Prediction

Overview

SCALE is a large-scale foundation model for virtual cell perturbation prediction. It addresses key limitations in existing methods by:

  • Formulating perturbation prediction as conditional transport between observed control and perturbed cellular populations, rather than reconstructing unobserved trajectories
  • Implementing a hierarchical set-aware encoder to learn stable, biologically grounded latent representations
  • Coupling this with a conditional flow-based transport backbone that predicts perturbed endpoints under paired endpoint supervision
  • Developing a scalable BioNeMo-based training and inference framework to enable atlas-scale virtual cell modeling

The core contributions are:

  1. A scalable virtual cell infrastructure with 12.51× speedup over prior SOTA
  2. An endpoint-aligned conditional transport formulation for perturbation prediction
  3. Meaningful gains on biologically grounded metrics like PDCorr and DE Overlap

Problem & Context

Virtual cell models aim to enable in silico experimentation by predicting how cells respond to genetic, chemical, or cytokine perturbations from single-cell measurements. However, large-scale perturbation prediction remains constrained by:

  • Inefficient training and inference pipelines
  • Unstable modeling in high-dimensional sparse expression space
  • Evaluation protocols that overemphasize reconstruction-like accuracy over biological fidelity

SCALE addresses these limitations through a specialized large-scale foundation model.

Methodology

Formulation

SCALE frames perturbation prediction as conditional transport between observed control and perturbed cellular populations, rather than attempting to reconstruct an unobserved continuous trajectory.

Given a control cell set X_0 and a perturbed cell set X_1, the model learns the conditional law p(X_1 | X_0, C) where C are observed conditions like cell type, perturbation, and batch.

Hierarchical Set-Aware Encoder

SCALE uses a two-level hierarchical encoder:

  1. A gene-wise encoder f_gene that captures within-cell transcriptional dependencies
  2. A DeepSets aggregation layer that models across-cell population structure in a permutation-equivariant manner

This allows the model to preserve both intra-cellular and inter-cellular information in a set-aware latent representation.

Conditional Latent Flow Transport

The model learns a conditional transport function g_θ(Z_0, t, C) → Z_1 that maps the control latent population Z_0 to the perturbed latent population Z_1 under observed conditions C. This is trained with an endpoint reconstruction objective, matching the paired supervision available in the data.

Scalable BioNeMo Infrastructure

SCALE is implemented within a BioNeMo-based framework for high-throughput data processing, distributed execution, and efficient inference. This enables nearly 12.51× speedup in training throughput over the prior SOTA pipeline.

Data & Experimental Setup

SCALE is evaluated on three representative single-cell perturbation benchmarks:

  • PBMC: Cytokine perturbations on peripheral blood mononuclear cells
  • Tahoe-100M: Chemical perturbations across 50 cancer cell lines
  • Replogle-Nadig: Genetic perturbations in 4 human cell lines

Datasets are preprocessed to the top 2,000 highly variable genes, with held-out test sets designed to assess out-of-context generalization.

Results

SCALE achieves state-of-the-art performance on most evaluation metrics:

| Model | MSE ↓ | MAE ↓ | PDCorr ↑ | DEOver ↑ | DEPrec ↑ | LFCSpear ↑ | DirAgr ↑ | | --- | --- | --- | --- | --- | --- | --- | --- | | SCALE (Tahoe) | 0.0002 | 0.006 | 0.953 | 0.806 | 0.765 | 0.876 | 0.949 | | SCALE (PBMC) | 0.0320 | 0.118 | 0.979 | 0.810 | 0.831 | 0.516 | 0.682 | | SCALE (Replogle) | 0.0009 | 0.072 | 0.909 | 0.601 | 0.345 | 0.871 | 0.979 |

The key findings are:

  • SCALE exhibits a clear decoupling between global expression error (MSE/MAE) and biologically meaningful metrics like PDCorr and DE Overlap.
  • The BioNeMo-based infrastructure delivers 12.51× speedup in training throughput over the prior SOTA pipeline.
  • Ablations identify critical design choices, including endpoint-oriented supervision, adaptive condition encoding, and Gaussian-control mixed initialization.

Interpretation

SCALE's performance gains arise not just from model scaling, but from the co-design of scalable infrastructure, stable transport modeling, and biologically grounded evaluation.

The decoupling between expression error and biological metrics suggests that optimizing strictly for reconstruction can miss crucial perturbation signals. SCALE's endpoint-oriented formulation avoids this by directly learning the transition between observed control and perturbed populations.

The scalability improvements from the BioNeMo framework enable efficient atlas-scale virtual cell modeling, reducing training time from over 20 minutes per epoch to under 4 minutes.

The ablations highlight that progress in this domain requires careful attention to condition encoding, optimization geometry, and the alignment between training and evaluation. Naive choices like mean pooling or masking-based regularization can severely degrade performance.

Limitations & Uncertainties

While SCALE represents a step forward, several challenges remain:

  • Benchmark comparisons are still only partially aligned, as different papers use subtly different preprocessing and evaluation protocols.
  • The current flow-based formulation may be vulnerable to shortcut solutions that optimize endpoint metrics without learning robust perturbation transport.
  • Cell-Eval metrics, while more biologically grounded than reconstruction error, still exhibit sensitivity to implementation details in the evaluator itself.

What Comes Next

Future work should focus on further improving the evaluation and comparison framework for virtual cell models:

  • Establish strict protocol parity, including identical preprocessing, splits, and evaluator code, to enable reliable progress claims.
  • Explore alternative perturbation transport formulations beyond the current flow paradigm, to better capture causal structure beyond shortcut geometry.
  • Work toward a more standardized and transparent Cell-Eval pipeline, with better control over hidden choices in the evaluation implementation.

Overall, SCALE demonstrates that progress in virtual cell modeling requires not only more expressive generative objectives, but the co-design of scalable systems, stable transport modeling, and biologically faithful evaluation.

Sources