CellFlux: Simulating Cellular Morphology Changes via Flow Matching

ICML 2025

\(^1\)Stanford University \(^2\)Tsinghua University \(^3\)MIT
\(^\star\)Equal contribution

Abstract

Building a virtual cell capable of accurately simulating cellular behaviors in silico has long been a dream in computational biology. We introduce CellFlux, an image-generative model that simulates cellular morphology changes induced by chemical and genetic perturbations using flow matching. Unlike prior methods, CellFlux models distribution-wise transformations from unperturbed to perturbed cell states, effectively distinguishing actual perturbation effects from experimental artifacts such as batch effects—a major challenge in biological data. Evaluated on chemical (BBBC021), genetic (RxRx1), and combined perturbation (JUMP) datasets, CellFlux generates biologically meaningful cell images that faithfully capture perturbation-specific morphological changes, achieving a 35% improvement in FID scores and a 12% increase in mode-of-action prediction accuracy over existing methods. Additionally, CellFlux enables continuous interpolation between cellular states, providing a potential tool for studying perturbation dynamics. These capabilities mark a significant step toward realizing virtual cell modeling for biomedical research.


🔮 CellFlux Overview

Overview of CellFlux.

(a) Objective. CellFlux aims to predict changes in cell morphology induced by chemical or gene perturbations in silico. In this example, the perturbation effect reduces the nuclear size.

(b) Data. The dataset includes images from high-content screening experiments, where chemical or genetic perturbations are applied to target wells, alongside control wells without perturbations. Control wells provide prior information to contrast with target images, enabling the identification of true perturbation effects (e.g., reduced nucleus size) while calibrating non-perturbation artifacts such as batch effects—systematic biases unrelated to the perturbation (e.g., variations in color intensity).

(c) Problem formulation. We formulate the task as a distribution-to-distribution problem (many-to-many mapping), where the source distribution consists of control images, and the target distribution contains perturbed images within the same batch.

(d) Flow matching. CellFlux employs flow matching, a state-of-the-art generative approach for distribution-to-distribution problems. It learns a neural network to approximate a velocity field, continuously transforming the source distribution into the target by solving an ordinary differential equation (ODE).

(e) Results. CellFlux significantly outperforms baselines in image generation quality, achieving lower Fréchet Inception Distance (FID) and higher classification accuracy for mode-of-action (MoA) predictions.



💡 Key Innovation: Distribution-to-Distribution Transformation

CellFlux's key innovation lies in formulating cellular morphology prediction as a distribution-to-distribution learning problem. Traditional methods often ignore control cells or treat perturbation prediction as a simple image-to-image translation. In contrast, CellFlux recognizes that:

  • Control cells provide crucial context: They serve as a reference to distinguish true perturbation effects from experimental artifacts like batch effects
  • Batch effects are systematic biases: Variations in experimental conditions across different runs introduce consistent biases unrelated to the perturbation itself
  • Distribution-wise modeling is more robust: By learning transformations between entire distributions rather than individual images, CellFlux captures the inherent variability in biological systems

This approach enables CellFlux to generate more accurate and biologically meaningful cellular responses while maintaining robustness across diverse experimental conditions.



🛠️ CellFlux Methods

CellFlux Algorithm Overview. Our method leverages flow matching to learn continuous transformations between cellular states:

Training Process

During training, CellFlux learns a velocity field by fitting trajectories between control cell images (x₀ ~ p₀) and perturbed cell images (x₁ ~ p₁). At each training step, intermediate states are sampled along linear interpolations, and the network minimizes the difference between predicted and true velocities.

Inference Process

At inference, the trained velocity field guides the transformation of control cell states into perturbed states by solving an ordinary differential equation iteratively using numerical integration steps.

Key Technical Components

  • Conditional Flow Matching: Extends flow matching to handle perturbation conditions
  • Classifier-Free Guidance: Improves generation fidelity through conditional/unconditional interpolation
  • Noise Augmentation: Prevents overfitting and encourages smooth velocity fields
  • U-Net Architecture: Captures multi-scale features for accurate cellular morphology modeling

Try our CellFlux methods with: bash example.sh



💎 CellFlux Capabilities

State-of-the-Art Performance

CellFlux achieves superior performance across multiple cellular imaging datasets:

Our results demonstrate significant improvements in both image generation quality and biological relevance, with CellFlux achieving:

  • 35% improvement in FID scores compared to existing methods
  • 12% increase in mode-of-action prediction accuracy
  • Superior handling of batch effects through distribution-wise modeling
  • Strong out-of-distribution generalization to unseen perturbations

Novel Capabilities: Continuous Interpolation

CellFlux unlocks unprecedented capabilities that advance the field toward a true virtual cell:

  • Smooth State Transitions: Enables visualization of cellular morphology changes over time
  • Bidirectional Transformations: Can model both forward perturbation and recovery dynamics
  • Batch Effect Calibration: Distinguishes true biological effects from experimental artifacts
  • Trajectory Modeling: Provides insights into the dynamics of cellular responses

These capabilities significantly advance our understanding of cellular behavior and provide powerful tools for drug discovery and personalized therapy development.


BibTeX

@inproceedings{CellFlux,
  title={CellFlux: Simulating Cellular Morphology Changes via Flow Matching},
  author={Zhang, Yuhui and Su, Yuchang and Wang, Chenyu and Li, Tianhong and Wefers, Zoe and Nirschl, Jeffrey and Burgess, James and Ding, Daisy and Lozano, Alejandro and Lundberg, Emma and Yeung-Levy, Serena},
  booktitle={International Conference on Machine Learning (ICML)},
  year={2025}
}