CellFlux

Simulating Cellular Morphology Changes via Flow Matching

Building a virtual cell capable of accurately simulating cellular behaviors in silico has long been a dream in computational biology. We introduce CellFlux, an image-generative model that simulates cellular morphology changes induced by chemical and genetic perturbations using flow matching.

🔮
Method Overview: Predict changes in cell morphology induced by chemical or gene perturbations through distribution-to-distribution modeling using flow matching.
💡
Key Innovation: Formulate cellular morphology prediction as a distribution-to-distribution learning problem, distinguishing true perturbation effects from experimental artifacts.
🛠️
Algorithm Details: Employ flow matching to learn continuous transformations with conditional flow matching, classifier-free guidance, and U-Net architecture.
💎
Results & Capabilities: Achieve 35% improvement in FID scores and 12% increase in prediction accuracy with novel trajectory modeling capabilities.
CellFlux Overview

Building a virtual cell capable of accurately simulating cellular behaviors in silico has long been a dream in computational biology. We introduce CellFlux, an image-generative model that simulates cellular morphology changes induced by chemical and genetic perturbations using flow matching. Unlike prior methods, CellFlux models distribution-wise transformations from unperturbed to perturbed cell states, effectively distinguishing actual perturbation effects from experimental artifacts such as batch effects—a major challenge in biological data.

Evaluated on chemical (BBBC021), genetic (RxRx1), and combined perturbation (JUMP) datasets, CellFlux generates biologically meaningful cell images that faithfully capture perturbation-specific morphological changes, achieving a 35% improvement in FID scores and a 12% increase in mode-of-action prediction accuracy over existing methods. Additionally, CellFlux enables continuous interpolation between cellular states, providing a potential tool for studying perturbation dynamics. These capabilities mark a significant step toward realizing virtual cell modeling for biomedical research.

🔮
Method
Overview
💡
Key
Innovation
🛠️
Algorithm
Details
💎
Results &
Capabilities

Click to jump to each section.

Our method represents a significant advancement in computational biology, providing researchers with powerful tools for virtual cell modeling, drug discovery, and understanding cellular dynamics. The open-source implementation enables the community to build upon this work and accelerate progress in this field.


🔮 Method Overview

Our method aims to predict changes in cell morphology induced by chemical or gene perturbations in silico. The approach addresses the fundamental challenge of distinguishing true biological effects from experimental artifacts through innovative distribution-to-distribution modeling.

CellFlux Overview
Figure 1: Method overview. (a) Objective: Predict morphological changes from perturbations. (b) Data: High-content screening with control and perturbed wells. (c) Problem formulation: Distribution-to-distribution transformation. (d) Flow matching: Learning velocity fields for continuous transformation. (e) Results: Superior performance in generation quality and biological accuracy.

Key components of CellFlux include:

💡 Key Innovation: Distribution-to-Distribution Transformation

Our key innovation lies in formulating cellular morphology prediction as a distribution-to-distribution learning problem. Traditional methods often ignore control cells or treat perturbation prediction as a simple image-to-image translation. In contrast, our approach recognizes that:

  • Control cells provide crucial context: They serve as a reference to distinguish true perturbation effects from experimental artifacts like batch effects
  • Batch effects are systematic biases: Variations in experimental conditions across different runs introduce consistent biases unrelated to the perturbation itself
  • Distribution-wise modeling is more robust: By learning transformations between entire distributions rather than individual images, our method captures the inherent variability in biological systems

This approach enables our method to generate more accurate and biologically meaningful cellular responses while maintaining robustness across diverse experimental conditions.

🛠️ Algorithm Details

Our method leverages flow matching to learn continuous transformations between cellular states. The algorithm is designed to handle the complexity of biological data while maintaining computational efficiency.

CellFlux Algorithm Overview
Figure 2: Algorithm overview. Our method leverages flow matching to learn continuous transformations between cellular states.

Training Process

During training, our model learns a velocity field by fitting trajectories between control cell images (x₀ ~ p₀) and perturbed cell images (x₁ ~ p₁). At each training step, intermediate states are sampled along linear interpolations, and the network minimizes the difference between predicted and true velocities.

Inference Process

At inference, the trained velocity field guides the transformation of control cell states into perturbed states by solving an ordinary differential equation iteratively using numerical integration steps.

Key Technical Components include:

  • Conditional Flow Matching: Extends flow matching to handle perturbation conditions
  • Classifier-Free Guidance: Improves generation fidelity through conditional/unconditional interpolation
  • Noise Augmentation: Prevents overfitting and encourages smooth velocity fields
  • U-Net Architecture: Captures multi-scale features for accurate cellular morphology modeling

💎 Results & Capabilities

State-of-the-Art Performance

Our method achieves superior performance across multiple cellular imaging datasets, demonstrating significant improvements in both image generation quality and biological relevance.

CellFlux Performance Results
Figure 3: State-of-the-art performance across multiple datasets. Our method consistently outperforms baseline methods.
Detailed Performance Metrics
Figure 4: Detailed performance metrics showing our method's superiority in various biological tasks.

Our results demonstrate significant improvements:

  • 35% improvement in FID scores compared to existing methods
  • 12% increase in mode-of-action prediction accuracy
  • Strong out-of-distribution generalization to unseen perturbations

Novel Capabilities

Our method unlocks unprecedented capabilities that advance the field toward a true virtual cell, enabling researchers to study cellular dynamics in ways previously impossible.

Continuous Interpolation Capabilities
Figure 5: Novel capabilities including batch effects calibration and modeling bidirectional cellular morphological change trajectories.

Key capabilities include:

  • Trajectory Modeling: Provides insights into the dynamics of cellular responses
  • Bidirectional Transformations: Can model both forward perturbation and recovery dynamics
  • Batch Effect Calibration: Distinguishes true biological effects from experimental artifacts

These capabilities significantly advance our understanding of cellular behavior and provide powerful tools for drug discovery and personalized therapy development.

Conclusion

Our method represents a significant advancement in computational biology, providing researchers with powerful tools for virtual cell modeling, drug discovery, and understanding cellular dynamics. The distribution-to-distribution approach addresses fundamental challenges in biological data analysis, while flow matching enables unprecedented capabilities in cellular morphology simulation. We hope our work will strengthen the computational biology community and accelerate progress toward truly predictive virtual cell models.

BibTeX

@inproceedings{CellFlux,
  title={{CellFlux: Simulating Cellular Morphology Changes via Flow Matching}},
  author={Zhang, Yuhui and Su, Yuchang and Wang, Chenyu and Li, Tianhong and Wefers, Zoe and Nirschl, Jeffrey and Burgess, James and Ding, Daisy and Lozano, Alejandro and Lundberg, Emma and Yeung-Levy, Serena},
  booktitle={International Conference on Machine Learning (ICML)},
  year={2025}
}