Applied ML
Deep Generative Models for Cosmic Structures
GANs, normalizing flows, and diffusion models as fast surrogates for N-body simulations: halo catalogs, cosmological field generation, and simulation-based inference of cosmological parameters.
Prerequisites
Why This Matters
Cosmological inference compares observed large-scale structure to predictions from N-body simulations. A single high-resolution simulation of a Gpc box costs to CPU-hours. Bayesian parameter inference requires sampling the cosmological parameter space (, , , , neutrino masses) at thousands of points. The simulation budget is the bottleneck for next-generation surveys (Rubin LSST, Euclid, DESI, SKA).
Deep generative models attack this from two directions. Emulators learn the mapping from cosmological parameters to summary statistics (matter power spectrum, halo mass function) at to times the speed of running the simulation. Field-level generative models go further: produce full 3D density fields or 2D weak-lensing maps statistically indistinguishable from simulation outputs. Both feed into simulation-based inference, where the likelihood is replaced by samples from a learned conditional distribution.
The science payoff is the ability to use all of the data, not just the two-point function. Higher-order statistics, peak counts, and field-level likelihoods carry information about non-Gaussian structure that the power spectrum throws away.
Core Ideas
GANs for halo catalogs and weak-lensing maps. Mustafa et al. (2019, Computational Astrophysics and Cosmology 6; arXiv 1706.02390) trained a DCGAN on weak-lensing convergence maps from N-body simulations. Generated maps matched the power spectrum, peak counts, and Minkowski functionals of training data within a few percent across angular scales of to arcmin. Subsequent work extended to 3D density fields and conditional generation on cosmological parameters.
Normalizing flows for cosmological fields. CosmoFlow (Mathuriya et al. 2018, arXiv 1808.04728) used 3D CNN regression for parameter prediction; later work used flow-based models (Rouhiainen, Münchmeyer 2022, arXiv 2206.05014; Dai, Seljak 2022, PNAS 119) to learn tractable densities over field configurations. Flows give an exact, tractable likelihood for the learned surrogate model — not an exact reconstruction of the true simulator-induced likelihood. SBI pipelines exploit this tractability for fast posterior evaluation, but the surrogate must still be calibrated against the simulator (e.g., via simulation-based calibration or coverage tests, see below) before its likelihoods can be trusted as a stand-in for the true .
Diffusion models as the new default. Score-based and diffusion models now match or beat GANs on cosmological field generation (Mudur, Finkbeiner 2022, arXiv 2211.12444; Park et al. 2023). Training is more stable, mode coverage is better, and conditional generation on is straightforward. The cost is sampling speed, partially addressed by flow matching and consistency models.
Simulation-based inference (SBI). The Cranmer-Brehmer-Louppe review (2020, PNAS 117; arXiv 1911.01429) frames the broader program: when the simulator defines an implicit likelihood, train a conditional density estimator on simulator runs and use it as the posterior. Methods include neural posterior estimation (NPE), neural likelihood estimation (NLE), and neural ratio estimation (NRE). SimBIG (Hahn et al. 2023, arXiv 2211.00723) applied SBI to BOSS galaxy clustering and recovered with field-level information beyond the standard two-point analysis.
Common Confusions
A GAN that matches the power spectrum has not learned cosmology
Matching low-order summary statistics is necessary but not sufficient. A generator can reproduce the power spectrum while failing on peak counts, bispectrum, or filament topology. Validation must include statistics orthogonal to the training objective. The community standard is to check power spectrum, bispectrum, peak counts, and Minkowski functionals independently, plus parameter inference posteriors against ground truth.
SBI posteriors require coverage tests
A neural posterior estimator can be sharp and wrong. Standard practice now demands simulation-based calibration (SBC) or expected coverage probability diagnostics (Hermans et al. 2021) on a held-out set of simulator runs. A posterior that fails coverage is unsafe to publish regardless of how tight the credible intervals look.
References
- Mustafa et al., CosmoGAN: creating high-fidelity weak lensing convergence maps using Generative Adversarial Networks (Computational Astrophysics and Cosmology 6, 2019; arXiv 1706.02390).
- Mathuriya et al., CosmoFlow: Using Deep Learning to Learn the Universe at Scale (SC18; arXiv 1808.04728).
- Cranmer, Brehmer, Louppe, The frontier of simulation-based inference (PNAS 117, 2020; arXiv 1911.01429). Canonical review of SBI methods.
- Hahn et al., SimBIG: A Forward Modeling Approach To Analyzing Galaxy Clustering (JCAP 04, 2023; arXiv 2211.00723).
- Hermans et al., A Trust Crisis In Simulation-Based Inference? Your Posterior Approximations Can Be Unfaithful (TMLR 2022; arXiv 2110.06581). Coverage tests for SBI.
- Mudur and Finkbeiner, Can denoising diffusion probabilistic models generate realistic astrophysical fields? (NeurIPS ML4PS workshop 2022; arXiv 2211.12444).
Related Topics
Last reviewed: April 26, 2026
Canonical graph
Required before and derived from this topic
These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.
Required prerequisites
2- Generative Adversarial Networkslayer 3 · tier 2
- Normalizing Flowslayer 3 · tier 3
Derived topics
2- Score Matchinglayer 3 · tier 1
- Diffusion Modelslayer 4 · tier 1
Graph-backed continuations