Deep Generative Models for Cosmic Structures

Sneiderman, Robby

Applied ML

Deep Generative Models for Cosmic Structures

GANs, normalizing flows, and diffusion models as fast surrogates for N-body simulations: halo catalogs, cosmological field generation, and simulation-based inference of cosmological parameters.

AdvancedTier 3CurrentReference~15 min

Prerequisites

Generative Adversarial Networks Normalization Flows

Prereq Map

Why This Matters

Cosmological inference compares observed large-scale structure to predictions from N-body simulations. A single high-resolution simulation of a $1$ Gpc box costs $10^5$ to $10^6$ CPU-hours. Bayesian parameter inference requires sampling the cosmological parameter space ( $\Omega_m$ , $\sigma_8$ , $w$ , $h$ , neutrino masses) at thousands of points. The simulation budget is the bottleneck for next-generation surveys (Rubin LSST, Euclid, DESI, SKA).

Deep generative models attack this from two directions. Emulators learn the mapping from cosmological parameters to summary statistics (matter power spectrum, halo mass function) at $10^4$ to $10^6$ times the speed of running the simulation. Field-level generative models go further: produce full 3D density fields or 2D weak-lensing maps statistically indistinguishable from simulation outputs. Both feed into simulation-based inference, where the likelihood is replaced by samples from a learned conditional distribution.

The science payoff is the ability to use all of the data, not just the two-point function. Higher-order statistics, peak counts, and field-level likelihoods carry information about non-Gaussian structure that the power spectrum throws away.

Core Ideas

GANs for halo catalogs and weak-lensing maps. Mustafa et al. (2019, Computational Astrophysics and Cosmology 6; arXiv 1706.02390) trained a DCGAN on weak-lensing convergence maps from N-body simulations. Generated maps matched the power spectrum, peak counts, and Minkowski functionals of training data within a few percent across angular scales of $1$ to $20$ arcmin. Subsequent work extended to 3D density fields and conditional generation on cosmological parameters.

Normalizing flows for cosmological fields. CosmoFlow (Mathuriya et al. 2018, arXiv 1808.04728) used 3D CNN regression for parameter prediction; later work used flow-based models (Rouhiainen, Münchmeyer 2022, arXiv 2206.05014; Dai, Seljak 2022, PNAS 119) to learn tractable densities over field configurations. Flows give an exact, tractable likelihood for the learned surrogate model $q_\phi$ — not an exact reconstruction of the true simulator-induced likelihood. SBI pipelines exploit this tractability for fast posterior evaluation, but the surrogate must still be calibrated against the simulator (e.g., via simulation-based calibration or coverage tests, see below) before its likelihoods can be trusted as a stand-in for the true $p(x \mid \theta)$ .

Diffusion models as the new default. Score-based and diffusion models now match or beat GANs on cosmological field generation (Mudur, Finkbeiner 2022, arXiv 2211.12444; Park et al. 2023). Training is more stable, mode coverage is better, and conditional generation on $(\Omega_m, \sigma_8)$ is straightforward. The cost is sampling speed, partially addressed by flow matching and consistency models.

Simulation-based inference (SBI). The Cranmer-Brehmer-Louppe review (2020, PNAS 117; arXiv 1911.01429) frames the broader program: when the simulator defines an implicit likelihood, train a conditional density estimator $q_\phi(\theta \mid x)$ on simulator runs and use it as the posterior. Methods include neural posterior estimation (NPE), neural likelihood estimation (NLE), and neural ratio estimation (NRE). SimBIG (Hahn et al. 2023, arXiv 2211.00723) applied SBI to BOSS galaxy clustering and recovered $\sigma_8$ with field-level information beyond the standard two-point analysis.

Common Confusions

Watch Out

A GAN that matches the power spectrum has not learned cosmology

Matching low-order summary statistics is necessary but not sufficient. A generator can reproduce the power spectrum while failing on peak counts, bispectrum, or filament topology. Validation must include statistics orthogonal to the training objective. The community standard is to check power spectrum, bispectrum, peak counts, and Minkowski functionals independently, plus parameter inference posteriors against ground truth.

Watch Out

SBI posteriors require coverage tests

A neural posterior estimator can be sharp and wrong. Standard practice now demands simulation-based calibration (SBC) or expected coverage probability diagnostics (Hermans et al. 2021) on a held-out set of simulator runs. A posterior that fails coverage is unsafe to publish regardless of how tight the credible intervals look.

References

Mustafa et al., CosmoGAN: creating high-fidelity weak lensing convergence maps using Generative Adversarial Networks (Computational Astrophysics and Cosmology 6, 2019; arXiv 1706.02390).
Mathuriya et al., CosmoFlow: Using Deep Learning to Learn the Universe at Scale (SC18; arXiv 1808.04728).
Cranmer, Brehmer, Louppe, The frontier of simulation-based inference (PNAS 117, 2020; arXiv 1911.01429). Canonical review of SBI methods.
Hahn et al., SimBIG: A Forward Modeling Approach To Analyzing Galaxy Clustering (JCAP 04, 2023; arXiv 2211.00723).
Hermans et al., A Trust Crisis In Simulation-Based Inference? Your Posterior Approximations Can Be Unfaithful (TMLR 2022; arXiv 2110.06581). Coverage tests for SBI.
Mudur and Finkbeiner, Can denoising diffusion probabilistic models generate realistic astrophysical fields? (NeurIPS ML4PS workshop 2022; arXiv 2211.12444).

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Full prerequisite chain All derived topics

Required prerequisites

2

Generative Adversarial Networkslayer 3 · tier 2
Normalizing Flowslayer 3 · tier 3

Derived topics

2

Score Matchinglayer 3 · tier 1
Diffusion Modelslayer 4 · tier 1

Graph-backed continuations

Diffusion Models Score Matching