Prerequisite chain
Prerequisites for Reward Models and Verifiers
Topics you need before working through Reward Models and Verifiers. Direct prerequisites are listed first; transitive prerequisites (the chain reachable through them) follow.
Direct prerequisites (4)
- RLHF and Alignmentlayer 4, tier 2
- Post-Training Overviewlayer 5, tier 2
- Reasoning Data Curationlayer 5, tier 2
- Test-Time Compute and Searchlayer 5, tier 2
Reachable through the chain (391)
These topics are not directly cited as prerequisites but are reached transitively by following the chain upward. Working through the direct prerequisites pulls these in.
- Policy Gradient Theoremlayer 3, tier 1
- Markov Decision Processeslayer 2, tier 1
- Convex Optimization Basicslayer 1, tier 1
- Differentiation in Rⁿlayer 0A, tier 1
- Sets, Functions, and Relationslayer 0A, tier 1
- Basic Logic and Proof Techniqueslayer 0A, tier 2
- Vectors, Matrices, and Linear Mapslayer 0A, tier 1
- Continuity in Rⁿlayer 0A, tier 1
- Metric Spaces, Convergence, and Completenesslayer 0A, tier 1
- Matrix Operations and Propertieslayer 0A, tier 1
- Linear Independencelayer 0A, tier 1
- Common Inequalitieslayer 0A, tier 1
- Common Probability Distributionslayer 0A, tier 1
- Exponential Function Propertieslayer 0A, tier 1
- Integration and Change of Variableslayer 0A, tier 2
- Measure-Theoretic Probabilitylayer 0B, tier 1
- Cardinality and Countabilitylayer 0A, tier 2
- Kolmogorov Probability Axiomslayer 0A, tier 1
- Random Variableslayer 0A, tier 1
- Zermelo-Fraenkel Set Theorylayer 0A, tier 2
- Dynamic Programminglayer 0A, tier 1
- Graph Algorithms Essentialslayer 0A, tier 2
- Greedy Algorithmslayer 0A, tier 2
- GraphSLAM and Factor Graphslayer 3, tier 2
- Inverse and Implicit Function Theoremlayer 0A, tier 2
- The Jacobian Matrixlayer 0A, tier 1
- Positive Semidefinite Matriceslayer 0A, tier 1
- Eigenvalues and Eigenvectorslayer 0A, tier 1
- Inner Product Spaces and Orthogonalitylayer 0A, tier 1
- Matrix Normslayer 0A, tier 1
- Submodular Optimizationlayer 3, tier 3
- Taylor Expansionlayer 0A, tier 1
- The Hessian Matrixlayer 0A, tier 1
- Vector Calculus Chain Rulelayer 0A, tier 1
- Concentration Inequalitieslayer 1, tier 1
- Expectation, Variance, Covariance, and Momentslayer 0A, tier 1
- Joint, Marginal, and Conditional Distributionslayer 0A, tier 1
- Triangular Distributionlayer 0A, tier 2
- Central Limit Theoremlayer 0B, tier 1
- Law of Large Numberslayer 0B, tier 1
- Borel-Cantelli Lemmaslayer 0B, tier 1
- Modes of Convergence of Random Variableslayer 0B, tier 1
- Characteristic Functionslayer 1, tier 1
- Moment Generating Functionslayer 0A, tier 2
- Martingale Theorylayer 0B, tier 2
- Radon-Nikodym and Conditional Expectationlayer 0B, tier 1
- Skewness, Kurtosis, and Higher Momentslayer 1, tier 1
- Bayesian State Estimationlayer 2, tier 2
- Bayesian Estimationlayer 0B, tier 2
- Maximum Likelihood Estimation: Theory, Information Identity, and Asymptotic Efficiencylayer 0B, tier 1
- KL Divergencelayer 1, tier 1
- Information Theory Foundationslayer 0B, tier 2
- Distance Metrics Comparedlayer 1, tier 2
- Non-Euclidean and Hyperbolic Geometrylayer 1, tier 2
- Total Variation Distancelayer 1, tier 1
- Method of Momentslayer 0B, tier 2
- Shrinkage Estimation and the James-Stein Estimator: Inadmissibility, SURE, and Brown's Characterizationlayer 0B, tier 1
- Cramér-Rao Bound: Information Inequality, Achievability, and Sharper Variantslayer 0B, tier 1
- Fisher Information: Curvature, KL Geometry, and the Natural Gradientlayer 0B, tier 1
- Basu's Theoremlayer 0B, tier 3
- Sufficient Statistics and Exponential Familieslayer 0B, tier 2
- Minimax Lower Bounds: Le Cam, Fano, Assouad, and the Reduction to Testinglayer 3, tier 1
- Empirical Processes and Chaininglayer 3, tier 2
- Rademacher Complexitylayer 3, tier 1
- Empirical Risk Minimizationlayer 2, tier 1
- High-Dimensional Probability (Vershynin)layer 2, tier 1
- Cramér-Wold Theoremlayer 1, tier 2
- Loss Functions Cataloglayer 1, tier 1
- Logistic Regressionlayer 1, tier 1
- Data Preprocessing and Feature Engineeringlayer 1, tier 1
- Linear Regressionlayer 1, tier 1
- The Elements of Statistical Learning (Hastie, Tibshirani, Friedman)layer 0B, tier 1
- Naive Bayeslayer 1, tier 2
- Robust Statistics and M-Estimatorslayer 3, tier 2
- Minimax and Saddle Pointslayer 2, tier 2
- Convex Dualitylayer 2, tier 1
- Subgradients and Subdifferentialslayer 1, tier 1
- Winsorizationlayer 1, tier 3
- Order Statisticslayer 1, tier 2
- Sequences and Series of Functionslayer 0A, tier 2
- Understanding Machine Learning (Shalev-Shwartz, Ben-David)layer 1, tier 1
- VC Dimensionlayer 2, tier 1
- Counting and Combinatoricslayer 0A, tier 2
- Hypothesis Classes and Function Spaceslayer 2, tier 1
- PAC Learning Frameworklayer 1, tier 1
- Uniform Convergencelayer 2, tier 1
- Adaptive Learning Is Not IIDlayer 3, tier 2
- Bernstein Inequalitylayer 2, tier 1
- Bennett's Inequalitylayer 2, tier 1
- Chernoff Boundslayer 1, tier 1
- Hoeffding's Lemmalayer 1, tier 1
- Realizability Assumptionlayer 2, tier 1
- Loss Functionslayer 1, tier 2
- Slud's Inequalitylayer 2, tier 2
- Bias-Complexity Tradeofflayer 2, tier 2
- No-Free-Lunch Theoremlayer 2, tier 2
- Glivenko-Cantelli Theoremlayer 2, tier 2
- McDiarmid's Inequalitylayer 3, tier 1
- Sub-Gaussian Random Variableslayer 2, tier 1
- Epsilon-Nets and Covering Numberslayer 3, tier 1
- Contraction Inequalitylayer 3, tier 2
- Sub-Exponential Random Variableslayer 2, tier 1
- Chi-Squared Concentrationlayer 2, tier 1
- Symmetrization Inequalitylayer 3, tier 1
- Asymptotic Statistics: M-Estimators, Delta Method, LANlayer 0B, tier 1
- Measure Concentration and Geometric Functional Analysislayer 3, tier 1
- Stochastic Processes for MLlayer 2, tier 2
- Gaussian Processes in Astronomylayer 4, tier 3
- Gaussian Processes for Machine Learninglayer 4, tier 3
- Kernels and Reproducing Kernel Hilbert Spaceslayer 3, tier 2
- Dimensionality Reduction Theorylayer 2, tier 2
- Principal Component Analysislayer 1, tier 1
- Singular Value Decompositionlayer 0A, tier 1
- Gram Matrices and Kernel Matriceslayer 1, tier 1
- Matrix Multiplication Algorithmslayer 1, tier 2
- The Kernel Tricklayer 2, tier 1
- Support Vector Machineslayer 2, tier 1
- Perceptronlayer 1, tier 2
- Ridge Regressionlayer 1, tier 1
- Gauss-Markov Theoremlayer 2, tier 1
- The Multivariate Normal Distributionlayer 0B, tier 1
- Maximum A Posteriori (MAP) Estimationlayer 0B, tier 1
- Bayesian Linear Regressionlayer 2, tier 1
- Conjugate Priorslayer 0B, tier 1
- High-Dimensional Covariance Estimationlayer 3, tier 2
- Matrix Concentrationlayer 3, tier 1
- Lasso Regressionlayer 2, tier 1
- NMF (Nonnegative Matrix Factorization)layer 2, tier 3
- Tensors and Tensor Operationslayer 0A, tier 1
- Pandas and NumPy Fundamentalslayer 4, tier 3
- Functional Analysis Corelayer 0B, tier 2
- Hanson-Wright Inequalitylayer 3, tier 2
- Regularization Theorylayer 2, tier 2
- Bias-Variance Tradeofflayer 2, tier 2
- Elastic Netlayer 2, tier 2
- Generalized Additive Modelslayer 2, tier 2
- MARS (Multivariate Adaptive Regression Splines)layer 2, tier 3
- K-Nearest Neighborslayer 1, tier 2
- AdaBoostlayer 2, tier 2
- Decision Trees and Ensembleslayer 2, tier 2
- Gradient Boostinglayer 2, tier 1
- Gradient Descent Variantslayer 1, tier 1
- Cubist and Model Treeslayer 2, tier 3
- Overfitting and Underfittinglayer 2, tier 1
- XGBoostlayer 2, tier 2
- Spectral Clusteringlayer 2, tier 2
- K-Means Clusteringlayer 1, tier 1
- Self-Organizing Mapslayer 2, tier 3
- t-SNE and UMAPlayer 2, tier 2
- PageRank Algorithmlayer 2, tier 2
- SVM for RF Classificationlayer 4, tier 3
- Signals and Systems for MLlayer 1, tier 2
- Time Series Forecasting Basicslayer 2, tier 2
- Time Series Foundationslayer 2, tier 2
- Gaussian Process Regressionlayer 3, tier 2
- Kernel Methods for Moleculeslayer 4, tier 3
- Kalman Filterlayer 2, tier 1
- No-U-Turn Sampler and Neal's Funnellayer 3, tier 2
- Hamiltonian Monte Carlolayer 3, tier 2
- Metropolis-Hastings Algorithmlayer 2, tier 1
- Markov Chain Monte Carlolayer 2, tier 1
- Markov Chains and Steady Statelayer 1, tier 2
- Monte Carlo Methodslayer 2, tier 1
- Gibbs Samplinglayer 2, tier 1
- Griddy Gibbs Samplinglayer 2, tier 3
- Variance Reduction Techniqueslayer 2, tier 2
- Importance Samplinglayer 2, tier 1
- Number Theory and Machine Learninglayer 4, tier 3
- Differential Privacylayer 3, tier 2
- Federated Learninglayer 3, tier 2
- Optimizer Theory: SGD, Adam, and Muonlayer 3, tier 1
- Adam Optimizerlayer 2, tier 1
- Stochastic Gradient Descent Convergencelayer 2, tier 1
- Coordinate Descentlayer 2, tier 2
- Mirror Descent and Frank-Wolfelayer 3, tier 2
- Online Convex Optimizationlayer 3, tier 2
- No-Regret Learninglayer 3, tier 2
- Projected Gradient Descentlayer 2, tier 2
- Proximal Gradient Methodslayer 2, tier 1
- Quasi-Newton Methodslayer 2, tier 1
- Newton's Methodlayer 1, tier 1
- Line Search Methodslayer 2, tier 2
- Secant Methodlayer 1, tier 2
- Automatic Differentiationlayer 1, tier 1
- Matrix Calculuslayer 1, tier 1
- Information Geometrylayer 3, tier 3
- Whitening and Decorrelationlayer 2, tier 2
- Floating-Point Arithmeticlayer 0A, tier 1
- Preconditioned Optimizers: Shampoo, K-FAC, and Natural Gradientlayer 3, tier 2
- Conjugate Gradient Methodslayer 2, tier 2
- Numerical Linear Algebralayer 1, tier 2
- Riemannian Optimization and Manifold Constraintslayer 3, tier 2
- Equivariant Deep Learninglayer 4, tier 2
- Convolutional Neural Networkslayer 3, tier 2
- Feedforward Networks and Backpropagationlayer 2, tier 1
- Activation Functionslayer 1, tier 1
- Deep Learning (Goodfellow, Bengio, Courville)layer 0B, tier 1
- Fast Fourier Transformlayer 1, tier 2
- Complex Numbers for Fourierlayer 0A, tier 2
- Skip Connections and ResNetslayer 2, tier 1
- Graph Neural Networkslayer 3, tier 2
- Clustering for Gene Expressionlayer 4, tier 3
- Attention for Protein Structure: AlphaFold and Successorslayer 4, tier 3
- Attention Mechanism Theorylayer 4, tier 2
- Softmax and Numerical Stabilitylayer 1, tier 1
- Linear Layer: Shapes, Bias, and Memorylayer 2, tier 1
- Word Embeddingslayer 2, tier 2
- Information Retrieval Foundationslayer 2, tier 1
- Transformer Architecturelayer 4, tier 2
- Attention Mechanisms Historylayer 3, tier 2
- Recurrent Neural Networkslayer 3, tier 2
- Macroeconomic Time-Series Forecastinglayer 4, tier 3
- Byte-Level Language Modelslayer 4, tier 3
- Tokenization and Information Theorylayer 4, tier 3
- Distributional Semanticslayer 2, tier 2
- NLP for Economic Text Analysislayer 4, tier 3
- Natural Language Processing Foundationslayer 2, tier 2
- RNNs for Signal Sequenceslayer 4, tier 3
- Token Prediction and Language Modelinglayer 3, tier 2
- Hyperbolic Embeddings for Graphslayer 2, tier 2
- Training Dynamics and Loss Landscapeslayer 4, tier 2
- Stability and Optimization Dynamicslayer 2, tier 2
- Peano Axiomslayer 0A, tier 2
- Rejection Samplinglayer 1, tier 2
- Squeezed Rejection Samplinglayer 2, tier 3
- Burn-in and Convergence Diagnosticslayer 2, tier 2
- Coupling Arguments and Mixing Timelayer 3, tier 3
- MCMC for Markov Random Fieldslayer 3, tier 3
- Perfect Samplinglayer 3, tier 3
- Slice Samplinglayer 2, tier 3
- Multi-Armed Bandits Theorylayer 2, tier 2
- Bayesian Optimization for Hyperparameterslayer 3, tier 2
- Online Learning and Banditslayer 3, tier 2
- Test-Time Training and Adaptive Inferencelayer 5, tier 2
- Continuous Thought Machineslayer 5, tier 3
- Neural ODEs and Continuous-Depth Networkslayer 4, tier 3
- Classical ODEs: Existence, Stability, and Numerical Methodslayer 1, tier 1
- Gradient Flow and Vanishing Gradientslayer 2, tier 1
- Equilibrium and Implicit-Layer Modelslayer 4, tier 2
- Implicit Differentiationlayer 2, tier 2
- Lyapunov-Based Machine Learning for Chaoslayer 4, tier 3
- Nonlinear Dynamics and Chaos Fundamentalslayer 4, tier 3
- Physics-Informed Neural Networkslayer 4, tier 2
- Divergence, Curl, and Line Integralslayer 0A, tier 2
- Kolmogorov-Arnold Networks (KANs)layer 4, tier 2
- Universal Approximation Theoremlayer 2, tier 1
- PDE Fundamentals for Machine Learninglayer 1, tier 2
- Stochastic Differential Equationslayer 3, tier 2
- Ito's Lemmalayer 3, tier 2
- Stochastic Calculus for MLlayer 3, tier 3
- Symbolic Regression and Equation Discoverylayer 4, tier 3
- Sparse Recovery and Compressed Sensinglayer 4, tier 3
- Q-Learninglayer 2, tier 1
- Value Iteration and Policy Iterationlayer 2, tier 1
- Bellman Equationslayer 2, tier 1
- Stochastic Approximation Theorylayer 2, tier 2
- Temporal Difference Learninglayer 2, tier 2
- Actor-Critic Methodslayer 3, tier 2
- Reward Systems and Reinforcement Learning Neurosciencelayer 4, tier 3
- Fine-Tuning and Adaptationlayer 3, tier 1
- Agentic RL and Tool Uselayer 5, tier 2
- Offline Reinforcement Learninglayer 3, tier 2
- Video World Modelslayer 5, tier 2
- World Models and Planninglayer 4, tier 2
- The Era of Experiencelayer 4, tier 1
- The Bitter Lessonlayer 3, tier 1
- History of Artificial Intelligencelayer 5, tier 2
- Model-Based Reinforcement Learninglayer 3, tier 2
- Deep RL for Controllayer 4, tier 3
- Diffusion Modelslayer 4, tier 1
- Variational Autoencoderslayer 3, tier 1
- Autoencoderslayer 2, tier 2
- Boltzmann Machines and Hopfield Networkslayer 2, tier 3
- EM Algorithm Variantslayer 3, tier 2
- The EM Algorithmlayer 2, tier 1
- Autoencoders for Low-Dimensional Dynamical Structureslayer 4, tier 3
- Gaussian Mixture Models and EMlayer 2, tier 2
- Score Matchinglayer 3, tier 1
- Fokker–Planck Equationlayer 3, tier 2
- Feynman–Kac Formulalayer 3, tier 2
- Deep Generative Models for Cosmic Structureslayer 4, tier 3
- Generative Adversarial Networkslayer 3, tier 2
- Normalizing Flowslayer 3, tier 3
- Time Reversal of SDEslayer 3, tier 2
- CLIP, OpenCLIP, and SigLIP: Contrastive Language-Image Pretraininglayer 4, tier 1
- Contrastive Learninglayer 3, tier 2
- Vision Transformer Lineage: ViT, DeiT, Swin, MAE, DINOv2, SAMlayer 4, tier 1
- CNNs for Medical Imaginglayer 4, tier 3
- Object Detection and Segmentationlayer 3, tier 2
- Florence and Vision Foundation Modelslayer 5, tier 2
- Self-Supervised Visionlayer 4, tier 2
- CNNs for Signal Feature Extractionlayer 4, tier 3
- Continuous Normalizing Flowslayer 3, tier 3
- Adjoint Sensitivity Methodlayer 3, tier 2
- Energy-Based Modelslayer 3, tier 3
- Neural SDEs and the Diffusion Bridgelayer 4, tier 3
- Langevin Dynamicslayer 3, tier 2
- SGD as a Stochastic Differential Equationlayer 3, tier 2
- Probability Flow ODElayer 3, tier 2
- BERT and the Pretrain-Finetune Paradigmlayer 4, tier 2
- Policy Optimization: PPO and TRPOlayer 3, tier 2
- DDPG: Deep Deterministic Policy Gradientlayer 3, tier 2
- TD3: Twin Delayed Deep Deterministic Policy Gradientlayer 3, tier 2
- Scaling Lawslayer 4, tier 1
- Data Contamination and Evaluationlayer 5, tier 2
- Hypothesis Testing for MLlayer 2, tier 2
- Benford's Lawlayer 1, tier 2
- Confusion Matrix: MCC, Kappa, and Cost-Sensitive Evaluationlayer 1, tier 1
- Evaluation Metrics and Propertieslayer 2, tier 2
- Neyman-Pearson and Hypothesis Testing Theorylayer 2, tier 2
- Reproducibility and Experimental Rigorlayer 2, tier 2
- Git and GitLab for ML Researchlayer 4, tier 3
- Python for ML Researchlayer 4, tier 3
- Weights and Biases for Experiment Trackinglayer 4, tier 3
- Survival Analysislayer 3, tier 2
- Benchmarking Methodologylayer 3, tier 3
- Model Collapse and Data Qualitylayer 5, tier 2
- Synthetic Data Generationlayer 3, tier 2
- Distributed Training Theorylayer 5, tier 3
- Parallel Processing Fundamentalslayer 5, tier 2
- Broadcast Joins in Distributed Computelayer 4, tier 3
- Dask Parallel Pythonlayer 4, tier 3
- Ray Distributed Pythonlayer 4, tier 3
- Batch Size and Learning Dynamicslayer 2, tier 2
- Kafka Streaming Platformlayer 4, tier 3
- Running ML Workloads on GPUslayer 4, tier 3
- GPU Compute Modellayer 5, tier 2
- ASML and Chip Manufacturinglayer 5, tier 3
- Docker and Containers for MLlayer 4, tier 3
- Kubernetes for ML Workloadslayer 4, tier 3
- Modal: Serverless GPU Platformlayer 4, tier 3
- Ineffable Intelligencelayer 4, tier 2
- Reinforcement Learning from Human Feedbacklayer 5, tier 1
- Reinforcement Learning for Synthesis Planninglayer 4, tier 3
- Reward Design and Reward Misspecificationlayer 3, tier 1
- Reinforcement Learning for Drug Discoverylayer 4, tier 3
- AI Labs Landscapelayer 5, tier 2
- Model Timelinelayer 5, tier 2
- Inference Systems Overviewlayer 5, tier 2
- KV Cachelayer 5, tier 2
- Attention Is All You Need (Paper)layer 4, tier 1
- Attention Variants and Efficiencylayer 4, tier 2
- Efficient Transformers Surveylayer 4, tier 2
- Speculative Decoding and Quantizationlayer 5, tier 2
- Megakernelslayer 5, tier 3
- Fused Kernelslayer 5, tier 2
- CUDA Programming Fundamentalslayer 4, tier 3
- Flash Attentionlayer 5, tier 2
- Computer Architecture for MLlayer 2, tier 2
- NVIDIA GPU Architectureslayer 5, tier 3
- WebGPU for Machine Learninglayer 0B, tier 2
- Multi-Token Predictionlayer 5, tier 2
- Edge and On-Device MLlayer 5, tier 2
- Model Compression and Pruninglayer 3, tier 2
- Lazy vs Feature Learninglayer 4, tier 2
- Neural Tangent Kernel: Lazy Training, Kernel Equivalence, μP, and the Limits of Widthlayer 4, tier 1
- Implicit Bias and Modern Generalizationlayer 4, tier 1
- Algorithmic Stabilitylayer 3, tier 1
- Cross-Validation Theorylayer 2, tier 2
- AIC and BIClayer 2, tier 1
- Kolmogorov Complexity and MDLlayer 2, tier 2
- Class Imbalance and Resamplinglayer 1, tier 2
- Confusion Matrices and Classification Metricslayer 1, tier 1
- Multi-Class and Multi-Label Classificationlayer 1, tier 2
- Signal Detection Theorylayer 2, tier 2
- Feature Importance and Interpretabilitylayer 2, tier 2
- Exploratory Data Analysislayer 1, tier 2
- ML Project Lifecyclelayer 1, tier 2
- Hardware for ML Practitionerslayer 1, tier 2
- Train-Test Split and Data Leakagelayer 1, tier 1
- Mechanistic Interpretability: Features, Circuits, and Causal Faithfulnesslayer 4, tier 1
- Residual Stream and Transformer Internalslayer 4, tier 2
- Forgetting Transformer (FoX)layer 4, tier 2
- Sparse Attention and Long Contextlayer 4, tier 2
- Gemini and Google Modelslayer 5, tier 2
- Sparse Autoencoders for Interpretability: TopK, JumpReLU, Matryoshka, and Scalinglayer 4, tier 1
- Truth Directions and Linear Probeslayer 4, tier 2
- Model Evaluation Best Practiceslayer 1, tier 1
- Proper Scoring Ruleslayer 2, tier 2
- ROC Curve and AUClayer 2, tier 2
- Statistical Significance and Multiple Comparisonslayer 2, tier 2
- PAC-Bayes Boundslayer 3, tier 1
- Sample Complexity Boundslayer 2, tier 1
- Information Bottlenecklayer 3, tier 3
- Neural Network Optimization Landscapelayer 4, tier 2
- Random Matrix Theory Overviewlayer 4, tier 2
- Mean Field Theorylayer 4, tier 2
- AlphaProof and AI-Assisted Theorem Provinglayer 4, tier 1
- Synthetic Data Distillationlayer 3, tier 2
- Knowledge Distillationlayer 3, tier 2
- Iterative Magnitude Pruning and the Lottery Ticket Hypothesislayer 4, tier 2