Skip to main content

Gold diagnostic report

Check reference diagnostic sets before real learners use them.

A gold diagnostic set is a curated group of questions with known answers, skill tags, claim links, and expected routing behavior. We use these sets to test whether the adaptive system recommends the right review target before real learner data is involved.

Open diagnostic

Not learner data

These are fixed evaluator questions, not private user answers or a model-training dataset.

Used as a calibration set

Each set has expected skills and claims, so we can catch vague or wrong recommendations.

Tests the learning loop

The report checks coverage, answer quality, Q-matrix links, claim links, and simulated profiles.

lanes

14

unique questions

288

recommended policies

1

math-to-probability-readiness

Math to probability readiness

testable

Checks notation, matrix reasoning, random variables, Bayes, distributions, convergence, and concentration before ML theory.

questions

30

topics

23

Q-matrix

30

claim links

8

Recommended policy

prerequisite bottleneck

Accuracy 76% · mastery 38% · prereq jumps 0.0

Claim-link coverage is thin

8/30 questions have claim links.

probability-to-statistics-readiness

Probability to statistics readiness

testable

Checks LLN versus CLT, convergence modes, standard errors, and finite-sample caveats before estimation and asymptotics.

questions

20

topics

13

Q-matrix

20

claim links

5

Recommended policy

prerequisite bottleneck

Accuracy 82% · mastery 41% · prereq jumps 0.0

Claim-link coverage is thin

5/20 questions have claim links.

finite-generalization-readiness

Finite generalization readiness

testable

Checks the sub-Gaussian/Hoeffding bridge plus ERM, VC, and uniform-convergence prerequisites.

questions

18

topics

11

Q-matrix

18

claim links

8

Recommended policy

prerequisite bottleneck

Accuracy 70% · mastery 42% · prereq jumps 0.0

Overlapping gold items deduped

2 repeated questions appeared across sets in this lane.

neural-network-function-readiness

Neural network function readiness

testable

Checks exp/log identities, sigmoid/logit/tanh, softplus, safe softmax, and optimizer mechanics before neural-network training paths.

questions

20

topics

14

Q-matrix

20

claim links

5

Recommended policy

prerequisite bottleneck

Accuracy 85% · mastery 42% · prereq jumps 0.0

Claim-link coverage is thin

5/20 questions have claim links.

training-and-estimation-readiness

Training and estimation readiness

testable

Checks optimization, stochastic gradients, likelihood, Fisher information, and estimation assumptions.

questions

20

topics

17

Q-matrix

20

claim links

10

Recommended policy

prerequisite bottleneck

Accuracy 82% · mastery 42% · prereq jumps 0.0

Ready for local test flight

Coverage and quality checks are above the current gold-set floor.

information-theoretic-readiness

Information-theoretic readiness

testable

Checks surprisal, entropy, bit/nat units, bpb in language modeling, KL asymmetry, and the cross-entropy decomposition before classifier-training and LM-evaluation paths. Pairs IT foundations with the activation-function mechanics that the cross-entropy loss combines with.

questions

20

topics

9

Q-matrix

20

claim links

0

Recommended policy

prerequisite bottleneck

Accuracy 92% · mastery 45% · prereq jumps 0.0

Claim-link coverage is thin

0/20 questions have claim links.

gradient-optimization-readiness

Gradient optimization readiness

testable

Checks the differential calculus and geometry beneath gradient descent: directional derivatives, Jacobians, Hessians and symmetry, first- and second-order optimality conditions, Taylor quadratic models, eigenvalues as curvature, saddle classification, and the Newton step. Pairs the calculus foundations with optimizer mechanics so a learner who walks both knows what GD is walking on AND how stochastic gradients and step-size choices behave.

questions

20

topics

13

Q-matrix

20

claim links

5

Recommended policy

prerequisite bottleneck

Accuracy 78% · mastery 42% · prereq jumps 0.0

Claim-link coverage is thin

5/20 questions have claim links.

estimation-foundations-readiness

Estimation foundations readiness

testable

Checks the spine of estimator vocabulary and MLE properties: estimator-as-random-variable, bias and MSE decomposition, OLS as MLE under Gaussian noise, MLE as KL minimization (Goodfellow Ch 5.5), consistency, Gaussian-variance MLE bias-vs-consistency, asymptotic efficiency at the Cramer-Rao floor, and OLS as BLUE under Gauss-Markov. Pairs the depth set with the breadth statistics-estimation set so a learner who walks both has Bayesian/frequentist context AND clean property definitions.

questions

20

topics

10

Q-matrix

20

claim links

5

Recommended policy

prerequisite bottleneck

Accuracy 75% · mastery 43% · prereq jumps 0.0

Claim-link coverage is thin

5/20 questions have claim links.

convex-optimization-readiness

Convex optimization readiness

testable

Checks the constrained-optimization vocabulary: convex set vs convex function, Lipschitz vs L-smoothness, feasible set, projection onto a convex set, the descent lemma, Lagrangian, KKT conditions, weak vs strong duality with Slater, and projected gradient. Pairs the constrained-side set with the unconstrained-calculus set (gradient-based-optimization-foundations-v1) so a learner walks both halves of optimization: what GD walks on, and what changes when constraints arrive.

questions

20

topics

9

Q-matrix

20

claim links

0

Recommended policy

prerequisite bottleneck

Accuracy 83% · mastery 43% · prereq jumps 0.0

Claim-link coverage is thin

0/20 questions have claim links.

ml-fundamentals-readiness

ML fundamentals readiness

testable

Checks the pre-ML-theory vocabulary: supervised vs unsupervised, parametric vs non-parametric, train vs generalization error, bias-variance, k-means, KNN distance concentration, SVM max-margin, the kernel trick, decision-tree splitting, and overfitting diagnosis. Pairs the fundamentals set with the learning-theory set so a learner who walks both has the basic ML vocabulary AND the formal generalization machinery.

questions

20

topics

14

Q-matrix

20

claim links

3

Recommended policy

prerequisite bottleneck

Accuracy 73% · mastery 43% · prereq jumps 0.0

Claim-link coverage is thin

3/20 questions have claim links.

adaptive-diagnostic-readiness

Adaptive diagnostic readiness

testable

Checks whether learners can separate iid sampling from adaptive diagnostic feedback and use martingale-style assumptions for retry/checkpoint loops.

questions

20

topics

13

Q-matrix

20

claim links

11

Recommended policy

prerequisite bottleneck

Accuracy 67% · mastery 40% · prereq jumps 0.0

Ready for local test flight

Coverage and quality checks are above the current gold-set floor.

proof-vocabulary-readiness

Proof vocabulary readiness

testable

Checks the discrete-math vocabulary every learner needs before reading proofs in any branch of math: subset vs member, functions and relations, injective/surjective/bijective, implication truth tables, quantifier negation, contrapositive vs contradiction, induction structure, Cantor diagonalization, equivalence relations, power-set cardinality, and Russell's paradox. Pairs the sets-and-logic depth set with math-foundations so a learner gets both proof vocabulary AND the broader notation literacy for matrices, calculus, and inequalities.

questions

20

topics

13

Q-matrix

20

claim links

1

Recommended policy

prerequisite bottleneck

Accuracy 85% · mastery 42% · prereq jumps 0.0

Claim-link coverage is thin

1/20 questions have claim links.

calculus-readiness

Calculus and differentiation readiness

testable

Checks the multivariable-calculus prerequisites before optimization theory: derivative as a limit, continuity vs differentiability, partials, MVT, chain rule via Jacobians, the Frechet derivative, implicit function theorem, extreme value theorem on compact sets, Lipschitz vs uniform vs pointwise continuity, and Schwarz commutativity. Pairs the depth set with the gradient-based-optimization set so a learner walks both the calculus and its application to GD/Newton.

questions

20

topics

9

Q-matrix

20

claim links

0

Recommended policy

prerequisite bottleneck

Accuracy 78% · mastery 43% · prereq jumps 0.0

Claim-link coverage is thin

0/20 questions have claim links.

measure-theory-readiness

Measure theory readiness

testable

Checks the measure-theory foundations underpinning modern probability: sigma-algebras, Borel sets, measurable functions, MCT/Fatou/DCT, almost-everywhere equality, Borel-Cantelli, Fubini, Lebesgue vs Riemann, and countable additivity. Pairs the measure-theory depth set with the probability-foundations breadth set so a learner promoting from undergrad probability into rigorous measure-theoretic probability has both sides.

questions

20

topics

11

Q-matrix

20

claim links

0

Recommended policy

prerequisite bottleneck

Accuracy 90% · mastery 41% · prereq jumps 0.0

Claim-link coverage is thin

0/20 questions have claim links.

Path recommendation checks

Synthetic profiles should route to concrete review targets.

These rows test whether missed answers point to specific claims or topics, not generic study advice.

basic-neural-network-from-scratch

Basic Neural Network From Scratch

passing

Checks whether the basic MLP path routes shape, update-rule, backprop, and generalization gaps to the right checkpoint.

questions

39

claims

14

profiles

3

Shape ledger gap

linear layer and activation shapes

Review Vectors, Matrices, and Linear Maps. Next checkpoint: Linear Layer And Activation Shapes.

expected: Vectors, Matrices, and Linear Maps

Optimizer update gap

loss optimization update rule

Review Critical Batch Size and Training Efficiency. Next checkpoint: Loss Optimization Update Rule.

expected: Critical Batch Size and Training Efficiency

Generalization gap

generalization and estimation readiness

Review Bias-Variance Tradeoff. Next checkpoint: Generalization And Estimation Readiness.

expected: Bias-Variance Tradeoff, Consistency of the MLE (Wald)

training-and-estimation-readiness

Training and Estimation Readiness

passing

Checks whether optimizer/statistics gaps resolve to claim-level review targets rather than whole-topic advice.

questions

20

claims

10

profiles

3

Batch-noise gap

claim review

Review Critical Batch Size and Training Efficiency.

expected: Critical Batch Size and Training Efficiency

MLE regularity gap

claim review

Review Asymptotic Normality of the MLE.

expected: Asymptotic Normality of the MLE, Consistency of the MLE (Wald)

Fisher/Cramer-Rao gap

claim review

Review Hessian Form (corollary of Bartlett II).

expected: Hessian Form (corollary of Bartlett II), Cramér-Rao Lower Bound (Scalar)

Copy-safe report

Gold diagnostic simulation report
Generated: 2026-05-01
Lanes: 14
Questions counted across lanes: 288

lane | sets | questions | topics | Q-matrix | claim links | quality | recommended policy
--- | --- | ---: | ---: | ---: | ---: | ---: | ---
math-to-probability-readiness | math-foundations-v1, probability-foundations-v1, probability-concentration-foundations-v1 | 30 | 23 | 30 | 8 | 94.7 | prerequisite_bottleneck (0.756 accuracy)
probability-to-statistics-readiness | convergence-lln-clt-foundations-v1, statistics-estimation-v1 | 20 | 13 | 20 | 5 | 98.2 | prerequisite_bottleneck (0.817 accuracy)
finite-generalization-readiness | probability-concentration-bridge-v1, learning-theory-foundations-v1 | 18 | 11 | 18 | 8 | 97.0 | prerequisite_bottleneck (0.704 accuracy)
neural-network-function-readiness | activation-functions-foundations-v1, optimization-foundations-v1 | 20 | 14 | 20 | 5 | 98.5 | prerequisite_bottleneck (0.850 accuracy)
training-and-estimation-readiness | optimization-foundations-v1, statistics-estimation-v1 | 20 | 17 | 20 | 10 | 98.1 | prerequisite_bottleneck (0.817 accuracy)
information-theoretic-readiness | information-theory-foundations-v1, activation-functions-foundations-v1 | 20 | 9 | 20 | 0 | 99.5 | prerequisite_bottleneck (0.917 accuracy)
gradient-optimization-readiness | gradient-based-optimization-foundations-v1, optimization-foundations-v1 | 20 | 13 | 20 | 5 | 99.0 | prerequisite_bottleneck (0.783 accuracy)
estimation-foundations-readiness | estimator-and-mle-foundations-v1, statistics-estimation-v1 | 20 | 10 | 20 | 5 | 98.6 | prerequisite_bottleneck (0.750 accuracy)
convex-optimization-readiness | convex-optimization-foundations-v1, gradient-based-optimization-foundations-v1 | 20 | 9 | 20 | 0 | 100.0 | prerequisite_bottleneck (0.833 accuracy)
ml-fundamentals-readiness | ml-fundamentals-foundations-v1, learning-theory-foundations-v1 | 20 | 14 | 20 | 3 | 98.3 | prerequisite_bottleneck (0.733 accuracy)
adaptive-diagnostic-readiness | adaptive-learning-not-iid-v1, probability-concentration-bridge-v1 | 20 | 13 | 20 | 11 | 98.0 | prerequisite_bottleneck (0.667 accuracy)
proof-vocabulary-readiness | sets-and-logic-foundations-v1, math-foundations-v1 | 20 | 13 | 20 | 1 | 95.0 | prerequisite_bottleneck (0.850 accuracy)
calculus-readiness | calculus-rn-foundations-v1, gradient-based-optimization-foundations-v1 | 20 | 9 | 20 | 0 | 99.5 | prerequisite_bottleneck (0.783 accuracy)
measure-theory-readiness | measure-theory-foundations-v1, probability-foundations-v1 | 20 | 11 | 20 | 0 | 95.5 | prerequisite_bottleneck (0.900 accuracy)

Path recommendation checks
basic-neural-network-from-scratch: 3 synthetic profiles
- Shape ledger gap: Review Vectors, Matrices, and Linear Maps. Next checkpoint: Linear Layer And Activation Shapes.
- Optimizer update gap: Review Critical Batch Size and Training Efficiency. Next checkpoint: Loss Optimization Update Rule.
- Generalization gap: Review Bias-Variance Tradeoff. Next checkpoint: Generalization And Estimation Readiness.
training-and-estimation-readiness: 3 synthetic profiles
- Batch-noise gap: Review Critical Batch Size and Training Efficiency.
- MLE regularity gap: Review Asymptotic Normality of the MLE.
- Fisher/Cramer-Rao gap: Review Hessian Form (corollary of Bartlett II).