Unlock: Q-Learning
Model-free, off-policy value learning: the Q-learning update rule, convergence under Robbins-Monro conditions, and the deep Q-network revolution that introduced function approximation, experience replay, and the deadly triad.
255 Prerequisites0 Mastered0 Working196 Gaps
Prerequisite mastery23%
Recommended probe
Natural Language Processing Foundations is your weakest prerequisite with available questions. You haven't been assessed on this topic yet.
Q-LearningTARGET
Not assessed5 questions
Not assessed12 questions
Not assessed6 questions
Not assessed3 questions
Not assessed1 question
Sign in to track your mastery and see personalized gap analysis.