Unlock: Reward Design and Reward Misspecification
The hardest problem in RL: specifying what you want. Reward shaping, potential-based shaping theorem, specification gaming, Goodhart's law in RL, and the bridge from classic RL to alignment.
258 Prerequisites0 Mastered0 Working198 Gaps
Prerequisite mastery23%
Recommended probe
Natural Language Processing Foundations is your weakest prerequisite with available questions. You haven't been assessed on this topic yet.
Not assessed5 questions
Not assessed12 questions
Not assessed3 questions
No quiz
Sign in to track your mastery and see personalized gap analysis.