Skip to main content

Batch Normalization

5 selectedDifficulty 3-55 unseenView topic
FoundationNew
0 answered
2 foundation3 intermediateAdapts to your performance
Question 1 of 5
120sfoundation (3/10)compare
Why do transformer-style language models usually prefer LayerNorm or RMSNorm over BatchNorm?