Training Value Functions via Classification for Scalable Deep Reinforcement Learning: Study by Google DeepMind Researchers and Others
Marktechpost
MARCH 11, 2024
Value functions, implemented with neural networks, undergo training via mean squared error regression to align with bootstrapped target values. However, upscaling value-based RL methods utilizing regression for extensive networks, like high-capacity Transformers, has posed challenges.
Let's personalize your content