How does Reinforcement Learning Affect Models
Source ↗
👁 0
💬 0
I wanted to share some reflections I have been having recently about how reinforcement learning in post-training may be affecting language models. This seems important for two reasons. First, much of the serious risk from advanced AI systems may come from post-training rather than pre-training alone. Second, reinforcement learning appears to be one of the main methods currently being scaled to make models more powerful, especially in reasoning-heavy domains.To understand what may be happening, w
Comments (0)