Latent reasoning models might be a good thing?
Source ↗
👁 2
💬 0
Epistemic status: I think the main point of this post is probably (~80%) false, and there are probably more counterpoints I haven't thought of. I wrote the rest of the post as if my claims are true for ease of reading. I would appreciate it if you told me where my arguments are wrong!Latent reasoning models (LRMs, popularized by Meta's Coconut paper, which was improved on a lot by CODI) do CoT thinking in the model's latent space by skipping the LM head that maps from d_model-vectors to a distri
Comments (0)