🧩 Philosophy 12h ago · Nathan Helm-Burger

Research Log: Monet/PEER sparse experts

Less Wrong
View Channel →
Source ↗ 👁 0 💬 0
I've been looking into the Monet/PEER sparse expert papers. I think there's a lot of potential in these ideas for interpretability-by-design.
Some of what I've done so far:


Quantization experiments: PEER can be losslessly distilled to int8 and distilled to int4 with only minor degradation. From int4, you can train PEER by having second int4 tensor that works as a gradient accumulation buffer (allowing for incremental steps between two int4 values), with some stochastic rounding on the accumul

Comments (0)

Sign in to join the discussion

More Like This

Your Supplies Probably Won't Be Stolen in a Disaster
LessWrong · 2h ago
📰
On the discordance between AI systems' internal states and their outputs
LessWrong · 2h ago
📰
Claude the romance novelist
LessWrong · 4h ago
(Re)introduction of a rationalist dragon, and clarifications on Ziz's character
LessWrong · 6h ago
📰
Community misconduct disputes are not about facts
LessWrong · 6h ago
Trained steering vectors may work as activation oracles
LessWrong · 7h ago