Trained steering vectors may work as activation oracles
Source ↗
👁 0
💬 0
Inspired by @Eriskii's recent finding that trained steering vectors can teach a base model to act as an assistant, I replaced the Activation Oracle paper's trained LoRA with a far smaller set of per layer trained steering vectors and found surprisingly good eval results, far better than anticipated from the tiny param count.Trained per-layer steering vectors on Qwen3-8B as an activation oracleStandard activation injection mechanism with " ?" placeholdersCollected activation ranges (full sequence
Comments (0)