🧩 Philosophy 21h ago · Logan Riggs

Ambitious Mech Interp w/ Tensor-transformers on toy languages [Project Proposal]

Less Wrong

Ambitious Mech Interp w/ Tensor-transformers on toy languages [Project Proposal]

Source ↗ 👁 0 💬 0

This is my project proposal for Pivotal. Apply as a mentee by May 3rdThe field has accumulated a vocabulary of computational primitives (induction heads, skip-trigrams) through post-hoc analysis. We propose building a toy language from these known primitives to train tensor-transformers (see an early example in the last section)This allows us to study fundamental problems (suppression & error correction, compositionality/ circuits, dev-interp, etc) with the odds stacked in our favor:We know the