Ambitious Mech Interp w/ Tensor-transformers on toy languages [Project Proposal]
Source ↗
👁 0
💬 0
This is my project proposal for Pivotal. Apply as a mentee by May 3rdThe field has accumulated a vocabulary of computational primitives (induction heads, skip-trigrams) through post-hoc analysis. We propose building a toy language from these known primitives to train tensor-transformers (see an early example in the last section)This allows us to study fundamental problems (suppression & error correction, compositionality/ circuits, dev-interp, etc) with the odds stacked in our favor:We know the
Comments (0)