🧩 Philosophy 3d ago · Mitchell_Porter

Final research agenda #2: first sketch of a plan

Less Wrong
View Channel →
Source ↗ 👁 2 💬 0
Back in mid-February, I posted "A research agenda for the final year", which poses a small set of basic questions. The idea is that if we can answer those questions correctly, then we might have a plan for the creation of a human-friendly superintelligent AI. Now I want to sketch what an answer (and its implementation) could look like. There are no proofs of anything here, just several exploratory hypotheses. They are meant to provide a concrete image of what to aim for, and are subject to revis

Comments (0)

Sign in to join the discussion

More Like This

📰
Types and Tokens
Stanford Encyclopedia of Philosophy · 1d ago
Human-looking robots are a bad idea
LessWrong · 1d ago
📰
How Go Players Disempower Themselves to AI
LessWrong · 1d ago
Early-stage empirical work on “spillway motivations”
LessWrong · 1d ago
Exploration Hacking: Can LLMs Learn to Resist RL Training?
LessWrong · 1d ago
Conditional misalignment: Mitigations can hide EM behind contextual cues
LessWrong · 1d ago