💻 Technology Mar 31, 2026 · Angela Aristidou

AI benchmarks are broken. Here’s what we need instead.

MIT Technology Review

Authoritative reporting on emerging technologies

Source ↗ 👁 16 💬 0

For decades, artificial intelligence has been evaluated through the question of whether machines outperform humans. From chess to advanced math, from coding to essay writing, the performance of AI models and applications is tested against that of individual humans completing tasks. 

This framing is seductive: An AI vs. human comparison on isolated problems with clear right or wrong answers is easy to standardize, compare, and optimize. It generates rankings and headlines. 

But th