💻 Technology Apr 15, 2026 · taryn.plumb@venturebeat.com (Taryn Plumb)

Frontier models are failing one in three production attempts — and getting harder to audit

VentureBeat
VentureBeat tech
View Channel →
Frontier models are failing one in three production attempts — and getting harder to audit
Source ↗ 👁 0 💬 0
AI agents are now embedded in real enterprise workflows, and they're still failing roughly one in three attempts on structured benchmarks. That gap between capability and reliability is the defining operational challenge for IT leaders in 2026, according to Stanford HAI's ninth annual AI Index report.This uneven, unpredictable performance is what the AI Index calls the "jagged frontier," a term coined by AI researcher Ethan Mollick to describe the boundary where AI excels and then suddenly fails

Comments (0)

Sign in to join the discussion

More Like This

📰
Apple Releases iOS 26.4.2 and iPadOS 26.4.2
Thurrott.com · 1h ago
Google Confirms Gemini-Powered Siri Coming Later This Year
MacRumors: Mac News and Rumors - All Stories · 1h ago
📰
Don't wait to buy a MacBook Neo for school - here's why
Latest news · 1h ago
📰
OpenAI announces workspace agents in ChatGPT, letting teams create Codex-powered shared agents for complex tasks, and says they are "an evolution of GPTs" (OpenAI)
Techmeme · 1h ago
Apple TV’s acclaimed London crime thriller is back with new season
9to5Mac · 1h ago
OpenAI updates ChatGPT with Codex-powered ‘workspace agents’ for teams
9to5Mac · 1h ago