BridgeBench Shows Top AI Models at 10% Accuracy Despite Strong Reasoning
Source ↗
👁 0
💬 0
Casablanca – BridgeBench, a new benchmarking project focused on AI reasoning, has released a ranking that exposes a gap between how confidently models explain answers and how often those answers are correct.
The benchmark tests models on reasoning-heavy tasks and scores them across three metrics. Accuracy measures whether the final answer is correct. Evidence evaluates how well the model supports its reasoning with verifiable steps or sources. The overall score combines both, aiming to reward s
The benchmark tests models on reasoning-heavy tasks and scores them across three metrics. Accuracy measures whether the final answer is correct. Evidence evaluates how well the model supports its reasoning with verifiable steps or sources. The overall score combines both, aiming to reward s
Comments (0)