Prediction markets put the probability at 16%: AI model scores ≥ 90% on FrontierMath Benchmark before 2027. Currently, markets see this as unlikely (16% YES). + AI Frontier Models Should Be Subject to Government Cyber Testing, Microsoft Says.
A prediction market currently estimates a 16% probability that an AI model will score ≥ 90% on the FrontierMath Benchmark before 2027, with an 84% probability against that outcome. The FrontierMath Benchmark, designed to test advanced mathematical reasoning, has become a key indicator of progress toward artificial general intelligence (AGI). This market reflects growing skepticism about near-term breakthroughs, even as major AI labs push toward more capable systems. The low probability comes amid broader industry debate about the pace of AI advancement, with some experts warning that benchmarks may not fully capture the complexity of mathematical reasoning required for such high scores. [Forbes, May 03]
The significance of an AI model scoring ≥ 90% on the FrontierMath Benchmark extends beyond academic achievement, as it would signal a level of autonomous problem-solving that could reshape cybersecurity and insurance markets. On May 6, 2026, Microsoft publicly stated that frontier AI models should be subject to government cyber testing, arguing that early access to such models would improve national cyber defenses. This call for regulation comes as Anthropic, on April 7, 2027, announced it would not release its Claude Mythos model to the public, citing global cybersecurity concerns—a decision that has disrupted the cyber insurance underwriting model, according to industry analysts. The potential for a model to achieve near-perfect scores on FrontierMath raises questions about whether such systems could autonomously identify and exploit vulnerabilities faster than human experts. [VitalLaw, May 06]
Looking ahead, the path to an AI model scoring ≥ 90% on the FrontierMath Benchmark by 2027 faces both technical and market headwinds. A May 7, 2026 PitchBook analysis highlighted an "AI quality gap," suggesting that companies priced highest in AI may be worth the least by fundamentals, as the narrative around frontier models fades. Meanwhile, Anthropic co-founder Jack Clark stated on May 4, 2026 that there is a 60% probability of recursive self-improvement (RSI) occurring before the end of 2028, a development that could accelerate benchmark performance. However, the current market probability of 16% for a 90% FrontierMath score by 2027 suggests that traders see significant hurdles—whether in data quality, computational cost, or the inherent difficulty of mathematical reasoning—that may delay such a milestone beyond the current timeline. [PitchBook, May 07]
Lower-volume market on Polymarket ($91K). Wider spreads expected — enter with limit orders and be aware of slippage risk. Currently 80c YES.
What does smart money think? Get AI verdicts, wallet positioning, signal analysis, and entry targets.
Unlock PRO — $29/moOddsShift runs mathematical + AI models and tracks 166 smart money wallets. Get BUY/SELL verdicts, entry targets, wallet positions, and P&L data.
Explore Market Radar →These Other markets have full AI verdicts, smart money tracking, and 5-model analysis: