Discussion about this post

User's avatar
James Ransom's avatar

Five hours after posting this the results were in. GPT 5.4-mini was closest, with a Burnham (Labour) victory and Kenyon (Reform) second, and it nicely captured the ~9k gap between them.

Grok, Gemini and DeepSeek were all off the mark, voting for a Reform landslide. The right wing Restore Britain party came in third but barely featured in the simulation, highlighting a limitation of using LLMs and their knowledge cut-offs: the party was established in February 2026 and this was their first time standing a candidate for Westminster.

This experiment tells us more about the diversity of models themselves than anything to do with ‘predicting’ election results. Run enough simulations enough times and you’ll eventually get an eerily accurate forecast. But the divergence given identical prompts is pretty interesting.

In future, we want to run similar simulations but with frontier models. I did a 1% sample with Claude Haiku and Sonnet, but even these turned out to be too expensive for what is an experimental pilot (running this with Opus 4.8 or GPT 5.5 Pro would require a second mortgage).

Chart of results: https://user.fm/files/v2-9e37ac277d867945372e6ef84e26e4e9/actual_vs_models_chart.png

Press coverage: https://aboutmanchester.co.uk/most-ai-models-predicted-a-reform-win-in-makerfield-the-voters-delivered-a-labour-landslide/

No posts

Ready for more?