I Made 6 AI Models Predict the World Cup. They Almost All Agreed. (Results: June 24)

I had what I thought was a genuinely fun idea.

It’s the 2026 World Cup group stage. China has a football lottery product where you predict win, draw, or loss for 14 matches — all running at the same time. I figured: why not run an experiment? Pit three Chinese AI models against three American ones. See who predicts better. Buy real tickets. Make it concrete.

The models: Kimi, Doubao, and Qianwen on the Chinese side. ChatGPT, Gemini, and Claude on the American side.

The stakes: ¥4 total. Two tickets, one per camp.

I expected disagreement. I got something more interesting instead.


The Setup

I gave every model the same prompt. 14 World Cup matches, numbered. The prediction options: 3 (home win), 1 (draw), 0 (away win). I asked for a one-line reason for each call.

Chinese football lottery period 26087 showing 14 World Cup 2026 group stage matches

The matches included some obvious mismatches — France vs Iraq, Argentina vs Austria, Portugal vs Uzbekistan — and a few legitimately uncertain ones, like Netherlands vs Sweden and Norway vs Senegal.

I ran the Chinese models in Chinese. The American models in English. Same logic, same matches, different languages.

ChatGPT predicting 14 World Cup 2026 match results using 3/1/0 format
Doubao Chinese AI predicting 14 World Cup 2026 match results

What Happened

Out of 14 matches, all six models agreed on 12 of them.

Not just “leaning the same direction.” Exactly the same prediction, same code, six for six. Germany beats Ivory Coast. Japan beats Tunisia. Spain beats Saudi Arabia. France beats Iraq by a lot. The list goes on.

My first reaction was mild disappointment. I’d imagined this neat story about cultural differences in AI training showing up in football analysis — maybe Chinese models would favor Asian underdogs, or American models would be more aggressive about picking upsets. Instead I got a spreadsheet that looked like six people copying each other’s homework.

But then I sat with it for a minute and thought: actually, this is the finding.


What Consensus Actually Means

When six completely different AI systems, trained by different teams in different countries on different data pipelines, all arrive at the same answer — that’s not them agreeing with each other. They can’t. They don’t talk to each other. What they’re all doing is reading the same underlying reality and reporting the same thing.

For 12 of these 14 matches, the gap between the teams is apparently large enough that any competent analysis — human or AI — reaches the same conclusion. France is not losing to Iraq. Argentina is not losing to Austria. The signal is just that clear.

The interesting question, then, isn’t “who agreed” — it’s “where did they disagree?” Because disagreement is where the actual uncertainty lives.


Where China AI and US AI Actually Disagreed

Out of 14 matches, the two camps split on exactly one: Norway vs Senegal (Match 9). And they split cleanly down national lines.

CampModelsMatch 1: Netherlands vs SwedenMatch 9: Norway vs Senegal
🇨🇳 China AIKimi, Doubao, Qianwen3 (Dutch win)3 (Norway win)
🇺🇸 US AIChatGPT, Gemini, Claude3 (Dutch win)1 (Draw)

Match 9: Norway vs Senegal — this is where it got interesting. All three Chinese models called Norway to win. Haaland had already scored twice against Iraq in game one. Norway were sitting on 3 points, momentum high, Haaland at full throttle.

The US models weren’t buying it. ChatGPT, Gemini, and Claude all called a draw. Their argument: Senegal are physical, organized, and not a team that collapses under pressure. Norway rely heavily on one player. The group is tight. A draw was the safer call.

Chinese AI backed the star. American AI backed the system.

Both camps agreed on Match 1 (Netherlands to beat Sweden), so the entire China vs US debate comes down to one question: Can Haaland’s Norway actually beat a Senegal team that held France to a… wait, no. France beat Senegal 3-1. So Senegal are already behind. That might actually favor the Chinese camp’s call.

We find out June 24.


What I Bought

I ended up buying two tickets — one for each camp.

Two World Cup football lottery tickets labeled China AI and US AI

China AI ticket (Kimi + Doubao majority): Norway wins match 9, Dutch win match 1.

US AI ticket (ChatGPT + Gemini majority): Draw in match 9, Dutch win match 1.

The only difference between the two tickets is one match. ¥2 each. If Norway beats Senegal, the Chinese camp wins the internal debate. If it’s a draw, the American camp does.

That’s the experiment now. Results on June 24.


Which camp would you trust — the Chinese models or the American ones? And do you think the 12 unanimous calls will hold up?


This is just Round 1. When the World Cup reaches the quarterfinals, I’ll run the same experiment again — but this time every remaining team will actually be good. No mismatches. No easy calls. That’s when China AI and US AI will have to make real decisions under real pressure.

Follow along if you want to see who gets it right. 👇


Related Reading

Leave a Reply

Your email address will not be published. Required fields are marked *