Some of you may have already seen it, but today I wanted to talk to you about
the statistical model
that we have made to predict the World Cup: 100,000 simulations every day to predict the tournament, match by match.
It has been running since Sunday and we can already tell you curious details.
I speak in the plural because I write today's newsletter with Borja Andrino, the model's other father.
We start with the results and then talk about the method.
These are right now our favorites to win the tournament (every day updated here):
Brazil remains the first candidate, with a 26% chance of taking the trophy.
In this newsletter you are already trained to think about probabilities, but remember the paradox: the Brazilians are the team with the most chances of winning, but the model also warns that another team is likely to win (74%).
Spain is third, fourth or fifth, with a 92% chance of being in the round of 16 and a 10% chance of winning the World Cup.
📉 1. How have the forecasts changed this week?
There have been many confirmations and two surprises.
It is perfectly seen in the following graph, which shows the options of each team to be in the round of 16.
Those of the favorites who have won their first games —Brazil, France, the Netherlands, England, Spain and Portugal— have risen, but the two teams defeated by surprise, Argentina and Germany, have fallen a lot.
Why doesn't Spain have more options than England?
Although our model believes that we have a slightly better team than the English, they also think that our group – with Germany and Japan – is more difficult.
🎯 2. How are we doing right?
Only 19 games have been played yet, so it's hard to draw any firm conclusions about the model.
But, for now, the success is being as expected.
During the design of the model we hit 50-60% of the encounters, which is more or less what has happened so far.
In the graph you have the 19 games with their results and the forecast.
The biggest surprises?
Argentina's defeat against Saudi Arabia, which only had a 6% chance, but what happened.
An interesting detail is that our model is updated with the results—it learns from them.
The victory of Saudi Arabia made him adjust to think that the Asians are better than expected and the South Americans worse, to the point that if the match were repeated today, the model would say that the probability of a defeat for Argentina is 9% now, and not 6%.
A Twitter user has organized a small tournament of models, where you can compare ours with others.
At the moment we are sixth, ahead of some bookmakers, but the normal thing is that it is because we have had a bit of luck.
🔮 3. How does the model work?
What we do is measure the strength of each team with two good metrics, and then we simulate the entire tournament, game by game, thousands of times.
The model has three parts (which we detail in the methodology):
How strong is each team?
To capture this we use two metrics: their recent results (measured with an Elo ranking, an original chess method and now also used by the official FIFA ranking) and the quality of their players (measured with their value in euros, with data from the Transfermarkt website).
Who wins each match?
We have trained a model with thousands of matches to, given two teams and their strength metrics, estimate how likely each outcome is.
The model tells the probability of victory, draw and defeat, and even that of each marker.
For example, in that hypothetical duel between Brazil and Saudi Arabia, the most probable results are 2-0 and 3-0 with around 14% each.
And to predict the complete World Cup?
What we do is simulate it match by match, match by match.
We repeat this thousands of times, to have 100,000 possible World Cups, and thus be able to estimate the probability of each event.
If Brazil wins the tournament in 26,000 out of 100,000 simulations, it is because it has a 26% chance.
On our website you can even generate your own simulations.
Here in the video you have three.
It's a fantastic exercise to really feel the uncertainty surrounding the outcome of the tournament: we can give odds of 26% to Brazil and 3% to Japan—very different from flipping a coin—but at the same time it's true that we don't know who will win.
⏱️ 4. Does it influence that the matches are very long?
If you're watching the World Cup, you'll know that a lot more time is being added to matches than usual.
10, 12 and up to 20 minutes are added.
As David Álvarez recounted, it is neither a surprise nor a coincidence: Pierluigi Collina, the president of the FIFA Referees Committee, had already announced that we should expect 100-minute matches.
But what effect does this have on the game?
The logical thing is to think that it will help the best teams, because they will have more time to impose their quality.
This is what our model suggests as well.
Simply, we have simulated a World Cup with matches lasting 100 minutes instead of 90, with these results: Brazil goes from winning 26% of the time to winning 27%.
The rest changes little.
🙏 5. Help us
Many people are following the predictions of the model, but the more we are, the better.
You can share the page with the updated predictions, in Spanish and English.
There you have the methodology, the simulator and also the predictions from other sources, such as Opta, Metaculus or betting.
Forward this newsletter, or if you are not subscribed,
sign up yourself
It is an exclusive newsletter for EL PAÍS subscribers, but anyone can receive it for a trial month.
You can also follow me on Twitter, at
, or write to me with clues or comments, at
Subscribe to continue reading
Read without limits
I'm already a subscriber