What do the polls say in Mexico? Sheinbaum has an 89% chance of winning

2024-03-17T05:19:10.027Z

Highlights: EL PAÍS has built a prediction model, similar to the one it has used in the 2018 elections and twenty other elections in different countries. The model works in three steps: (1) we start from the average of previous surveys, (2) we incorporate a certain degree of uncertainty around it, and (3) we simulate the elections 20,000 times to calculate the probabilities of each outcome. Our prediction says that Claudia Sheinbaum has an 89% chance of victory, but her rival Xóchitl Gálvez retains 1 option out of 10 to surprise.

The first electoral prediction by EL PAÍS places the official candidate as a firm favorite, although the opposition Xóchitl Gálvez retains an option among ten to win by surprise

The average of polls carried out by EL PAÍS places Claudia Sheinbaum as the main candidate to win the presidency of Mexico, with close to 60% of the estimated vote, well ahead of Xóchitl Gálvez (36%) and Jorge Álvarez Máynez (5%). .

Three months before the vote, Morena's candidate is the firm favorite.

But what chance do you have of winning?

To answer that, we have built a prediction model, similar to the one that EL PAÍS has used in the 2018 elections and twenty other elections in different countries.

As explained in the methodology, the model works in three steps: (1) we start from the average of previous surveys, (2) we incorporate a certain degree of uncertainty around it, according to the historical success of the surveys in Mexico, and (3) we simulate the elections 20,000 times to calculate the probabilities of each outcome.

Our prediction says that Claudia Sheinbaum has an 89% chance of victory, but her rival Xóchitl Gálvez retains 1 option out of 10 to surprise.

It is important to interpret these probabilities well.

Sheinbaum is a clear favorite, and her victory is ten times more likely than Gálvez's, but 11% probability events are not impossible.

A football statistic serves as a reference: Sheinbaum's defeat is more likely than seeing the first two penalties of a shootout missed.

This newspaper already published forecasts of this type in the presidential elections six years ago.

Our first prediction said in March 2018 that López Obrador had a 79% chance of winning;

The last one, days before the elections, raised his chances to 97%, anticipating what happened later.

Gálvez recovers ground

Since the fall, the polls have moved to reduce the distance between Sheinbaum and Gálvez, from 32 points in December to the current 24.

On the one hand, a leap forward for Gálvez is evident, coinciding with the erosion of the Citizen Movement - when it was announced that his nominee would be Jorge Álvarez Máynez and not Samuel García, who seemed the favorite in November.

At the same time, since January there has been a slight decline in Sheinbaum, which drops on average from 62% to 60% of voting intention.

A key in the coming weeks is to follow the evolution of these trends, although Sheinbaum's cushion is considerable.

The error of the polls

Models like ours turn polls into predictions by incorporating additional information: the historical accuracy of the polls.

How big are your mistakes?

How likely is it that they will miss by 5 or 15 points?

To find out that, we have analyzed dozens of surveys in Mexico and thousands of other countries.

The polls were good in the 2006 and 2012 Mexican presidential elections, but they deviated more in 2000 and 2018. Although six years ago they predicted López Obrador's victory, the truth is that they gave him six points less than what he obtained — and a six-point error, in other circumstances, can change the result.

In these four appointments, the polls made an average error per candidate of 3.8 points in votes - considering only those that exceed 10%.

That is, deviations of 4 or 5 points were common and the margin of error was around 9 points.

The polls were better in the legislative elections of 2009, 2015 and 2021: there the average error was 2.1 points, which is a high precision, similar to that of polls in the United States or Spain.

However, applying a precautionary principle, we have used that first data point—3.8 points of error—as the basis for our prediction model.

Furthermore, our methodology also widens the uncertainty depending on the time until the vote.

Three months before the June vote, the margin of error (at 90%) is around 16 points for a candidate with around 50% of the vote, hence Sheinbaum moves around 90% of the options.

Methodology

Predictions are produced by a statistical model based on surveys and their historical accuracy.

One similar to those we use in

Spain in 2023

and

twice

in 2019

, in

Andalusia

,

Catalonia

or

Madrid

.
.
Also in

Mexico

six years ago, in

France

or

the United Kingdom

.
The model works in three steps: 1) aggregate and average the polls, 2) incorporate expected uncertainty, and 3) simulate 20,000 elections to calculate probabilities.

Step 1. Average the surveys.

Our average takes into account dozens of polls to improve its accuracy.
The data has been collected mostly by the website

Oraculus.mx

.
The average is weighted to give different weight to each survey according to two factors: the survey house (companies without track record have less weight; those that do not publish their data

in the INE

are excluded) and the date.
We want to give more weight to recent polls when calculating the average, and that on the last day only the latest ones published by each pollster matter.
For this, we assign weights to the polls according to an exponential decay law.
And we define an exclusion band that ignores surveys that are more than 30 days old.
In addition, we penalize repeated surveys by the same interviewer.
When calculating the average on a date, the closest survey of each house has a weight of one, but the rest of its surveys are almost ignored.

Averages like ours can be viewed as a consensus estimate.

Instead of relying on a single pollster, they add the criteria and hypotheses of many.

Averages reduce noise, preventing trends from jumping up and down by chance.

And above all:

they have been shown to improve accuracy

.

Step 2. Incorporate survey uncertainty.

This is the most complicated and most important step.
We need to estimate the expected accuracy of polls in Mexico.
How large are common errors?
How likely are errors of 3, 5, or 15 points to occur?
To answer these questions, dozens of surveys in Mexico and thousands internationally are studied.

Calibrate expected errors.

First I have estimated the error of the surveys in Mexico.

I have built a database with surveys from seven elections since 2000. The mean absolute error (MAE) of the survey averages in Mexico, by candidate or party, considering those with more than 10% of the vote, has been around 3.8 points in the presidential elections and 2.2 points in the legislative elections.

That is, deviations of four or five points were common and the margin of error (95%) was around nine points.

As seven elections are not enough to draw strong conclusions, we also reviewed around twenty votes in other Latin American countries, where the MAE error rose to 4.1 points.

In the end, following a principle of caution, I have decided that our model assumes an MAE of 3.8 points in Mexico.

Furthermore, this uncertainty is modulated taking into account two additional factors: the size of the candidate/party (because it is easier to estimate a party's vote if it is around 5% than if it is close to 50%) and the proximity of the elections ( because the polls at the end are almost always more accurate).

To adjust this part of the model I have used the Jennings and Wlezien database,

published in Nature

, and analyzed the errors of 4,100 surveys in 241 elections in 19 Western countries.

Choice of distribution type.

To incorporate uncertainty into the vote for each candidate/party in each simulation I use a multivariate distribution.

I use t-student distributions instead of normal ones so that they have longer tails (kurtosis): that makes it more likely that very extreme events will happen.

The advantages of that hypothesis

were explained by Nate Silver

.
I have estimated the level of kurtosis with the previous database.
Then I define the covariance matrix of these distributions so that the sum of the votes does not exceed 100% (

an idea from Chris Hanretty

).
Finally, the amplitude of the covariance matrices must be scaled so that the resulting vote distributions have the expected MAE and standard deviation according to the calibration.

Step 3. Simulate.

The last step is to run the model 20,000 times.
Each iteration is a simulation of the elections with vote percentages that vary according to the distribution defined in the previous step.
The results in these simulations allow us to calculate the probabilities that each candidate has of being the most voted and achieving the presidency.

Why surveys?

This model is based entirely on surveys.
There is a perception that polls are not reliable,

but the truth is that polls work

.
Polls are rarely perfect, but

there is no alternative that is proven to be better

.

Subscribe to the EL PAÍS México newsletter

and the

electoral WhatsApp

channel and receive all the key information on current events in this country.

Source: elparis

All news articles on 2024-03-17

What do the polls say in Mexico? Sheinbaum has an 89% chance of winning

Gálvez recovers ground

The error of the polls

Methodology

You may like

Trends 24h

Latest