How Our Chess Tournament Predictions Work

The Details

Since the 2022 Grand Prix, I have been publishing predictions for chess tournaments with super grandmasters. I am interested in understanding who is likely to win, and what the probability is for each player. We can also answer other interesting questions, such as, “How likely is the Candidates Tournament to go to tiebreaks?” (22%!) So what goes into making these predictions? A few simple steps.

Predicting the outcome of individual games
Programming the tournament format
Simulating the tournament 10,000+ times

Predicting Individual Games

Every chess tournament is fundamentally the same - we take the results of many individual chess games and we calculate the total points scored for each player to determine the winner. So, the first step in predicting tournament outcomes is creating a good mechanism to predict individual games.

I built a machine learning model to predict the results of chess games given the Elo for the white player and the Elo for the black player. This works really well because the model captures important relationships like white having an advantage over black and that draws are more likely as Elo increases. Both of these relationships are critical to model correctly when predicting tournament outcomes. I wrote an article called Elo Ratings v. Machine Learning earlier in the year, in case you’d like some more details. Below is an example of output from the model showing draw probability as Elo increases. You can see at the super GM level, the draw rate is about 55%.

Programming The Tournament Format

Once we can predict individual games, the next step is to write code to collect information about the players (i.e., names and Elos), the tournament schedule, and the tiebreak procedures. This is a relatively straightforward task, though each tournament usually has a slightly unique format that requires coding. For example, the Grand Prix used a pool-play system and then moved to a knockout, whereas the Candidates Tournament is a double round robin. Each tournament also has different tiebreak procedures that must be considered in the code.

The code for the simulations is available on my GitHub, free for you to use and explore yourself! For example, here is the code to simulate the Candidates Tournament in Python.

Running A Lot Of Simulations

More code! In Python. On my GitHub for you to see.

Once we can simulate the results of a tournament we want to run that simulation over and over to get a sense for the probability of each outcome. I usually run 50,000 (or more) simulations, which is generally enough to have an accuracy within 0.1% (assuming the model for individual games is accurate). That model is not perfect, but it serves as a good approximation.

For example, in round four of the 2022 Candidates Tournament when Ding Liren (white, 2799) faces off against Caruana (black, 2787), there is a 25.4% chance Ding Liren wins, a 57.1% chance of a draw, and a 17.5% chance Caruana wins, according to the model.

Back to the simulations! Once we have our 50,000 simulated tournaments, we can inspect the results for interesting things. Some topics I’ve been curious about for the 2022 Candidates Tournament:

Most obviously, what is the probability of each player winning after each round?
What is the average winning score?
How often does the winner of the tournament lose any of their games with black?
What are the chances the event goes to tiebreaks?

All Models Are Wrong, Some Are Useful

No model attempting to predict the outcome of a future event is perfect. Anyone claiming otherwise is going to be selling you snake oil very soon! But, some models do a good job of capturing an approximation of the real world. As humans, we understand that Ding is more likely to win with white against Caruana than the other way around, and the model can help us approximate that difference.

As of round three, my predictions say that both Nepo and Caruana have a 25% chance to win the event. So if Firouzja wins, does that mean my model was wrong? No, as it stands today, I believe it’s approximately true that they each have a 25% chance of winning, but that doesn’t mean it will turn out that way - in fact, it also means there is a 75% chance that Caruana will not win the tournament!

So, I claim the model and my simulations are useful, but there are many ways I could improve them. If you have more suggestions to improve the simulations or model, please let me know on Twitter or Reddit.

Future improvements

How drawish each player is individually (ehem, Drawdjabov!)
Previous head-to-head results
How good each player is with white and black
Considering the tournament situation in individual game predictions (e.g., sometimes a player may need a win so they give up drawing chances for a small increase in winning chances)

Hopefully this helps you understand the simulations I publish, and please feel free to ask questions if you have them.