Chess2u
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Chess2uLog in

descriptionRanking of evaluation functions EmptyRanking of evaluation functions

more_horiz
Adriaan de Groot has shown that there is not much difference in the depth of calculation of chess variants by an ordinary player and a grandmaster. In both cases it is a little more than 3 moves ahead. What differs is their ability to evaluate the position on the board.

This inspired me to test what difference there is in the quality of chess-playing computer programs' evaluation. I set the analysis depth to 6 plies and ran the matches between the engines. I have taken LC0 and the Maia 1900 network as a reference. This network is trained on the games of players with a 1900 ELO. Here are the results:


No Engine ELO
1 LC0 384x30-t60-4485 (sergio-V) 2650
2 LC0 hanse-69722-vf2 (lczero.org) 2625
3 LC0 384x30-2021_0518_1740_16_793 (lczero.org) 2600
4 LC0 192x15-2021_1016_0414_39_071 (lczero.org) 2500
5 LC0 J13B.2-178 (jhorthos 320x24) 2450
6 LC0 256x20-t40-1541 (sergio-V) 2400
7 LC0 LS15-20x256SE-jj-9-75000000 (leelenstein) 2350
8 LC0 128x10-2021_0726_2120_38_663 (lczero.org) 2300
9 LC0 11258-96x8-se-5 (dkappe) 2150
10 LC0 11258-64x6-se (dkappe) 2100
11 Rybka 2.3.2a 2000
12 LC0 Maia 1900 (maiachess.com) 1900
13 Komodo 3 1750
14 Hiarcs 11.2 1700
15 Komodo 7 1650
16 Houdini 1.5a 1600
17 Komodo 1 1600
18 ProDeo 2.6 (Rebel) 1500
19 Komodo 12 1500
20 Stockfish 4 1500
21 Stockfish 1 1450
22 Stockfish 14 NNUE 1450
23 SlowChess 2.7 1450
24 Fire 8 NNUE 1350
25 Stockfish 7 1300
26 Stockfish 11 1250
First places go to LC0. Stockfish is at the last. The strongest engine with a classic evaluation is Rybka.

In recent years, developers have focused on increasing the depth of analysis. The evaluation function has been simplified to speed it up. At the same time, the accuracy of evaluation has been decreasing. Some improvement was the use of NNUE, although you can see that it didn't help much.

The only exception is the LC0 project. You can see that the larger the network you use, the more powerful the engine play will be. Maybe it would be possible to get an even higher ELO if someone trained a 520x40 net based on the games of grandmasters with rankings >2700. The unsupervised learning process of such a network would be very long. How about using ready-made data? Or use games played by Stockfish?

On the other hand, Stockfish could be improved by using the evaluation function from LC0. Stockfish has the best search function. Combining the two (best evaluation function and best search function) could result in an ELO increase of hundreds of points. Of course, I'm not the first to think of this. But this test shows a big difference in the quality of evaluation by LC0 and Stockfish. It is worth thinking about how to reduce this gap.

Finally, a practical note. Using LC0 and choosing the suitable network, you can get a pretty good sparring partner. To reduce the power of the play, you will also need to reduce the depth of analysis:

LC0 Maia 1600 5 plies 1550 ELO
LC0 Maia 1300 4 plies 1200 ELO

descriptionRanking of evaluation functions EmptyRe: Ranking of evaluation functions

more_horiz
really good information,
thank you my dear chess-friends.

best regards,
mi.

descriptionRanking of evaluation functions EmptyRe: Ranking of evaluation functions

more_horiz
Stockfish 14.1 - Stockfish 14 +23-7=10 TP=+147 ELO

Stockfish 14 - Rybka 2.3.2a +1-38=1 TP = -564 ELO
Stockfish 14.1 - Rybka 2.3.2a +5-31=4 TP = -269 ELO

Stockfish 14 - Houdini 1.5a +9-25=6 TP = -147 ELO
Stockfish 14.1 - Houdini 1.5a +19-15=6 TP = +35 ELO

Stockfish 14 - LC0 Maia 1900 +4-31=5 TP = -285 ELO
Stockfish 14.1 - LC0 Maia 1900 +15-16=9 TP = -9 ELO

Stockfish 14 - LC0 11258-64x6-se +3-34=3 TP = -359 ELO
Stockfish 14.1 - LC0 11258-64x6-se +4-31=5 TP = -285 ELO

Stockfish 14 - LC0 128x10-2021_0726_2120_38_663 +0-40=0 TP = -800 ELO
Stockfish 14.1 - LC0 128x10-2021_0726_2120_38_663 +2-35=3 TP = -407 ELO

Undoubtedly, great progress has been made.

descriptionRanking of evaluation functions EmptyRe: Ranking of evaluation functions

more_horiz
Permissions in this forum:
You cannot reply to topics in this forum