Stefan Pohl Computer Chess

private website for chessengine-tests


LC0 / Neural Nets versus Stockfish testing

 

Playing conditions:

 

Hardware: i7-8750H 2.6GHz (Hexacore) Notebook, RTX 2060 GPU, Windows 10 64bit, 16GB RAM

Speed:  Stockfish (running on 11 hyperthreading-threads, Intel Turbo-Mode off): 9000 kn/s, Lc0 (with old 32930 20x256 net): 16000 n/s in starting position.

Hash / NN Cache: 4096 GB Hash for Stockfish / 5000000 NN-Cachesize for Lc0

GUICutechess-cli (GUI ends game, when a 5-piece endgame is on the board)

Tablebases: None for engines, 5 Syzygy for cutechess-cli

Openings: NBSC Advanced Armageddon Noomen 3-moves (250 openings).  Learn more about Advanced Armageddon in the "NBSC Armageddon openings"- section and download the NBSC-Armageddon package right here

Ponder, Large Memory Pages & learning: Off

Thinking time: Lc0 2'+1'' and Stockfish 3'+1.5'' (means a perfect Leela-Ratio of 1.0). Average game-duration: 8 minutes, one 500 games-testrun takes around 2.5 days.

 

Each Lc0 / Neural Net plays 500 games vs. Stockfish with my new NBSC Advanced Armageddon openings. After the testrun is finished, all games are rescored with my armageddonize_advanced-tool. Means: 

Win for white = 1 point for white
Draw = 1 point for black
Win for black = 2 points for black 

 

Learn more about my new NBSC Advanced Armageddon openings and the advanced scoring system in the "NBSC Armageddon openings"- section.

Learn more about Lc0 (getting started in a GUI, links to net-downloads, FAQs, development-informations and the Leela-Blog) here

 

 

Latest update: 2020/07/05 Lc0 0.25.1 sv-1810. Next testrun Lc0 0.25.1 LS 15 with Kayra4 parameter setting.

 

Download all played games (non-armageddonized) here

 

 

500 NBSC-Advanced-Armageddon games each testrun (= a win for Black is 2 points for Black and a draw is a 1 point-win for Black). vs. Stockfish 200418 (SPCC-Elo: 3568 (Contempt set to 0) (around +14 Elo stronger than Stockfish 11 (SPCC-Elo: 3554)).

The errorbar of each result is +/- 20 Elo. But mention, that the usage of my NBSC-Armageddon openings spreads the Elo-results around 2.25x wider, than using classical openings for testing(!), so with classical openings, you would need an errorbar of +/- 9 Elo for the same statistical quality of the results (= the rankings of Lc0 nets here). And for an errorbar of +/- 9 elo, you need around 3000 games, not 500, which means 6x more games (and 6x more PC-time)!!

Learn more about that revolution in computerchess in the "NBSC Armageddon openings"- section of my website.

 

1  Lc0 0.24.1 LS 14.3 (20x256)      : 3644 513 (+311,=  0,-202), 60.6 %
2  Lc0 0.25.1 LS 15 (20x256)        : 3643 512 (+310,=  0,-202), 60.5 %
3  Lc0 0.24.1 LS 14.2 (20x256)      : 3633 520 (+308,=  0,-212), 59.2 %
4  Lc0 0.25.1 3972_20k_tcec (30x384): 3617 514 (+293,=  0,-221), 57.0 %
5  Lc0 0.25.1 sv-1810 (20x256)      : 3599 514 (+280,=  0,-234), 54.5 %
6  Lc0 0.25.1 t60-4175_mlh (30x384) : 3594 516 (+277,=  0,-239), 53.7 %
7  Lc0 0.25.1 t60-4175 (30x384)     : 3592 515 (+275,=  0,-240), 53.4 %
8  Lc0 0.25.1 t60-4082 (30x384)     : 3589 510 (+270,=  0,-240), 52.9 %
9  Lc0 0.25.1 t40-1541 (20x256)     : 3583 516 (+269,=  0,-247), 52.1 %
10 Lc0 0.25.1 t60-3010 (30x384)     : 3582 514 (+267,=  0,-247), 51.9 %
** Stockfish 200418 *************** : 3568 SPCC-Elo *******************
11 Allie 0.6 LS 14.3 (20x256)       : 3558 519 (+252,=  0,-267), 48.6 %
12 Lc0 0.25.1 42850 (20x256)        : 3556 522 (+252,=  0,-270), 48.3 %
13 Lc0 0.25.1 63651 (24x320)        : 3554 517 (+248,=  0,-269), 48.0 %
14 Lc0 0.25.1 63851 (24x320)        : 3552 518 (+247,=  0,-271), 47.7 %
14 Lc0 0.25.1 702820 (10x128)       : 3552 518 (+247,=  0,-271), 47.7 %
15 Lc0 0.25.1 t60-3972 (30x384)     : 3550 514 (+244,=  0,-270), 47.5 %
16 Fat Fritz 1.1 (20x256)           : 3530 523 (+233,=  0,-290), 44.6 %
17 Lc0 0.25.1 63305 (24x320)        : 3530 512 (+228,=  0,-284), 44.5 %
18 Lc0 0.25.1 32930 (20x256)        : 3483 515 (+196,=  0,-319), 38.1 %
19 Lc0 0.25.1 714646 (19x256)       : 3479 516 (+194,=  0,-322), 37.6 %
20 Lc0 0.25.1 714435 (19x256)       : 3463 517 (+183,=  0,-334), 35.4 %
21 Lc0 0.25.1 11260 (20x256)        : 3408 521 (+149,=  0,-372), 28.6 %

 

Mention, the number of games is a little bit too high, because the (rare) wins
of Black are doubled in the pgn-file, which is given to ORDO, because of 
Advanced Armageddon Scoring (= a win for Black is 2 points for Black). 
That trick of doubling these games is the only possibility to make
ORDO count a win of Black as 2 points...

 


Games        : 11356 (finished)

White Wins   : 5729 (50.4 %)
Black Wins   : 5627 (49.6 %)
Draws        : 0 (0.0 %)

 

Mention, that this is not a ratinglist, but only a performance test of Lc0 with different NNs versus Stockfish. Because Lc0 vs. Stockfish is definitly the most interesting head-to-head competition of NN vs. AB-engines. For a real ratinglist including Lc0 running on a RTX-GPU (with a valid Leela-Ratio of 1.0), please visit Andreas Strangmueller's excellent website. Just click here