LC0 / Neural Nets versus Stockfish testing
Playing conditions:
Hardware: i7-8750H 2.6GHz (Hexacore) Notebook, RTX 2060 GPU, Windows 10 64bit, 16GB RAM
Speed: Stockfish (running on 11 hyperthreading-threads, Intel Turbo-Mode off): 9000 kn/s, Lc0 (with old 32930 20x256 net): 16000 n/s in starting position. Since Lc0 0.26.3, Lc0 uses Cuda 11.1, which makes it around +37% faster. To keep the Leela-Ratio on 1.0, I slowed down the GPU with the Afterburner-Tool.
Hash / NN Cache: 4096 GB Hash for Stockfish / 5000000 NN-Cachesize for Lc0
GUI: Cutechess-cli (GUI ends game, when a 5-piece endgame is on the board)
Tablebases: None for engines, 5 Syzygy for cutechess-cli
Openings: NBSC Advanced Armageddon Noomen 3-moves (250 openings). Learn more about Advanced Armageddon in the "NBSC Armageddon openings"- section and download the NBSC-Armageddon package right here
Ponder, Large Memory Pages & learning: Off
Thinking time: Lc0 2'+1'' and Stockfish 3'+1.5'' (means a perfect Leela-Ratio of 1.0). Average game-duration: 8 minutes, one 500 games-testrun takes around 2.5 days.
Each Lc0 / Neural Net plays 500 games vs. Stockfish with my new NBSC Advanced Armageddon openings. After the testrun is finished, all games are rescored with my armageddonize_advanced-tool. Means:
Win for white = 1 point for white
Draw = 1 point for black
Win for black = 2 points for black
Learn more about my new NBSC Advanced Armageddon openings and the advanced scoring system in the "NBSC Armageddon openings"- section.
Learn more about Lc0 (getting started in a GUI, links to net-downloads, FAQs, development-informations and the Leela-Blog) here
Latest update: 2021/03/01: Lc0 0.27.0 Bad Gyal 9XXL
(Since Lc0 0.26.3, Lc0 uses Cuda 11.1, which makes it around +37% faster. To keep the Leela-Ratio on 1.0, I slowed down the GPU with the Afterburner-Tool.)
Download all played games (non-armageddonized) here
500 NBSC-Advanced-Armageddon games each testrun (= a win for Black is 2 points for Black and a draw is a 1 point-win for Black). vs. Stockfish 200418 (SPCC-Elo: 3568 (Contempt set to 0) (around +14 Elo stronger than Stockfish 11 (SPCC-Elo: 3554)).
The errorbar of each result is +/- 20 Elo. But mention, that the usage of my NBSC-Armageddon openings spreads the Elo-results around 2.25x wider, than using classical openings for testing(!), so with classical openings, you would need an errorbar of +/- 9 Elo for the same statistical quality of the results (= the rankings of Lc0 nets here). And for an errorbar of +/- 9 elo, you need around 3000 games, not 500, which means 6x more games (and 6x more PC-time)!!
Learn more about that revolution in computerchess in the "NBSC Armageddon openings"- section of my website.
# PLAYER : RATING ERROR PLAYED W L (%)
1 Lc0 0.27.0 67741 (30x384) : 3733 23 513 369 144 71.9
2 Lc0 0.26.3 66680 (30x384) : 3724 23 519 368 151 70.9
3 Lc0 0.26.3 67336 (30x384) : 3723 23 520 368 152 70.8
4 Lc0 0.26.3 67326 (30x384) : 3715 23 516 360 156 69.8
5 Lc0 0.26.3 67574 (30x384) : 3710 24 520 360 160 69.2
6 Lc0 0.26.3 66888 (30x384) : 3710 23 516 357 159 69.2
7 Lc0 0.26.3 67438 (30x384) : 3707 21 521 359 162 68.9
8 Lc0 0.26.3 J96-28 (30x384) : 3707 23 517 356 161 68.9
9 Lc0 0.26.3 67692 (30x384) : 3705 23 517 355 162 68.7
10 Ceres 0.88 66680 (30x384) : 3702 23 514 351 163 68.3
11 Lc0 0.26.3 J94-100 (30x384) : 3699 23 517 351 166 67.9
12 Lc0 0.26.3 J94-80 (30x384) : 3696 22 519 350 169 67.4
13 Lc0 0.26.3 J92-260 (30x384) : 3689 23 515 343 172 66.6
14 Lc0 0.26.3 66988 (30x384) : 3682 22 519 341 178 65.7
15 Lc0 0.26.3 J92-330 (30x384) : 3682 22 519 341 178 65.7
16 Ceres 0.80 66680 (30x384) : 3677 22 515 335 180 65.0
17 Lc0 0.26.3 J98.1-16 (30x384) : 3671 23 510 328 182 64.3
18 Ceres 0.87 66680 (30x384) : 3668 23 518 331 187 63.9
19 Lc0 0.26.3 65981 (24x320) : 3668 23 515 329 186 63.9
20 Lc0 0.26.3 67211 (30x384) : 3667 21 514 328 186 63.8
21 Lc0 1483dev J94-100 (SuFi 20) : 3666 20 515 328 187 63.7
22 Lc0 0.26.3 J92-300 (30x384) : 3665 22 513 326 187 63.5
23 Lc0 0.26.3 66309 (24x320) : 3662 21 514 324 190 63.0
24 Lc0 0.26.2 J92-130 (30x384) : 3655 21 521 324 197 62.2
25 Lc0 0.26.3 66511 (24x320) : 3650 23 514 316 198 61.5
26 Lc0 0.26.3 65536 (24x320) : 3648 22 514 315 199 61.3
27 Lc0 0.26.3 65732 (24x320) : 3648 21 514 315 199 61.3
28 Lc0 0.24.1 LS 14.3 (20x256) : 3644 23 513 311 202 60.6
29 Lc0 0.25.1 LS 15 (20x256) : 3643 22 512 310 202 60.5
30 Lc0 0.26.3 PhStein 1.2 (20x256) : 3642 23 512 309 203 60.4
31 Lc0 0.26.3 65411 (24x320) : 3641 20 519 313 206 60.3
32 Lc0 0.26.2 J92-160 (30x384) : 3635 21 511 304 207 59.5
33 Lc0 0.26.2 T60B.7-105 (24x320 : 3634 20 519 308 211 59.3
34 Lc0 0.24.1 LS 14.2 (20x256) : 3633 21 520 308 212 59.2
35 Lc0 0.25.1 LS 15 Kayra4 : 3624 22 513 297 216 57.9
36 Lc0 0.26.1 t60-4619 (30x384) : 3622 21 522 301 221 57.7
37 Lc0 0.26.2 J92-205 (30x384) : 3618 20 511 292 219 57.1
38 Lc0 0.25.1 3972_20k_tcec (30x384) : 3617 21 514 293 221 57.0
39 Lc0 0.26.2 65100 (24x320) : 3616 21 512 291 221 56.8
40 Lc0 0.26.1 J92-100 (30x384) : 3609 21 510 285 225 55.9
41 Lc0 0.26.1 t60-4585 (30x384) : 3605 21 514 284 230 55.3
42 Lc0 0.25.1 sv-1810 (20x256) : 3599 22 514 280 234 54.5
43 Lc0 0.25.1 t60-4175_mlh (30x384) : 3594 21 516 277 239 53.7
44 Lc0 0.25.1 t60-4175 (30x384) : 3592 21 515 275 240 53.4
45 Lc0 0.25.1 t60-4082 (30x384) : 3589 21 510 270 240 52.9
46 Lc0 0.26.3 J104.1-30 (10x128) : 3587 21 516 272 244 52.7
47 Lc0 0.26.0 J90-40 (30x384) : 3587 22 511 269 242 52.6
48 Lc0 0.26.3 PhStein 1.1 (20x256) : 3585 21 513 269 244 52.4
49 Lc0 0.26.2 PhoenixStein (20x256) : 3585 20 525 275 250 52.4
50 Lc0 0.25.1 t40-1541 (20x256) : 3583 21 516 269 247 52.1
51 Allie 0.7 LS 14.3 : 3582 21 512 266 246 52.0
52 Lc0 0.25.1 t60-3010 (30x384) : 3582 20 514 267 247 51.9
53 Ceres 0.80 J104.1-30 (10x128) : 3580 22 511 264 247 51.7
54 Lc0 0.26.1 64623 (24x320) : 3576 22 520 266 254 51.2
55 Stockfish 200418 bmi2 : 3568 2 42331 19422 22909 45.9
56 Lc0 0.26.1 64208 (24x320) : 3565 22 510 253 257 49.6
57 Allie 0.6 LS 14.3 (20x256) : 3558 22 519 252 267 48.6
58 Lc0 0.25.1 42850 (20x256) : 3556 21 522 252 270 48.3
59 Lc0 0.26.3 SV-5300 (10x128) : 3554 21 518 249 269 48.1
60 Lc0 0.25.1 63651 (24x320) : 3554 21 517 248 269 48.0
61 Lc0 0.26.3 Tinker_6430 (10x128) : 3552 22 513 245 268 47.8
62 Lc0 0.25.1 702820 (10x128) : 3552 22 518 247 271 47.7
63 Lc0 0.25.1 63851 (24x320) : 3552 22 518 247 271 47.7
64 Lc0 0.25.1 t60-3972 (30x384) : 3550 22 514 244 270 47.5
65 Lc0 0.26.2 722641 (10x128) : 3546 21 518 243 275 46.9
66 Lc0 0.26.0 703810 (10x128) : 3545 20 507 237 270 46.7
67 Lc0 0.26.1 722052 (10x128) : 3543 21 512 238 274 46.5
68 Lc0 0.26.3 730372 (14x128) : 3536 22 510 232 278 45.5
69 Fat Fritz 1.1 (20x256) : 3530 21 523 233 290 44.6
70 Lc0 0.25.1 63305 (24x320) : 3530 21 512 228 284 44.5
71 Lc0 0.26.3 730937 (14x128) : 3529 22 515 229 286 44.5
72 Lc0 0.26.3 730262 (14x128) : 3528 22 515 228 287 44.3
73 Lc0 0.26.3 730517 (14x128) : 3525 21 520 228 292 43.8
74 Lc0 0.26.1 721051 (10x128) : 3491 21 516 202 314 39.1
75 Lc0 0.25.1 32930 (20x256) : 3483 21 515 196 319 38.1
76 Lc0 0.25.1 714646 (19x256) : 3479 21 516 194 322 37.6
77 Lc0 0.26.3 730164 (14x128) : 3475 22 513 190 323 37.0
78 Lc0 0.25.1 714435 (19x256) : 3463 23 517 183 334 35.4
79 Lc0 0.26.1 715842 (19x256) : 3431 24 528 166 362 31.4
80 Lc0 0.25.1 11260 (20x256) : 3408 23 521 149 372 28.6
81 Lc0 0.26.3 Bad Gyal 9XL (20x128) : 3367 25 528 127 401 24.1
82 Lc0 0.26.3 Bad Gyal 9PXL (20x128) : 3348 24 524 116 408 22.1
83 Lc0 0.27.0 Bad Gyal 9XXL (40x128) : 3309 28 538 100 438 18.6
Games : 42331 (finished)
White Wins : 21696 (51.3 %)
Black Wins : 20635 (48.7 %)
Draws : 0 (0.0 %)
Mention, that this is not a ratinglist, but only a performance test of Lc0 with different NNs versus Stockfish. Because Lc0 vs. Stockfish is definitly the most interesting head-to-head competition of NN vs. AB-engines. For a real ratinglist including Lc0 running on a RTX-GPU (with a valid Leela-Ratio of 1.0), please visit Andreas Strangmueller's excellent website. Just click here
Stockfish vs Lc0 longtime testing ("SuFi for the poor")
Each testrun 300 games with 150 Noomen lowdraw-openings (selected openings from TCEC superfinals) and 5'+3'' thinking-time (Lc0) / 7.5'+4.5'' (Stockfish). This thinking-time gives a perfect Leela-Ratio of 1.0 on the used PC hardware: i7-8750H 2.6GHz (Hexacore, TurboBoost mode off) Notebook, RTX 2060 GPU. Average game-duration: 20 minutes. Stockfish (running on 11 hyperthreading-threads, Intel Turbo-Mode off) 9000 kn/s, Lc0 (with old 32930 20x256 net) on RTX 2060 mobile: 16000 n/s in starting position.
Hash / NN Cache: 4096 GB Hash for Stockfish / 10000000 NN-Cachesize for Lc0
GUI: Cutechess-cli (GUI ends game, when a 5-piece endgame is on the board)
Tablebases: None for engines, 5 Syzygy for cutechess-cli
Openings: 150 Noomen lowdraws openings (J. Noomen selected non-drawish openings out of his TCEC superfinal openings of previous TCEC seasons). Download here
Ponder, Large Memory Pages & learning: Off
Thinking time: Lc0 5'+3'' and Stockfish 7.5'+4.5'' (means a perfect Leela-Ratio of 1.0). Average game-duration: 20 minutes.
Download all played games here
Latest update: 2021/03/06 Stockfish 210226 vs Lc0 0.27.0 67741
See some short and spectacular wins of this match directly here on the website in the "View SF vs Lc0 games"- section!
Stockfish 210226 bmi2 vs Lc0 0.27.0 67741 : 300 (+ 75,=205,- 20), 59.2 % (+65 Elo)
Stockfish 201225 bmi2 vs Lc0 0.26.3 66680 : 300 (+ 60,=223,- 17), 57.2 % (+50 Elo)
Stockfish 201022 bmi2 vs Lc0 0.26.3 J92-260 : 300 (+ 75,=207,- 18), 59.5 % (+67 Elo)
Stockfish 200928 bmi2 vs Lc0 0.26.3rc2 J92-190: 300 (+ 68,=215,- 17), 58.5 % (+60 Elo)
Stockfish 12 bmi2 vs Lc0 0.26.2 J92-130: 300 (+ 74,=203,- 23), 58.5 % (+60 Elo)
SF 200823 82215d0fd0df vs Lc0 0.26.1 t60-4619: 300 (+ 85,=199,- 16), 61.5 % (+82 Elo)
SF 200810 112bb1c8cdb5 vs Lc0 0.26.1 LS 15: 300 (+ 78,=196,- 26), 58.7 % (+62 Elo)