LC0 / Neural Nets versus Stockfish testing
Playing conditions:
Hardware: i7-8750H 2.6GHz (Hexacore) Notebook, RTX 2060 GPU, Windows 10 64bit, 16GB RAM
Speed: Stockfish (running on 11 hyperthreading-threads, Intel Turbo-Mode off): 9000 kn/s, Lc0 (with old 32930 20x256 net): 16000 n/s in starting position. Since Lc0 0.26.3, Lc0 uses Cuda 11.1, which makes it around +37% faster. To keep the Leela-Ratio on 1.0, I slowed down the GPU with the Afterburner-Tool.
Hash / NN Cache: 4096 GB Hash for Stockfish / 5000000 NN-Cachesize for Lc0
GUI: Cutechess-cli (GUI ends game, when a 5-piece endgame is on the board)
Tablebases: None for engines, 5 Syzygy for cutechess-cli
Openings: NBSC Advanced Armageddon Noomen 3-moves (250 openings). Learn more about Advanced Armageddon in the "NBSC Armageddon openings"- section and download the NBSC-Armageddon package right here
Ponder, Large Memory Pages & learning: Off
Thinking time: Lc0 2'+1'' and Stockfish 3'+1.5'' (means a perfect Leela-Ratio of 1.0). Average game-duration: 8 minutes, one 500 games-testrun takes around 2.5 days.
Each Lc0 / Neural Net plays 500 games vs. Stockfish with my new NBSC Advanced Armageddon openings. After the testrun is finished, all games are rescored with my armageddonize_advanced-tool. Means:
Win for white = 1 point for white
Draw = 1 point for black
Win for black = 2 points for black
Learn more about my new NBSC Advanced Armageddon openings and the advanced scoring system in the "NBSC Armageddon openings"- section.
Learn more about Lc0 (getting started in a GUI, links to net-downloads, FAQs, development-informations and the Leela-Blog) here
Latest update: 2021/04/17: Lc0 0.27.0 PhoenixStein 1.7
(Since Lc0 0.26.3, Lc0 uses Cuda 11.1, which makes it around +37% faster. To keep the Leela-Ratio on 1.0, I slowed down the GPU with the Afterburner-Tool.)
Download all played games (non-armageddonized) here
500 NBSC-Advanced-Armageddon games each testrun (= a win for Black is 2 points for Black and a draw is a 1 point-win for Black). vs. Stockfish 200418 (SPCC-Elo: 3568 (Contempt set to 0) (around +14 Elo stronger than Stockfish 11 (SPCC-Elo: 3554)).
The errorbar of each result is +/- 20 Elo. But mention, that the usage of my NBSC-Armageddon openings spreads the Elo-results around 2.25x wider, than using classical openings for testing(!), so with classical openings, you would need an errorbar of +/- 9 Elo for the same statistical quality of the results (= the rankings of Lc0 nets here). And for an errorbar of +/- 9 elo, you need around 3000 games, not 500, which means 6x more games (and 6x more PC-time)!!
Learn more about that revolution in computerchess in the "NBSC Armageddon openings"- section of my website.
# PLAYER : RATING ERROR PLAYED W L (%)
1 Lc0 0.27.0 68501 (30x384) : 3740 24 510 371 139 72.7
2 Lc0 0.27.0 68002 (30x384) : 3735 25 514 371 143 72.2
3 Lc0 0.27.0 67741 (30x384) : 3733 22 513 369 144 71.9
4 Lc0 0.27.0 68171 (30x384) : 3727 24 522 372 150 71.3
5 Lc0 0.26.3 66680 (30x384) : 3724 23 519 368 151 70.9
6 Lc0 0.26.3 67336 (30x384) : 3723 23 520 368 152 70.8
7 Lc0 0.27.0 68287 (30x384) : 3716 25 513 359 154 70.0
8 Lc0 0.26.3 67326 (30x384) : 3715 23 516 360 156 69.8
9 Lc0 0.26.3 67574 (30x384) : 3710 23 520 360 160 69.2
10 Lc0 0.26.3 66888 (30x384) : 3710 23 516 357 159 69.2
11 Lc0 0.26.3 67438 (30x384) : 3707 24 521 359 162 68.9
12 Lc0 0.26.3 J96-28 (30x384) : 3707 25 517 356 161 68.9
13 Lc0 0.26.3 67692 (30x384) : 3705 22 517 355 162 68.7
14 Ceres 0.88 66680 (30x384) : 3702 23 514 351 163 68.3
15 Lc0 0.26.3 J94-100 (30x384) : 3699 24 517 351 166 67.9
16 Lc0 0.26.3 J94-80 (30x384) : 3696 22 519 350 169 67.4
17 Lc0 0.26.3 J92-260 (30x384) : 3689 23 515 343 172 66.6
18 Lc0 0.26.3 66988 (30x384) : 3682 22 519 341 178 65.7
19 Lc0 0.26.3 J92-330 (30x384) : 3682 22 519 341 178 65.7
20 Ceres 0.80 66680 (30x384) : 3677 24 515 335 180 65.0
21 Ceres 0.89 67741 (30x384) : 3674 23 519 336 183 64.7
22 Lc0 0.26.3 J98.1-16 (30x384) : 3671 23 510 328 182 64.3
23 Ceres 0.87 66680 (30x384) : 3668 21 518 331 187 63.9
24 Lc0 0.26.3 65981 (24x320) : 3668 23 515 329 186 63.9
25 Lc0 0.26.3 67211 (30x384) : 3667 22 514 328 186 63.8
26 Lc0 1483dev J94-100 (SuFi 20) : 3666 22 515 328 187 63.7
27 Lc0 0.26.3 J92-300 (30x384) : 3665 22 513 326 187 63.5
28 Lc0 0.26.3 66309 (24x320) : 3662 22 514 324 190 63.0
29 Lc0 0.26.2 J92-130 (30x384) : 3655 22 521 324 197 62.2
30 Lc0 0.26.3 66511 (24x320) : 3650 22 514 316 198 61.5
31 Lc0 0.26.3 65536 (24x320) : 3648 22 514 315 199 61.3
32 Lc0 0.26.3 65732 (24x320) : 3648 21 514 315 199 61.3
33 Lc0 0.24.1 LS 14.3 (20x256) : 3644 22 513 311 202 60.6
34 Lc0 0.25.1 LS 15 (20x256) : 3643 22 512 310 202 60.5
35 Lc0 0.26.3 PhStein 1.2 (20x256) : 3642 21 512 309 203 60.4
36 Lc0 0.26.3 65411 (24x320) : 3641 21 519 313 206 60.3
37 Lc0 0.26.2 J92-160 (30x384) : 3635 22 511 304 207 59.5
38 Lc0 0.26.2 T60B.7-105 (24x320 : 3634 22 519 308 211 59.3
39 Lc0 0.24.1 LS 14.2 (20x256) : 3633 21 520 308 212 59.2
40 Lc0 0.25.1 LS 15 Kayra4 : 3624 22 513 297 216 57.9
41 Lc0 0.26.1 t60-4619 (30x384) : 3622 22 522 301 221 57.7
42 Lc0 0.26.2 J92-205 (30x384) : 3618 21 511 292 219 57.1
43 Lc0 0.25.1 3972_20k_tcec (30x384) : 3617 21 514 293 221 57.0
44 Lc0 0.26.2 65100 (24x320) : 3616 22 512 291 221 56.8
45 Lc0 0.26.1 J92-100 (30x384) : 3609 23 510 285 225 55.9
46 Lc0 0.27.0 PhStein 1.7 (20x256) : 3609 21 512 286 226 55.9
47 Lc0 0.27.0 PhStein 1.41 (20x2 : 3605 22 512 283 229 55.3
48 Lc0 0.26.1 t60-4585 (30x384) : 3605 20 514 284 230 55.3
49 Lc0 0.25.1 sv-1810 (20x256) : 3599 21 514 280 234 54.5
50 Lc0 0.25.1 t60-4175_mlh (30x384) : 3594 21 516 277 239 53.7
51 Lc0 0.25.1 t60-4175 (30x384) : 3592 21 515 275 240 53.4
52 Lc0 0.25.1 t60-4082 (30x384) : 3589 20 510 270 240 52.9
53 Lc0 0.26.3 J104.1-30 (10x128) : 3587 22 516 272 244 52.7
54 Lc0 0.26.0 J90-40 (30x384) : 3587 21 511 269 242 52.6
55 Lc0 0.26.3 PhStein 1.1 (20x256) : 3585 21 513 269 244 52.4
56 Lc0 0.26.2 PhoenixStein (20x256) : 3585 21 525 275 250 52.4
57 Lc0 0.25.1 t40-1541 (20x256) : 3583 22 516 269 247 52.1
58 Allie 0.7 LS 14.3 : 3582 22 512 266 246 52.0
59 Lc0 0.25.1 t60-3010 (30x384) : 3582 21 514 267 247 51.9
60 Ceres 0.80 J104.1-30 (10x128) : 3580 22 511 264 247 51.7
61 Lc0 0.26.1 64623 (24x320) : 3576 21 520 266 254 51.2
62 Stockfish 200418 bmi2 : 3568 2 45933 20646 25287 44.9
63 Lc0 0.26.1 64208 (24x320) : 3565 21 510 253 257 49.6
64 Allie 0.6 LS 14.3 (20x256) : 3558 20 519 252 267 48.6
65 Lc0 0.25.1 42850 (20x256) : 3556 21 522 252 270 48.3
66 Lc0 0.26.3 SV-5300 (10x128) : 3554 21 518 249 269 48.1
67 Lc0 0.25.1 63651 (24x320) : 3554 21 517 248 269 48.0
68 Lc0 0.26.3 Tinker_6430 (10x128) : 3552 21 513 245 268 47.8
69 Lc0 0.25.1 63851 (24x320) : 3552 21 518 247 271 47.7
70 Lc0 0.25.1 702820 (10x128) : 3552 22 518 247 271 47.7
71 Lc0 0.25.1 t60-3972 (30x384) : 3550 21 514 244 270 47.5
72 Lc0 0.26.2 722641 (10x128) : 3546 22 518 243 275 46.9
73 Lc0 0.26.0 703810 (10x128) : 3545 21 507 237 270 46.7
74 Lc0 0.26.1 722052 (10x128) : 3543 21 512 238 274 46.5
75 Lc0 0.26.3 730372 (14x128) : 3536 21 510 232 278 45.5
76 Fat Fritz 1.1 (20x256) : 3530 22 523 233 290 44.6
77 Lc0 0.25.1 63305 (24x320) : 3530 21 512 228 284 44.5
78 Lc0 0.26.3 730937 (14x128) : 3529 22 515 229 286 44.5
79 Lc0 0.26.3 730262 (14x128) : 3528 22 515 228 287 44.3
80 Lc0 0.26.3 730517 (14x128) : 3525 21 520 228 292 43.8
81 Lc0 0.26.1 721051 (10x128) : 3491 21 516 202 314 39.1
82 Lc0 0.25.1 32930 (20x256) : 3483 22 515 196 319 38.1
83 Lc0 0.25.1 714646 (19x256) : 3479 22 516 194 322 37.6
84 Lc0 0.26.3 730164 (14x128) : 3475 22 513 190 323 37.0
85 Lc0 0.25.1 714435 (19x256) : 3463 22 517 183 334 35.4
86 Lc0 0.26.1 715842 (19x256) : 3431 22 528 166 362 31.4
87 Lc0 0.25.1 11260 (20x256) : 3408 23 521 149 372 28.6
88 Lc0 0.26.3 Bad Gyal 9XL (20x128) : 3367 24 528 127 401 24.1
89 Lc0 0.26.3 Bad Gyal 9PXL (20x128) : 3348 25 524 116 408 22.1
90 Lc0 0.27.0 Bad Gyal 9XXL (40x128) : 3309 27 538 100 438 18.6
Games : 45933 (finished)
White Wins : 23579 (51.3 %)
Black Wins : 22354 (48.7 %)
Draws : 0 (0.0 %)
Mention, that this is not a ratinglist, but only a performance test of Lc0 with different NNs versus Stockfish. Because Lc0 vs. Stockfish is definitly the most interesting head-to-head competition of NN vs. AB-engines. For a real ratinglist including Lc0 running on a RTX-GPU (with a valid Leela-Ratio of 1.0), please visit Andreas Strangmueller's excellent website. Just click here
Stockfish vs Lc0 longtime testing ("SuFi for the poor")
Each testrun 300 games with 150 Noomen lowdraw-openings (selected openings from TCEC superfinals) and 5'+3'' thinking-time (Lc0) / 7.5'+4.5'' (Stockfish). This thinking-time gives a perfect Leela-Ratio of 1.0 on the used PC hardware: i7-8750H 2.6GHz (Hexacore, TurboBoost mode off) Notebook, RTX 2060 GPU. Average game-duration: 20 minutes. Stockfish (running on 11 hyperthreading-threads, Intel Turbo-Mode off) 9000 kn/s, Lc0 (with old 32930 20x256 net) on RTX 2060 mobile: 16000 n/s in starting position.
Hash / NN Cache: 4096 GB Hash for Stockfish / 10000000 NN-Cachesize for Lc0
GUI: Cutechess-cli (GUI ends game, when a 5-piece endgame is on the board)
Tablebases: None for engines, 5 Syzygy for cutechess-cli
Openings: 150 Noomen lowdraws openings (J. Noomen selected non-drawish openings out of his TCEC superfinal openings of previous TCEC seasons). Download here
Ponder, Large Memory Pages & learning: Off
Thinking time: Lc0 5'+3'' and Stockfish 7.5'+4.5'' (means a perfect Leela-Ratio of 1.0). Average game-duration: 20 minutes.
Download all played games here
Latest update: 2021/04/11 Stockfish 210406 vs Lc0 0.27.0 68002
See some short and spectacular wins of this match directly here on the website in the "View SF vs Lc0 games"- section!
Stockfish 210406 bmi2 vs Lc0 0.27.0 68002 : 300 (+ 60,=219,- 21), 56.5 % (+45 Elo)
Stockfish 210226 bmi2 vs Lc0 0.27.0 67741 : 300 (+ 75,=205,- 20), 59.2 % (+65 Elo)
Stockfish 201225 bmi2 vs Lc0 0.26.3 66680 : 300 (+ 60,=223,- 17), 57.2 % (+50 Elo)
Stockfish 201022 bmi2 vs Lc0 0.26.3 J92-260 : 300 (+ 75,=207,- 18), 59.5 % (+67 Elo)
Stockfish 200928 bmi2 vs Lc0 0.26.3rc2 J92-190: 300 (+ 68,=215,- 17), 58.5 % (+60 Elo)
Stockfish 12 bmi2 vs Lc0 0.26.2 J92-130: 300 (+ 74,=203,- 23), 58.5 % (+60 Elo)
SF 200823 82215d0fd0df vs Lc0 0.26.1 t60-4619: 300 (+ 85,=199,- 16), 61.5 % (+82 Elo)
SF 200810 112bb1c8cdb5 vs Lc0 0.26.1 LS 15: 300 (+ 78,=196,- 26), 58.7 % (+62 Elo)