Stefan Pohl Computer Chess

Home of famous UHO openings and EAS Ratinglist


Lc0 or other GPU-Neural Nets versus Stockfish 15.1 testing

 

Playing conditions:

 

Hardware: Ryzen 7 6800H 2.6GHz Notebook, RTX 3060 GPU, Windows 11 64bit, 32GB RAM

Cuda version installed: Cuda 11.7

Speed:  Stockfish 15.1 plays with 14 Threads (=7 cores) and reaches 10 MN/s in the middlegame. Lc0 minibatchsize parameter is set to the best value for each netsize, depending on Lc0's benchmark with backendbench --clippy.

Hash: 2 GB Hash for Stockfish 15.1 / 8192 RamLimitMb for Lc0

GUICutechess-cli (GUI ends game, when a 5-piece endgame is on the board)

Tablebases: None for engines, 5 Syzygy for cutechess-cli

Openings: UHO_2022_6mvs_+120_+129.pgn. Download my UHO 2022 openings here

Ponder, Large Memory Pages & learning: Off

Thinking time: 2min+2sec for Lc0 and 1min+1sec for Stockfish 15.1: I measured nps on my system and compared these values with the TCEC: My CPU is way too fast, compared with Lc0 running on my RTX 3060 GPU, so it makes sense to set the thinking-time of Stockfish to only 50% of the thinking-time of Lc0. For compensating the fast CPU and the fact, that in TCEC Lc0 benefits from fast hardware and long thinking-time (both is better for Lc0, not for Stockfish)

One testrun takes around nearly 5 days. Average game-duration: 6min 45sec

 

Each Lc0 / Neural Net plays 1000 games vs. Stockfish 15.1 with my UHO 2022 openings

 

Learn more about Lc0 (getting started in a GUI, links to net-downloads, FAQs, development-informations and the Leela-Blog) here

 

Latest update: 2023/09/08: Lc0 0.31dev TCEC 25 (Binary and net, playing TCEC 25 Premier Division, right now)

 

Download all played games (games of the old test-setups, too): here

     Program                              Elo    +    -  Games    Score   Av.Op. Draws

   1 Stockfish 15.1 avx2                :    0    4    4 13000    59.2%    -66   49.4%
   2 Lc0 0.31dev TCEC 25                :  -22   16   16  1000    46.9%      0   52.3%
   3 Lc0 0.30dev T1-4000 (15x768)       :  -39   15   15  1000    44.5%      0   49.8%
   4 Lc0 0.30dev 811107 (19x512)        :  -41   15   15  1000    44.1%      0   46.1%
   5 Lc0 0.30dev TCEC 24                :  -42   14   14  1000    44.1%      0   51.0%
   6 Lc0 0.30rc1 T1-4000 (15x768)       :  -44   15   15  1000    43.7%      0   49.8%
   7 Lc0 0.30dev T1-30875 (15x768)      :  -45   15   15  1000    43.5%      0   47.5%
   8 Lc0 0.30dev BT2-4510 (15x768)      :  -45   15   15  1000    43.5%      0   47.5%
   9 Lc0 0.30rc2 814174 (15x768)        :  -80   15   15  1000    38.8%      0   51.0%
  10 Lc0 0.30dev 813207 (15x768)        :  -84   15   15  1000    38.3%      0   49.6%
  11 Lc0 0.30dev TCEC 20                :  -90   15   15  1000    37.5%      0   50.5%
  12 Lc0 0.30dev T1-2432500 (10x256)    :  -94   16   16  1000    36.9%      0   47.2%
  13 Lc0 0.30dev TCEC 22                :  -95   15   15  1000    36.8%      0   49.4%
  14 Lc0 0.30dev TCEC 18                : -133   16   16  1000    31.9%      0   50.5%

 

Games        : 13000 (finished)

White Wins   : 6525 (50.2 %)
Black Wins   : 53 (0.4 %)
Draws        : 6422 (49.4 %)

 

Below the gamebase recalculated with my Gamepairs Rescorer Batch-Tool. Realizing Vondele's (Stockfish maintainer) idea: "Thinking uniquely in game pairs makes sense with the biased openings used these days. While pentanomial makes sense it is a bit complicated so we could simplify and score game pairs only (not games) as W-L-D (a traditional  score of 2-0, or 1.5-0.5 is just a W)."

   # PLAYER                             :  RATING  ERROR  PLAYED     W     D    L   (%)  CFS(%)
   1 Stockfish 15.1 avx2                :       0   ----    6500  2951  2980  569  68.3     100
   2 Lc0 0.31dev TCEC 25                :     -44     22     500    85   267  148  43.7      99
   3 Lc0 0.30dev T1-4000 (15x768)       :     -79     22     500    62   265  173  38.9      59
   4 Lc0 0.30dev 811107 (19x512)        :     -83     22     500    53   278  169  38.4      61
   5 Lc0 0.30dev TCEC 24                :     -87     23     500    56   266  178  37.8      58
   6 Lc0 0.30rc1 T1-4000 (15x768)       :     -90     22     500    62   250  188  37.4      56
   7 Lc0 0.30dev T1-30875 (15x768)      :     -93     23     500    60   251  189  37.1      54
   8 Lc0 0.30dev BT2-4510 (15x768)      :     -94     21     500    60   249  191  36.9     100
   9 Lc0 0.30rc2 814174 (15x768)        :    -168     24     500    28   221  251  27.7      67
  10 Lc0 0.30dev 813207 (15x768)        :    -176     26     500    21   226  253  26.8      78
  11 Lc0 0.30dev TCEC 20                :    -190     24     500    25   203  272  25.3      75
  12 Lc0 0.30dev T1-2432500 (10x256)    :    -202     25     500    20   200  280  24.0      58
  13 Lc0 0.30dev TCEC 22                :    -206     26     500    25   186  289  23.6     100
  14 Lc0 0.30dev TCEC 18                :    -315     32     500    12   118  370  14.2     ---


------------------------------------------------------------------- 
--- Number of all Gamepairs          : 6500 
--- Number of drawn Gamepairs overall: 2980 (= 45.85%) 
--- Number of 1:1 drawn Gamepairs    : 1516 (= 23.32%) 
--- Number of 2-draws drawn Gamepairs: 1464 (= 22.52%) 
------------------------------------------------------------------- 

You can download my Gamepairs Rescorer Tool right here

 

Mention, that this is not a ratinglist, but only a performance test of Lc0 with different NNs versus Stockfish. For a real ratinglist including Lc0 running on a RTX-GPU (with a valid Leela-Ratio of 1.0), please visit Andreas Strangmueller's excellent website. Just click here