Stefan Pohl Computer Chess

private website for chessengine-tests


Latest Website-News (2021/06/24): AB-testrun of Slow Chess 2.6 finished: +36 Elo to Slow Chess 2.54 and +57 Elo to Slow Chess 2.5. 

 

Next AB-testrun: Minic 3.08. Testrun of Chiron 5 aborted: The author of Chiron 5 told me, that the release-version is buggy, plays weaker and I should wait for the bugfix - that's what I will do.

Next NN-testrun: Longtime "SuFi for the poor" testrun of TCEC S21 - Division P engines: Stockfish 210619 vs Lc0 0.28.0-rc1 69146.

 

 

Stay tuned.


Stockfish testing

 

Playing conditions:

 

Hardware: Since 20/07/21 AMD Ryzen 3900 12-core (24 threads) notebook with 32GB RAM. Now, 20 games are played simultaneously (!), so from now, each testrun will have 6000 or 7000 games (instead of 5000 before) and will take only 2 days, not 6-7 days as before! From now, all engine-binaries are popcount/avx2, of course, because bmi2-compiles are extremly slow on AMD. To keep the rating-list engine-names consistent, the "bmi2"- or "pext"-extension in the engine-name is still in use for older engines - otherwise ORDO will not calculate all played games by this engine as one engine...

Speed: (singlethread, TurboBoost-mode switched off, chess starting position) Stockfish: 1.3 mn/s, Komodo: 1.1 mn/s

Hash: 256MB per engine

GUI: Cutechess-cli (GUI ends game, when a 5-piece endgame is on the board)

Tablebases: None for engines, 5 Syzygy for cutechess-cli

Openings: HERT_500 testset (by Thomas Zipproth) (download the file at the "Download & Links"-section or here)

Ponder, Large Memory Pages & learning: Off

Thinking time: 180''+1000ms (= 3'+1'') per game/engine (average game-duration: around  7.5 minutes). One 7000 games-testrun takes about 2 days.The version-numbers of the Stockfish engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file, written backwards (year,month,day))(example: 200807 = August, 7, 2020). The used SF compile is the AVX2-compile, which is the fastest on my AMD Ryzen CPU. SF binaries are taken from abrok.eu (except the official SF-release versions, which are taken form the official Stockfish website).

Download BrainFish (and the Cerebellum-Libraries)here

 

To avoid distortions in the Ordo Elo-calculation, from now, only 3x Stockfish (latest official release + the latest 2 dev-versions) and 1x Brainfish are stored in the gamebase (all older engine-versions games will be deleted, every time, when a new version was tested). Stockfish and BrainFish older Elo-results can still be seen in the Elo-diagrams below. BrainFish plays always with the latest Cerebellum-Libraries of course, because otherwise BrainFish = Stockfish.

 

Latest update: 2021/06/24: Slow Chess 2.6  (+36 Elo to Slow Chess 2.54 and +57 Elo to Slow Chess 2.5)

 

(Ordo-calculation fixed to Stockfish 13 = 3723 Elo)

 

See the individual statistics of engine-results here

Download the current gamebase here

     Program                    Elo    +    -   Games   Score   Av.Op.  Draws

   1 Stockfish 210615 avx2    : 3755    7    7  7000    75.4 %   3542   47.9 %
   2 Stockfish 210619 avx2    : 3751    7    7  7000    75.1 %   3542   48.7 %
   3 Stockfish 13 210218      : 3723    4    4 22000    74.4 %   3513   47.7 %
   4 SF Fat Fritz 2 avx2      : 3722    6    6  8000    73.7 %   3517   49.3 %
   5 CFish 12 avx2            : 3711    9    9  7000    84.6 %   3397   29.1 %
   6 Stockfish 12 200902      : 3689    4    4 25000    78.2 %   3442   39.4 %
   7 SF Fat Fritz 2 github    : 3682    7    7  7000    73.3 %   3488   48.5 %
   8 KomodoDragon 1.0 avx2    : 3651    4    4 21000    74.4 %   3443   43.0 %
   9 KomodoDragon 2.0 avx2    : 3651    5    5 14000    66.5 %   3520   52.8 %
  10 Fire 8.NN avx2           : 3609    5    5 11000    57.8 %   3549   57.9 %
  11 Stockfish 11 200118      : 3569    6    6  9000    66.7 %   3432   43.8 %
  12 KomodoDragon 2.0 MCTS    : 3568    6    6  7000    62.4 %   3478   53.4 %
  13 KomodoDragon 1.0 MCTS    : 3485    6    6  7000    57.6 %   3431   57.1 %
  14 Slow Chess 2.6 avx2      : 3482    7    7  7000    42.6 %   3541   57.7 %
  15 Ethereal 13 nnue avx2    : 3475    5    5 10000    37.8 %   3575   51.2 %
  16 Komodo 14.1 x64          : 3461    6    6  8000    56.3 %   3418   55.6 %
  17 Komodo 14 bmi2           : 3452    5    5 13000    52.6 %   3438   54.6 %
  18 Slow Chess 2.54 avx2     : 3446    5    5 11000    36.3 %   3560   50.0 %
  19 RubiChess 2.1 avx2       : 3446    4    4 17000    39.0 %   3536   53.8 %
  20 Houdini 6 pext           : 3444    3    3 40000    49.1 %   3456   50.7 %
  21 Nemorino 6.00 avx2       : 3443    3    3 36000    47.2 %   3470   52.6 %
  22 Fire 8.1 popc            : 3439    5    5 13000    43.1 %   3495   53.5 %
  23 Pedone 3.1 avx2          : 3427    4    4 14000    42.9 %   3484   54.0 %
  24 Slow Chess 2.5 avx2      : 3425    4    4 19000    41.7 %   3496   50.0 %
  25 Igel 3.0.5 popavx2       : 3414    6    6  9000    45.4 %   3454   56.3 %
  26 Nemorino 6.05 avx2       : 3411    6    6  7000    42.8 %   3472   51.0 %
  27 Ethereal 12.75 avx2      : 3404    3    3 25000    45.7 %   3444   52.1 %
  28 Ethereal 12.62 avx2      : 3399    6    6  8000    49.1 %   3411   54.6 %
  29 Igel 3.0.0 popavx2       : 3391    6    6  7000    36.2 %   3504   52.1 %
  30 Slow Chess 2.4 popc      : 3380    5    5 12000    43.7 %   3433   52.3 %
  31 Ethereal 12.50 popc      : 3367    6    6  7000    45.8 %   3406   55.2 %
  32 RubiChess 2.0 avx2       : 3360    6    6 11000    29.4 %   3539   44.4 %
  33 Slow Chess 2.3 popc      : 3352    5    5 12000    42.9 %   3409   54.0 %
  34 Ethereal 12.25 pext      : 3346    6    6  9000    36.3 %   3465   50.3 %
  35 Slow Chess 2.2 popc      : 3338    6    6  8000    33.6 %   3480   46.2 %
  36 Pedone 3 avx2            : 3335    6    6  8000    33.4 %   3471   46.7 %
  37 Igel 2.9.0 popavx2       : 3333    5    5 11000    34.5 %   3457   50.1 %
  38 RubiChess 1.9dev nnue    : 3326    6    6  8000    31.8 %   3477   45.6 %
  39 Xiphos 0.6 bmi2          : 3316    4    4 20000    37.4 %   3422   49.1 %
  40 Fire 7.1 popc            : 3313    4    4 17000    33.9 %   3447   48.9 %
  41 Gogobello 3 avx2         : 3305    6    6  7000    40.5 %   3375   53.3 %
  42 Booot 6.5 popc           : 3303    7    7  8000    33.1 %   3450   40.5 %
  43 rofChade 2.3 bmi2        : 3276    6    6  9000    38.6 %   3364   49.3 %
  44 Minic 3.07               : 3207    7    7  7000    29.0 %   3369   43.1 %

The version-numbers (180622 for example) of the engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file. Especially the asmFish-engines are often released much later!!

Some engines are using a nnue-net based on evals of other engines. I decided to test these engines, too. As far as I know the follwing engines use nnue-nets based on evals of other engines (if I missed an engine, please contact me):

Fire 8.N, Fire 8.NN, Nemorino 6.00, Gogobello 3 (using Stockfish-based nnue net)

Fat Fritz 2, Stockfish 210615 (using Lc0-based nnue net)

Below you find a diagram of the progress of Stockfish in my tests since August 2020

And below that diagram, the older diagrams.

 

You can save the diagrams (as a JPG-picture (in originial size)) on your PC with mouseclick (right button) and then choose "save image"...

The Elo-ratings of older Stockfish dev-versions in the Ordo-calculation can be a little different to the Elo-"dots" in the diagram, because the results/games of new Stockfish dev-versions - when getting part of the Ordo-calculation - can change the Elo-ratings of the opponent engines and that can change the Elo-ratings of older Stockfish dev-versions (in the Ordo-calculation / ratinglist, but not in the diagram, where all Elo-"dots" are the rating of one Stockfish dev-version at the moment, when the testrun of that Stockfish dev-version was finished).

 

 

 

 

 

 


Sie sind Besucher Nr.