Stefan Pohl Computer Chess

private website for chessengine-tests


Latest Website-News (2020/10/19): NN-testrun of Lc0 0.26.3 730164 (first testrun of a T73 net) finished. Check the result and download the games in the "NN vs SF testing"- section.

AB-testrun of Stockfish 201014 finished - no progress.

 

Stay tuned.


Stockfish testing

 

Playing conditions:

 

Hardware: Since 20/07/21 AMD Ryzen 3900 12-core (24 threads) notebook with 32GB RAM. Now, 20 games are played simultaneously (!), so from now, each testrun will have 6000 or 7000 games (instead of 5000 before) and will take only 2 days, not 6-7 days as before! From now, all engine-binaries are popcount/avx2, of course, because bmi2-compiles are extremly slow on AMD. To keep the rating-list engine-names consistent, the "bmi2"- or "pext"-extension in the engine-name is still in use for older engines - otherwise ORDO will not calculate all played games by this engine as one engine...

Speed: (singlethread, TurboBoost-mode switched off, chess starting position) Stockfish: 1.3 mn/s, Komodo: 1.1 mn/s

Hash: 256MB per engine

GUI: Cutechess-cli (GUI ends game, when a 5-piece endgame is on the board)

Tablebases: None for engines, 5 Syzygy for cutechess-cli

Openings: HERT_500 testset (by Thomas Zipproth) (download the file at the "Download & Links"-section or here)

Ponder, Large Memory Pages & learning: Off

Thinking time: 180''+1000ms (= 3'+1'') per game/engine (average game-duration: around  7.5 minutes). One 7000 games-testrun takes about 2 days.The version-numbers of the Stockfish engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file, written backwards (year,month,day))(example: 200807 = August, 7, 2020). Since Stockfish supports NNUE, the engine-name is "SF" ("BF"= BrainFish), only, because the engine-name has to include not only the release-date, but the name of the nnue-net, too, which makes the complete engine-name very long! And the used SF compile is the AVX2-compile, which is the fastest on my AMD Ryzen CPU.

Download BrainFish (and the Cerebellum-Libraries)here

 

To avoid distortions in the Ordo Elo-calculation, from now, only 3x Stockfish (latest official release + the latest 2 dev-versions) and 1x Brainfish are stored in the gamebase (all older engine-versions games will be deleted, every time, when a new version was tested). Stockfish and BrainFish older Elo-results can still be seen in the Elo-diagrams below. BrainFish plays always with the latest Cerebellum-Libraries of course, because otherwise BrainFish = Stockfish.

 

Latest update: 2020/10/18: Stockfish 201014 (-2 Elo to Stockfish 200928)

 

(Ordo-calculation fixed to Stockfish 12 = 3684 Elo)

 

See the individual statistics of engine-results here

See the ORDO-rating of the archive-gamebase since 2020 here

Download the current gamebase here

Download the archive-gamebase since 2020 here

 

     Program                      Elo    +    -   Games   Score   Av.Op.  Draws

   1 CFish 12 3xCerebellum      : 3722    9    9  7000    86.1 %   3384   27.3 %
   2 Stockfish 200928 avx2      : 3716    8    8  7000    82.8 %   3412   32.8 %
   3 Stockfish 201014 avx2      : 3714    8    8  7000    81.4 %   3433   35.8 %
   4 Stockfish 200921 avx2      : 3708    8    8  7000    82.6 %   3407   32.5 %
   5 CFish 12 avx2              : 3699    8    8  7000    84.6 %   3384   29.1 %
   6 Stockfish 12 200902        : 3684    5    5 18000    77.0 %   3441   41.2 %
   7 BrainFish-2 200724         : 3641    8    8  7000    84.1 %   3341   31.0 %
   8 SF 200910 miniNNue avx2    : 3612    7    7  7000    72.1 %   3433   43.2 %
   9 Stockfish 200731 popc      : 3597    8    8  7000    80.5 %   3341   36.2 %
  10 Stockfish 11 200118        : 3560    5    5 17000    69.5 %   3399   41.6 %
  11 Stockfish 10 181129        : 3520    6    6 15000    78.5 %   3284   37.7 %
  12 Stockfish 9 180201         : 3471    8    8  5000    74.9 %   3268   41.7 %
  13 Komodo 14 bmi2             : 3441    4    4 24000    46.7 %   3473   49.1 %
  14 Houdini 6 pext             : 3436    3    3 45000    56.5 %   3388   45.7 %
  15 Nemorino 6.00 avx2         : 3436    5    5 11000    49.8 %   3444   49.7 %
  16 Komodo 13.3 bmi2           : 3434    6    6  8000    62.8 %   3338   49.9 %
  17 Komodo 13.1 bmi2           : 3421    5    5 11000    62.0 %   3329   48.8 %
  18 Komodo 12.3 bmi2           : 3408    7    7  7000    62.7 %   3310   49.4 %
  19 Ethereal 12.75 avx2        : 3391    6    6 10000    43.0 %   3454   49.4 %
  20 Ethereal 12.62 avx2        : 3388    6    6  9000    45.3 %   3434   51.7 %
  21 Ethereal 12.50 popc        : 3352    6    6  9000    43.0 %   3419   52.0 %
  22 Slow Chess 2.3 popc        : 3342    4    4 17000    38.0 %   3450   46.9 %
  23 Komodo 14 MCTS             : 3336    7    7  5000    44.4 %   3381   53.4 %
  24 Ethereal 12.25 pext        : 3333    5    5 13000    33.5 %   3480   44.9 %
  25 Slow Chess 2.2 popc        : 3325    5    5 12000    31.3 %   3493   41.5 %
  26 Ethereal 12.00 pext        : 3313    6    6  9000    43.1 %   3366   50.8 %
  27 Igel 2.8.0 popavx2         : 3310    6    6  7000    38.3 %   3407   48.2 %
  28 Ethereal 11.75 pext        : 3304    5    5  9000    39.3 %   3388   53.2 %
  29 Xiphos 0.6 bmi2            : 3299    3    3 32000    33.7 %   3445   45.6 %
  30 Fire 7.1 popc              : 3296    3    3 42000    40.7 %   3381   48.9 %
  31 Xiphos 0.5.6 bmi2          : 3284    7    7  7000    41.2 %   3352   54.6 %
  32 Minic 2.51 nasc_nutr       : 3279    7    7  7000    31.2 %   3433   45.0 %
  33 Ethereal 11.53 pext        : 3276    7    7  7000    42.2 %   3338   53.4 %
  34 Komodo 12.3 MCTS           : 3271    7    7  7000    42.7 %   3329   46.3 %
  35 Ethereal 11.25 pext        : 3267    8    8  6000    38.4 %   3358   51.0 %
  36 rofChade 2.3 bmi2          : 3255    5    5 12000    31.9 %   3405   45.4 %
  37 Booot 6.4 popc             : 3240    7    7  6000    31.1 %   3390   46.5 %
  38 Schooner 2.2 popc          : 3237    7    7  6000    31.3 %   3386   50.3 %
  39 Laser 1.7 bmi2             : 3214    7    7  6000    30.8 %   3366   45.8 %
  40 Fizbo 2 bmi2               : 3209    8    8  5000    36.0 %   3321   39.0 %
  41 Fritz 17                   : 3208    8    8  6000    29.4 %   3372   44.2 %
  42 Shredder 13 x64            : 3205    8    8  6000    31.9 %   3354   42.6 %
  43 RubiChess 1.8 popc         : 3204    6    6  7000    32.0 %   3341   46.1 %
  44 Defenchess 2.2 popc        : 3201    8    8  5000    26.6 %   3390   41.8 %
  45 Booot 6.3.1 popc           : 3195    8    8  5000    34.0 %   3323   44.1 %
  46 Andscacs 0.95 popc         : 3163    8    8  5000    23.1 %   3386   35.4 %

The version-numbers (180622 for example) of the engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file. Especially the asmFish-engines are often released much later!!

Below you find a diagram of the progress of Stockfish in my tests since August 2020

And below that diagram, the older diagrams.

 

You can save the diagrams (as a JPG-picture (in originial size)) on your PC with mouseclick (right button) and then choose "save image"...

The Elo-ratings of older Stockfish dev-versions in the Ordo-calculation can be a little different to the Elo-"dots" in the diagram, because the results/games of new Stockfish dev-versions - when getting part of the Ordo-calculation - can change the Elo-ratings of the opponent engines and that can change the Elo-ratings of older Stockfish dev-versions (in the Ordo-calculation / ratinglist, but not in the diagram, where all Elo-"dots" are the rating of one Stockfish dev-version at the moment, when the testrun of that Stockfish dev-version was finished).

 

 

 

 

 

 


Sie sind Besucher Nr.