Stefan Pohl Computer Chess

private website for chessengine-tests


Latest Website-News (2020/12/02): NN-testrun of Lc0 0.26.3 730517 finished. See the result and download the games in the "NN vs SF testing"- section. Next NN-testrun: Lc0 0.26.3 J96-28.

 

Release of my new Unbalanced Human Openings. Learn more in the "Unbalanced Human Openings"- section and download them right here

 

At the moment, I am working on a new V2.00 of UHO-openings, which means, all (nearly) 400000 endpositions of the raw-database are re-evaluated by KomodoDragon 1.0 (15 seconds per position on a Quadcore PC). V1.00 was evaluated by Komodo 14. Because KomodoDragon is around +200 Elo stronger and the nnue-net boosts the positional understanding even more, the new evaluation promises much better and more valid results. But it is not clear, if this will lead to better results of UHO-openings in testings, too. I hope so. If the testing results of UHO V2.00 are better than UHO V1.00 and no unexpected problems appear, the release of UHO V2.00 will be in Q1/2021 - I need around 2 months for the evaluation of all endpositions. And a lot of pre-tests, to find the best eval-interval for best results, are needed. And then, the final testruns have to be done... A lot of work!

 

Stay tuned.


Stockfish testing

 

Playing conditions:

 

Hardware: Since 20/07/21 AMD Ryzen 3900 12-core (24 threads) notebook with 32GB RAM. Now, 20 games are played simultaneously (!), so from now, each testrun will have 6000 or 7000 games (instead of 5000 before) and will take only 2 days, not 6-7 days as before! From now, all engine-binaries are popcount/avx2, of course, because bmi2-compiles are extremly slow on AMD. To keep the rating-list engine-names consistent, the "bmi2"- or "pext"-extension in the engine-name is still in use for older engines - otherwise ORDO will not calculate all played games by this engine as one engine...

Speed: (singlethread, TurboBoost-mode switched off, chess starting position) Stockfish: 1.3 mn/s, Komodo: 1.1 mn/s

Hash: 256MB per engine

GUI: Cutechess-cli (GUI ends game, when a 5-piece endgame is on the board)

Tablebases: None for engines, 5 Syzygy for cutechess-cli

Openings: HERT_500 testset (by Thomas Zipproth) (download the file at the "Download & Links"-section or here)

Ponder, Large Memory Pages & learning: Off

Thinking time: 180''+1000ms (= 3'+1'') per game/engine (average game-duration: around  7.5 minutes). One 7000 games-testrun takes about 2 days.The version-numbers of the Stockfish engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file, written backwards (year,month,day))(example: 200807 = August, 7, 2020). The used SF compile is the AVX2-compile, which is the fastest on my AMD Ryzen CPU. SF binaries are taken from abrok.eu (except the official SF-release versions, which are taken form the official Stockfish website).

Download BrainFish (and the Cerebellum-Libraries)here

 

To avoid distortions in the Ordo Elo-calculation, from now, only 3x Stockfish (latest official release + the latest 2 dev-versions) and 1x Brainfish are stored in the gamebase (all older engine-versions games will be deleted, every time, when a new version was tested). Stockfish and BrainFish older Elo-results can still be seen in the Elo-diagrams below. BrainFish plays always with the latest Cerebellum-Libraries of course, because otherwise BrainFish = Stockfish.

 

Latest update: 2020/11/29: Stockfish 201126 avx2 (-1 Elo to Stockfish 201115)

 

(Ordo-calculation fixed to Stockfish 12 = 3684 Elo)

 

See the individual statistics of engine-results here

Download the current gamebase here

 

     Program                      Elo    +    -   Games   Score   Av.Op.  Draws

   1 CFish 12 3xCerebellum      : 3726    8    8  7000    86.1 %   3389   27.3 %
   2 Stockfish 201108 avx2      : 3724    8    8  7000    81.6 %   3442   35.5 %
   3 Stockfish 201115 avx2      : 3723    8    8  7000    78.3 %   3473   41.9 %
   4 Stockfish 201126 avx2      : 3722    8    8  7000    78.1 %   3473   42.6 %
   5 CFish 12 avx2              : 3703    9    9  7000    84.6 %   3389   29.1 %
   6 Stockfish 12 200902        : 3684    4    4 23000    76.6 %   3448   41.9 %
   7 KomodoDragon 1.0 avx2      : 3652    7    7 10000    70.7 %   3471   46.0 %
   8 SF 200910 miniNNue avx2    : 3616    7    7  7000    72.1 %   3437   43.2 %
   9 Stockfish 200731 popc      : 3601    7    7  7000    80.5 %   3345   36.2 %
  10 Stockfish 11 200118        : 3564    5    5 17000    69.5 %   3403   41.6 %
  11 Stockfish 10 181129        : 3525    5    5 15000    78.5 %   3288   37.7 %
  12 KomodoDragon 1.0 MCTS      : 3479    6    6  7000    57.7 %   3424   56.9 %
  13 Stockfish 9 180201         : 3475    9    9  5000    74.9 %   3273   41.7 %
  14 Komodo 14.1 x64            : 3454    6    6  9000    51.9 %   3446   53.1 %
  15 Komodo 14 bmi2             : 3445    4    4 20000    52.1 %   3433   51.4 %
  16 Nemorino 6.00 avx2         : 3441    4    4 18000    45.8 %   3481   50.1 %
  17 Houdini 6 pext             : 3440    2    2 49000    56.3 %   3394   46.5 %
  18 Komodo 13.3 bmi2           : 3438    6    6  8000    62.8 %   3342   49.9 %
  19 Komodo 13.1 bmi2           : 3425    5    5 11000    62.0 %   3334   48.8 %
  20 Komodo 12.3 bmi2           : 3412    7    7  7000    62.7 %   3314   49.4 %
  21 Ethereal 12.75 avx2        : 3399    5    5 17000    39.8 %   3489   48.4 %
  22 Ethereal 12.62 avx2        : 3391    6    6  8000    49.1 %   3402   54.6 %
  23 Slow Chess 2.4 popc        : 3372    5    5 13000    34.2 %   3511   45.2 %
  24 Ethereal 12.50 popc        : 3356    6    6  8000    46.9 %   3386   55.5 %
  25 Slow Chess 2.3 popc        : 3345    4    4 15000    43.2 %   3402   52.4 %
  26 Komodo 14 MCTS             : 3340    7    7  5000    44.4 %   3385   53.4 %
  27 Ethereal 12.25 pext        : 3338    5    5 12000    35.2 %   3471   46.4 %
  28 Slow Chess 2.2 popc        : 3329    5    5 11000    32.9 %   3483   42.7 %
  29 RubiChess 1.9dev nnue      : 3320    6    6 10000    27.2 %   3522   40.0 %
  30 Ethereal 12.00 pext        : 3317    6    6  9000    43.1 %   3371   50.8 %
  31 Igel 2.8.0 popavx2         : 3315    6    6  8000    37.1 %   3419   48.6 %
  32 Ethereal 11.75 pext        : 3309    6    6  9000    39.3 %   3392   53.2 %
  33 Xiphos 0.6 bmi2            : 3304    3    3 33000    35.8 %   3427   48.0 %
  34 Fire 7.1 popc              : 3301    3    3 42000    42.0 %   3371   50.7 %
  35 Xiphos 0.5.6 bmi2          : 3288    7    7  7000    41.2 %   3356   54.6 %
  36 Minic 2.51 nasc_nutr       : 3284    6    6  7000    31.2 %   3437   45.0 %
  37 Ethereal 11.53 pext        : 3281    7    7  7000    42.2 %   3342   53.4 %
  38 Komodo 12.3 MCTS           : 3276    7    7  7000    42.7 %   3334   46.3 %
  39 Ethereal 11.25 pext        : 3271    7    7  6000    38.4 %   3362   51.0 %
  40 rofChade 2.3 bmi2          : 3258    5    5 11000    33.8 %   3388   47.5 %
  41 Booot 6.4 popc             : 3245    7    7  6000    31.1 %   3394   46.5 %
  42 Schooner 2.2 popc          : 3242    7    7  6000    31.3 %   3391   50.3 %
  43 Laser 1.7 bmi2             : 3218    8    8  6000    30.8 %   3371   45.8 %
  44 Fizbo 2 bmi2               : 3213    8    8  5000    36.0 %   3325   39.0 %
  45 Fritz 17                   : 3212    7    7  6000    29.4 %   3377   44.2 %
  46 Shredder 13 x64            : 3210    8    8  6000    31.9 %   3359   42.6 %
  47 RubiChess 1.8 popc         : 3208    6    6  7000    32.0 %   3345   46.1 %
  48 Defenchess 2.2 popc        : 3206    8    8  5000    26.6 %   3394   41.8 %
  49 Booot 6.3.1 popc           : 3200    9    9  5000    34.0 %   3328   44.1 %
  50 Andscacs 0.95 popc         : 3168    8    8  5000    23.1 %   3391   35.4 %

The version-numbers (180622 for example) of the engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file. Especially the asmFish-engines are often released much later!!

Below you find a diagram of the progress of Stockfish in my tests since August 2020

And below that diagram, the older diagrams.

 

You can save the diagrams (as a JPG-picture (in originial size)) on your PC with mouseclick (right button) and then choose "save image"...

The Elo-ratings of older Stockfish dev-versions in the Ordo-calculation can be a little different to the Elo-"dots" in the diagram, because the results/games of new Stockfish dev-versions - when getting part of the Ordo-calculation - can change the Elo-ratings of the opponent engines and that can change the Elo-ratings of older Stockfish dev-versions (in the Ordo-calculation / ratinglist, but not in the diagram, where all Elo-"dots" are the rating of one Stockfish dev-version at the moment, when the testrun of that Stockfish dev-version was finished).

 

 

 

 

 

 


Sie sind Besucher Nr.