Stefan Pohl Computer Chess

private website for chessengine-tests


Latest Website-News (2021/10/20): Regression-testrun and ratinglist-testrun of Stockfish 211017 finished: -4 Elo in ratinglist-testrun to Stockfish 211009. +0.5 Elo in regression-testrun to Stockfish 211009 and -0.3 Elo in regression-testrun to Stockfish 211015.

Next NN-testrun: Lc0 0.28.0 754042 (final net of T75 learn-run)

Next ratinglist-testrun: Beserk 6

 

 

Release of an updated version (V1.6) of my AntiDraw Openings Collection. What's new? Added a 6th (complete new) openings-concept, called WOMM ("W"hite "O"ne "M"ore "M"ove)... Read more in the "Anti Draw Openings"- section or download the new V1.6 right here

 

Stay tuned.


Stockfish Regression testing (30000 games (20sec+200ms) vs Stockfish 14 210702)

Latest testrun:

Stockfish 211017:  (+2547,=26492,-961)= 52.6% = +18.5 Elo (-0.3 Elo to previous test)

Best testrun so far:

Stockfish 210827:  (+2621,=26421,-958)= 52.8% = +19.4 Elo (+3.4 Elo to previous best)

See all results, get more information and download the games: Click on the yellow link above...


Stockfish testing

 

Playing conditions:

 

Hardware: Since 20/07/21 AMD Ryzen 3900 12-core (24 threads) notebook with 32GB RAM. 

Speed: (singlethread, TurboBoost-mode switched off, chess starting position) Stockfish 11: 1.3 mn/s, Komodo 14: 1.1 mn/s

Hash: 256MB per engine

GUI: Cutechess-cli (GUI ends game, when a 5-piece endgame is on the board)

Tablebases: None for engines, 5 Syzygy for cutechess-cli

Openings: HERT_500 testset (by Thomas Zipproth) (download the file at the "Download & Links"-section or here)

Ponder, Large Memory Pages & learning: Off

Thinking time: 180''+1000ms (= 3'+1'') per game/engine (average game-duration: around  7.5 minutes). One 7000 games-testrun takes about 2 days.The version-numbers of the Stockfish engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file, written backwards (year,month,day))(example: 200807 = August, 7, 2020). The used SF compile is the AVX2-compile, which is the fastest on my AMD Ryzen CPU. SF binaries are taken from abrok.eu (except the official SF-release versions, which are taken form the official Stockfish website).

Download BrainFish (and the Cerebellum-Libraries)here

 

To avoid distortions in the Ordo Elo-calculation, from now, only 3x Stockfish (latest official release + the latest 2 dev-versions) and 1x Brainfish are stored in the gamebase (all older engine-versions games will be deleted, every time, when a new version was tested). Stockfish and BrainFish older Elo-results can still be seen in the Elo-diagrams below. BrainFish plays always with the latest Cerebellum-Libraries of course, because otherwise BrainFish = Stockfish.

 

Latest update: 2021/10/20: Stockfish 211017 (-4 Elo to Stockfish 211009)

 

(Ordo-calculation fixed to Stockfish 14 = 3757 Elo)

 

See the individual statistics of engine-results here

Download the current gamebase here

Download the complete game-archive here

     Program                    Elo    +    -   Games   Score   Av.Op.  Draws

   1 Stockfish 211009 avx2    : 3779    8    8  7000    74.9 %   3578   49.3 %
   2 Stockfish 211017 avx2    : 3775    8    8  7000    74.5 %   3578   50.2 %
   3 Stockfish 14 210702      : 3757    7    7 10000    75.1 %   3554   48.8 %
   4 KomodoDragon 2.5 avx2    : 3731    7    7  9000    64.6 %   3616   61.3 %
   5 KomodoDragon 2.5 MCTS    : 3654    7    7  9000    55.1 %   3616   63.5 %
   6 Fire 8.NN avx2           : 3612    6    6 11000    50.9 %   3606   62.3 %
   7 Fire 8.NN MCTS avx2      : 3570    7    7  7000    47.5 %   3591   61.6 %
   8 Ethereal 13.25 nnue      : 3535    6    6 11000    38.7 %   3624   57.2 %
   9 Slow Chess 2.7 avx2      : 3527    6    6 10000    37.1 %   3630   58.0 %
  10 Koivisto 6.16 avx2       : 3500    6    6 11000    34.1 %   3627   54.1 %
  11 RubiChess 2.2 avx2       : 3486    5    5 13000    36.8 %   3593   51.5 %
  12 Revenge 1.0 avx2         : 3474    5    5 10000    44.8 %   3516   55.4 %
  13 Nemorino 6.00 avx2       : 3448    5    5 12000    56.5 %   3402   48.3 %
  14 Igel 3.0.5 popavx2       : 3407    6    6  9000    63.3 %   3308   52.5 %
  15 Seer 2.3.0 avx           : 3370    6    6 11000    56.1 %   3325   52.0 %
  16 Berserk 4.5.1 avx2       : 3349    5    5 12000    51.1 %   3341   46.0 %
  17 Arasan 23.0.1 avx2       : 3344    6    6  9000    52.9 %   3323   51.2 %
  18 Gogobello 3 avx2         : 3301    6    6 11000    48.9 %   3309   53.6 %
  19 Minic 3.16 znver3        : 3298    7    7  7000    49.6 %   3301   51.5 %
  20 Weiss 2.0 popc           : 3276    6    6 11000    45.1 %   3312   48.2 %
  21 Lc0 0.28.0 744706        : 3236    6    6  9000    41.6 %   3297   45.1 %
  22 Chiron 5 x64             : 3231    6    6 10000    38.3 %   3318   41.5 %
  23 Danasah 9.0 avx2         : 3230    7    7  7000    44.2 %   3272   48.9 %
  24 Clover 2.4 avx2          : 3216    7    7  9000    36.7 %   3316   43.3 %

The version-numbers (180622 for example) of the engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file. Especially the asmFish-engines are often released much later!!

Some engines are using a nnue-net based on evals of other engines. I decided to test these engines, too. As far as I know the follwing engines use nnue-nets based on evals of other engines (if I missed an engine, please contact me):

Fire 8.NN, Nemorino 6.00, Gogobello 3 (using Stockfish-based nnue nets)

Stockfish since 210615 (using Lc0-based nnue nets)

Below you find a diagram of the progress of Stockfish in my tests since August 2020

And below that diagram, the older diagrams.

 

You can save the diagrams (as a JPG-picture (in originial size)) on your PC with mouseclick (right button) and then choose "save image"...

The Elo-ratings of older Stockfish dev-versions in the Ordo-calculation can be a little different to the Elo-"dots" in the diagram, because the results/games of new Stockfish dev-versions - when getting part of the Ordo-calculation - can change the Elo-ratings of the opponent engines and that can change the Elo-ratings of older Stockfish dev-versions (in the Ordo-calculation / ratinglist, but not in the diagram, where all Elo-"dots" are the rating of one Stockfish dev-version at the moment, when the testrun of that Stockfish dev-version was finished).

 

 

 

 

 

 


Sie sind Besucher Nr.