Stefan Pohl Computer Chess

private website for chessengine-tests


Latest Website-News (2021/09/22): AB-testrun of Koivisto 6.16 finished: +38 Elo to Koivisto 6 (6.16 is a non-official version. Binaries by Ipman can be downloaded here (downloads from MEGA at your own risk...))

Next NN-testrun: Longtime "SuFi for the poor" testrun of Stockfih 210915 vs. Lc0 0.28.0 610062

Next AB-testrun and regression-testrun: Stockfish 210921

 

Stay tuned.


Stockfish Regression testing (30000 games (20sec+200ms) vs Stockfish 14 210702)

Latest testrun:

Stockfish 210915:  (+2474,=26618,-908)= 52.6% = +18.3 Elo (+0.5 Elo to previous test)

Best testrun so far:

Stockfish 210827:  (+2621,=26421,-958)= 52.8% = +19.4 Elo (+3.4 Elo to previous best)

See all results, get more information and download the games: Click on the yellow link above...


Stockfish testing

 

Playing conditions:

 

Hardware: Since 20/07/21 AMD Ryzen 3900 12-core (24 threads) notebook with 32GB RAM. 

Speed: (singlethread, TurboBoost-mode switched off, chess starting position) Stockfish 11: 1.3 mn/s, Komodo 14: 1.1 mn/s

Hash: 256MB per engine

GUI: Cutechess-cli (GUI ends game, when a 5-piece endgame is on the board)

Tablebases: None for engines, 5 Syzygy for cutechess-cli

Openings: HERT_500 testset (by Thomas Zipproth) (download the file at the "Download & Links"-section or here)

Ponder, Large Memory Pages & learning: Off

Thinking time: 180''+1000ms (= 3'+1'') per game/engine (average game-duration: around  7.5 minutes). One 7000 games-testrun takes about 2 days.The version-numbers of the Stockfish engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file, written backwards (year,month,day))(example: 200807 = August, 7, 2020). The used SF compile is the AVX2-compile, which is the fastest on my AMD Ryzen CPU. SF binaries are taken from abrok.eu (except the official SF-release versions, which are taken form the official Stockfish website).

Download BrainFish (and the Cerebellum-Libraries)here

 

To avoid distortions in the Ordo Elo-calculation, from now, only 3x Stockfish (latest official release + the latest 2 dev-versions) and 1x Brainfish are stored in the gamebase (all older engine-versions games will be deleted, every time, when a new version was tested). Stockfish and BrainFish older Elo-results can still be seen in the Elo-diagrams below. BrainFish plays always with the latest Cerebellum-Libraries of course, because otherwise BrainFish = Stockfish.

 

Latest update: 2021/09/22: Koivisto 6.16 (+38 Elo to Koivisto 6)

 

(Ordo-calculation fixed to Stockfish 14 = 3757 Elo)

 

See the individual statistics of engine-results here

Download the current gamebase here

Download the complete game-archive here

     Program                    Elo    +    -   Games   Score   Av.Op.  Draws

   1 Stockfish 210915 avx2    : 3777    7    7  7000    74.8 %   3574   49.5 %
   2 Stockfish 210910 avx2    : 3773    7    7  7000    74.4 %   3574   50.3 %
   3 Stockfish 14 210702      : 3757    5    5 13000    73.4 %   3566   51.2 %
   4 KomodoDragon 2.0 avx2    : 3648    5    5 13000    61.2 %   3562   58.4 %
   5 Fire 8.NN avx2           : 3608    5    5 12000    54.2 %   3578   62.5 %
   6 KomodoDragon 2.0 MCTS    : 3575    6    6  8000    54.1 %   3547   62.8 %
   7 Fire 8.NN MCTS avx2      : 3573    6    6  8000    53.1 %   3552   63.3 %
   8 Ethereal 13.25 nnue      : 3531    5    5 12000    42.0 %   3595   58.9 %
   9 Slow Chess 2.7 avx2      : 3523    5    5 10000    40.5 %   3600   59.6 %
  10 Koivisto 6.16 avx2       : 3496    5    5  9000    39.6 %   3574   61.2 %
  11 RubiChess 2.2 avx2       : 3483    5    5 14000    39.3 %   3570   54.6 %
  12 Revenge 1.0 avx2         : 3472    5    5 15000    39.6 %   3557   52.0 %
  13 Koivisto 6 avx2          : 3458    4    4 14000    47.4 %   3477   52.8 %
  14 Nemorino 6.00 avx2       : 3443    4    4 15000    56.3 %   3398   47.4 %
  15 Igel 3.0.5 popavx2       : 3408    5    5 11000    58.0 %   3349   53.3 %
  16 Seer 2.3.0 avx           : 3366    5    5 12000    54.9 %   3330   52.0 %
  17 Berserk 4.5.1 avx2       : 3346    5    5 12000    51.7 %   3334   46.1 %
  18 Arasan 23.0.1 avx2       : 3341    6    6  8000    52.0 %   3326   51.0 %
  19 Gogobello 3 avx2         : 3303    5    5 12000    46.0 %   3333   53.3 %
  20 Weiss 2.0 popc           : 3274    5    5 10000    44.8 %   3312   48.6 %
  21 Minic 3.13 znver3        : 3250    6    6  7000    43.8 %   3295   50.2 %
  22 Lc0 0.28.0 744706        : 3234    6    6  7000    40.5 %   3304   44.7 %
  23 Chiron 5 x64             : 3222    5    5 12000    34.7 %   3340   39.8 %
  24 Clover 2.4 avx2          : 3210    5    5 10000    34.3 %   3330   42.2 %

The version-numbers (180622 for example) of the engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file. Especially the asmFish-engines are often released much later!!

Some engines are using a nnue-net based on evals of other engines. I decided to test these engines, too. As far as I know the follwing engines use nnue-nets based on evals of other engines (if I missed an engine, please contact me):

Fire 8.NN, Nemorino 6.00, Gogobello 3 (using Stockfish-based nnue nets)

Stockfish since 210615 (using Lc0-based nnue nets)

Below you find a diagram of the progress of Stockfish in my tests since August 2020

And below that diagram, the older diagrams.

 

You can save the diagrams (as a JPG-picture (in originial size)) on your PC with mouseclick (right button) and then choose "save image"...

The Elo-ratings of older Stockfish dev-versions in the Ordo-calculation can be a little different to the Elo-"dots" in the diagram, because the results/games of new Stockfish dev-versions - when getting part of the Ordo-calculation - can change the Elo-ratings of the opponent engines and that can change the Elo-ratings of older Stockfish dev-versions (in the Ordo-calculation / ratinglist, but not in the diagram, where all Elo-"dots" are the rating of one Stockfish dev-version at the moment, when the testrun of that Stockfish dev-version was finished).

 

 

 

 

 

 


Sie sind Besucher Nr.