Stefan Pohl Computer Chess

private website for chessengine-tests


Latest Website-News (2021/12/09): Ratinglist-testrun of Stash 32.0 finished. 

NN-testrun of Lc0 0.28.0 770578 finished (first testrun of a T77-net): See the result and download the games in the "NN vs SF testing"- section.

 

Next ratinglist-testruns: Berserk 8 and Stockfish 211207 (plus regression-testrun)

Next NN-testrun: Lc0 0.28.0 780023 (new T78 run is training run of large 40x512 nets - this is just a try, if the result is too bad, I will abort this run...)

 

I got several requests for the downloadlink of the superlarge 40x512 experimental nets for Lc0. You find the developer's github-site here

 

I made an Autoplayer Batch-tool for automatic analyzing and playing games from one or more startpositions for Germany's #1 Chess-Youtuber TheBigGreek. You can download it right here

If you speak german, I recommend his videos very much! Find him here

 

Stay tuned.


Stockfish Regression testing (20000 games (30sec+300ms) vs Stockfish 14.1 211028)

Latest testrun:

Stockfish 211129:  (+5365,=10582,-4053)= 53.3% = +23.0 Elo (+3.2 Elo to previous test)

Best testrun so far:

Stockfish 211129:  (+5365,=10582,-4053)= 53.3% = +23.0 Elo (+3.2 Elo to previous best)

See all results, get more information and download the games: Click on the yellow link above...


Stockfish testing

 

Playing conditions:

 

Hardware: Since 20/07/21 AMD Ryzen 3900 12-core (24 threads) notebook with 32GB RAM. 

Speed: (singlethread, TurboBoost-mode switched off, chess starting position) Stockfish 11: 1.3 mn/s, Komodo 14: 1.1 mn/s

Hash: 256MB per engine

GUI: Cutechess-cli (GUI ends game, when a 5-piece endgame is on the board)

Tablebases: None for engines, 5 Syzygy for cutechess-cli

Openings: HERT_500 testset (by Thomas Zipproth) (download the file at the "Download & Links"-section or here)

Ponder, Large Memory Pages & learning: Off

Thinking time: 180''+1000ms (= 3'+1'') per game/engine (average game-duration: around  7.5 minutes). One 7000 games-testrun takes about 2 days.The version-numbers of the Stockfish engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file, written backwards (year,month,day))(example: 200807 = August, 7, 2020). The used SF compile is the AVX2-compile, which is the fastest on my AMD Ryzen CPU. SF binaries are taken from abrok.eu (except the official SF-release versions, which are taken form the official Stockfish website).

Download BrainFish (and the Cerebellum-Libraries)here

 

To avoid distortions in the Ordo Elo-calculation, from now, only 3x Stockfish (latest official release + the latest 2 dev-versions) and 1x Brainfish are stored in the gamebase (all older engine-versions games will be deleted, every time, when a new version was tested). Stockfish and BrainFish older Elo-results can still be seen in the Elo-diagrams below. BrainFish plays always with the latest Cerebellum-Libraries of course, because otherwise BrainFish = Stockfish.

 

Latest update: 2021/12/09: Stash 32.0 popc (first entry (non-nnue engine))

 

(Ordo-calculation fixed to Stockfish 14.1 = 3780 Elo)

 

See the individual statistics of engine-results here

Download the current gamebase here

Download the complete game-archive here

     Program                    Elo    +    -   Games   Score   Av.Op.  Draws

   1 Stockfish 211129 avx2    : 3790    8    8  7000    74.7 %   3594   50.4 %
   2 Stockfish 14.1 211028    : 3780    7    7  8000    74.2 %   3588   51.0 %
   3 Stockfish 14 210702      : 3761    6    6 11000    75.2 %   3559   48.7 %
   4 KomodoDragon 2.5 avx2    : 3728    6    6 10000    63.7 %   3622   62.0 %
   5 KomodoDragon 2.5 MCTS    : 3653    6    6 11000    55.6 %   3613   64.6 %
   6 Fire 8.NN avx2           : 3611    6    6 11000    49.2 %   3618   62.5 %
   7 Stockfish final HCE      : 3577    6    6 12000    48.2 %   3592   57.6 %
   8 Fire 8.NN MCTS avx2      : 3571    6    6  9000    50.1 %   3572   62.2 %
   9 Slow Chess 2.8 avx2      : 3566    6    6 10000    43.4 %   3618   63.2 %
  10 Koivisto 7.5 avx2        : 3546    6    6 12000    42.2 %   3607   58.3 %
  11 Ethereal 13.25 nnue      : 3536    6    6 12000    38.2 %   3629   58.0 %
  12 Berserk 7 avx2           : 3514    5    5 11000    47.3 %   3538   54.6 %
  13 RubiChess 2.2 avx2       : 3486    5    5 12000    36.1 %   3596   54.5 %
  14 Revenge 1.0 avx2         : 3477    6    6 11000    40.3 %   3551   57.4 %
  15 Nemorino 6.00 avx2       : 3447    6    6 10000    51.8 %   3436   48.2 %
  16 Seer 2.4.0 avx2          : 3446    6    6  7000    48.7 %   3455   60.1 %
  17 Igel 3.0.5 popavx2       : 3408    6    6  9000    58.5 %   3343   54.7 %
  18 Arasan 23.2 avx2         : 3400    6    6  9000    59.0 %   3333   48.7 %
  19 Scorpio 3.0.14d cpu      : 3332    6    6 10000    53.3 %   3311   45.9 %
  20 Gogobello 3 avx2         : 3301    6    6  8000    51.0 %   3295   51.9 %
  21 Minic 3.17 znver3        : 3290    6    6  8000    47.1 %   3314   48.2 %
  22 Wasp 5.00 avx            : 3288    6    6  9000    53.1 %   3266   51.0 %
  23 Weiss 2.0 popc           : 3286    6    6 10000    48.5 %   3298   49.5 %
  24 Zahak 9.0 avx            : 3275    6    6  9000    52.4 %   3259   45.2 %
  25 Fritz 18 x64             : 3274    6    6 10000    53.2 %   3251   47.7 %
  26 Lc0 0.28.0 744706        : 3239    6    6 12000    44.1 %   3281   46.1 %
  27 Chiron 5 x64             : 3238    5    5 12000    42.4 %   3294   43.4 %
  28 Marvin 5.2 avx2          : 3232    6    6  9000    45.6 %   3264   48.6 %
  29 Danasah 9.0 avx2         : 3231    6    6 12000    45.1 %   3265   46.8 %
  30 Clover 2.4 avx2          : 3219    5    5 14000    40.5 %   3289   45.8 %
  31 Stash 32.0 popc          : 3211    7    7  7000    40.6 %   3280   44.0 %

The version-numbers (180622 for example) of the engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file. Especially the asmFish-engines are often released much later!! (Stockfish final HCE is Stockfish 200731, the latest version without neural-net and with HCE (=Hand Crafted Evaluation). This engine is (and perhaps will stay forever?) the strongest HCE (Hand Crafted Eval) engine on the planet. IMHO this makes it very interesting for comparsion.)

Some engines are using a nnue-net based on evals of other engines. I decided to test these engines, too. As far as I know the follwing engines use nnue-nets based on evals of other engines (if I missed an engine, please contact me):

Fire 8.NN, Nemorino 6.00, Gogobello 3 (using Stockfish-based nnue nets)

Stockfish since 210615 (using Lc0-based nnue nets)

Below you find a diagram of the progress of Stockfish in my tests since August 2020

And below that diagram, the older diagrams.

 

You can save the diagrams (as a JPG-picture (in originial size)) on your PC with mouseclick (right button) and then choose "save image"...

The Elo-ratings of older Stockfish dev-versions in the Ordo-calculation can be a little different to the Elo-"dots" in the diagram, because the results/games of new Stockfish dev-versions - when getting part of the Ordo-calculation - can change the Elo-ratings of the opponent engines and that can change the Elo-ratings of older Stockfish dev-versions (in the Ordo-calculation / ratinglist, but not in the diagram, where all Elo-"dots" are the rating of one Stockfish dev-version at the moment, when the testrun of that Stockfish dev-version was finished).

 

 

 

 

 

 


Sie sind Besucher Nr.