Stefan Pohl Computer Chess

private website for chessengine-tests


Latest Website-News (2022/01/15): Ratinglist-testrun of Minic 3.18 finished: +87 Elo to Minic 3.17

(The Rebel 14 testrun was aborted, because Rebel 14 is too weak. Clearly below 3200 Elo, and my small ratinglist is for the top engines, only, because otherwise, there is not enough time for doing all the testruns of the Stockfish dev-versions. Sorry, Ed...)

NN-testrun of Lc0 0.28.2 T75_2400k (20x256) finished: See the result and download the games in the "NN vs SF testing"- section.

 

 

Because of the huge success of my Unbalanced Human Openings (UHO), I started the development of UHO 2022 (because the MegaBase 2022 is used for building the new UHO openings-sets): Filtered out of all games with both players 2300+ Elo (out of brand new MegaBase 2022), evaluated with brand new KomodoDragon 2.6 (8.5 secs thinking-time for each endposition on a 12core AMD Ryzen 3900 CPU). (UHO V3 was filtered out of MegaBase 2020, evaluated with KomodoDragon 1.0). There are 111789 different 6-moves lines and 316318 different 8-move lines to evaluate, so this will take around 6 weeks... After evaluating, there is some testing and editing to do. So, the estimated release-date is not before end of February 2022, perhaps March 2022...

 

Stay tuned.


Stockfish Regression testing (20000 games (30sec+300ms) vs Stockfish 14.1 211028)

Latest testrun:

Stockfish 220108:  (+5997,=10287,-3716)= 55.7% = +40.1 Elo (+4.0 Elo to previous test)

Best testrun so far:

Stockfish 220108:  (+5997,=10287,-3716)= 55.7% = +40.1 Elo (+4.0 Elo to previous best)

See all results, get more information and download the games: Click on the yellow link above...


Stockfish testing

 

Playing conditions:

 

Hardware: Since 20/07/21 AMD Ryzen 3900 12-core (24 threads) notebook with 32GB RAM. 

Speed: (singlethread, TurboBoost-mode switched off, chess starting position) Stockfish 11: 1.3 mn/s, Komodo 14: 1.1 mn/s

Hash: 256MB per engine

GUI: Cutechess-cli (GUI ends game, when a 5-piece endgame is on the board)

Tablebases: None for engines, 5 Syzygy for cutechess-cli

Openings: HERT_500 testset (by Thomas Zipproth) (download the file at the "Download & Links"-section or here)

Ponder, Large Memory Pages & learning: Off

Thinking time: 180''+1000ms (= 3'+1'') per game/engine (average game-duration: around  7.5 minutes). One 7000 games-testrun takes about 2 days.The version-numbers of the Stockfish engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file, written backwards (year,month,day))(example: 200807 = August, 7, 2020). The used SF compile is the AVX2-compile, which is the fastest on my AMD Ryzen CPU. SF binaries are taken from abrok.eu (except the official SF-release versions, which are taken form the official Stockfish website).

 

To avoid distortions in the Ordo Elo-calculation, from now, only 2x Stockfish (latest official release + the latest 2 dev-versions)(all older engine-versions games will be deleted, every time, when a new version was tested). Stockfish's older Elo-results can still be seen in the Elo-diagrams below.

 

Latest update: 2022/01/15: Minic 3.18 (+87 Elo to Minic 3.17)

 

(Ordo-calculation fixed to Stockfish 14.1 = 3780 Elo)

 

See the individual statistics of engine-results here

Download the current gamebase here

Download the complete game-archive here

     Program                    Elo    +    -   Games   Score   Av.Op.  Draws

   1 Stockfish 220108 avx2    : 3793    7    7  7000    72.6 %   3617   54.2 %
   2 Stockfish 211227 avx2    : 3792    8    8  7000    72.5 %   3617   54.5 %
   3 Stockfish 14.1 211028    : 3780    7    7  8000    72.2 %   3607   54.9 %
   4 Stockfish 14 210702      : 3762    7    7  9000    71.9 %   3591   54.9 %
   5 KomodoDragon 2.6 avx2    : 3746    6    6 11000    62.4 %   3651   66.0 %
   6 KomodoDragon 2.6 MCTS    : 3680    6    6 11000    54.0 %   3651   68.1 %
   7 Fire 8.NN avx2           : 3615    6    6 11000    44.2 %   3659   62.9 %
   8 Berserk 8.5 avx2         : 3596    6    6 11000    45.8 %   3628   55.7 %
   9 Stockfish final HCE      : 3574    6    6 10000    43.4 %   3625   57.9 %
  10 Fire 8.NN MCTS avx2      : 3573    6    6  9000    46.6 %   3600   64.8 %
  11 Slow Chess 2.8 avx2      : 3568    6    6 11000    39.5 %   3649   61.9 %
  12 Revenge 2.0 avx2         : 3568    6    6 12000    44.7 %   3610   62.5 %
  13 Koivisto 7.5 avx2        : 3543    6    6 13000    38.7 %   3633   55.9 %
  14 Ethereal 13.25 nnue      : 3536    6    6  9000    40.3 %   3611   62.5 %
  15 RubiChess 2021 avx2      : 3517    6    6  7000    48.8 %   3525   67.5 %
  16 Nemorino 6.00 avx2       : 3452    5    5  9000    51.3 %   3444   46.7 %
  17 Seer 2.4.0 avx2          : 3449    6    6  7000    44.8 %   3487   57.6 %
  18 Igel 3.0.5 popavx2       : 3414    5    5 10000    55.4 %   3372   54.7 %
  19 Arasan 23.2 avx2         : 3405    6    6 10000    57.2 %   3353   48.0 %
  20 Minic 3.18 znver3        : 3383    7    7  7000    62.8 %   3289   48.8 %
  21 Scorpio 3.0.14d cpu      : 3335    5    5 11000    52.3 %   3320   46.9 %
  22 Wasp 5.20 avx            : 3326    7    7  7000    55.0 %   3290   49.6 %
  23 Gogobello 3 avx2         : 3308    6    6 10000    49.9 %   3310   53.5 %
  24 Minic 3.17 znver3        : 3296    7    7  7000    50.9 %   3289   49.9 %
  25 Weiss 2.0 popc           : 3288    6    6 10000    47.7 %   3305   49.2 %
  26 Zahak 9.0 avx            : 3278    6    6  9000    51.7 %   3267   44.9 %
  27 Fritz 18 x64             : 3278    6    6 11000    51.2 %   3269   47.1 %
  28 Lc0 0.28.0 744706        : 3243    6    6 12000    43.2 %   3293   45.7 %
  29 Chiron 5 x64             : 3241    5    5 12000    42.0 %   3301   43.1 %
  30 Marvin 5.2 avx2          : 3234    6    6  9000    44.9 %   3271   47.8 %
  31 Danasah 9.0 avx2         : 3233    6    6 12000    43.9 %   3277   45.9 %
  32 Clover 2.4 avx2          : 3223    5    5 14000    39.7 %   3300   45.2 %
  33 Stash 32.0 popc          : 3215    7    7  7000    40.6 %   3284   44.0 %

The version-numbers (180622 for example) of the engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file. Especially the asmFish-engines are often released much later!! (Stockfish final HCE is Stockfish 200731, the latest version without neural-net and with HCE (=Hand Crafted Evaluation). This engine is (and perhaps will stay forever?) the strongest HCE (Hand Crafted Eval) engine on the planet. IMHO this makes it very interesting for comparsion.)

Some engines are using a nnue-net based on evals of other engines. I decided to test these engines, too. As far as I know the follwing engines use nnue-nets based on evals of other engines (if I missed an engine, please contact me):

Fire 8.NN, Nemorino 6.00, Gogobello 3, Coiled 1.1 (using Stockfish-based nnue nets)

Stockfish since 210615 (using Lc0-based nnue nets)

Below you find a diagram of the progress of Stockfish in my tests since August 2020

And below that diagram, the older diagrams.

 

You can save the diagrams (as a JPG-picture (in originial size)) on your PC with mouseclick (right button) and then choose "save image"...

The Elo-ratings of older Stockfish dev-versions in the Ordo-calculation can be a little different to the Elo-"dots" in the diagram, because the results/games of new Stockfish dev-versions - when getting part of the Ordo-calculation - can change the Elo-ratings of the opponent engines and that can change the Elo-ratings of older Stockfish dev-versions (in the Ordo-calculation / ratinglist, but not in the diagram, where all Elo-"dots" are the rating of one Stockfish dev-version at the moment, when the testrun of that Stockfish dev-version was finished).

 

 

 

 

 

 


Sie sind Besucher Nr.