Stefan Pohl Computer Chess

private website for chessengine-tests


Latest Website-News (2022/05/13): NN-testrun of Lc0 0.29rc1 800815 finished. See the result and download the games in the "NN vs Dragon testing"- section.

 

Next ratinglist-testrun: Stash 33.0 & rofChade 3.0

Next NN-testrun: Lc0 0.29rc1 606511 (old (final) T60 24x320 net) for comparsion

 

New website-section added: EAS-ratinglist...The Engines Aggressiveness Score of the engines of my SPCC-ratinglist. The EAS-ratinglist shows, how aggressive the engines play. This is the world's first engine-ratinglist not measuring strength of engines but engines's style of play !!! More information on the site.

 

Stay tuned.


Stockfish VLTC UHO Regression testing (2000 games (10min+3sec) vs Stockfish 15)

Latest testrun:

Stockfish 220504:  (+487,=985,-528)= 49.0% = -7 Elo (-4 Elo to previous test)

Best testrun so far:

Stockfish 220422:  (+476,=1029,-495)= 49.5% = -3 Elo (xxx Elo to previous best)

See all results, get more information and download the games: Click on the yellow link above...


Stockfish testing

 

Playing conditions:

 

Hardware: Since 20/07/21 AMD Ryzen 3900 12-core (24 threads) notebook with 32GB RAM. 

Speed: (singlethread, TurboBoost-mode switched off, chess starting position) Stockfish 14.1: 750 kn/s (when 20 games are running simultaneously)

Hash: 256MB per engine

GUI: Cutechess-cli (GUI ends game, when a 5-piece endgame is on the board)

Tablebases: None for engines, 5 Syzygy for cutechess-cli

Openings: HERT_500 testset (by Thomas Zipproth) (download the file at the "Download & Links"-section or here). Mention, the HERT-set is not an Anti-Draw (UHO or something) opening-set, but a classical, balanced opening-set.

Ponder, Large Memory Pages & learning: Off

Thinking time: 180sec+1000ms (= 3min+1sec) per game/engine (average game-duration: around  7.5 minutes). One 7000 games-testrun takes about 2 days.The version-numbers of the Stockfish engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file, written backwards (year,month,day))(example: 200807 = August, 7, 2020). The used SF compile is the AVX2-compile, which is the fastest on my AMD Ryzen CPU. SF binaries are taken from abrok.eu (except the official SF-release versions, which are taken form the official Stockfish website).

 

To avoid distortions in the Ordo Elo-calculation, from now, only 2x Stockfish (latest official release + the latest 2 dev-versions)(all older engine-versions games will be deleted, every time, when a new version was tested). Stockfish's older Elo-results can still be seen in the Elo-diagrams below.

 

Latest update: 2022/05/11: Zahak 10 (+31 Elo to Zahak 9)

 

(Ordo-calculation fixed to Stockfish 15 = 3802 Elo)

 

See the individual statistics of engine-results here

See the Engines Aggressiveness Score Ratinglist here

Download the current gamebase here

Download the complete game-archive here

     Program                    Elo    +    -   Games   Score   Av.Op.  Draws

   1 Stockfish 220422 avx2    : 3804    8    8  7000    72.0%   3634   55.6%
   2 Stockfish 15 220418      : 3802    8    8  7000    71.8%   3634   56.1%
   3 Stockfish 220504 avx2    : 3801    8    8  7000    71.7%   3634   56.3%
   4 KomodoDragon 3 avx2      : 3763    7    7  9000    63.0%   3663   64.7%
   5 KomodoDragon 3 MCTS      : 3697    7    7  9000    54.6%   3663   67.6%
   6 Fire 8.NN avx2           : 3614    6    6 11000    45.4%   3651   63.2%
   7 Koivisto 8.0 avx2        : 3611    6    6 13000    46.7%   3638   63.3%
   8 Berserk 8.5 avx2         : 3598    7    7  9000    41.9%   3661   59.3%
   9 Stockfish final HCE      : 3582    7    7  9000    47.7%   3600   61.1%
  10 Fire 8.NN MCTS avx2      : 3582    7    7  8000    45.9%   3613   70.2%
  11 Ethereal 13.50 nnue      : 3577    6    6 11000    42.2%   3639   64.8%
  12 Slow Chess 2.83 avx2     : 3575    6    6 12000    39.3%   3659   62.8%
  13 Revenge 2.0 avx2         : 3572    6    6  8000    50.8%   3566   71.0%
  14 RubiChess 220223 avx2    : 3556    6    6  8000    52.3%   3539   69.7%
  15 Seer 2.5.0 avx2          : 3529    6    6 10000    44.4%   3569   67.9%
  16 Nemorino 6.00 avx2       : 3457    5    5 11000    54.8%   3422   53.1%
  17 Arasan 23.3 avx2         : 3447    6    6  8000    53.9%   3420   56.3%
  18 Igel 3.0.5 popavx2       : 3419    5    5 10000    53.5%   3393   56.5%
  19 Halogen 10.23.11 avx2    : 3418    6    6  8000    51.8%   3406   58.7%
  20 Clover 3.1 avx2          : 3404    6    6  9000    53.1%   3382   55.6%
  21 Tucano 10.00 avx2        : 3386    6    6  8000    55.4%   3348   59.6%
  22 Rebel 15 avx2            : 3383    6    6  8000    50.3%   3381   56.0%
  23 Minic 3.18 znver3        : 3383    6    6  9000    56.6%   3335   54.7%
  24 Wasp 5.50 avx            : 3376    6    6  9000    54.1%   3347   53.5%
  25 Fritz 18 nnue avx2       : 3364    6    6  9000    45.7%   3395   54.9%
  26 Coiled 1.1 avx2          : 3351    6    6  9000    52.1%   3335   56.8%
  27 Scorpio 3.0.14d cpu      : 3337    6    6 10000    52.4%   3319   49.8%
  28 Gogobello 3 avx2         : 3315    6    6  9000    48.0%   3329   54.7%
  29 Zahak 10.0 avx           : 3311    7    7  7000    44.4%   3351   50.3%
  30 Velvet 3.3.0 avx2        : 3311    6    6  8000    46.8%   3334   46.3%
  31 Combusken 2.0.0 amd64    : 3298    6    6  9000    53.3%   3275   47.0%
  32 Weiss 2.0 popc           : 3291    6    6  9000    46.5%   3317   47.2%
  33 Black Marlin 5.0 avx2    : 3285    6    6  9000    49.5%   3289   49.9%
  34 Zahak 9.0 avx            : 3280    7    7  8000    50.7%   3276   46.9%
  35 Chiron 5 x64             : 3247    6    6 10000    41.3%   3311   41.7%
  36 Lc0 0.28.0 744706        : 3246    6    6 11000    42.3%   3302   46.7%
  37 Danasah 9.0 avx2         : 3237    6    6 11000    43.3%   3285   46.2%
  38 Marvin 5.2 avx2          : 3235    6    6  9000    44.4%   3275   48.5%
  39 Stash 32.0 popc          : 3220    6    6  8000    40.0%   3292   43.1%

The version-numbers (180622 for example) of the engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file. Especially the asmFish-engines are often released much later!! (Stockfish final HCE is Stockfish 200731, the latest version without neural-net and with HCE (=Hand Crafted Evaluation). This engine is (and perhaps will stay forever?) the strongest HCE (Hand Crafted Eval) engine on the planet. IMHO this makes it very interesting for comparsion.)

Some engines are using a nnue-net based on evals of other engines. I decided to test these engines, too. As far as I know the follwing engines use nnue-nets based on evals of other engines (if I missed an engine, please contact me):

Fire 8.NN, Nemorino 6.00, Gogobello 3, Coiled 1.1 (using Stockfish-eval-based nnue nets or nets directly from Stockfish website). Stockfish since 210615 (using Lc0-based nnue nets). Halogen 10.23.11 using a Koivisto-eval-based net.

Some engine-testruns were aborted, because the engine is too weak (below 3200 SPCC-Elo): LittleGoliath 3.15.3

Below you find a diagram of the progress of Stockfish in my tests since April 2022

And below that diagram, the older diagrams.

 

You can save the diagrams (as a JPG-picture (in originial size)) on your PC with mouseclick (right button) and then choose "save image"...

The Elo-ratings of older Stockfish dev-versions in the Ordo-calculation can be a little different to the Elo-"dots" in the diagram, because the results/games of new Stockfish dev-versions - when getting part of the Ordo-calculation - can change the Elo-ratings of the opponent engines and that can change the Elo-ratings of older Stockfish dev-versions (in the Ordo-calculation / ratinglist, but not in the diagram, where all Elo-"dots" are the rating of one Stockfish dev-version at the moment, when the testrun of that Stockfish dev-version was finished).

 

 

 

 

 

 

 

 

 


Sie sind Besucher Nr.