Stefan Pohl Computer Chess

Home of famous UHO openings and EAS Ratinglist


Latest Website-News (2023/05/26): Testruns of Stockfish 230520 finished: -1 Elo in SPCC Ratinglist testrun and -2 Elo in UHO-Top10 Ratinglist testrun.

 

NN-testrun of Lc0 0.30dev BT2-4510000 finished. See the result and download the games on NN vs SF 15.1 testing

 

Dont forget to take a look at my EAS-Ratinglist (The world's first engine-ratinglist not measuring strength of engines, but engines's style of play).

 

 

Stay tuned.


SPCC Top Engines Ratinglist (+ regular testing of Stockfish Dev-versions)

 

Playing conditions:

 

Hardware: Since 20/07/21 AMD Ryzen 3900 12-core (24 threads) notebook with 32GB RAM. 

Speed: (singlethread, TurboBoost-mode switched off, chess starting position) Stockfish 14.1: 750 kn/s (when 20 games are running simultaneously)

Hash: 256MB per engine

GUI: Cutechess-cli (GUI ends game, when a 5-piece endgame is on the board, all other games are played until mate or draw by chess-rules (3fold, 50-moves, stalemate, insufficent material))

Tablebases: None for engines, 5 Syzygy for cutechess-cli

Openings: HERT_500 testset (by Thomas Zipproth) (download the file at the "Download & Links"-section or here). Mention, the HERT-set is not an Anti-Draw (UHO or something) opening-set, but a classical, balanced opening-set.

Ponder, Large Memory Pages & learning: Off

Thinking time: 3min+1sec per game/engine (average game-duration: 7 min 45sec). One 7000 games-testrun takes about 2 days.The version-numbers of the Stockfish engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file, written backwards (year,month,day))(example: 200807 = August, 7, 2020). The used SF compile is the AVX2-compile, which is the fastest on my AMD Ryzen CPU. SF binaries are taken from abrok.eu (except the official SF-release versions, which are taken form the official Stockfish website).

 

To avoid distortions in the Ordo Elo-calculation, from now, only 2x Stockfish (latest official release + the latest 2 dev-versions)(all older engine-versions games will be deleted, every time, when a new version was tested). Stockfish's older Elo-results can still be seen in the Elo-diagrams below.

 

Latest update: 2023/05/26: Stockfish 230520 (-1 Elo to Stockfish 230505)

 

(Ordo-calculation fixed to Stockfish 15.1 = 3807 Elo)

 

See the individual statistics of engine-results here

See the Engines Aggressiveness Score Ratinglist here

Download the current gamebase here

Download the complete game-archive here

See the full SPCC-Ratinglist and full EAS-Ratinglist (without Stockfish dev-versions) from 2020 until today here

(calculating the EAS-Ratings of the full list has a high effort and will be done only from time to time, not after each test)

 

(best Stockfish Elo so far: Stockfish 230325: 3817 SPCC-Elo, latest full release (Stockfish 15) had 3802 SPCC-Elo)

     Program                    Elo    +    -  Games    Score   Av.Op. Draws

   1 Stockfish 230505 avx2    : 3813    7    7  7000    66.5%   3691   66.5%
   2 Stockfish 230520 avx2    : 3812    7    7  7000    66.5%   3691   66.5%
   3 Stockfish 15.1 221204    : 3807    8    8  7000    65.8%   3691   67.7%
   4 KomodoDragon 3.2 avx2    : 3777    6    6 10000    60.8%   3697   70.7%
   5 KomodoDragon 3.2 MCTS    : 3715    6    6 10000    52.5%   3697   73.8%
   6 Berserk 11 avx2          : 3704    6    6 14000    52.5%   3686   74.0%
   7 Ethereal 14.00 nnue      : 3679    6    6 14000    48.9%   3688   74.3%
   8 Koivisto 9.2 avx2        : 3664    6    6 15000    47.9%   3680   71.1%
   9 RubiChess 230410 avx2    : 3653    6    6 15000    46.3%   3681   71.3%
  10 Revenge 3.0 avx2         : 3642    5    5 16000    46.2%   3671   71.4%
  11 Rebel 16.1               : 3619    5    5 13000    50.9%   3612   76.0%
  12 Fire 8.NN avx2           : 3615    6    6 12000    48.5%   3626   69.4%
  13 Clover 4.1 avx2          : 3605    6    6 12000    48.0%   3619   74.2%
  14 Igel 3.4.0 popavx2       : 3605    5    5 13000    49.5%   3608   74.4%
  15 Seer 2.6.0 avx2          : 3602    5    5 12000    48.9%   3610   74.0%
  16 Slow Chess 2.9 avx2      : 3583    6    6 11000    48.1%   3596   73.0%
  17 Fire 8.NN MCTS avx2      : 3582    5    5 10000    52.0%   3567   70.1%
  18 Stockfish final HCE      : 3578    5    5 12000    45.9%   3608   59.8%
  19 Uralochka 3.39d avx2     : 3557    5    5 12000    48.6%   3567   72.8%
  20 rofChade 3.0 avx2        : 3550    5    5 11000    45.4%   3583   70.3%
  21 Caissa 1.8 avx2          : 3548    5    5 11000    53.9%   3520   66.4%
  22 Minic 3.32 znver3        : 3531    5    5 10000    47.4%   3549   70.1%
  23 Viridithas 9.0.0 avx2    : 3516    5    5 11000    51.1%   3508   66.1%
  24 Wasp 6.50 avx            : 3498    6    6 10000    54.6%   3465   61.5%
  25 Velvet 5.2.0 avx2        : 3475    5    5 12000    55.8%   3433   59.8%
  26 PowerFritz 18 avx2       : 3474    5    5 10000    49.5%   3478   63.6%
  27 Booot 7.1 avx2           : 3472    6    6  9000    50.7%   3467   62.7%
  28 Arasan 23.5 avx2         : 3465    5    5 10000    44.7%   3503   60.6%
  29 Black Marlin 7.0 avx2    : 3460    6    6 10000    50.4%   3456   62.9%
  30 Nemorino 6.00 avx2       : 3453    5    5  9000    51.9%   3439   56.3%
  31 Smallbrain 7.0 avx2      : 3447    6    6 10000    46.7%   3471   63.0%
  32 Devre 4.0 avx2           : 3428    6    6  9000    47.7%   3444   66.6%
  33 BlackCore 6.0 avx2       : 3426    6    6  9000    52.0%   3412   59.6%
  34 Halogen 11.4 avx2        : 3414    5    5 10000    50.2%   3412   63.4%
  35 Marvin 6.1.0 avx2        : 3389    6    6  9000    46.3%   3415   60.6%
  36 Alexandria 3.5 avx2      : 3387    6    6 10000    44.4%   3427   60.0%
  37 Weiss 230309 popc        : 3384    6    6  9000    49.0%   3392   50.6%
  38 Tucano 10.00 avx2        : 3382    6    6 10000    50.0%   3382   59.2%
  39 Coiled 1.1 avx2          : 3352    6    6 10000    46.7%   3376   57.3%
  40 Frozenight 6.0.0 avx2    : 3351    6    6  9000    53.5%   3326   55.1%
  41 Komodo 14.1 aggress.     : 3348    6    6 10000    55.5%   3309   43.9%
  42 Zahak 10.0 avx           : 3324    6    6 10000    50.0%   3324   50.5%
  43 Gogobello 3 avx2         : 3310    7    7 11000    48.9%   3318   54.2%
  44 Counter 5.0 amd64        : 3307    6    6 11000    52.0%   3293   51.1%
  45 Lc0 0.29 dnll 791921     : 3295    7    7 10000    50.4%   3292   46.2%
  46 Expositor 2BR17 avx2     : 3295    6    6 11000    49.2%   3300   50.5%
  47 Combusken 2.0.0 amd64    : 3294    6    6 11000    49.7%   3296   47.7%
  48 Stash 34.0 popc          : 3292    7    7 10000    50.5%   3289   48.5%
  49 Mantissa 3.7.2 avx2      : 3290    6    6 11000    48.9%   3298   50.6%
  50 Pawn 1.0 x64             : 3287    6    6 11000    46.4%   3312   49.4%
  51 Chiron 5 x64             : 3248    7    7 10000    41.2%   3312   41.8%
  52 Danasah 9.0 avx2         : 3238    7    7 10000    41.7%   3297   46.5%

The version-numbers (180622 for example) of the engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file. Especially the asmFish-engines are often released much later!! (Stockfish final HCE is Stockfish 200731, the latest version without neural-net and with HCE (=Hand Crafted Evaluation). This engine is (and perhaps will stay forever?) the strongest HCE (Hand Crafted Eval) engine on the planet. IMHO this makes it very interesting for comparsion.)

Some engines are using a nnue-net based on evals of other engines. I decided to test these engines, too. As far as I know the follwing engines use nnue-nets based on evals of other engines (if I missed an engine, please contact me):

Expositor 2BR17, Fire 8.NN, Nemorino 6.00, Gogobello 3, Coiled 1.1 (using Stockfish-eval-based nnue nets or nets directly from Stockfish website). Stockfish since 210615, Devre 4 (using Lc0-based nnue nets). Halogen 11.4 and Alexandria 3.5 using a Koivisto-eval-based net.

Some engine-testruns were aborted, because the engine is too weak (weaker than Danasah 9): Tenax 0.8.0Winter 2.0

Some engine-testruns were aborted, because the new version was clearly weaker than the engine-version already listed: Fire 230519, Drofa 4.0.0

Some engines could not be tested, because they do not work properly in cutechess-cli (with 20 games running simultaneously): LittleGoliath 3.16

Below you find a diagram of the progress of Stockfish in my tests since April 2022

And below that diagram, the older diagrams.

 

You can save the diagrams (as a JPG-picture (in originial size)) on your PC with mouseclick (right button) and then choose "save image"...

The Elo-ratings of older Stockfish dev-versions in the Ordo-calculation can be a little different to the Elo-"dots" in the diagram, because the results/games of new Stockfish dev-versions - when getting part of the Ordo-calculation - can change the Elo-ratings of the opponent engines and that can change the Elo-ratings of older Stockfish dev-versions (in the Ordo-calculation / ratinglist, but not in the diagram, where all Elo-"dots" are the rating of one Stockfish dev-version at the moment, when the testrun of that Stockfish dev-version was finished).

 

 

 

 

 

 

 

 

 


Sie sind Besucher Nr.