Stefan Pohl Computer Chess

private website for chessengine-tests


Latest Website-News (2021/07/30): AB-testrun of Stockfish 210726 finished: +3 Elo to Stockfish 210713. 

Next NN-testrun: Ceres 0.91b 69722

Next AB-testruns: Regression testruns of Stockfish 210724 and Stockfish 210726, followed by next regular Stockfish-testrun (probably Stockfish 210729) and the corresponding regression-testrun.

 

I merged the files of the Noomen SuFi LowDraw openings v1 and the new v2, deleted the lines with double endpositions and added an EPD-file, containing the endpositions: 247 lines remaining. Download the openings here

 

 

Development of huge Unbalanced Human Openings sets for the Stockfish Framework started (will be added to my Anti Draw Openings collection - then this collection will offer opening-sets for everyone, even for the Stockfish Framework with it's extreme testing conditions). Because this is a long-termed project, I made a small website, where you can see the progress. Just click here

 

Stay tuned.


NEW *** NEW *** NEW 

Stockfish Regression testing (30000 games (20sec+200ms) vs Stockfish 14 210702)

Latest testrun:

Stockfish 210713:  (+1221  =27656  -1123)= 50.2% = +1.1 Elo

Best testrun so far:

Stockfish 210703:  (+1304  =27559  -1137)= 50.3% = +2.0 Elo

See all results, get more information and download the games here


Stockfish testing

 

Playing conditions:

 

Hardware: Since 20/07/21 AMD Ryzen 3900 12-core (24 threads) notebook with 32GB RAM. Now, 20 games are played simultaneously (!), so from now, each testrun will have 6000 or 7000 games (instead of 5000 before) and will take only 2 days, not 6-7 days as before! From now, all engine-binaries are popcount/avx2, of course, because bmi2-compiles are extremly slow on AMD. To keep the rating-list engine-names consistent, the "bmi2"- or "pext"-extension in the engine-name is still in use for older engines - otherwise ORDO will not calculate all played games by this engine as one engine...

Speed: (singlethread, TurboBoost-mode switched off, chess starting position) Stockfish: 1.3 mn/s, Komodo: 1.1 mn/s

Hash: 256MB per engine

GUI: Cutechess-cli (GUI ends game, when a 5-piece endgame is on the board)

Tablebases: None for engines, 5 Syzygy for cutechess-cli

Openings: HERT_500 testset (by Thomas Zipproth) (download the file at the "Download & Links"-section or here)

Ponder, Large Memory Pages & learning: Off

Thinking time: 180''+1000ms (= 3'+1'') per game/engine (average game-duration: around  7.5 minutes). One 7000 games-testrun takes about 2 days.The version-numbers of the Stockfish engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file, written backwards (year,month,day))(example: 200807 = August, 7, 2020). The used SF compile is the AVX2-compile, which is the fastest on my AMD Ryzen CPU. SF binaries are taken from abrok.eu (except the official SF-release versions, which are taken form the official Stockfish website).

Download BrainFish (and the Cerebellum-Libraries)here

 

To avoid distortions in the Ordo Elo-calculation, from now, only 3x Stockfish (latest official release + the latest 2 dev-versions) and 1x Brainfish are stored in the gamebase (all older engine-versions games will be deleted, every time, when a new version was tested). Stockfish and BrainFish older Elo-results can still be seen in the Elo-diagrams below. BrainFish plays always with the latest Cerebellum-Libraries of course, because otherwise BrainFish = Stockfish.

 

Latest update: 2021/07/30: Stockfish 210726  (+3 Elo to Stockfish 210713)

 

(Ordo-calculation fixed to Stockfish 14 = 3757 Elo)

 

See the individual statistics of engine-results here

Download the current gamebase here

     Program                    Elo    +    -   Games   Score   Av.Op.  Draws

   1 Stockfish 210726 avx2    : 3759    7    7  7000    73.9 %   3563   50.8 %
   2 Stockfish 14 210702      : 3757    6    6 12000    73.0 %   3564   52.1 %
   3 Stockfish 210713 avx2    : 3756    7    7  7000    73.7 %   3563   51.3 %
   4 Stockfish 13 210218      : 3724    4    4 22000    76.2 %   3499   45.0 %
   5 SF Fat Fritz 2 avx2      : 3723    7    7  8000    73.7 %   3519   49.3 %
   6 CFish 12 avx2            : 3712    8    8  7000    84.6 %   3398   29.1 %
   7 Stockfish 12 200902      : 3690    4    4 25000    78.2 %   3443   39.4 %
   8 SF Fat Fritz 2 github    : 3684    7    7  7000    73.3 %   3489   48.5 %
   9 KomodoDragon 1.0 avx2    : 3653    4    4 21000    74.4 %   3445   43.0 %
  10 KomodoDragon 2.0 avx2    : 3653    4    4 19000    66.8 %   3520   52.2 %
  11 Fire 8.NN avx2           : 3609    4    4 16000    59.1 %   3540   57.4 %
  12 Stockfish 11 200118      : 3570    6    6  9000    66.7 %   3433   43.8 %
  13 KomodoDragon 2.0 MCTS    : 3570    6    6  7000    62.4 %   3480   53.4 %
  14 KomodoDragon 1.0 MCTS    : 3487    6    6  7000    57.6 %   3433   57.1 %
  15 RubiChess 2.2 avx2       : 3486    5    5 12000    42.1 %   3552   53.6 %
  16 Slow Chess 2.6 avx2      : 3484    5    5 15000    42.5 %   3548   54.1 %
  17 Ethereal 13 nnue avx2    : 3479    5    5 16000    42.6 %   3542   52.1 %
  18 Revenge 1.0 avx2         : 3472    6    6 11000    38.8 %   3565   50.5 %
  19 Komodo 14.1 x64          : 3462    6    6  8000    56.3 %   3419   55.6 %
  20 Koivisto 6 avx2          : 3454    7    7  7000    36.3 %   3563   55.9 %
  21 Komodo 14 bmi2           : 3454    5    5 13000    52.6 %   3439   54.6 %
  22 RubiChess 2.1 avx2       : 3448    4    4 17000    41.6 %   3516   55.4 %
  23 Slow Chess 2.54 avx2     : 3448    6    6  9000    41.1 %   3519   54.6 %
  24 Houdini 6 pext           : 3445    3    3 40000    49.1 %   3457   50.7 %
  25 Nemorino 6.00 avx2       : 3444    3    3 41000    50.1 %   3447   52.5 %
  26 Fire 8.1 popc            : 3440    5    5 13000    43.1 %   3496   53.5 %
  27 Pedone 3.1 avx2          : 3428    4    4 17000    46.9 %   3453   52.4 %
  28 Slow Chess 2.5 avx2      : 3427    4    4 19000    41.7 %   3497   50.0 %
  29 Igel 3.0.5 popavx2       : 3413    5    5 12000    51.6 %   3405   53.0 %
  30 Nemorino 6.05 avx2       : 3412    7    7  7000    42.8 %   3474   51.0 %
  31 Ethereal 12.75 avx2      : 3406    4    4 25000    45.7 %   3445   52.1 %
  32 Ethereal 13.07 avx2      : 3402    7    7  7000    31.7 %   3549   48.8 %
  33 Ethereal 12.62 avx2      : 3401    6    6  8000    49.1 %   3412   54.6 %
  34 Igel 3.0.0 popavx2       : 3393    6    6  7000    36.2 %   3506   52.1 %
  35 Slow Chess 2.4 popc      : 3381    5    5 12000    43.7 %   3435   52.3 %
  36 Ethereal 12.50 popc      : 3369    6    6  7000    45.8 %   3408   55.2 %
  37 RubiChess 2.0 avx2       : 3361    6    6 11000    29.4 %   3540   44.4 %
  38 Slow Chess 2.3 popc      : 3354    5    5 12000    42.9 %   3411   54.0 %
  39 Ethereal 12.25 pext      : 3348    6    6  9000    36.3 %   3466   50.3 %
  40 Slow Chess 2.2 popc      : 3339    7    7  8000    33.6 %   3481   46.2 %
  41 Pedone 3 avx2            : 3336    6    6  8000    33.4 %   3473   46.7 %
  42 Berserk 4.5.1 avx2       : 3335    6    6  7000    35.9 %   3439   45.6 %
  43 Igel 2.9.0 popavx2       : 3334    5    5 11000    34.5 %   3458   50.1 %
  44 RubiChess 1.9dev nnue    : 3328    7    7  8000    31.8 %   3478   45.6 %
  45 Xiphos 0.6 bmi2          : 3317    4    4 22000    39.8 %   3404   48.9 %
  46 Fire 7.1 popc            : 3315    4    4 17000    33.9 %   3448   48.9 %
  47 Booot 6.5 popc           : 3306    5    5 10000    39.1 %   3404   42.3 %
  48 Gogobello 3 avx2         : 3297    5    5 10000    44.1 %   3340   51.6 %
  49 rofChade 2.3 bmi2        : 3279    5    5 11000    42.6 %   3338   48.8 %
  50 Minic 3.08 znver1        : 3220    7    7  7000    32.2 %   3355   44.5 %
  51 Chiron 5 x64             : 3216    7    7  7000    31.7 %   3355   40.6 %
  52 Minic 3.07               : 3209    7    7  7000    29.0 %   3370   43.1 %

The version-numbers (180622 for example) of the engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file. Especially the asmFish-engines are often released much later!!

Some engines are using a nnue-net based on evals of other engines. I decided to test these engines, too. As far as I know the follwing engines use nnue-nets based on evals of other engines (if I missed an engine, please contact me):

Fire 8.NN, Nemorino 6.00, Gogobello 3 (using Stockfish-based nnue nets)

Fat Fritz 2, Stockfish since 210615 (using Lc0-based nnue nets)

Below you find a diagram of the progress of Stockfish in my tests since August 2020

And below that diagram, the older diagrams.

 

You can save the diagrams (as a JPG-picture (in originial size)) on your PC with mouseclick (right button) and then choose "save image"...

The Elo-ratings of older Stockfish dev-versions in the Ordo-calculation can be a little different to the Elo-"dots" in the diagram, because the results/games of new Stockfish dev-versions - when getting part of the Ordo-calculation - can change the Elo-ratings of the opponent engines and that can change the Elo-ratings of older Stockfish dev-versions (in the Ordo-calculation / ratinglist, but not in the diagram, where all Elo-"dots" are the rating of one Stockfish dev-version at the moment, when the testrun of that Stockfish dev-version was finished).

 

 

 

 

 

 


Sie sind Besucher Nr.