Stefan Pohl Computer Chess

private website for chessengine-tests


Latest Website-News (2017/09/25): Testrun of asmBrainFish 170825 finished. asmBrainFish is asmFish 170825 using a polyglot-version of the latest Cerebellum-Library (not generated automatically by asmFish, but manually by Thomas Zipproth himself, to avoid move-loops and draws out of the Library (the polyglot-version, asmFish generates automatically, does not prevent these problems and plays weaker!)). Download Cerebellum in BrainFish-format and Polyglot-format: here

 

Results of Testruns of Houdini 6 and Fire 6.1 will follow after my holiday-break in the middle of October.

 

Little one-day Bullet-selfplay testrun of Fire 6.1 vs. Fire 5:

Games Completed = 1000 of 1000 (Avg game length = 198.760 sec)
Settings = Gauntlet/128MB/70000ms+700ms/M 400cp for 4 moves, D 130 moves
 1.  Fire 6.1 popc 571.5/1000    287-144-569  (tpm=1671.1 d=17.12 nps=1500607)
 2.  Fire 5 popc   428.5/1000    144-287-569  (tpm=1677.8 d=17.06 nps=1466165)
 

57.1% means +50 Elo. So, I believe, with longer thinking-time Fire 6.1 should be around +40 Elo stronger than Fire 5. So Fire 6.1 could be a little bit stronger, than Shredder 13. Because I need strong opponents for my Stockfish testruns and Fire 6.1 has lower similarity-values (compared to Stockfish 7/8 (see the "Experiments"-site)) than Andscacs, I will use Fire 6.1 from now instead of Andscacs for my Stockfish testruns.

 

 

Long thinking-time tournament updated - first (but very early) result of Houdini 6.

 

I added a ArenaGUI-version of my SALC_V3_10moves book to the SALC-download package. So, from now the SALC-book can be used with FritzGUI, ShredderGUI and ArenaGUI. For all other GUIs, you can construct the SALC_V3-book by yourself, because all used SALC-games are stored in a PGN-file, which is part of the download-package, too. Download it in the "Download & Links"-section...

 

Stay tuned.


Stockfish testing

 

Playing conditions:

 

Hardware: i7-6700HQ 2.6GHz Notebook (Skylake CPU), Windows 10 64bit, 8GB RAM

Fritzmark: singlecore: 5.3 / 2521 (all engines running on one core, only), average meganodes/s displayed by LittleBlitzerGUI: Houdini: 2.6 mn/s, Stockfish: 2.2 mn/s, Komodo: 2.0 mn/s

Hash: 512MB per engine

GUI: LittleBlitzerGUI (draw at 130 moves, resign at 400cp (for 4 moves))

Tablebases: None

Openings: HERT testset (by Thomas Zipproth) (download the file at the "Download & Links"-section or here)(I use a version of HERT, where the positions in the file are ordered in a different way - makes no difference for testing-results, dont be confused, when you download my gamebase-file and the game-sequence doesnt match with the sequence of your HERT-set...)

Ponder, Large Memory Pages & learning: Off

Thinking time: 180''+1000ms (= 3'+1'') per game/engine (average game-duration: around  7.5 minutes). One 5000 games-testrun takes about 7 days.The version-numbers of the Stockfish-development engines are the release-date, written backwards (year,month,day))(example: 170526 = May, 26, 2017). I use BrainFish-compiles (bmi2) by Thomas Zipproth (without using the Cerebellum-Library, BrainFish is identical to Stockfish and BrainFish-compiles are the fastest compiles of the Stockfish C++ code at the moment, around +10% faster than the abrok.eu-compiles and around 4% faster than the ultimaiq-compiles).

Download BrainFish (and the Cerebellum-Library (in BrainFish-format and Polyglot-format)): here

 

Each Stockfish-version plays 1000 games against Komodo 11.2.2, Houdini 5, Shredder 13, Fizbo 1.9, Andscacs 0.91b. All engines are running with default-settings.

To avoid distortions in the Ordo Elo-calculation, from now, only 2x Stockfish (latest official release + the latest version) and 1x asmFish and 1x Brainfish are stored in the gamebase (all older engine-versions games will be deleted, every time, when a new version was tested). Stockfish, asmFish and BrainFish older Elo-results can still be seen in the Elo-diagrams below. BrainFish plays always with the latest Cerebellum-Library of course, because otherwise BrainFish = Stockfish.

 

Latest update: 2017/09/25: asmBrainFish 170825

 

(Ordo-calculation fixed to Stockfish 8 = 3396 Elo (this value was chosen, so that Stockfish 170526 had the same Elo-result with the new testing conditions as it had with the old conditions. So, there is no "break" in the Elo-progress in the diagram below...)

 

See the individual statistics of engine-results here

Download the current gamebase here

Download the gamebase-archive (all played games with the HERT-set) here

 

     Program                      Elo    +    -   Games   Score   Av.Op.  Draws

   1 asmBrainFish 170825        : 3491    8    8  5000    78.4 %   3246   39.5 % (new)
   2 BrainFish 170826 bmi2      : 3467    7    7  5000    76.1 %   3246   41.8 %
   3 asmFish 170819 bmi2        : 3427    7    7  5000    72.1 %   3246   46.4 %
   4 Stockfish 170909 bmi2      : 3425    7    7  5000    71.9 %   3246   45.1 %
   5 Stockfish 8 161101 bmi2    : 3396    7    7  5000    68.7 %   3246   49.9 %
   6 Komodo 11.2.2 x64          : 3380    5    5  9000    54.6 %   3340   53.2 %
   7 Houdini 5 pext             : 3369    5    5  9000    53.1 %   3341   55.0 %
   8 Shredder 13 x64            : 3199    5    5  9000    30.1 %   3360   41.2 %
   9 Fizbo 1.9 bmi2             : 3176    6    6  9000    27.3 %   3362   35.6 %
  10 Andscacs 0.91b bmi2        : 3106    6    6  9000    19.8 %   3370   30.6 %

Below you find a diagram of the progress of Stockfish in my tests since the end of 2016

And below that diagram, the older diagrams.

 

You can save the diagrams (as a JPG-picture (in originial size)) on your PC with mouseclick (right button) and then choose "save image"...

The Elo-ratings of older Stockfish dev-versions in the Ordo-calculation can be a little different to the Elo-"dots" in the diagram, because the results/games of new Stockfish dev-versions - when getting part of the Ordo-calculation - can change the Elo-ratings of the opponent engines and that can change the Elo-ratings of older Stockfish dev-versions (in the Ordo-calculation / ratinglist, but not in the diagram, where all Elo-"dots" are the rating of one Stockfish dev-version at the moment, when the testrun of that Stockfish dev-version was finished).


Sie sind Besucher Nr.