Stefan Pohl Computer Chess

private website for chessengine-tests


Latest Website-News (2017/11/18): Testrun of asmBrainFish 171107 (with Cerebellum-Library 171101) finished. Next testrun: Stockfish 171111. Result not before next Friday.

 

Long thinking-time tournament updated.

 

Stay tuned.

 

New download-content available: In the SALC_V3_10moves-package, you will find a new folder called "SALC_half_closed". In there, you find only SALC-lines (PGN & EPD), which have at least on one of two center lines (d-line or e-line) a white and a black pawn. For further information check out the ReadMe-file in the download-package.The idea is, that in these positions, the probability of fast and many capturing-moves is much lower, so it should took more time (and moves) to reach drawish endgame positions. So, the probability of an interesting and long midgame should get higher...(hopefully)...

 

The testrun of SALC half-closed is finished and the result is really impressive and a huge step forward on my mission to prevent computerchess from draw-death. Take a look at the result on the "Experiments"-site.

 

Download the new SALC-package in the "Downloads & Links"-section or right here

 

Because the SALC half-closed result is so much better than "normal"-SALC, I decided to start the development of a complete new SALC V4 (hc+) book (and opening-sets). Because half-closed SALC-positions are more rare than "normal" SALC-positions, I will use 11 moves (22 plies) deep opening-lines for the development, instead of 10 moves. The development will take some time. In the meantime, I strongly recommend to use the SALC half-closed positions in the SALC V3 download-package: Only 7053 positions, but SALC V4 (hc+) will have definitly much more positions!


Stockfish testing

 

Playing conditions:

 

Hardware: i7-6700HQ 2.6GHz Notebook (Skylake CPU), Windows 10 64bit, 8GB RAM

Fritzmark: singlecore: 5.3 / 2521 (all engines running on one core, only), average meganodes/s displayed by LittleBlitzerGUI: Houdini: 2.6 mn/s, Stockfish: 2.2 mn/s, Komodo: 2.0 mn/s

Hash: 512MB per engine

GUI: LittleBlitzerGUI (draw at 130 moves, resign at 400cp (for 4 moves))

Tablebases: None

Openings: HERT testset (by Thomas Zipproth) (download the file at the "Download & Links"-section or here)(I use a version of HERT, where the positions in the file are ordered in a different way - makes no difference for testing-results, dont be confused, when you download my gamebase-file and the game-sequence doesnt match with the sequence of your HERT-set...)

Ponder, Large Memory Pages & learning: Off

Thinking time: 180''+1000ms (= 3'+1'') per game/engine (average game-duration: around  7.5 minutes). One 5000 games-testrun takes about 7 days.The version-numbers of the Stockfish-development engines are the release-date, written backwards (year,month,day))(example: 170526 = May, 26, 2017). I use BrainFish-compiles (bmi2) by Thomas Zipproth (without using the Cerebellum-Library, BrainFish is identical to Stockfish and BrainFish-compiles are the fastest compiles of the Stockfish C++ code at the moment, around +10% faster than the abrok.eu-compiles and around 4% faster than the ultimaiq-compiles).

Download BrainFish (and the Cerebellum-Library): here

 

Each Stockfish-version plays 1000 games versus Komodo 11.2.2, Houdini 6, Fire 6.1, Shredder 13, Fizbo 1.9. All engines are running with default-settings, except: Move Overhead is set to 300ms, if an engine allows to do so.

To avoid distortions in the Ordo Elo-calculation, from now, only 2x Stockfish (latest official release + the latest version) and 1x asmFish and 1x Brainfish are stored in the gamebase (all older engine-versions games will be deleted, every time, when a new version was tested). Stockfish, asmFish and BrainFish older Elo-results can still be seen in the Elo-diagrams below. BrainFish plays always with the latest Cerebellum-Library of course, because otherwise BrainFish = Stockfish.

 

Latest update: 2017/11/18: asmBrainFish 171107

 

(Ordo-calculation fixed to Stockfish 8 = 3396 Elo (this value was chosen, so that Stockfish 170526 had the same Elo-result with the new testing conditions as it had with the old conditions. So, there is no "break" in the Elo-progress in the diagram below...)

 

See the individual statistics of engine-results here

Download the current gamebase here

Download the archive (all played games with the HERT-set (80000 games)) here

 

     Program                    Elo    +    -   Games   Score   Av.Op.  Draws

   1 asmBrainFish 171107      : 3492    8    8  5000    75.5 %   3279   43.5 % (new)
   2 BrainFish 171104 bmi2    : 3479    7    7  5000    74.2 %   3279   44.6 %
   3 asmFish 171019 bmi2      : 3445    7    7  5000    70.6 %   3279   48.1 %
   4 Stockfish 171011 bmi2    : 3437    7    7  5000    69.8 %   3279   48.5 %
   5 Houdini 6 pext           : 3425    5    5  9000    58.3 %   3358   56.1 %
   6 Stockfish 8 161101       : 3396    7    7  5000    65.0 %   3279   53.7 %
   7 Komodo 11.2.2 x64        : 3380    5    5  9000    51.9 %   3363   54.3 %
   8 Fire 6.1 popc            : 3214    5    5  9000    29.1 %   3381   42.8 %
   9 Shredder 13 x64          : 3200    5    5  9000    27.4 %   3383   41.4 %
  10 Fizbo 1.9 bmi2           : 3178    6    6  9000    24.8 %   3385   35.4 %

Below you find a diagram of the progress of Stockfish in my tests since the end of 2016

And below that diagram, the older diagrams.

 

You can save the diagrams (as a JPG-picture (in originial size)) on your PC with mouseclick (right button) and then choose "save image"...

The Elo-ratings of older Stockfish dev-versions in the Ordo-calculation can be a little different to the Elo-"dots" in the diagram, because the results/games of new Stockfish dev-versions - when getting part of the Ordo-calculation - can change the Elo-ratings of the opponent engines and that can change the Elo-ratings of older Stockfish dev-versions (in the Ordo-calculation / ratinglist, but not in the diagram, where all Elo-"dots" are the rating of one Stockfish dev-version at the moment, when the testrun of that Stockfish dev-version was finished).


Sie sind Besucher Nr.