Stefan Pohl Computer Chess

private website for chessengine-tests


Latest Website-News (2019/01/16): Testrun of Lc0 v0.20.1 N:32463 finished. First testrun of Lc0 for my 3'+1'' mini-ratinglist. Because Lc0 needs the GPU, only 1 game can be played simultaneously. So, it is not possible to play 5000 games (I would need a month for this...) - so Lc0 played 1200 games (200 games vs. 6 top-engines). Next testrun: Stockfish 190114. Result not before next Wednesday.

 

Long thinking-time tournament updated with a nice progress of Lc0 v0.20.1 Net 32463.

 

Update of my Drawkiller Openings to V1.5: I added a small openings-set with 500 positions / opening-lines, which gave the lowest draw-rate and the widest spread Elo-results of all Drawkiller openings! I recommend to download the new Drawkiller-files: here

 

My Drawkiller Openings Project is finished. Never before any openings-set gave such low draw-rates without crunching the scores of the engines towards 50%, but instead pushing the scores away from 50%. The Drawkiller Normal- and Tournament sets nearly halve the draw-rate, compared to FEOBOS or the Stockfish Framework 8-move openings. I would never have expected, that this was possible – the Drawkiller project is really a breakthrough into another dimension of Computerchess. Learn more about Drawkiller openings in the "Drawkiller openings"- section on this website and download them here

 

Stay tuned.

 


Stockfish testing

 

Playing conditions:

 

Hardware: i7-6700HQ 2.6GHz Notebook (Skylake CPU), Windows 10 64bit, 8GB RAM

Fritzmark: singlecore: 5.3 / 2521 (all engines running on one core, only), average meganodes/s displayed by LittleBlitzerGUI: Houdini: 2.6 mn/s, Stockfish: 2.2 mn/s, Komodo: 2.0 mn/s

GPU (used by LC Zero): Nvidia Cuda Gforce GTX 950M (4GB memory, grafics: 914 MHz, memory: 1.25 GHz). LC Zero CUDA calculates around 1020 rollouts/s in the starting position (measured with "go infinte"-command until depth 28) with Netsize 20x256

Leela-Ratio: 0.67 for my tournament (opponent engines running singlethread). What is it? look here

Hash: 512MB per engine

GUI: LittleBlitzerGUI (draw at 130 moves, resign at 400cp (for 4 moves))

Tablebases: None

Openings: HERT testset (by Thomas Zipproth) (download the file at the "Download & Links"-section or here)(I use a version of HERT, where the positions in the file are ordered in a different way - makes no difference for testing-results, dont be confused, when you download my gamebase-file and the game-sequence doesnt match with the sequence of your HERT-set...)

Ponder, Large Memory Pages & learning: Off

Thinking time: 180''+1000ms (= 3'+1'') per game/engine (average game-duration: around  7.5 minutes). One 5000 games-testrun takes about 7 days.The version-numbers of the Stockfish engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file, written backwards (year,month,day))(example: 170526 = May, 26, 2017). Since July, 2018 I use the abrok-compiles of Stockfish again (http://abrok.eu/stockfish), because they are now much faster than before - now only 1.3% slower than BrainFish-compiles. So, there is no reason anymore to not use these "official" development-compiles.

Download BrainFish (and the Cerebellum-Library): here

 

Each Stockfish-version plays 1000 games versus Komodo 12.3, Houdini 6, Fire 7.1, Ethereal 11.12, Komodo 12.3 MCTS. All engines are running with default-settings, except: Move Overhead is set to 500ms, if an engine allows to do so.

To avoid distortions in the Ordo Elo-calculation, from now, only 2x Stockfish (latest official release + the latest version) and 1x asmFish and 1x Brainfish are stored in the gamebase (all older engine-versions games will be deleted, every time, when a new version was tested). Stockfish, asmFish and BrainFish older Elo-results can still be seen in the Elo-diagrams below. BrainFish plays always with the latest Cerebellum-Library of course, because otherwise BrainFish = Stockfish.

 

Latest update: 2019/01/16: Lc0 v0.20.1 N:32463

 

(Ordo-calculation fixed to Stockfish 10 = 3508 Elo)

 

See the individual statistics of engine-results here

See the ORDO-rating of the archive-gamebase since 2019 here

Download the current gamebase here

Download the archive-gamebase since 2019 here

 

     Program                    Elo    +    -   Games   Score   Av.Op.  Draws

   1 BrainFish 180728 bmi2    : 3531    8    8  5000    77.8 %   3295   39.0 %
   2 Stockfish 190101 bmi2    : 3510    8    8  5000    73.9 %   3321   44.4 %
   3 Stockfish 10 181129      : 3508    7    7  7200    74.4 %   3312   43.0 %
   4 Stockfish 9 180201       : 3452    6    6  8000    74.9 %   3246   40.5 %
   5 Houdini 6 pext           : 3425    4    4 14200    63.3 %   3317   48.7 %
   6 Komodo 12 bmi2           : 3394    5    5 12000    62.1 %   3297   48.2 %
   7 Komodo 12.3 bmi2         : 3392    7    7  6200    53.1 %   3369   55.2 %
   8 Lc0 v0.20.1 N:32463      : 3296   14   14  1200    42.7 %   3352   43.7 % (new)
   9 Fire 7.1 popc            : 3281    4    4 14200    44.4 %   3327   49.5 %
  10 Ethereal 11.12 pext      : 3254    6    6  7200    32.4 %   3392   48.2 %
  11 Komodo 12.3 MCTS         : 3252    7    7  7200    32.1 %   3392   46.0 %
  12 Shredder 13 x64          : 3189    6    6  9000    31.7 %   3344   41.0 %
  13 Fizbo 2 bmi2             : 3188    6    6  9000    35.8 %   3307   37.7 %
  14 Booot 6.3.1 popc         : 3178    7    7  6000    32.1 %   3321   42.9 %
  15 Xiphos 0.4.6 bmi2        : 3169    8    8  5000    27.8 %   3348   39.3 %
  16 Andscacs 0.94 popc       : 3142    7    7  6000    28.1 %   3321   37.4 %

The version-numbers (180622 for example) of the engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file. Especially the asmFish-engines are often released much later!!

Below you find a diagram of the progress of Stockfish in my tests since the end of 2018

And below that diagram, the older diagrams.

 

You can save the diagrams (as a JPG-picture (in originial size)) on your PC with mouseclick (right button) and then choose "save image"...

The Elo-ratings of older Stockfish dev-versions in the Ordo-calculation can be a little different to the Elo-"dots" in the diagram, because the results/games of new Stockfish dev-versions - when getting part of the Ordo-calculation - can change the Elo-ratings of the opponent engines and that can change the Elo-ratings of older Stockfish dev-versions (in the Ordo-calculation / ratinglist, but not in the diagram, where all Elo-"dots" are the rating of one Stockfish dev-version at the moment, when the testrun of that Stockfish dev-version was finished).

 

 

 


Sie sind Besucher Nr.