Stefan Pohl Computer Chessprivate website for chessengine-testsLatest Website-News (2020/01/15): AB-testrun of Slow Chess 2.5 avx2 finished: +50 Elo to Slow Chess 2.4 - nice progress! NN-testrun of Lc0 0.26.3 66988 net finished. See the result and download the games in the "NN vs SF testing"- section. Next NN-testrun: Lc0 0.26.3 J94-100 Next AB-testrun: Stockfish 210111 avx2
I released the new V2.00 of my Unbalanced Human Openings. All of the 400000 raw-data endpositions were re-evaluated with KomodoDragon 1.0 (instead of Komodo 14 in UHO V1.0) and all UHO openings-sets and opening-books were rebuilt. Learn more in the "Unbalanced Human Openings"- section or download them right here
Stay tuned. Stockfish testing
Playing conditions:
Hardware: Since 20/07/21 AMD Ryzen 3900 12-core (24 threads) notebook with 32GB RAM. Now, 20 games are played simultaneously (!), so from now, each testrun will have 6000 or 7000 games (instead of 5000 before) and will take only 2 days, not 6-7 days as before! From now, all engine-binaries are popcount/avx2, of course, because bmi2-compiles are extremly slow on AMD. To keep the rating-list engine-names consistent, the "bmi2"- or "pext"-extension in the engine-name is still in use for older engines - otherwise ORDO will not calculate all played games by this engine as one engine... Speed: (singlethread, TurboBoost-mode switched off, chess starting position) Stockfish: 1.3 mn/s, Komodo: 1.1 mn/s Hash: 256MB per engine GUI: Cutechess-cli (GUI ends game, when a 5-piece endgame is on the board) Tablebases: None for engines, 5 Syzygy for cutechess-cli Openings: HERT_500 testset (by Thomas Zipproth) (download the file at the "Download & Links"-section or here) Ponder, Large Memory Pages & learning: Off Thinking time: 180''+1000ms (= 3'+1'') per game/engine (average game-duration: around 7.5 minutes). One 7000 games-testrun takes about 2 days.The version-numbers of the Stockfish engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file, written backwards (year,month,day))(example: 200807 = August, 7, 2020). The used SF compile is the AVX2-compile, which is the fastest on my AMD Ryzen CPU. SF binaries are taken from abrok.eu (except the official SF-release versions, which are taken form the official Stockfish website). Download BrainFish (and the Cerebellum-Libraries): here
To avoid distortions in the Ordo Elo-calculation, from now, only 3x Stockfish (latest official release + the latest 2 dev-versions) and 1x Brainfish are stored in the gamebase (all older engine-versions games will be deleted, every time, when a new version was tested). Stockfish and BrainFish older Elo-results can still be seen in the Elo-diagrams below. BrainFish plays always with the latest Cerebellum-Libraries of course, because otherwise BrainFish = Stockfish.
Latest update: 2020/01/13: Slow Chess 2.5 (+50 Elo to Slow Chess 2.4)
(Ordo-calculation fixed to Stockfish 12 = 3684 Elo)
See the individual statistics of engine-results here Download the current gamebase here
Program Elo + - Games Score Av.Op. Draws 1 CFish 12 3xCerebellum : 3725 9 9 7000 86.1 % 3388 27.3 % The version-numbers (180622 for example) of the engines are the date of the latest patch, which was included in the Stockfish sourcecode, not the release-date of the engine-file. Especially the asmFish-engines are often released much later!! Below you find a diagram of the progress of Stockfish in my tests since August 2020. And below that diagram, the older diagrams.
You can save the diagrams (as a JPG-picture (in originial size)) on your PC with mouseclick (right button) and then choose "save image"... The Elo-ratings of older Stockfish dev-versions in the Ordo-calculation can be a little different to the Elo-"dots" in the diagram, because the results/games of new Stockfish dev-versions - when getting part of the Ordo-calculation - can change the Elo-ratings of the opponent engines and that can change the Elo-ratings of older Stockfish dev-versions (in the Ordo-calculation / ratinglist, but not in the diagram, where all Elo-"dots" are the rating of one Stockfish dev-version at the moment, when the testrun of that Stockfish dev-version was finished). ![]()
![]()
![]() ![]() ![]()
Sie sind Besucher Nr. |