Stefan Pohl Computer Chess

private website for chessengine-tests


NN MEA testing

 

The MEA tool by Ferdinand Mosca (programming) and Ed Schroeder (epd-sets) is a great tool, which solves epd-sets with multiple possible best-moves (with different scores!) automatically and summarizes the score-points of the moves, found by the engine. Ed Schroeder built some huge epd-sets. I use 50000 of the epd-positions, calculated by Stockfish 11

 

You find the MEA tool here

Download the huge 50000 positions epd-file, which I use for testing, here

Download a small textfile with a short manual and examples how to use MEA with lc0 here

 

MEA is really great for testing neural-nets, because it is possible to let lc0 calculate the board-position (1 node), only. For that, you have to set the MEA-thinkingtime to 1ms and give the option Slowmover=0 to lc0. If you do so, one testrun takes some minutes (10x128 nets) up to 90 minutes (30x384 nets), only, even though lc0 CPU is used. So, you do not need a modern GPU in your machine...

The 2 disadvantages are, that you can not compare nets of different sizes, because bigger nets are better than smaller nets, when only the board-position is calculated and no MCTS is done. And you get only a result in score-points by MEA, no Elo-number. 

 

But, of course, it is possible to use MEA with thinking-time and lc0 running on a modern GPU, too. Then, a testrun will take longer, but the results of different net-sizes are compareable.  But, on a fast GPU Lc0 (and Stockfish on a fast CPU) is too strong for the MEA-testing IMHO. With 5''/position, Stockfish scored more than 0.870 (87%), which seems too much to me, especially mentioning more progress of the engines in the future. So, I can not recommend using MEA in that way. But for testing of NNs with 1 node/position, MEA is just perfect.

 

 

1 node/position testresults of neural-nets, for each testrun 50000 positions of the epd-file were calculated.

Latest update: 20/09/17 (New results of both training-runs and J92-170 net)

 

Netsize 10x128 results here (results of T72-nets (training run 2))

Netsize 19x256 and 20x256 results here

Netsize 24x320 results here (results of T60-nets (training run 1))

Netsize 30x384 results here

 

 

Best nets, tested so far here with MEA:

Netsize 10x128 best net overall: 703810 (best net of training run 2: 721894)

Netsize 19x256 and 20x256 best net overall: LS 15

Netsize 24x320 best net overall: T60B.7-135 (best net of training run 1: 64991)

Netsize 30x384 best net overall: J92-160

 

 

I made some comparsions of some lc0-results on discord and in the FGRL-ratinglist. And it seems, that an increasing scorerate of +0.001 is around +2 Elo gain...but only with nets of the same size. Never compare MEA-results of different net-sizes, when only 1 node per testposition was calculated, because that is an huge advantage for bigger nets!

Example: On discord, a 1000 games testrun between LS 14.3 and t40-1541 net (both 20x256 size!!!) gave +25 Elo for LS 14.3. In my MEA 1 node testings, LS 14.3 has a scorerate of 0.657 and t40-1541 has a scorerate of 0.645. 

The difference is 0.012. Means: 12 * 2 = +24 Elo.