LimitStrength tournament

As in chess tournaments and matches...
Post Reply
Alexander Schmidt
Posts: 30
Joined: Wed Jun 09, 2010 3:14 pm

LimitStrength tournament

Post by Alexander Schmidt » Wed Dec 12, 2012 1:38 pm

I did some tournaments for Engines with the UCI-Limit-Strength-Feature.

Maybe it is helpful for authors to tune their ELO values.

All engines played with ELO 2000 setting.

Added some engines with known SSDF values (old values which are best tuned to human values)

Roma 68020 ~ ELO 2030
Dallas 68000 ~ ELO 1971
Resurrection Fruit 2.1 ~ ELO 2445
Fruit 2.1 is a modification by me which will probably not released.

Code: Select all

------------------------------------------------------------------------------------------------------------------
 1: Amyan 1.72                     2000  35,5 / 36   XXXX 1111 1111 1111 1111 1111 1111 11=1 1111 1111   2715 +198
 2: Chiron 1.5                     2000  30,0 / 36   0000 XXXX 1=11 11== 1111 1111 1111 111= 1111 1111   2311 +140
 3: Resurrection Fruit 2.1 203 MHz 2445  25,0 / 36   0000 0=00 XXXX ==1= 1=1= 111= 1111 1111 1111 =111   2236 -72
 4: Delfi Trainer 5.4              2000  23,0 / 36   0000 00== ==0= XXXX 1100 ==11 1111 1111 1=11 1111   2140 +72
 5: HIARCS 13.2                    2000  19,5 / 36   0000 0000 0=0= 0011 XXXX 11== 0111 1111 101= 1111   2067 +36
 6: GreKo 9.7                      1999  17,5 / 36   0000 0000 000= ==00 00== XXXX 1111 1=11 1111 =111   2031 +18
 7: Mephisto Roma 32Bit            2030   9,5 / 36   0000 0000 0000 0000 1000 0000 XXXX 101= 11=1 01=1   1854 -79
 8: SlowChess 2.96                 2000   7,5 / 36   00=0 000= 0000 0000 0000 0=00 010= XXXX ==11 001=   1808 -83
 9: Fruit 2.1                      2000   7,0 / 36   0000 0000 0000 0=00 010= 0000 00=0 ==00 XXXX =111   1787 -90
10: Mephisto Dallas 16Bit         1971   5,5 / 36   0000 0000 =000 0000 0000 =000 10=0 110= =000 XXXX   1742 -90
------------------------------------------------------------------------------------------------------------------
180 games: +82 =27 -71
Bugs:

Arasan stops playing with ELO 2000 and long timecontrol. Looks like the nps decrases way too much.

GreKo can't play under Arena with ELO 2000 because of an inaccuarcy of the UCI protocoll. It's default value is 2000, Arena doesn't send a value if it is the default value.

Code: Select all

2012-12-12 13:30:28,005<--1:uciok
2012-12-12 13:30:28,006-->1:setoption name Hash value 64
2012-12-12 13:30:28,006-->1:setoption name UCI_LimitStrength value true
2012-12-12 13:30:28,006-->1:setoption name UCI_Elo value 1999
2012-12-12 13:30:28,010-->1:isready
2012-12-12 13:30:28,166<--1:readyok

2012-12-12 13:30:45,126<--1:uciok
2012-12-12 13:30:45,130-->1:setoption name Hash value 64
2012-12-12 13:30:45,136-->1:setoption name UCI_LimitStrength value true
2012-12-12 13:30:45,180-->1:isready
2012-12-12 13:30:45,251<--1:readyok
I tested only engines which play at the same strength on different hardware (by limited nodes) and don't move immediately.

Richard Allbert
Posts: 15
Joined: Sat Jul 17, 2010 6:10 pm
Contact:

Re: LimitStrength tournament

Post by Richard Allbert » Sun Dec 16, 2012 7:08 pm

Hi Alex

An interesting idea!

Are you planning to do more of this, perhaps over other strength settings?

I've recently been playing chess again, I'm a weak player, and I found the adjusted strength settings of engines varied - better was to use the weaker engines from the 5th division of the WBEC.

It would be nice to eventually have an accuracy estimate for strength settings for the engines that support this.

I guess a lot of testing time is involved, though.

Ciao

Richard

Alexander Schmidt
Posts: 30
Joined: Wed Jun 09, 2010 3:14 pm

Re: LimitStrength tournament

Post by Alexander Schmidt » Wed Dec 19, 2012 9:57 pm

Sorry for the delayed answer, I missed your posting... :oops:
Richard Allbert wrote:
It would be nice to eventually have an accuracy estimate for strength settings for the engines that support this.
Yes, I'd like to have this too :)

I deal with this topic for a while and I don't think it is possible to get accurate values. It would need tests with all ELO settings and different timecontrols and with lots of engines. It took me months to tune SlowChess to play at the correct strength in all kind of situations.

As a rough estimation you can take this older list of me:

40moves/15min

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

 1 Amyan ELO 2000                 : 2727    0 105    88    97.7 %   2127    4.5 %
 2 Chiron ELO 2000                : 2382   86  81    88    78.4 %   2158    9.1 %
 3 Delfi Trainer ELO 2000         : 2266   69  68    88    63.6 %   2169   18.2 %
 5 Hiarcs ELO 2000                : 2213   67  67    88    55.7 %   2174   18.2 %
10 Mephisto Roma 32 Bit           : 2030   72  74    88    28.4 %   2190   15.9 %
11 StockFish ELO 2000             : 2011   76  79    88    26.1 %   2192   11.4 %
12 Rybka ELO 2000                 : 1960   61  59   160    81.2 %   1705   15.0 %
13 Shredder ELO 2000              : 1878   40  40   248    47.4 %   1896   16.5 %
14 Ufim ELO 2000                  : 1758   51  51   160    50.3 %   1756   10.6 %
15 Junior ELO 2000                : 1711   50  51   160    41.9 %   1768   15.0 %
16 DeepSjeng ELO 2000             : 1519   67  70   160    15.3 %   1816    9.4 %
and 40moves/120min:

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

 1 Amyan ELO 2000                 : 2547  106  98    96    90.1 %   2164    9.4 %
 2 Chiron ELO 2000                : 2470   84  80    96    84.9 %   2170   15.6 %
 4 Hiarcs ELO 2000                : 2297   66  64    96    65.6 %   2185   20.8 %
 6 Delfi Trainer ELO 2000         : 2251   64  63    96    58.9 %   2188   19.8 %
 7 StockFish ELO 2000             : 2203   65  65    96    51.6 %   2192   15.6 %
11 Mephisto Roma 32 Bit           : 2030   73  76    96    26.6 %   2207    9.4 %
12 Rybka ELO 2000                 : 1978   61  59   160    81.2 %   1723   15.0 %
13 Mephisto MM V                  : 1889   93 100    96    13.0 %   2219    7.3 %
14 Shredder ELO 2000              : 1863   41  41   256    40.8 %   1928   12.1 %
15 Ufim ELO 2000                  : 1776   51  51   160    50.3 %   1774   10.6 %
16 Junior ELO 2000                : 1729   50  51   160    41.9 %   1786   15.0 %
17 DeepSjeng ELO 2000             : 1537   67  70   160    15.3 %   1834    9.4 %
Both lists are calibrated to the Mephisto Roma with 2030 SSDF ELO.

lucasart
Posts: 201
Joined: Mon Dec 17, 2012 1:09 pm
Contact:

Re: LimitStrength tournament

Post by lucasart » Sat Dec 22, 2012 3:14 am

Alexander Schmidt wrote: GreKo can't play under Arena with ELO 2000 because of an inaccuarcy of the UCI protocoll. It's default value is 2000, Arena doesn't send a value if it is the default value.

Code: Select all

2012-12-12 13:30:28,005<--1:uciok
2012-12-12 13:30:28,006-->1:setoption name Hash value 64
2012-12-12 13:30:28,006-->1:setoption name UCI_LimitStrength value true
2012-12-12 13:30:28,006-->1:setoption name UCI_Elo value 1999
2012-12-12 13:30:28,010-->1:isready
2012-12-12 13:30:28,166<--1:readyok

2012-12-12 13:30:45,126<--1:uciok
2012-12-12 13:30:45,130-->1:setoption name Hash value 64
2012-12-12 13:30:45,136-->1:setoption name UCI_LimitStrength value true
2012-12-12 13:30:45,180-->1:isready
2012-12-12 13:30:45,251<--1:readyok
Arena is right. So long as

Code: Select all

setoption name UCI_LimitStrength value true
is sent, you don't need to specify a value if you want to use the default value (but you can still specify it).
In the same logic, you don't need

Code: Select all

setoption name Hash value 32
if the engine's default Hash value is already 32, for example.

So the bug is in GreKo. It is not a bug of the UCI protocol, nor a bug in Arena.
"Talk is cheap. Show me the code." -- Linus Torvalds.

Alexander Schmidt
Posts: 30
Joined: Wed Jun 09, 2010 3:14 pm

Re: LimitStrength tournament

Post by Alexander Schmidt » Sat Dec 22, 2012 7:49 am

lucasart wrote:So the bug is in GreKo. It is not a bug of the UCI protocol, nor a bug in Arena.
Basically I agree, but you can interpret the UCI protocol in both ways:
UCI protocol wrote:* setoption name <id> [value <x>]
this is sent to the engine when the user wants to change the internal parameters
of the engine. For the "button" type no value is needed.
One string will be sent for each parameter and this will only be sent when the engine is waiting.
"for each parameter" can be understood as "every value will be sent", regardless if it is changed or not.

I found a lot of engines haveing problems with that, especially the first UCI engines.

lucasart
Posts: 201
Joined: Mon Dec 17, 2012 1:09 pm
Contact:

Re: LimitStrength tournament

Post by lucasart » Sat Dec 22, 2012 8:00 am

Alexander Schmidt wrote:
lucasart wrote:So the bug is in GreKo. It is not a bug of the UCI protocol, nor a bug in Arena.
Basically I agree, but you can interpret the UCI protocol in both ways:
UCI protocol wrote:* setoption name <id> [value <x>]
this is sent to the engine when the user wants to change the internal parameters
of the engine. For the "button" type no value is needed.
One string will be sent for each parameter and this will only be sent when the engine is waiting.
"for each parameter" can be understood as "every value will be sent", regardless if it is changed or not.

I found a lot of engines haveing problems with that, especially the first UCI engines.
Perhaps the UCI protocol definition doesn't clarify that point enough, but it's completely obvious. The very definition of a "default value" is that it's the value you use "by default". So if you don't specify a setoption command, then you are using default values...
"Talk is cheap. Show me the code." -- Linus Torvalds.

Richard Allbert
Posts: 15
Joined: Sat Jul 17, 2010 6:10 pm
Contact:

Re: LimitStrength tournament

Post by Richard Allbert » Tue Dec 25, 2012 9:08 pm

No problem, so did I!

Thanks for the list, there is a lot of variation in there.

Merry Christmas
Richard

Post Reply