One in a million (or more)

Whatever else you want to talk about. Forum rules still apply.
Post Reply
BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

One in a million (or more)

Post by BB+ » Wed Oct 05, 2011 6:46 am

Since the closest analogue is to junk science, I preferred this as the proper subforum. :lol:

The use of "headline" large numbers by the ICGA Panel Report has recently caused a kerfuffle. I cannot find the "headline" in question, though the ICGA Report does twice indicate among its 14 pages:
The overlap between Rybka 1.0 beta and Fruit 2.1 is 74%, which is 7.5 standard deviations away, with a less than 1 in 10,000,000 chance of this happening “randomly” assuming a normal distribution.
Rybka versions from 1.0 through 2.3.2a were found to have nearly identical evaluation functions to Fruit 2.1. For the majority of the evaluation features, the same things were measured. The likelihood of this happening by chance is approximately 1 in 10,000,000.
In fact, EVAL_COMP noted that 6 sigma (the measured difference between Fruit 2.1 and Rybka 2.3.2a) was already 1 in 100 million [in fact, in a one-sided comparison it is already 1 in 1000 million], and for 7.5 sigma it is even more. So MarkL toned it done a bit, it seems, perhaps in congruence with my footnote: I personally do not place much value on the exact numbers, but this at least gives a general indication of the likelihood of the Fruit 2.1 match of evaluation features with the Rybkas. The question of whether a normal distribution should be used is also one of possible discontent (again mentioned in EVAL_COMP), though again I note that the difference between 1 in a million and 1 in a billion was not too relevant in the case at hand, particularly with the necessarily subjective aspect of the comparison.

Also, the elimination of weak engines (most particularly Faile, which outlies enough to perceptively disturb the distribution -- the 27 control datapoints have mean 31.3 and deviation 5.6 with it, and the remaining 20 have mean 33.4 and deviation 4.5 upon its exclusion) and the commensurate inclusion of common (TSCP, GnuChess) engines and more-modern stronger [still open-source] engines seems to the corroborate unlikelihood of the Rybka/Fruit overlap.

User avatar
kingliveson
Posts: 1388
Joined: Thu Jun 10, 2010 1:22 am
Real Name: Franklin Titus
Location: 28°32'1"N 81°22'33"W

Re: One in a million (or more)

Post by kingliveson » Wed Oct 05, 2011 7:52 pm

Forget Rybka -- I could use your strong knowledge in improving another program.
PAWN : Knight >> Bishop >> Rook >>Queen

Post Reply