The Evidence against Rybka

BB+ · Post by **BB+** » Fri Oct 07, 2011 4:41 am

Rebel wrote:But as you are used to criticize RF postings here I was wondering why you (yet?) did not address the what's called Fruitication issue.

If my understanding is correct, you and CW are arguing against using Fruit 2.1 as a template when reconstructing equivalent C code to the Rybka executable [whether it should be "equivalent" in the functional or semantic sense is a decision for a particular situation]. However, this use of a "template" is a standard procedure when comparing a specific source to an executable (or programs more generally). As others have noted, the main "sanity check" is to ensure that the source template is sufficiently idiosyncratic for an accidental match to be unlikely. This can usually be quantified (if desired), via something like the amount of code [under a suitable metric] that needs to be changed from the source to produce what is in the executable, and then comparing that to other specimens.

E.g., the Rybka 1.0 Beta PST can be derived exactly from the Fruit 2.1 code with 2 added lines, 4 deleted lines, and 18 changes of constants ("tuning parameters" and/or scaling). I am unaware of a non-Fruit-based program (particularly from ~2005) for which the necessitated change-set is so small. As a specific example of this "sanity check", one can note that generating the PST for Fruit 1.0 appears to require (many) more modifications to the Fruit 2.1 code than Rybka 1.0 Beta does.

Thus, the exact "Rybka source code" [whether it exists or not] is largely irrelevant to determining whether a given Rybka executable is "substantially similar" to Fruit 2.1 and/or whether Rybka is "original". One doesn't elude "copying" via changing/re-typing variable names, moving code blocks around, changing iteration into recursion, tweaking constants, etc., any more than one could say the same concerning a literary work that changed some phrasing, added/subtracted jargon words, and permuted paragraphs, with a few extra plot twists to boot. The opinion of VR in this matter seems similar (in a case where he states that there were "extensive" changes). I strongly suspect the Strelka source code looks "quite different" from the corresponding Rybka source code, yet indeed a substantially similar result was obtained.

Rebel · Post by **Rebel** » Fri Oct 07, 2011 11:55 am

Wylie, I understand the process of RE and your review was fine. Let's discuss the current 2 examples further:

Example-ONE

Code: Select all

0x401b86: add    $0x79,%esi                        if so, add 121 in opening
0x401b89: test   %rdx,%rax                          and if file is same as bK
0x401b8c: je     0x401baf
0x401b8e: add    $0x355,%esi                         add 853 more in opening

Since Fruit pawn vallue is 100 and Rybka is 3200 makes it already hard enough to identify score semantics.

Code: Select all

Fruit vs Rybka
static const int RookSemiKingFileOpening = 10;         // 121
static const int RookKingFileOpening = 20;             // 853 (974-121)

For a fair comparison (100 vs 3200) that would mean:

Code: Select all

Fruit vs Rybka
static const int RookSemiKingFileOpening = 10;         // Rybka = 3    (12100/3200) = 3  (a factor of 3 less)
static const int RookKingFileOpening = 20;             // Rybaka = 26 (85300/3200)=26  (looks similar) (OK)

A factor 3 less does not look as tuning Fruit or a deliberate obfuscation.

Example-TWO

Code: Select all

Fruit:
static const int RookSemiOpenFileOpening = 10;         // Rybka=2   (6400/3200)  (Rybka factor 5 less)
static const int RookSemiOpenFileEndgame = 10;         // Rybka=8   (25600/3200  (similar) (OK)
static const int RookOpenFileOpening = 20;             // Rybka=32  (103500/3200)(similar) (OK)
static const int RookOpenFileEndgame = 20;             // Rybka=13  (42800/3200) (similar) (OK)

1. A factor 5 less does not look as tuning Fruit or a deliberate obfuscation.

2. What I see is Fruit (10/10) (20/20) middlegame / endgame -> equal scores VERSUS Rybka (2/8) (32/13) making an obvious difference between the middlegame and the endgame phase, at is should BTW. Remember Vas is an IM, he knows.

3. I see 2 different programs doing some BASIC Rook evaluation stuff every good chess program has, or should have if you want to seriously compete.

4. On top of that RE can't proof the existence of the 2 FRUIT RookOpenFileEndgame, RookSemiOpenFileEndgame variables in RYBKA because of the compiler's capability to merge 2 or more constants.

Back to example-ONE...

Code: Select all

0x401b86: add    $0x79,%esi                        if so, add 121 in opening
0x401b89: test   %rdx,%rax                          and if file is same as bK
0x401b8c: je     0x401baf
0x401b8e: add    $0x355,%esi                         add 853 more in opening

Basically what you have is "add -> test -> je -> add" for semantics and why should that mean you are dealing with ROOK evaluation as such a semantic exist in my own engine dozen and dozen of times ?

I would say that one must rely on a bigger semantic scope, the code before, the code after, meaning the order of evaluation becomes an issue ?

hyatt · Post by **hyatt** » Fri Oct 07, 2011 2:57 pm

The NUMBERS are irrelevant, if the thing being evaluated is done in the same way. Does changing a constant in this code:

if (!(file_mask[File(square)] & Pawns(enemy)) &&
mask_pawn_isolated[square] & Pawns(side) &&
!(pawn_attacks[side][square] & Pawns(enemy))) {
attackers = 1;
defenders = 0;
for (sq = square;
sq != File(square) + ((side) ? RANK7 << 3 : RANK2 << 3);
sq += direction[side]) {
if (SetMask(sq + direction[side]) & tree->all_pawns)
break;
defenders = PopCnt(pawn_attacks[enemy][sq] & p_moves[side]);
attackers = PopCnt(pawn_attacks[side][sq] & Pawns(enemy));
if (attackers)
break;
}
if (attackers <= defenders) {
if (!(mask_passed[side][sq + direction[side]] & Pawns(enemy))) {
score_mg += passed_pawn_candidate[mg][side][rank];
score_eg += passed_pawn_candidate[eg][side][rank];
}
}
}

Just altering the array "passed_pawn_candidate[eg/mg][color][rank]" values does not release you from copyright infringement if you copy that code. To believe it does shows that you do not understand the concept of literal vs non-literal copying. To state this KNOWING that non-literal copying is not allowed says something else entirely about you...

But here's the data for the above, just for fun:

int passed_pawn_candidate[2][2][8] = {
{{ 0, 0, 36, 16, 8, 4, 4, 0 }, /* [mg][black][rank] */
{ 0, 4, 4, 8, 16, 36, 0, 0 }}, /* [mg][white][rank] */
{{ 0, 0, 48, 32, 16, 9, 9, 0 }, /* [eg][black][rank] */
{ 0, 9, 9, 16, 32, 48, 0, 0 }} /* [eg][white][rank] */
};

32 numbers. Change none, one, or all, it doesn't matter. If you copy the executable code, you just violated copyright law AND ICGA rule 2.

And it REALLY is that simple. You are acting like a freshman. "aha, I can copy this program, change a couple of numbers, and that is no longer plagiarism and I won't get into trouble." No, you will get kicked out of the university for violating its academic misconduct policy.

If you want to debate the ICGA report, feel free to do so. But on a level that makes sense, not some kindergarten nonsense that doesn't hold water ANYWHERE...

For your last comment, nobody did that. As one example, do you REALLY think it would matter if a PAWN sits on a file in front of friendly pawns at g3 and attacks the black king's position at g8? I don't think so. Knight? Nope. Bishop? Nope. You are hung up on not knowing how to see the constant masks that are used. Once you UNDERSTAND what a piece of code is doing, it doesn't require a genius to figure out "this has to be rook scoring, what other piece would care if there is a pawn in front of it or not to allow a vertical attack on the king? The queen? Possibly, until you look at the REST of the scoring to determine that there is nothing about diagonals. No diagonal mobility. And then after looking around, you find the queen scoring which is done somewhere else. You REALLY think someone can't take a de-commented evaluation in C and understand it with some effort? If one can do that, one can do the same thing with asm, with MORE effort. If one is competent in asm and understands a bit about compiler optimization.

Also, show me ANY example in the report where you find a block of code that says:

0x401b86: add $0x79,%esi if so, add 121 in opening
0x401b89: test %rdx,%rax and if file is same as bK
0x401b8c: je 0x401baf
0x401b8e: add $0x355,%esi add 853 more in opening

And claims that is rook scoring. There is a LOT more context to go with that code. 4 lines by themselves doesn't say much at all. But you left out the qualifying code prior to the above where it was testing for friendly/enemy pawns on the same file... And the other piece scoring stuff that shows that other piece scoring was found elsewhere in the binary, as expected, leaving this stuff to be rooks only...

this is REALLY grasping at straws, when you need to be grasping for something much bigger, to even put a small dent in the report.

wgarvin · Post by **wgarvin** » Fri Oct 07, 2011 5:44 pm

In addition, I assume that Zach worked out which variables are which in the position structure (by examining various pieces of code that used them--not just ). If you know its checking a rook bitboard (and not say, a pawn bitboard) then obviously its a rook feature and not a pawn feature.

Also, no one was suggesting that an engine containing just this one feature had necessarily copied it from Fruit. The problem is that there are dozens of features in Rybka 1.0 Beta's evaluation that either directly match the semantics of Fruit, or are very similar to the Fruit 2.1 version of that feature.

The similarity is much higher than between any other pair of X and Fruit. Which is very strong evidence that one is a derived from the other.

hyatt · Post by **hyatt** » Fri Oct 07, 2011 5:52 pm

wgarvin wrote:In addition, I assume that Zach worked out which variables are which in the position structure (by examining various pieces of code that used them--not just ). If you know its checking a rook bitboard (and not say, a pawn bitboard) then obviously its a rook feature and not a pawn feature.

Also, no one was suggesting that an engine containing just this one feature had necessarily copied it from Fruit. The problem is that there are dozens of features in Rybka 1.0 Beta's evaluation that either directly match the semantics of Fruit, or are very similar to the Fruit 2.1 version of that feature.

The similarity is much higher than between any other pair of X and Fruit. Which is very strong evidence that one is a derived from the other.

Don't forget the key. This "feature" is not in the report. What Ed gave is a piece of a bigger piece of code that was analyzed in the report. I've already told him how to see the values. Run the thing under the debugger and let it get past the initialization, then just display the variables in question. He's not done bitboard code and probably doesn't want to learn how. But to those that have, the masks are quite obvious as to what they represent. It is not a giant step to then figure out how they are being used and why. I found it quite easy to find the places in memory where I found something like 00ff0000000000 and 000000000000ff00 before a move has been played. Pretty obvious those are the 2nd rank and 7th rank. Possibly pawns. Then play a pawn move and you can be sure. Ditto for 8100000000000000 and 0000000000000081 which is most likely rooks, but you can move one and then peek again to be sure. It is NOT that hard. To some of us, anyway...

I still have a hard time determining whether he really doesn't understand this stuff, or whether he is just practicing deception. either way...

Rebel · Post by **Rebel** » Fri Oct 07, 2011 8:18 pm

wgarvin wrote:In addition, ...

Wylie, does this mean you won't address my points?

Just checking.

Also, no one was suggesting that an engine containing just this one feature had necessarily copied it from Fruit. The problem is that there are dozens of features in Rybka 1.0 Beta's evaluation that either directly match the semantics of Fruit, or are very similar to the Fruit 2.1 version of that feature.

While Bob has made it a major issue to defame me as a retarded chess programmer, trust me, Fruit's EVAL is nothing special, it's just good. Rebel has everything Fruit has and more AND already in the 80's. From the early accusations back in 2006 till now only the 0.0 still stands as strong evidence of COPYING. I always felt EVAL (even during the period I held the VIG view) to be the weakest part of the of the allegations. What you find in Fruit and Rybka are the basic ingredients of normal general and public available knowledge. There is no copyright on knowledge.

Either you get your knowledge from chess books (which I was limited to in the 80's) or you get your knowledge from freely downloadable sources on the internet. There is no difference. Agree ?

Something happened in the 90's -------------> INTERNET.

Chess sources all over.

http://www.top-5000.nl/sources.htm

Modern programmers get their info from the internet, they don't have to reinvent the wheel as we (Bob, Chris, me ect.) were forced to, modern programmers read chess sources as chess players read an instructive Max Euwe book. And both practice that knowledge, the chess player on the club, the chess programmer in his engine.

Agree ?

And there are just so many unburden evidences that points to a different direction, did I mention the lack of history reductions in Rybka already?

hyatt · Post by **hyatt** » Fri Oct 07, 2011 10:06 pm

You STILL continue to ignore the semantic expression of an idea. You want to insist that different programmers will program the same basic idea in the same way. I have posted multiple examples showing this is completely false. You might find, in an eval that has DOZENS of terms, a COUPLE of terms that are done similarly. You won't find 1/2. Or 3/4. That's the key. What is "normal-appearing" to you is not normal to most of us. Otherwise there would be zero detection of software plagiarism possible...

wgarvin · Post by **wgarvin** » Fri Oct 07, 2011 11:12 pm

Ed, I will reply to your post in detail when I get home tonight and have a real computer to edit on. Typing on this iphone is fine but scrolling an edit box is tragically bad, making it really hard to quote big posts and reply point for point. when I did it yesterday, I composed my reply on a work computer and e-mailed it to the phone, but then I had to correct every newline by hand with this lousy-scrolling edit box, sigh!

Rebel · Post by **Rebel** » Fri Oct 07, 2011 11:21 pm

Get that

Thanks.

orgfert · Post by **orgfert** » Fri Oct 07, 2011 11:28 pm

Rebel wrote:While Bob has made it a major issue to defame me as a retarded chess programmer, trust me, Fruit's EVAL is nothing special, it's just good. Rebel has everything Fruit has and more AND already in the 80's. From the early accusations back in 2006 till now only the 0.0 still stands as strong evidence of COPYING. I always felt EVAL (even during the period I held the VIG view) to be the weakest part of the of the allegations. What you find in Fruit and Rybka are the basic ingredients of normal general and public available knowledge. There is no copyright on knowledge.

How knowledge is expressed is the issue. It's a little tiresome watching you ignore this point, talking instead as if it's only about ideas.

Rebel wrote:And there are just so many unburden evidences that points to a different direction, did I mention the lack of history reductions in Rybka already?

Only if you ignore the salient issue. I suppose that's why you keep ignoring it.

OpenChess

OpenChess

The Evidence against Rybka

Re: The Evidence against Rybka

Re: The Evidence against Rybka

Re: The Evidence against Rybka

Re: The Evidence against Rybka

Re: The Evidence against Rybka

Re: The Evidence against Rybka

Re: The Evidence against Rybka

Re: The Evidence against Rybka

Re: The Evidence against Rybka

Re: The Evidence against Rybka