perft results off by 1 in 700 million; reasons?

ethoma · Post by **ethoma** » Wed Jan 02, 2013 2:42 am

I am working through some bugs in my chess engine. I am testing my engine's perft function on "r3k2r/Pppp1ppp/1b3nbN/nP6/BBP1P3/q4N2/Pp1P2PP/R2Q1RK1 w kq - 0 1".

The correct results (verified by me with stockfish) can be found here: http://chessprogramming.wikispaces.com/Perft+Results. It is under "Position 4".

perft 1-5 works just fine. perft 6 plies yields a result of 706045032 nodes, whereas it should yield 706045033 nodes.

I do not have any data for the capture/castle/ep counts, other than that they 'should' be working.

I know that you don't have my code, but given that it is off 1 in 700 million, can you speculate (with reason) on some special circumstance that only occurs once that is not functioning correctly in my engine?

Note: the issue only appears in the 6th ply, so it must never occur in plies 1-5.

Thanks. This is just a really odd problem, so I am looking for some suggestions so I can keep thinking and debugging.

P.S.: Is Stockfish's perft multithreaded? It is sooooooo fast.

lucasart · Post by **lucasart** » Wed Jan 02, 2013 1:02 pm

ethoma wrote:I am working through some bugs in my chess engine. I am testing my engine's perft function on "r3k2r/Pppp1ppp/1b3nbN/nP6/BBP1P3/q4N2/Pp1P2PP/R2Q1RK1 w kq - 0 1".

The correct results (verified by me with stockfish) can be found here: http://chessprogramming.wikispaces.com/Perft+Results. It is under "Position 4".

perft 1-5 works just fine. perft 6 plies yields a result of 706045032 nodes, whereas it should yield 706045033 nodes.

I do not have any data for the capture/castle/ep counts, other than that they 'should' be working.

I know that you don't have my code, but given that it is off 1 in 700 million, can you speculate (with reason) on some special circumstance that only occurs once that is not functioning correctly in my engine?

Note: the issue only appears in the 6th ply, so it must never occur in plies 1-5.

Thanks. This is just a really odd problem, so I am looking for some suggestions so I can keep thinking and debugging.

P.S.: Is Stockfish's perft multithreaded? It is sooooooo fast.

Just debug recursively. Print the perft(n-1) after each legal move. If perft(n) = sum(perft(n-1) afterr move#i, for all i) is wrong, then necessarly one of the perft(n-1) after a move is wrong. You now need to debug the problem at depth n-1, and so on. After at most n-1 iterations, you have solved the problem.

You may need to hack Stockfish a little to make it display the perft(n-1) details of a perft(n). Otherwise you can use my engine DiscoCheck 3.7.1, as follows (command line)

Code: Select all

$ ./discocheck_3.7.1 
> position fen r3k2r/Pppp1ppp/1b3nbN/nP6/BBP1P3/q4N2/Pp1P2PP/R2Q1RK1 w kq - 0 1
> perft 6
h1	149335005
Nd4	129579089
Rf2	123078367
Bc5	87986826
d4	124002076
c5	92063670
time(ms) = 10250
perft(6) = 706045033
> quit

For example, if after d4, your perft(5) is wrong, repeat the operation at depth 5, with the FEN that you get after playing d4, and so on.

ethoma · Post by **ethoma** » Wed Jan 02, 2013 8:05 pm

Thanks for the helpful debugging tip. I modified stockfish to print out the perft(MAXDEPTH-1) results, and went through and found the exact move path that generated the error.

It turns out my "attacks" function was malfunctioning (the function that determines if a square is attacked). I had misgenerated my pawn attacking squares list, and so the king was seen as in check by a pawn that was on the other side of the board. Luckily that's cleared up now. It works great!

EDIT: I am still wondering why Stockfish has about a factor of 2 or 3 times faster on their perft function than crafty or my engine. Also, the way stockfish's perft scales with node size (much faster on big searches than small searches) makes me think it is multithreaded, but the threading is hiding somewhere in their crazy code.

lucasart · Post by **lucasart** » Thu Jan 03, 2013 2:48 am

ethoma wrote:EDIT: I am still wondering why Stockfish has about a factor of 2 or 3 times faster on their perft function than crafty or my engine. Also, the way stockfish's perft scales with node size (much faster on big searches than small searches) makes me think it is multithreaded, but the threading is hiding somewhere in their crazy code.

There is no cheat in Stockfish's perft, as you can see:
- single threaded
- no perft hash table

Code: Select all

size_t Search::perft(Position& pos, Depth depth) {

  // At the last ply just return the number of legal moves (leaf nodes)
  if (depth == ONE_PLY)
      return MoveList<LEGAL>(pos).size();

  StateInfo st;
  size_t cnt = 0;
  CheckInfo ci(pos);

  for (MoveList<LEGAL> ml(pos); !ml.end(); ++ml)
  {
      pos.do_move(ml.move(), st, ci, pos.move_gives_check(ml.move(), ci));
      cnt += perft(pos, depth - ONE_PLY);
      pos.undo_move(ml.move());
  }

  return cnt;
}

zullil · Post by **zullil** » Thu Jan 03, 2013 11:35 am

ethoma wrote: EDIT: I am still wondering why Stockfish has about a factor of 2 or 3 times faster on their perft function than crafty or my engine. Also, the way stockfish's perft scales with node size (much faster on big searches than small searches) makes me think it is multithreaded, but the threading is hiding somewhere in their crazy code.

For a very fast perft, get qperft from http://home.hccnet.nl/h.g.muller/dwnldpage.html

Here's a run using a single core on a Core 2 Duo running at 2 GHz:

Code: Select all

 - - - - - - - - - - - -
 - - - - - - - - - - - -
 - - r n b q k b n r - -
 - - p p p p p p p p - -
 - - . . . . . . . . - -
 - - . . . . . . . . - -
 - - . . . . . . . . - -
 - - . . . . . . . . - -
 - - P P P P P P P P - -
 - - R N B Q K B N R - -
 - - - - - - - - - - - -
 - - - - - - - - - - - -

Quick Perft by H.G. Muller
Perft mode: No hashing, bulk counting in horizon nodes

perft( 1)=           20 ( 0.000 sec)
perft( 2)=          400 ( 0.000 sec)
perft( 3)=         8902 ( 0.000 sec)
perft( 4)=       197281 ( 0.003 sec)
perft( 5)=      4865609 ( 0.058 sec)
perft( 6)=    119060324 ( 1.438 sec)
perft( 7)=   3195901860 (38.021 sec)

Richard Allbert · Post by **Richard Allbert** » Sat Jan 05, 2013 3:24 pm

I feel your frustration!

I've been spending weeks (technically months now as it took me ages to understand) implementing magic bit board move generation, and getting the Perft results correct was frustrating.

I use the winboard engine Sharper for verification - it has a divide command straight from the console, and is very fast.

For your problem, there is really no alternative other than to use the node count at each move, and follow the bug that way. Eg if at depth 7 move 16 is out by 1, then do a depth 6 Perft from the position after the 16th move is made, and so on.

In my experience, however, when your are 1 node out in several billion, it will not just be a one off 'make move' bug, rather a combination of successive make and take functions causing an obscure problem.

Good luck!

lucasart · Post by **lucasart** » Sun Jan 06, 2013 4:57 am

Richard Allbert wrote: In my experience, however, when your are 1 node out in several billion, it will not just be a one off 'make move' bug, rather a combination of successive make and take functions causing an obscure problem.

Good luck!

Oh yes, when you start having path dependant bugs, you are screwed !

But, with method, you can fix them too. Here's what I do in DiscoCheck, to make sure it never happens:
- first of all, i never manipulate the board internal structure directly. i always use elementary functions like

Code: Select all

Board::clear_square(color, piece, square)
Board::set_square(color, piece, square)

- these functions assert() the correctness of every parameter, as well as verify occupancy problems. For example clear_square() will assert-squeel if square is empty, and set_square() will assert-squeel is square is already occupied.
- I also validate the zobrist key, by putting it on the stack (so no need to recalc on undo). And in debug mode, I recalculate it from scratch after every Board::play() and Board::undo(), and compare it to the one on the stack.

If you stick to these very defensive programming techniques (assert everything, recalc from scratch in debug mode anything that is computed dynamically to double check it, like zobrist, but also PST sums, material etc.). You should not have path dependant bugs.

Also, you should use a systematic unit test before commit. Patches are of two kinds:
- functional changes: you need to play games, and obtain a significant improvement, accounting for error bar. Otherwise, do not commit.
- non functional changes: just have a function that calculated a few positions at fixed depth, and count the nodes. a non functional commit (like code style massage, cleanup, removing useless stuff etc.) should give you the same node count. If it doesn't, you have either introduced a bug, or fixed one. You need to investigage the node count difference, until you manage to understand its cause. Otherwise no commit.

This may seem painful, but in the long run, it is far less painful than accumulating bugs, without knowing of their existance. You would be amazed at how an engine that plays ok-ish, can actually be full of bugs. The biggest source of elo gain for amateur engines, is often fixing bugs.

PS: And use git ! It has saved my life so many times, and forces me to be organized, rather than just modify code, and never remember where I came from, what I just did, and how to undo it etc.

Richard Allbert · Post by **Richard Allbert** » Thu Jan 10, 2013 12:12 am

Hi Lucas

Yes, you're completely right - this is where I really lack discipline. I have been using SVN, that has helped.

The main problem, as I'm sure you know the feeling, is that you sit down to program one change/idea, intending to check and double check everything with Asserts. Then, you should test it. But then, you think... hmm might as well add x,y,z also. and so on. Before you know it, 3 hours later in the small hours of the morning, you have "quick test features" spread everywhere from ideas that suddenly came to mind, nothing is testable / checked and chaos ensues!

This is what happens with me, anyway.

I'm just not disciplined with it all

pdigre · Post by **pdigre** » Sun Jul 06, 2014 1:04 pm

This FEN is excellent for finding bugs, "kiwipete" by Peter McKenzie.
I had a similar situation with the exact same FEN. I used "divide" and conquer and traced it down. To my amusement I found I were allowing castling with the enemy rook

OpenChess

OpenChess

perft results off by 1 in 700 million; reasons?

perft results off by 1 in 700 million; reasons?

Re: perft results off by 1 in 700 million; reasons?

Re: perft results off by 1 in 700 million; reasons?

Re: perft results off by 1 in 700 million; reasons?

Re: perft results off by 1 in 700 million; reasons?

Re: perft results off by 1 in 700 million; reasons?

Re: perft results off by 1 in 700 million; reasons?

Re: perft results off by 1 in 700 million; reasons?

Re: perft results off by 1 in 700 million; reasons?