accelerated PVS

Code, algorithms, languages, construction...

accelerated PVS

Postby lucasart » Mon Dec 24, 2012 10:07 am

I'd like to starty by pointing out that the chess programming wiki's pseudo-code for PVS is quite wrong:
Code: Select all
int pvSearch( int alpha, int beta, int depth ) {
   if( depth == 0 ) return quiesce( alpha, beta );
   bool bSearchPv = true;
   for ( all moves)  {
      make
      if ( bSearchPv ) {
         score = -pvSearch(-beta, -alpha, depth - 1);
      } else {
         score = -pvSearch(-alpha-1, -alpha, depth - 1);
         if ( score > alpha ) // in fail-soft ... && score < beta ) is common
            score = -pvSearch(-beta, -alpha, depth - 1); // re-search
      }
      unmake
      if( score >= beta )
         return beta;   // fail-hard beta-cutoff
      if( score > alpha ) {
         alpha = score; // alpha acts like max in MiniMax
         bSearchPv = false;  // *1)
      }
   }
   return alpha; // fail-hard
}

This bSearchPV is rather clumsy, and the PV-ness of nodes must be passed on recursively via the search. For example when beta = alpha+1 in this code, the first move is search twice with exactly the same parameters.

Here's the code of my engine, as an example of a correct implementation in a real program. This is the part inside the move loop (the rest is obvious). In a real program, you need to pass on the depth recursively (otherwise you segfault on a stack overflow) as well as a search stack (not necessary but typical). And you need to cater for search reductions too, and the re-search mechanism. Anyway, here's the code, I hope it is self documenting:
Code: Select all
// Play the move
B.play(ss->m);

// PVS
int score;
if (is_pv && first)
   // search full window
   score = -search(B, -beta, -alpha, new_depth, is_pv, ss+1);
else
{
   // zero window search (reduced)
   score = -search(B, -alpha-1, -alpha, new_depth - ss->reduction, false, ss+1);

   // doesn't fail low: re-search full window (reduced)
   if (is_pv && score > alpha)
      score = -search(B, -beta, -alpha, new_depth - ss->reduction, is_pv, ss+1);

   // fails high: verify the beta cutoff with a non reduced search
   if (score > alpha && ss->reduction)
      score = -search(B, -beta, -alpha, new_depth, is_pv, ss+1);
}

// Undo the move
B.undo();


Notice that:
- if we are in a PV node, and we are searching move number 2,3,4,...
- we expect them to fail low. so we do a zero window search at reduced depth first
- at PV nodes, if we don't fail low, we do not extend the depth but extend the window to (alpha, beta) first, and the depth after (if it doesn't fail low, again). In theory, this is unsound, as if the zero window reduced search fails low, the full window reduced search should also fail low. But in reality modern searches are plagued with inconsistencies, and often you will fail high on nonPVSearch(alpha,alpha+1) but fail low on PVSearch(alpha,beta). So we resolve the potential instabilities in this way, before extending the search. This is what Fruit does, for example.

Now a more traditional implementation of PVS, that typically doesn't care about search instabiities:
Code: Select all
if (is_pv && first)
   // search full window
   score = -search(B, -beta, -alpha, new_depth, is_pv, ss+1);
else
{
   // zero window search (reduced)
   score = -search(B, -alpha-1, -alpha, new_depth - ss->reduction, false, ss+1);

   // fails high: verify the beta cutoff with a non reduced search
   if (score > alpha && ss->reduction)
      score = -search(B, -beta, -alpha, new_depth, is_pv, ss+1);

   // doesn't fail low: re-search full window (reduced)
   if (is_pv && score > alpha)
      score = -search(B, -beta, -alpha, new_depth, is_pv, ss+1);
}


I wonder if other engine developpers have experimented with both approaches. For example Fruit uses the first version, and Glaurung/Stockfish use the second. I have experimented with both in DiscoCheck, and the first one (the Fruit one) was very measurably superior.

Experimental results, pros and cons, or other ideas, welcome.
"Talk is cheap. Show me the code." -- Linus Torvalds.
User avatar
lucasart
 
Posts: 158
Joined: Mon Dec 17, 2012 1:09 pm

Re: accelerated PVS

Postby Rein Halbersma » Wed Dec 26, 2012 11:18 pm

Your analysis reminds me of the difference between PVS with aspiration windows at the root (let's abbreviate this as asp-PVS) and MTD(f). There's an old post on CCC where Fabien Letouzey explains this much better than I could, but here's my own summary.

Both algorithms start a new iteration with a null hypothesis that the score and best move from the previous iteration continue to be valid. The complication is that this hypothesis is the joint probability of two assumptions: score and best move *both* remain valid. This means that when searching the best move with a small (e.g. null) aspiration window fails low, there could be two causes: the best move is no longer the best move, or the score is no longer the real score. This is where asp-PVS differs from MTD(f):
1) asp-PVS is move-based: after a fail-low on the first move, it continues to believe that the best move is still the best move, but with a lower score. It widens and/or lowers the aspiration window to resolve the exact score for the best move, and only then it starts searching other moves.
2) MTD(f) is score-based: after a fail-low on the first move, it continues to believe that the score is still the right score, but with another move. It starts searching other moves with the same null-windows to find the new best move, and only if they all fail low at the current null-window it starts to search with a lowered null-window.

BTW, I don't think there is any a priori reason why asp-PVS should be better/worse than MTD(f) (not counting the fact that you actually get a PV!), eventually both algorithms will get the same score and best move. It's an empirical question which type of adjustment at the root is most often corresponding to reality.

In your analysis, there is a similar phenomenon. When searching moves 2...N along a PV, your null hypothesis is that such moves would all fail low on a full-window && full-depth search. For efficiency reasons, you test with both a null-window and a reduced-depth. However, by doing so, you are using a joint probability of two assumptions and introducing opportunities for both false positives and false negatives. For the reduced-depth && null-window search, both Fruit and Stockfish accept all the false positives (i.e. if the move fails low, you go to the next move). Only potentially false negatives (i.e. fail highs) are investigated further. IIUC, this is where you state that they differ:
1) Fruit-like engines believe that the null-window was too optimistic and that some moves were left-out by value-based pruning/reduction schemes (i.e. search instabilities introduced by e.g. null moves, futility). They do a full-window search at reduced-depth to hopefully get the move to fail low. If it does, this is again accepted (even though it could be a false a positive!). If it still fails high, these engines still want to rule out a false negative and they do the full-window && full-depth search and accept whatever answers comes out.
2) Stockfish-like engines believe that the reduced-depth was too optimistic for the side to move and missing some deep opponent tactics/threats. They do a null-window search at full-depth to hopefully get the move to fail low. If it does, this is again accepted (even though it could be a false positive!). If it still fails high, these engines still want to rule out a false negative and they do the full-window && full-depth search and accept whatever answer comes out.

Again, I don't think that there is any a priori reason why one scheme would be better than the other. How much ELO-difference did it make for you?
Rein Halbersma
 
Posts: 5
Joined: Mon Jun 14, 2010 9:40 am

Re: accelerated PVS

Postby lucasart » Thu Dec 27, 2012 5:08 am

Rein Halbersma wrote:In your analysis, there is a similar phenomenon. When searching moves 2...N along a PV, your null hypothesis is that such moves would all fail low on a full-window && full-depth search. For efficiency reasons, you test with both a null-window and a reduced-depth. However, by doing so, you are using a joint probability of two assumptions and introducing opportunities for both false positives and false negatives. For the reduced-depth && null-window search, both Fruit and Stockfish accept all the false positives (i.e. if the move fails low, you go to the next move). Only potentially false negatives (i.e. fail highs) are investigated further. IIUC, this is where you state that they differ:
1) Fruit-like engines believe that the null-window was too optimistic and that some moves were left-out by value-based pruning/reduction schemes (i.e. search instabilities introduced by e.g. null moves, futility). They do a full-window search at reduced-depth to hopefully get the move to fail low. If it does, this is again accepted (even though it could be a false a positive!). If it still fails high, these engines still want to rule out a false negative and they do the full-window && full-depth search and accept whatever answers comes out.
2) Stockfish-like engines believe that the reduced-depth was too optimistic for the side to move and missing some deep opponent tactics/threats. They do a null-window search at full-depth to hopefully get the move to fail low. If it does, this is again accepted (even though it could be a false positive!). If it still fails high, these engines still want to rule out a false negative and they do the full-window && full-depth search and accept whatever answer comes out.

Again, I don't think that there is any a priori reason why one scheme would be better than the other. How much ELO-difference did it make for you?

Good summary. So basically, there's no clear reason why one scheme is best/worse than the other. It's a question of experimentation, and it's possible that the Fruit scheme works better than the Stockfish scheme for me, but perhaps introducing it in Stokfish would be a regression (although I'm not convinced about it, and think that SF does it because Glaurung did it, and the null threat fail low Glaurung trick prevented the Fruit style PVS from being done in SF).

I can't remember the exact results, but I think it was something like 55% on a 2000 games experiment. So very significant (LOS=100.0%). However, the experiment was perhaps carried out at excessively fast time control (i think it was 2"+0.02"). I'll try to redo it (with DiscoCheck), and see how it scales with time (2000 games comparison in 1"+0.01" 2"+0.02" 4"+0.04" 8"+0.08" ...)
"Talk is cheap. Show me the code." -- Linus Torvalds.
User avatar
lucasart
 
Posts: 158
Joined: Mon Dec 17, 2012 1:09 pm

Re: accelerated PVS

Postby Rein Halbersma » Thu Dec 27, 2012 2:13 pm

What would be interesting is to collect a whole bunch of statistics in the following way. Make a class CollectStats with an array[depth][move number][search type] of chars, and then for every PV node's moves 2...N run up to 3 out of all 4 types of searches:
(1) null-window && reduced-depth
(2) null-window && full-depth
(3) full-window && reduced-depth
(4) full-window && full-depth

After each search, you increment the number of fail highs/fail lows at the current depth and move number. Perhaps you'll find some patterns that determine under which circumstances the Fruit-like scheme (1) -> (3) -> (4) is better than the Stockfish-like scheme (1) -> (2) -> (4). Of course, you need to be careful designing such a test because the order of searches will mess with the hash table. Probably it's best to have 2 hash tables, that both get updated during searches (1) and (4), and also during the one of the two possible schemes.
Rein Halbersma
 
Posts: 5
Joined: Mon Jun 14, 2010 9:40 am

Re: accelerated PVS

Postby lucasart » Fri Dec 28, 2012 1:20 am

lucasart wrote:I can't remember the exact results, but I think it was something like 55% on a 2000 games experiment. So very significant (LOS=100.0%). However, the experiment was perhaps carried out at excessively fast time control (i think it was 2"+0.02"). I'll try to redo it (with DiscoCheck), and see how it scales with time (2000 games comparison in 1"+0.01" 2"+0.02" 4"+0.04" 8"+0.08" ...)

Fasle alert. Sorry!

I think I had made a mistake when implementing the "Glaurung-style PVS": have a look at the 3rd code snippet above. Can you find the mistake? ;)

I redid the test in more proper conditions (8000 games in 10"+0.1"), and this time with the correct code for Glaurung-style PVS, and the result was clear. The normal textbook implementation (depth first window second) works best for DiscoCheck:
https://github.com/lucasart/chess/commit/903732a90f208f4f1c731f92554b96894e1cf21f

But still, depending on the engine, it may be possible that the Fruit-style PVS works best. It depends on how unstable your search is, and how differently it treats PV and non PV nodes. Something to experiment with.

For DiscoCheck, this conclusion could be the result of two things:
- my search is not too unstable, and doesn't make too much distinctions between PV and non PV nodes
- my search reductions are somewhat more aggressive than Fruit's. So a depth first re-search is more important than trying to solve search instabilities at reduced depth first.
"Talk is cheap. Show me the code." -- Linus Torvalds.
User avatar
lucasart
 
Posts: 158
Joined: Mon Dec 17, 2012 1:09 pm

Re: accelerated PVS

Postby Rein Halbersma » Fri Dec 28, 2012 10:13 am

yes I had noticed the typo but I thought it was in the pseudo-code only, not in your real code!
Rein Halbersma
 
Posts: 5
Joined: Mon Jun 14, 2010 9:40 am

Re: accelerated PVS

Postby lucasart » Sun Dec 30, 2012 4:35 am

Rein Halbersma wrote:yes I had noticed the typo but I thought it was in the pseudo-code only, not in your real code!

My "production code" (Fruit style) was correct, but my "test code" (Glaurung style) was wrong. So that completely screwed my test results, and conclusion.

It was a copy/paste accident. Note for myself: if you drink, don't code :lol:
"Talk is cheap. Show me the code." -- Linus Torvalds.
User avatar
lucasart
 
Posts: 158
Joined: Mon Dec 17, 2012 1:09 pm

Re: accelerated PVS

Postby hyatt » Wed Jan 02, 2013 11:05 pm

lucasart wrote:
Rein Halbersma wrote:In your analysis, there is a similar phenomenon. When searching moves 2...N along a PV, your null hypothesis is that such moves would all fail low on a full-window && full-depth search. For efficiency reasons, you test with both a null-window and a reduced-depth. However, by doing so, you are using a joint probability of two assumptions and introducing opportunities for both false positives and false negatives. For the reduced-depth && null-window search, both Fruit and Stockfish accept all the false positives (i.e. if the move fails low, you go to the next move). Only potentially false negatives (i.e. fail highs) are investigated further. IIUC, this is where you state that they differ:
1) Fruit-like engines believe that the null-window was too optimistic and that some moves were left-out by value-based pruning/reduction schemes (i.e. search instabilities introduced by e.g. null moves, futility). They do a full-window search at reduced-depth to hopefully get the move to fail low. If it does, this is again accepted (even though it could be a false a positive!). If it still fails high, these engines still want to rule out a false negative and they do the full-window && full-depth search and accept whatever answers comes out.
2) Stockfish-like engines believe that the reduced-depth was too optimistic for the side to move and missing some deep opponent tactics/threats. They do a null-window search at full-depth to hopefully get the move to fail low. If it does, this is again accepted (even though it could be a false positive!). If it still fails high, these engines still want to rule out a false negative and they do the full-window && full-depth search and accept whatever answer comes out.

Again, I don't think that there is any a priori reason why one scheme would be better than the other. How much ELO-difference did it make for you?

Good summary. So basically, there's no clear reason why one scheme is best/worse than the other. It's a question of experimentation, and it's possible that the Fruit scheme works better than the Stockfish scheme for me, but perhaps introducing it in Stokfish would be a regression (although I'm not convinced about it, and think that SF does it because Glaurung did it, and the null threat fail low Glaurung trick prevented the Fruit style PVS from being done in SF).

I can't remember the exact results, but I think it was something like 55% on a 2000 games experiment. So very significant (LOS=100.0%). However, the experiment was perhaps carried out at excessively fast time control (i think it was 2"+0.02"). I'll try to redo it (with DiscoCheck), and see how it scales with time (2000 games comparison in 1"+0.01" 2"+0.02" 4"+0.04" 8"+0.08" ...)


mtd(f) has one issue that is difficult. re-searches. It only takes a very few to exceed the node count of an aspiration-window PVS search. If an eval is simple (material-only is really good in mtd(f) for example) then mtd seems to work. I spent several months years ago trying mtd(f) in Crafty. I still have the source sitting around somewhere, but the problem was always convergence. 3-4 researches and you are slower than PVS...
hyatt
 
Posts: 1009
Joined: Thu Jun 10, 2010 2:13 am
Location: University of Alabama at Birmingham


Return to Programming and Technical Discussions

Who is online

Users browsing this forum: Yahoo [Bot] and 1 guest