Combining hashing with parallel search

rgoomes · Post by **rgoomes** » Sat Sep 23, 2017 11:15 pm

Just gained the motivation to do my first implementation of parallel search but first I tried to do parallel perft with hashing in order to test some ideas and the correctness of some code. In this test I have stumbled with one problem. Very rarely, when multiple threads try to store and lookup the hashtable at the same time somehow the hash entries became corrupted which sometimes gave wrong perft results (for this I made a test in which not only I stored the key in the hashtable but also the full chess position struct and very rarely I hit an assertion where the real key of the position didn't match the key stored in the hash entry).

I still cannot understand why this happens because when I lookup the hashtable I make a copy of the hash entry at index key % size to a local variable so as when I test if the keys match the other threads cannot change this local variable. Also, when storing I simply assign a full struct to the hash table like so: Hashtable[index] = {key, depth, ... }, which should be atomic right?

Nonetheless, I can think of at least two solutions. I can use a mutex so that no two threads can execute the store and lookup procedures simultaneously or each thread can have its own hashtable. In terms of performance the latter gave the better results, however it is still far slower than plain store and lookup.

I have looked into Stockfish code and other engines and none seem to use any of this solutions in their hashtable code, just plain store and lookup. How?

Any advices?

H.G.Muller · Post by **H.G.Muller** » Thu Sep 28, 2017 5:27 pm

Reading / storing of structs is not atomic, unless they happen to fit in a 64-bit machine word.

When hashing, you should be prepared to handle the situation where you sometimes get faulty information. Even with a 64-bit key this occasionally happens through key collisions. Often enough to make the engine unusable for long time controls if it would crash on the wrong info.

The point is that an engine can tolarate upto 0.1% of completely spoiled hash info when it is crash proof, without measurably affecting its move choice. This is why engines tend to ignore the problem. Stockfish only stores 16 bits signature in the hash entry, and thus gets one collision with arbitrarily wrong info once per 10,000 probes, or so. The few extra collisions it might get by corruption through simultaneous SMP writing completely drowns in this noise of key collisions.

For perft things are of course different, and a single probe error enywhere in the tree spoils the entire result. But top engines are not designed to be reliable perft calculators. Steven Edwards used a 128-bit key in his perft calculator.

Nevertheless, if you want to address the problem, Bob Hyatt's lockless hashing is a cheap solution: if the hash entry fits in two atomic words, you XOR the word that contains the key with the 'other' word, before storing it. After reading the entry on a probe, you do it again. If the other word wasn't the one that belonged to the key word because of SMP corruption, the two XOR operations do not cancel each other, and the key will be changed, and thus not recognized as a hash hit.

rgoomes · Post by **rgoomes** » Fri Sep 29, 2017 12:40 am

H.G.Muller wrote:Reading / storing of structs is not atomic, unless they happen to fit in a 64-bit machine word.

When hashing, you should be prepared to handle the situation where you sometimes get faulty information. Even with a 64-bit key this occasionally happens through key collisions. Often enough to make the engine unusable for long time controls if it would crash on the wrong info.

The point is that an engine can tolarate upto 0.1% of completely spoiled hash info when it is crash proof, without measurably affecting its move choice. This is why engines tend to ignore the problem. Stockfish only stores 16 bits signature in the hash entry, and thus gets one collision with arbitrarily wrong info once per 10,000 probes, or so. The few extra collisions it might get by corruption through simultaneous SMP writing completely drowns in this noise of key collisions.

For perft things are of course different, and a single probe error enywhere in the tree spoils the entire result. But top engines are not designed to be reliable perft calculators. Steven Edwards used a 128-bit key in his perft calculator.

Nevertheless, if you want to address the problem, Bob Hyatt's lockless hashing is a cheap solution: if the hash entry fits in two atomic words, you XOR the word that contains the key with the 'other' word, before storing it. After reading the entry on a probe, you do it again. If the other word wasn't the one that belonged to the key word because of SMP corruption, the two XOR operations do not cancel each other, and the key will be changed, and thus not recognized as a hash hit.

Yes, struct assignment is not atomic. I knew this, just forgot it. This explains the corrupted hash entries. Well, yes, for perft a single probe error will give wrong results, but perft only serves for testing purposes. As for the search I guess I'll just ignore this corruptions. Like you said, a very low percentage of key collisions and corruptions is tolerable.

Thank you for the insight

OpenChess

OpenChess

Combining hashing with parallel search

Combining hashing with parallel search

Re: Combining hashing with parallel search

Re: Combining hashing with parallel search