Page 1 of 2

Idea to virtually zero hash collisions

Posted: Tue Mar 15, 2016 8:14 am
by Peterpan
hi

In my struggle to correct and perfect my hash function for my perft,i have stumbled upon a discovery.Not sure if this is novel.
With only a 32 bit hashkey i was able to get this with only 1 hash collision,i know there should be many more with this amount of nodes.

The Starting HashKey = 12888282270356907191
Number of nodes depth (1) = 20
Number of nodes depth (2) = 400
Number of nodes depth (3) = 8902
Number of nodes depth (4) = 197281
Number of nodes depth (5) = 4865609
Number of nodes depth (6) = 119060324
Number of nodes depth (7) = 3195901860
Number of nodes depth (8) = 84998978956
Number of nodes depth (9) = 2439530234167
Number of nodes depth (10) = 69352859712417
Number of the hashnodes = 66691435746827
Number of Hash Collisions = 1

I used a function to check the position to see if the position is similar with the same hashkey to detect the collision.There was only 1 in this case.

The way i did it was as follows.The hashkey is dynamic,it changes,for example if i make a move and unmake,the hashkey changes,if i make the same move again,it's different.
For perft there is no problem with the hashnodes,the nodes are correct.And the amount of nodes detected in hash is even more than with my regular approach.The speed is also as fast if not faster.

I used a temp hashkey in my movegenerator.

In my movelist structure i have for example these 2...

struc movelist
{
.movelist rq 246
.hashkey rq 246
..
..

and i don't calculate the hashkeys in make/unmake,but in my move generator.I use a temp hashkey

if UseHash eq 1
mov rax,[my.hashkey]
mov [my.tempkey],rax
end if

and then later...

if UseHash eq 1
mov rcx,[my.tempkey]
mov [make.hashkey + r10 + r14],rcx
end if

mov qword [make.movelist + r10 + r14],rsi

and then in my make/unmake function i use it as follows... in make...

mov rax,[make.hashkey + rbx + r14]
mov [my.hashkey],rax

and in unmake...

mov [my.hashkey],0

For perft this works perfect,correct nodes,virtually no hash collisions,optimal hash detection,good speed.

Any comments how this could work in my chess engine,with not having the same hashkey everytime for the same position,is this usable in a real life chess engine,or just good for my perft function?
In each hashkey i would save the move anyway,so it should not be a problem ?!

Best Regards
Izak

Re: Idea to virtually zero hash collisions

Posted: Tue Mar 15, 2016 6:18 pm
by H.G.Muller
I don't get it. If the hash key is different for the same position, how can you find the information you stored for that position?

Re: Idea to virtually zero hash collisions

Posted: Tue Mar 15, 2016 6:26 pm
by hyatt
More important would be to report the number of "hash hits" which would likely appear to be "1" also. :)

Re: Idea to virtually zero hash collisions

Posted: Tue Mar 15, 2016 6:30 pm
by Peterpan
I do not get it either.It works,i don't understand how though... okay from initial start position,i made the hashkey 0 in this example.
So it starts with 0...
Then i move Knight to A3 and hashkey changes HashKey = 15549287946380739836
Then i unmake move and hashkey = 0 again,okay fine... but.
Lets say i move Knight to A3,and then i move black pawn to A5,the hashkey changes HashKey = 5594462511625213003
Okay still fine... but now i unmake black's last move,
and now the hashkey changes back to 0!
I move the black pawn once again to a5,and now the hashkey is different HashKey = 11126614585222664375

I could have started with the initial position and hashkey 12888282270356907191 as in my previous example,something similar happens,but i happen to have this example to post.

I do not understand how this works,and how there are so many hash hits.But this was the only way i could get correct nodes,with hash enabled,if i try to save the hashkey in make,and restore in unmake,hashnodes stays correct with make/unmake,but then in some positions perft gives incorrect nodes! but this way as in above example,there is virtually NO hash collisions and also correct hash nodes in all tested positions,even tested kiwipete last night up to depth 8,took more than 6 hours,and confirmed nodes with Stockfish,which took around 10 or more hours (of course stockfish does not use hash in perft)

So maybe you can help me solve this puzzle!,i am even willing to send you my source code,but it's assembler,and might be hard to read... but i do not wish to share it in public.

Many Thanks.

Best Regards
Izak

Re: Idea to virtually zero hash collisions

Posted: Tue Mar 15, 2016 8:21 pm
by H.G.Muller
You say you have no collisions. But it is more interesting to know whether you have any hits. If you never have any hits (i.e. hash keys the same) there is no possibility to have any collisions (same key but different position).

My first thought is that the situation where you do restore the key properly so that hashing actualy does something, some of the perfts are wrong because of collissions. These could be just the statistically expected number of collisions, or far more due to dependency in your Zobrist key set, or not properly accounting for e.p. rights or promotion piece in the key, etc. When you spoil the hash key so badly that youwon't ever get any hits, so that what you store in the hash key will never be used... Well, that would be a very good way of hiding how buggy the hashing really is.

Re: Idea to virtually zero hash collisions

Posted: Tue Mar 15, 2016 8:29 pm
by Peterpan
The number of hashhits are reported with hashnodes.Look at first post
And without hash it takes more than 2 times slower.With hash on,mine is almost 2 times faster than stockfish.
So i have no doubt that i have hash hits.

Like i said,i can send you my source... you can see for yourself.

Re: Idea to virtually zero hash collisions

Posted: Tue Mar 15, 2016 8:31 pm
by Peterpan
e.p and castling rights are working 100% although yes it took quite an effort to get them 100%.And promotion hashing as well.
I tested several positions with hash on,initial up to depth 10,correct nodes,good speed,kiwipete up to depth 8,correct nodes,good speed,promotion positions up to high depths,positions with ep and only king and pawns,some with rooks,some up to depth 12,all correct,speed correct,amount of hashnodes good,node count correct.The only problem is that i do not get any collisions,with depth 10 initial i got 1! Second problem is the one i described above.My question remains,can i use this hash function in my chess engine,in my hash i will store the move,etc.. so this can be usable,i mean if perft can use it and by some reason detect the correct nodes and have hash hits,it could be sufficient as a hash function in my engine? repitition detection might be a problem ? unless i see what move is in the hashkey's structure?

I guess time will tell...

Re: Idea to virtually zero hash collisions

Posted: Tue Mar 15, 2016 8:50 pm
by H.G.Muller
Peterpan wrote:The number of hashhits are reported with hashnodes.Look at first post
Well, I did not know what 'hashnodes' meant. But I don't think it can be the number of hits. There are almost as many 'hashnodes' as the perft(10) you calculate. How is that possible? What exactly do you hash? In perft it is pointless to hash the end leaves, as in the leaves you always calculate perft(0) = 1. Usually you don't even visit these 'nodes'. You just count the number of legal moves in the parent node. Ignoring transpositions, there would be perft(9) of those. I.e. about 25 times fewer than your 'hashnodes'.

And even if you would probe in the leaves, there are about 25 times as many 'hashnodes' amongst those as non-hashnodes (66e12 vs 69e12 - 66e12 = 3e12). So each node you store would then have to be hit 25 times afterwards. That is ridiculous. There won't be that many transpositions.

And the whole point of hashing is that it makes the tree smaller because you find the perft of entire sub-trees in the hash,so that you actually don't have to traverse them. So an perft tree with 69e12 nodes when you do not hash would have perhaps only 1e12 nodes actually visited. When you visit only 1e12nodes, how can there be 66e12 hash hits??? How many times do you probe the same hash entry in one node? A million times?

Something is very rotten here...

Re: Idea to virtually zero hash collisions

Posted: Tue Mar 15, 2016 8:54 pm
by Peterpan
yes correct,it's not the hits.the hashnodes is multiplied by the number of moves per turn.

I had a function to report the hits as well,but it is not present in this example.Although i know the more the hashnodes,the faster my engine runs...
and the less actual nodes my engine has to search,cause he skips them! when a known hashkey is detected,by the number of moves for that turn,or something like that.

No my move generation,for example sees the current move has 25 nodes for example....
Then when a hashkey is detected,it skips the 25 nodes,and those are the nodes that hashnodes report,the hit for that actual one is 1 not 25.
I am well aware of this!

Re: Idea to virtually zero hash collisions

Posted: Tue Mar 15, 2016 9:03 pm
by Peterpan
I can easily modify my code,to show the number of hits,it would be 25 times less +- as you say... depending on how many candidate legal moves the average is per move