Tragic that Martin Sedlak has discontinued Cheng

Code, algorithms, languages, construction...
Post Reply
User923005
Posts: 616
Joined: Thu May 19, 2011 1:35 am

Tragic that Martin Sedlak has discontinued Cheng

Post by User923005 » Fri Jan 31, 2014 11:06 pm

If you have not fetched the latest version, highly recommended. Comes with code.
His code is very beautifully written. In addition, it is also written in a robust way (it is quite remarkable for a body of code that large to emit so few compiler warnings).

Quite surprisingly, the function with the most time is cheng4::TransTable::probe at 7.51% of inclusive time. For most high level programs, the dominant time is spent in eval().

Code: Select all

// probe hash table
// returns scInvalid if probe failed
Score TransTable::probe( Signature sig, Ply ply, Depth depth, Score alpha, Score beta, Move &mv ) const
{
    mv = mcNone;
    size_t ei = ((size_t)sig & (size-1) & ~(buckets-1));
    const TransEntry *te = entries + ei;
    TransEntry lte;
    for ( uint i=0; i<buckets; i++, te++)
    {
        lte = *te; // ! The lion's share of the time is in this assignment, duplicating a copy of the hash entry.
        lte.bhash ^= lte.u.word2;
        if ( lte.bhash == sig )
        {
            mv = lte.u.s.move;
            if ( lte.u.s.depth < depth )
                return scInvalid;
            BoundType bt = (BoundType)(lte.u.s.bound & 3);
            Score score = ScorePack::unpackHash( lte.u.s.score, ply );
            switch ( bt )
            {
            case btExact:
                return score;
            case btUpper:
                if ( score <= alpha )
                    return score;
                break;
            case btLower:
                if ( score >= beta )
                    return score;
                break;
            default:
                return scInvalid;
            }
        }
    }
    return scInvalid;
}
I tried to speed it up by not duplicating the data, using the pointer directly, and using this instead:

Code: Select all

//        lte = *te; // The lion's share of the time is in this assignment, duplicating a copy of the hash entry.
//        lte.bhash ^= lte.u.word2;
        if ( (te->bhash ^ lte->u.word2) == sig )
but it was actually slightly slower.

I notice that his hash table is 4 levels deep (deeper than many others).

syzygy
Posts: 148
Joined: Sun Oct 16, 2011 4:21 pm

Re: Tragic that Martin Sedlak has discontinued Cheng

Post by syzygy » Mon Feb 03, 2014 7:54 pm

I think many engines use TT buckets with 4 entries (4 x 16 bytes = 1 cache line).

What seems strange to me is that the probe() function does not return if the score stored in hash is not sufficient for causing a cut-off. The code breaks out of the switch() statement, but not out of the for() loop.

The explanation is not that multiple bounds may be stored, because it does immediately return in case of insufficient depth. So this seems to be a bug.

This does not (fully) explain why the code would take much longer than TT probe implementations of other engines that use buckets of 4 entries. One thing that could be checked is whether buckets are properly aligned at 64-byte boundaries and whether entries really are 16 bytes.

It could also be that the profiler isn't very accurate.

User923005
Posts: 616
Joined: Thu May 19, 2011 1:35 am

Re: Tragic that Martin Sedlak has discontinued Cheng

Post by User923005 » Mon Feb 03, 2014 8:56 pm

I guess it is fairly rare when the hash signature matches and it is deep enough and it is an upper or lower bound that cannot change alpha or beta.
It is worth looking at, so I am trying a test with this:

Code: Select all

// probe hash table
// returns scInvalid if probe failed
Score TransTable::probe( Signature sig, Ply ply, Depth depth, Score alpha, Score beta, Move &mv ) const
{
    mv = mcNone;
    size_t ei = ((size_t)sig & (size-1) & ~(buckets-1));
    const TransEntry *te = entries + ei;
    TransEntry lte;
    for ( uint i=0; i<buckets; i++, te++)
    {
        lte = *te;
        lte.bhash ^= lte.u.word2;
        if ( lte.bhash == sig )
        {
            mv = lte.u.s.move;
            if ( lte.u.s.depth < depth )
                return scInvalid;
            BoundType bt = (BoundType)(lte.u.s.bound & 3);
            Score score = ScorePack::unpackHash( lte.u.s.score, ply );
            switch ( bt )
            {
            case btExact:
                return score;
            case btUpper:
                if ( score <= alpha )
                    return score;
                break;
            case btLower:
                if ( score >= beta )
                    return score;
                break;
            default:
                return scInvalid;
            }
            return scInvalid; // We *FOUND* the hash entry.  It was upper or lower.  Did not match criteria.
        }
    }
    return scInvalid;
}

User923005
Posts: 616
Joined: Thu May 19, 2011 1:35 am

Re: Tragic that Martin Sedlak has discontinued Cheng

Post by User923005 » Mon Feb 03, 2014 9:13 pm

No real change.
7.8% of the time in hash probe.

k_mar
Posts: 2
Joined: Sat Feb 15, 2014 9:44 am
Real Name: Martin Sedlak

Re: Tragic that Martin Sedlak has discontinued Cheng

Post by k_mar » Sat Feb 15, 2014 10:12 am

User923005 wrote:Quite surprisingly, the function with the most time is cheng4::TransTable::probe at 7.51% of inclusive time. For most high level programs, the dominant time is spent in eval().
Cheng spends around 25% of time in eval. Ronald is right, there's a bug in probe, but it has zero functional impact.

Post Reply