How does crafty's cluster work?

Code, algorithms, languages, construction...
Post Reply
benstoker
Posts: 110
Joined: Thu Jun 10, 2010 7:32 pm
Real Name: Ben Stoker

How does crafty's cluster work?

Post by benstoker » Mon Jul 26, 2010 9:43 pm

I did some lazy searching in talkchess and came up null for answers to the following. Out of curiosity, can someone give a description of what Dr. Hyatt's cluster is made up of, which he runs his tests on? What's the hardware? What OS? Linux? What about the cluster software - what's that? How does it work? Can you assign a 3 ghz processor to 16 engines or 16 processors to one engine? Do you log in to a shell account with pre-assigned cores available?

Also, how does Dr. Hyatt run these engine-engine games? What software tool does he use to make the engines talk to each other? Surely not xboard, since he must run these tests via a CLI terminal only - or maybe not.

How much RAM is on this cluster?

If all the processors are NOT on one chip, how can the threads communicate fast enough?

p.s. I want one. Can I get one at the Apple store?

hyatt
Posts: 1242
Joined: Thu Jun 10, 2010 2:13 am
Real Name: Bob Hyatt (Robert M. Hyatt)
Location: University of Alabama at Birmingham
Contact:

Re: How does crafty's cluster work?

Post by hyatt » Tue Jul 27, 2010 12:26 am

We have two different cluster. One has 128 nodes, two cpus per node, for a total of 256. 8gb RAM, nodes connected with gigabit and InfiniBand.

The other cluster has 70 nodes, 8 cores (dual quad-core) per node, 560 total cores. 12GB RAM/node, same kind of interconnections.

Both clusters have huge disk storage arrays. Both run 64 bit linux, with both gcc and the Intel C++ compiler (I use the Intel compiler only, myself).

You run things by submitting shell scripts. You can submit one script for each cpu, one for each node, or one for a group of nodes (you specify this in the script.)

I use a referee program I wrote that does essentially what xboard does in match mode, except that there is no GUI to show the game as it is played out. The referee can play two programs against each other, and can be told which positions out of an EPD file to use as the starting positions, and how many games per position to play (usually 2). It can also be told the time control to use. It plays the games, records the PGN, and then BayesElo eats all the PGN and gives a clear result.

I have an automated script that will play many versions, and even play the same version several times varying one parameter for each different match, used for tuning. The tests typically do not involve parallel search at all, although I have run some parallel search test matches on the 8 core per node cluster. But no distributed search...

benstoker
Posts: 110
Joined: Thu Jun 10, 2010 7:32 pm
Real Name: Ben Stoker

Re: How does crafty's cluster work?

Post by benstoker » Tue Jul 27, 2010 1:17 am

Thanks. Is your referee program available to the public or open source?
hyatt wrote:We have two different cluster. One has 128 nodes, two cpus per node, for a total of 256. 8gb RAM, nodes connected with gigabit and InfiniBand.

The other cluster has 70 nodes, 8 cores (dual quad-core) per node, 560 total cores. 12GB RAM/node, same kind of interconnections.

Both clusters have huge disk storage arrays. Both run 64 bit linux, with both gcc and the Intel C++ compiler (I use the Intel compiler only, myself).

You run things by submitting shell scripts. You can submit one script for each cpu, one for each node, or one for a group of nodes (you specify this in the script.)

I use a referee program I wrote that does essentially what xboard does in match mode, except that there is no GUI to show the game as it is played out. The referee can play two programs against each other, and can be told which positions out of an EPD file to use as the starting positions, and how many games per position to play (usually 2). It can also be told the time control to use. It plays the games, records the PGN, and then BayesElo eats all the PGN and gives a clear result.

I have an automated script that will play many versions, and even play the same version several times varying one parameter for each different match, used for tuning. The tests typically do not involve parallel search at all, although I have run some parallel search test matches on the 8 core per node cluster. But no distributed search...

hyatt
Posts: 1242
Joined: Thu Jun 10, 2010 2:13 am
Real Name: Bob Hyatt (Robert M. Hyatt)
Location: University of Alabama at Birmingham
Contact:

Re: How does crafty's cluster work?

Post by hyatt » Tue Jul 27, 2010 6:20 pm

benstoker wrote:Thanks. Is your referee program available to the public or open source?
I have not released it as I am not looking for _another_ program to support. :) I may, at some point, since it has played many tens of millions of games so far...

y

hyatt wrote:We have two different cluster. One has 128 nodes, two cpus per node, for a total of 256. 8gb RAM, nodes connected with gigabit and InfiniBand.

The other cluster has 70 nodes, 8 cores (dual quad-core) per node, 560 total cores. 12GB RAM/node, same kind of interconnections.

Both clusters have huge disk storage arrays. Both run 64 bit linux, with both gcc and the Intel C++ compiler (I use the Intel compiler only, myself).

You run things by submitting shell scripts. You can submit one script for each cpu, one for each node, or one for a group of nodes (you specify this in the script.)

I use a referee program I wrote that does essentially what xboard does in match mode, except that there is no GUI to show the game as it is played out. The referee can play two programs against each other, and can be told which positions out of an EPD file to use as the starting positions, and how many games per position to play (usually 2). It can also be told the time control to use. It plays the games, records the PGN, and then BayesElo eats all the PGN and gives a clear result.

I have an automated script that will play many versions, and even play the same version several times varying one parameter for each different match, used for tuning. The tests typically do not involve parallel search at all, although I have run some parallel search test matches on the 8 core per node cluster. But no distributed search...

benstoker
Posts: 110
Joined: Thu Jun 10, 2010 7:32 pm
Real Name: Ben Stoker

Re: How does crafty's cluster work?

Post by benstoker » Thu Jul 29, 2010 4:42 pm

If you do release it, may I suggest a name for it --- "kudzu". I was in Birmingham the other day and noticed all the kudzu. It evokes the notion of multiple connections.

[Idle thought for the day]
hyatt wrote:
benstoker wrote:Thanks. Is your referee program available to the public or open source?
I have not released it as I am not looking for _another_ program to support. :) I may, at some point, since it has played many tens of millions of games so far...

y

hyatt wrote:We have two different cluster. One has 128 nodes, two cpus per node, for a total of 256. 8gb RAM, nodes connected with gigabit and InfiniBand.

The other cluster has 70 nodes, 8 cores (dual quad-core) per node, 560 total cores. 12GB RAM/node, same kind of interconnections.

Both clusters have huge disk storage arrays. Both run 64 bit linux, with both gcc and the Intel C++ compiler (I use the Intel compiler only, myself).

You run things by submitting shell scripts. You can submit one script for each cpu, one for each node, or one for a group of nodes (you specify this in the script.)

I use a referee program I wrote that does essentially what xboard does in match mode, except that there is no GUI to show the game as it is played out. The referee can play two programs against each other, and can be told which positions out of an EPD file to use as the starting positions, and how many games per position to play (usually 2). It can also be told the time control to use. It plays the games, records the PGN, and then BayesElo eats all the PGN and gives a clear result.

I have an automated script that will play many versions, and even play the same version several times varying one parameter for each different match, used for tuning. The tests typically do not involve parallel search at all, although I have run some parallel search test matches on the 8 core per node cluster. But no distributed search...

hyatt
Posts: 1242
Joined: Thu Jun 10, 2010 2:13 am
Real Name: Bob Hyatt (Robert M. Hyatt)
Location: University of Alabama at Birmingham
Contact:

Re: How does crafty's cluster work?

Post by hyatt » Thu Jul 29, 2010 7:47 pm

benstoker wrote:If you do release it, may I suggest a name for it --- "kudzu". I was in Birmingham the other day and noticed all the kudzu. It evokes the notion of multiple connections.

[Idle thought for the day]
We definitely have Kudzu all over the south. Stuff grows up to 6' (six feet!) per night with lots of rain and sunshine. If you pass thru, give me a call. I'm in the phone book...


hyatt wrote:
benstoker wrote:Thanks. Is your referee program available to the public or open source?
I have not released it as I am not looking for _another_ program to support. :) I may, at some point, since it has played many tens of millions of games so far...

y

hyatt wrote:We have two different cluster. One has 128 nodes, two cpus per node, for a total of 256. 8gb RAM, nodes connected with gigabit and InfiniBand.

The other cluster has 70 nodes, 8 cores (dual quad-core) per node, 560 total cores. 12GB RAM/node, same kind of interconnections.

Both clusters have huge disk storage arrays. Both run 64 bit linux, with both gcc and the Intel C++ compiler (I use the Intel compiler only, myself).

You run things by submitting shell scripts. You can submit one script for each cpu, one for each node, or one for a group of nodes (you specify this in the script.)

I use a referee program I wrote that does essentially what xboard does in match mode, except that there is no GUI to show the game as it is played out. The referee can play two programs against each other, and can be told which positions out of an EPD file to use as the starting positions, and how many games per position to play (usually 2). It can also be told the time control to use. It plays the games, records the PGN, and then BayesElo eats all the PGN and gives a clear result.

I have an automated script that will play many versions, and even play the same version several times varying one parameter for each different match, used for tuning. The tests typically do not involve parallel search at all, although I have run some parallel search test matches on the 8 core per node cluster. But no distributed search...

BrianR
Posts: 17
Joined: Thu Jun 10, 2010 4:48 am

Re: How does crafty's cluster work?

Post by BrianR » Fri Jul 30, 2010 5:06 pm

Any update on cluster search for Crafty?

benstoker
Posts: 110
Joined: Thu Jun 10, 2010 7:32 pm
Real Name: Ben Stoker

Re: How does crafty's cluster work?

Post by benstoker » Fri Jul 30, 2010 5:36 pm

I noted the nice temp also, that is, compared to the oppressive swelter of Austin. I did think about ringing in to see if I could come gawk at Chess Engine Cluster Central Station, but it was all biz - in and out and no time. Maybe next time.
hyatt wrote:
benstoker wrote:If you do release it, may I suggest a name for it --- "kudzu". I was in Birmingham the other day and noticed all the kudzu. It evokes the notion of multiple connections.

[Idle thought for the day]
We definitely have Kudzu all over the south. Stuff grows up to 6' (six feet!) per night with lots of rain and sunshine. If you pass thru, give me a call. I'm in the phone book...


hyatt wrote:
benstoker wrote:Thanks. Is your referee program available to the public or open source?
I have not released it as I am not looking for _another_ program to support. :) I may, at some point, since it has played many tens of millions of games so far...

y

hyatt wrote:We have two different cluster. One has 128 nodes, two cpus per node, for a total of 256. 8gb RAM, nodes connected with gigabit and InfiniBand.

The other cluster has 70 nodes, 8 cores (dual quad-core) per node, 560 total cores. 12GB RAM/node, same kind of interconnections.

Both clusters have huge disk storage arrays. Both run 64 bit linux, with both gcc and the Intel C++ compiler (I use the Intel compiler only, myself).

You run things by submitting shell scripts. You can submit one script for each cpu, one for each node, or one for a group of nodes (you specify this in the script.)

I use a referee program I wrote that does essentially what xboard does in match mode, except that there is no GUI to show the game as it is played out. The referee can play two programs against each other, and can be told which positions out of an EPD file to use as the starting positions, and how many games per position to play (usually 2). It can also be told the time control to use. It plays the games, records the PGN, and then BayesElo eats all the PGN and gives a clear result.

I have an automated script that will play many versions, and even play the same version several times varying one parameter for each different match, used for tuning. The tests typically do not involve parallel search at all, although I have run some parallel search test matches on the 8 core per node cluster. But no distributed search...

Post Reply