Page 1 of 1

GCC -Ofast

Posted: Thu Mar 31, 2011 2:05 pm
by kingliveson
According to released document:

Code: Select all

-Ofast
Disregard strict standards compliance. ‘-Ofast’ enables all ‘-O3’ optimizations.
It also enables optimizations that are not valid for all standard compliant pro-
grams. It turns on ‘-ffast-math’.

Code: Select all

-ffast-math
Sets ‘-fno-math-errno’, ‘-funsafe-math-optimizations’, ‘-ffinite-math-only’,
‘-fno-rounding-math’, ‘-fno-signaling-nans’ and ‘-fcx-limited-range’.
This option causes the preprocessor macro __FAST_MATH__ to be defined.
Is there an advantage over -O3 for chess engines and what's wrong with the following combination?

-Ofast -fstrict-aliasing -fomit-frame-pointer -fno-exceptions -ffp-contract=fast \
-fkeep-static-consts -fmodulo-sched -fmodulo-sched-allow-regmoves -fgcse-las \
-fgcse-after-reload -fdce -fdse -fira-loop-pressure -fschedule-insns2 \
-fsched-pressure -fsched-spec-load -fsched2-use-superblocks \
-fipa-pta -fbranch-probabilities -fwhole-program -flto \
-fprofile-generate

Re: GCC -Ofast

Posted: Thu Mar 31, 2011 9:51 pm
by hyatt
It will vary by program, and by gcc version, and by processor type. Try the options and compare the NPS speeds. Choose the options that lead to the highest NPS and you can't go wrong...

Re: GCC -Ofast

Posted: Sat Apr 02, 2011 4:08 am
by orgfert
hyatt wrote:It will vary by program, and by gcc version, and by processor type. Try the options and compare the NPS speeds. Choose the options that lead to the highest NPS and you can't go wrong...
Shouldn't we go by best time-to-ply?

Re: GCC -Ofast

Posted: Sat Apr 02, 2011 1:03 pm
by Bo Persson
kingliveson wrote:

Code: Select all

-ffast-math
Sets ‘-fno-math-errno’, ‘-funsafe-math-optimizations’, ‘-ffinite-math-only’,
‘-fno-rounding-math’, ‘-fno-signaling-nans’ and ‘-fcx-limited-range’.
This option causes the preprocessor macro __FAST_MATH__ to be defined.
Those seems to be floating point options. Probably not useful for a chess program.

Re: GCC -Ofast

Posted: Sat Apr 02, 2011 1:29 pm
by kingliveson
hyatt wrote:It will vary by program, and by gcc version, and by processor type. Try the options and compare the NPS speeds. Choose the options that lead to the highest NPS and you can't go wrong...
I might get a chance to do some benchmarks later on today in combination with the other flags to see.
orgfert wrote:
hyatt wrote:It will vary by program, and by gcc version, and by processor type. Try the options and compare the NPS speeds. Choose the options that lead to the highest NPS and you can't go wrong...
Shouldn't we go by best time-to-ply?
Wouldnt best NPS for the program give better time-to-ply?
Bo Persson wrote:
kingliveson wrote:

Code: Select all

-ffast-math
Sets ‘-fno-math-errno’, ‘-funsafe-math-optimizations’, ‘-ffinite-math-only’,
‘-fno-rounding-math’, ‘-fno-signaling-nans’ and ‘-fcx-limited-range’.
This option causes the preprocessor macro __FAST_MATH__ to be defined.
Those seems to be floating point options. Probably not useful for a chess program.
Aha, this is what I was looking for specifically regarding '-Ofast'.

Re: GCC -Ofast

Posted: Sun Apr 03, 2011 5:40 am
by hyatt
orgfert wrote:
hyatt wrote:It will vary by program, and by gcc version, and by processor type. Try the options and compare the NPS speeds. Choose the options that lead to the highest NPS and you can't go wrong...
Shouldn't we go by best time-to-ply?
That will be exactly the same measurement. Since you are not changing the search tree shape at all (assuming the program has no bugs where the optimizer can change the tree in odd ways when you have uninitialized data) time-to-ply or nps are going to be exactly proportional...

To reduce the time-to-ply by 10% you have to increase the NPS by 10%.