Include paths are processed from left to right, so we use Prepend to
ensure that paths to bundled thirdparty files will have precedence over
system paths (e.g. `/usr/include` should have lowest priority).
Many contributors (me included) did not fully understand what CCFLAGS,
CXXFLAGS and CPPFLAGS refer to exactly, and were thus not using them
in the way they are intended to be.
As per the SCons manual: https://www.scons.org/doc/HTML/scons-user/apa.html
- CCFLAGS: General options that are passed to the C and C++ compilers.
- CFLAGS: General options that are passed to the C compiler (C only;
not C++).
- CXXFLAGS: General options that are passed to the C++ compiler. By
default, this includes the value of $CCFLAGS, so that setting
$CCFLAGS affects both C and C++ compilation.
- CPPFLAGS: User-specified C preprocessor options. These will be
included in any command that uses the C preprocessor, including not
just compilation of C and C++ source files [...], but also [...]
Fortran [...] and [...] assembly language source file[s].
TL;DR: Compiler options go to CCFLAGS, unless they must be restricted
to either C (CFLAGS) or C++ (CXXFLAGS). Preprocessor defines go to
CPPFLAGS.
Godot supports many different compilers and for production releases we
have to support 3 currently: GCC8, Clang6, and MSVC2017. These compilers
all do slightly different things with -ffast-math and it is causing
issues now. See #24841, #24540, #10758, #10070. And probably other
complaints about physics differences between release and release_debug
builds.
I've done some performance comparisons on Linux x86_64. All tests are
ran 20 times.
Bunnymark: (higher is better)
(bunnies) min max stdev average
fast-math 7332 7597 71 7432
this pr 7379 7779 108 7621 (102%)
FPBench (gdscript port http://fpbench.org/) (lower is better)
(ms)
fast-math 15441 16127 192 15764
this pr 15671 16855 326 16001 (99%)
Float_add (adding floats in a tight loop) (lower is better)
(sec)
fast-math 5.49 5.78 0.07 5.65
this pr 5.65 5.90 0.06 5.76 (98%)
Float_div (dividing floats in a tight loop) (lower is better)
(sec)
fast-math 11.70 12.36 0.18 11.99
this pr 11.92 12.32 0.12 12.12 (99%)
Float_mul (multiplying floats in a tight loop) (lower is better)
(sec)
fast-math 11.72 12.17 0.12 11.93
this pr 12.01 12.62 0.17 12.26 (97%)
I have also looked at FPS numbers for tps-demo, 3d platformer, 2d
platformer, and sponza and could not find any measurable difference.
I believe that given the issues and oft-reported (physics) glitches on
release builds I believe that the couple of percent of tight-loop
floating point performance regression is well worth it.
This fixes#24540 and fixes#24841
Apparently -ffast-math generates incorrect code with recent versions of
GCC and Clang. The manual page for GCC warns about this possibility.
In my tests it doesn't actually appear to be measurably slower in this
case, and this is used in a batch process so it seems safe to disable
this.
This fixes#10758 and fixes#10070