When using use_static_cpp we want to statically link with atomic as well
to make sure we don't incur any new runtime dependencies.
Scons doesn't quite support this so we do this little trick.
According to the LLVM documentation when using GNU's libstdc++ clang
will not automatically link with -latomic. This is necessary since we
merged c++11 atomics support.
This fixes linking using Clang on Linux
This makes it possibly to run Linux binaries compiled with udev support on
Linux systems which do not provide udev (typically systemd-less distros).
If udev is missing, we fall back to parsing `/dev/input` like when compiled
without udev support (`udev=no`).
Also adding some verbose debug statements to know which method we're using
when debugging Linux joypad issues.
The libudev so wrappers were generated on Mageia 8 with libudev 246.9 using
https://github.com/hpvb/dynload-wrapper:
```
./generate-wrapper.py --include /usr/include/libudev.h --sys-include '<libudev.h>' \
--soname libudev.so.1 --init-name libudev --omit-prefix gnu_ \
--output-header libudev-so_wrap.h --output-implementation libudev-so_wrap.c
```
(cherry picked from commits a10c259c1d
and e26a1f807b)
Edit: Updated to version 0.2 of dynload-wrapper to fix symbols clobbering as
done in #46143.
By generating stubs using https://github.com/hpvb/dynload-wrapper we
can dynamically load libpulse and libasound on systems where it is available.
Both are still a build-time requirement but no longer a run-time dependency.
For maintenance purposes the wrappers should not need to be re-generated
unless we want to bump pulse or asound to an incompatible version. It is
unlikely we will want to do this any time soon.
cherry-pick from 09f82fa6ea
This is meant for users making custom builds to match the options used on
optimized, official builds.
This enables, on the platforms which support them:
- `use_static_cpp=yes` (portable binaries for Linux and Windows)
- `use_lto=yes` (link time optimizations - note: requires a lot of RAM!)
- `debug_symbols=no` (no debug symbols, smaller binaries)
Also abort when using MSVC with `production=yes`, as:
- It cannot optimize the GDScript VM like GCC or Clang do, leading to
significant performance drops.
- Its LTO support is unreliable, at least used to trigger crashes last
we tried it extensively.
All options can still be overridden if specified, and the `dev=yes` option
was changed to also support overrides.
(cherry picked from commit db26871210)
This has been enabled for years in official binaries, and users making custom builds
may end up not enabling it unknowingly, so it's best if we default to the same as
what official builds do.
The original reason for having it opt-in was likely the addition of a dependency on
libudev, but that should be fairly ubiquitous by now.
(cherry picked from commit e8b69fccbe)
This enables `-static-libgcc -static-libstdc++` which help make custom Linux
builds more portable (official builds have been using this option for years).
For some obscure reason Ubuntu 18.04 i386 crashes when using the option for
i386 builds, so let's play it safe and enable for x86_64 only for now.
(cherry picked from commit 1ebd66daff)
Otherwise we can get situations where platform-specific opts with the same name
can override each other depending on the order at which platforms are parsed,
as was the case with `use_static_cpp` in Linux/Windows.
Fixes#44304.
This also has the added benefit that the `scons --help` output will now only
include the options which are relevant for the selected (or detected) platform.
(cherry picked from commit 0f84d8dc49)
`debug_symbols=yes` will now behave like `debug_symbols=full` did
before. The difference in compressed file sizes is not that large,
which means there isn't much point in having two different values.
This helps make the buildsystem easier to understand.
(cherry picked from commit ff1f0d2cb5)
Batching is mostly separated into a common template which can be used with multiple backends (GLES2 and GLES3 here). Only necessary specifics are in the backend files.
Batching is extended to cover more primitives.
Its last use was removed in Godot 3.0, so it no longer makes sense to define.
Also removed `D3D_DEBUG_INFO` for Windows as it's likely a left over from a
long time ago pre-opensourcing when Godot had some form of Direct3D 9 support?
(cherry picked from commit dcf902df85)
Configured for a max line length of 120 characters.
psf/black is very opinionated and purposely doesn't leave much room for
configuration. The output is mostly OK so that should be fine for us,
but some things worth noting:
- Manually wrapped strings will be reflowed, so by using a line length
of 120 for the sake of preserving readability for our long command
calls, it also means that some manually wrapped strings are back on
the same line and should be manually merged again.
- Code generators using string concatenation extensively look awful,
since black puts each operand on a single line. We need to refactor
these generators to use more pythonic string formatting, for which
many options are available (`%`, `format` or f-strings).
- CI checks and a pre-commit hook will be added to ensure that future
buildsystem changes are well-formatted.
(cherry picked from commit cd4e46ee65)
- Improve the SCsub to allow unbundling and remove unnecessary code.
- Move files around to match upstream source.
- Re-sync with upstream commit 308db73d0b3c2d1870cd3e465eaa283692a4cf23
to ensure we don't have local modifications.
- Doesn't actually build against current version 5.0.1 due to the lack
of the new ArmaturePopulate API that Gordon authored. We'll have to
wait for a public release with that API (5.1?) to enable unbundling.
(cherry picked from commit 9d8a9ea826)
Otherwise comparisons would fail for compiler versions above 10.
Also simplified code somewhat to avoid using subprocess too much
needlessly.
(cherry picked from commits c7dc5142b5
and df7ecfc4a7)
The basic point is as in 2.1 (appending the PCK into the executable), but this implementation also patches a dedicated section in the ELF/PE executable so that it matches the appended data perfectly.
The usage of integer types is simplified in existing code; namely, using plain `int` for small quantities.
It's the recommended way to set those, and is more portable
(automatically prepends -D for GCC/Clang and /D for MSVC).
We still use CPPFLAGS for some pre-processor flags which are not
defines.
The rationale for keeping those shared by default is that they're typical
dependencies found on any Linux system, and it saves compilation time and
binary size to link their dynamically.
But since official builds default to all-builtin, and Debian/Ubuntu still
don't have libpng16 (which we now require) readily available on all their
supported releases, it's simpler to bundle all the things.
This does not change the fact that those dependencies *can* be unbundled
on Linux, it's only the default option changing.
Wrapped libpng usage in a pair of functions under PNGDriverCommon,
which convert between Godot Image and png data.
Switched to libpng 1.6 simplified API for ease of maintenance.
Implemented ImageLoaderPNG and ResourceSaverPNG in terms of
PNGDriverCommon functions.
Travis, switched to builtin libpng (thus builtin freetype and zlib also)
so we can build on Xenial.
This updates our local copy to commit 5ec8339b6fc491e3f09a34a4516e82787f053fcc.
We need a recent master commit for some new features that we use in Godot
(see #25543 and #28909).
To avoid warnings generated by Bullet headers included in our own module,
we include those headers with -isystem on GCC and Clang.
Fixes#29503.
Include paths are processed from left to right, so we use Prepend to
ensure that paths to bundled thirdparty files will have precedence over
system paths (e.g. `/usr/include` should have lowest priority).
This is the same as #23542 (Fix binaries incorrectly detected as shared
libraries on some linux distros) but for Clang. It should be fine with
Clang 4 or higher.
This adds ThinLTO support when using Clang and the LLD Linker, it's
turned off by
default.
For now only support for Linux added as ThinLTO support on other
platforms may still be buggy.
Many contributors (me included) did not fully understand what CCFLAGS,
CXXFLAGS and CPPFLAGS refer to exactly, and were thus not using them
in the way they are intended to be.
As per the SCons manual: https://www.scons.org/doc/HTML/scons-user/apa.html
- CCFLAGS: General options that are passed to the C and C++ compilers.
- CFLAGS: General options that are passed to the C compiler (C only;
not C++).
- CXXFLAGS: General options that are passed to the C++ compiler. By
default, this includes the value of $CCFLAGS, so that setting
$CCFLAGS affects both C and C++ compilation.
- CPPFLAGS: User-specified C preprocessor options. These will be
included in any command that uses the C preprocessor, including not
just compilation of C and C++ source files [...], but also [...]
Fortran [...] and [...] assembly language source file[s].
TL;DR: Compiler options go to CCFLAGS, unless they must be restricted
to either C (CFLAGS) or C++ (CXXFLAGS). Preprocessor defines go to
CPPFLAGS.
Make the sanitizer names more explicit (use_ubsan, use_asan, use_lsan).
Comment has been adjusted to include GCC as supported compiler for these
and exclude -fno-omit-frame-pointer option (should not cause any
problems).
Godot supports many different compilers and for production releases we
have to support 3 currently: GCC8, Clang6, and MSVC2017. These compilers
all do slightly different things with -ffast-math and it is causing
issues now. See #24841, #24540, #10758, #10070. And probably other
complaints about physics differences between release and release_debug
builds.
I've done some performance comparisons on Linux x86_64. All tests are
ran 20 times.
Bunnymark: (higher is better)
(bunnies) min max stdev average
fast-math 7332 7597 71 7432
this pr 7379 7779 108 7621 (102%)
FPBench (gdscript port http://fpbench.org/) (lower is better)
(ms)
fast-math 15441 16127 192 15764
this pr 15671 16855 326 16001 (99%)
Float_add (adding floats in a tight loop) (lower is better)
(sec)
fast-math 5.49 5.78 0.07 5.65
this pr 5.65 5.90 0.06 5.76 (98%)
Float_div (dividing floats in a tight loop) (lower is better)
(sec)
fast-math 11.70 12.36 0.18 11.99
this pr 11.92 12.32 0.12 12.12 (99%)
Float_mul (multiplying floats in a tight loop) (lower is better)
(sec)
fast-math 11.72 12.17 0.12 11.93
this pr 12.01 12.62 0.17 12.26 (97%)
I have also looked at FPS numbers for tps-demo, 3d platformer, 2d
platformer, and sponza and could not find any measurable difference.
I believe that given the issues and oft-reported (physics) glitches on
release builds I believe that the couple of percent of tight-loop
floating point performance regression is well worth it.
This fixes#24540 and fixes#24841