It is not valid in C++ to store into shadow_matrix1[16] with shadow_matrix1[16 * j]
(for j > 0). Even though there's a valid space in a struct after shadow_matrix1.
Knowing that GCC performs aggressive optimizations that eventually lead
to a wrong code. Code has been changed into union where one can either
use shadow_matrix[4 * 16], or individual shadow_matrix1, shadow_matrix2, etc. GCC pragma
is not needed any longer.
(cherry picked from commit d9eb6a5b20)
Godot supports many different compilers and for production releases we
have to support 3 currently: GCC8, Clang6, and MSVC2017. These compilers
all do slightly different things with -ffast-math and it is causing
issues now. See #24841, #24540, #10758, #10070. And probably other
complaints about physics differences between release and release_debug
builds.
I've done some performance comparisons on Linux x86_64. All tests are
ran 20 times.
Bunnymark: (higher is better)
(bunnies) min max stdev average
fast-math 7332 7597 71 7432
this pr 7379 7779 108 7621 (102%)
FPBench (gdscript port http://fpbench.org/) (lower is better)
(ms)
fast-math 15441 16127 192 15764
this pr 15671 16855 326 16001 (99%)
Float_add (adding floats in a tight loop) (lower is better)
(sec)
fast-math 5.49 5.78 0.07 5.65
this pr 5.65 5.90 0.06 5.76 (98%)
Float_div (dividing floats in a tight loop) (lower is better)
(sec)
fast-math 11.70 12.36 0.18 11.99
this pr 11.92 12.32 0.12 12.12 (99%)
Float_mul (multiplying floats in a tight loop) (lower is better)
(sec)
fast-math 11.72 12.17 0.12 11.93
this pr 12.01 12.62 0.17 12.26 (97%)
I have also looked at FPS numbers for tps-demo, 3d platformer, 2d
platformer, and sponza and could not find any measurable difference.
I believe that given the issues and oft-reported (physics) glitches on
release builds I believe that the couple of percent of tight-loop
floating point performance regression is well worth it.
This fixes#24540 and fixes#24841
(cherry picked from commit e5b335d367)
Seemingly a typo, I did not check what exact impact it had, but
the x_ofs would likely have accumulated errors when using fonts
with varying char widths.
(cherry picked from commit 7cb5e005ee)
WebGL does not support MapBufferRange or UnmapBuffer.
Also used in non-ES platforms where an extra-copy is avoided.
(cherry picked from commit 92e7c8daf0)
Fix for incorrect types used in MeshDataTool for bones and weights.
If your mesh contains these memory accesses get OOB and might crash
the application
Closes#21713
(cherry picked from commit e50d56b4c6)
Made Debugger's Video Memory tab show correct resource paths.
The Icons are still missing but that is due to the get_icon(type, "EditorIcons") for type = "Texture" being missing. Adding that icon would fix it.
(cherry picked from commit 8f89e2b490)
The two POSIX style crash handlers (OSX and X11) now remove their signal
handlers when they are destroyed.
Additonally if they are called while no OS singleton is set, they will
simply abort(). This should not happen now that they remove themselves,
but if a future change seperates OS object and crash handler lifetimes,
this may be easier to report/debug than hanging on SIGSEGV.
(cherry picked from commit 653b832422)
Previous fix in e8e06b2 worked in most cases but not if you run e.g.
'godot -', where the '-' argument would mean that 'project_manager'
is false and yet that's what will be opened eventually.
(cherry picked from commits e8e06b2c9a
and c0df3b147e)
On X11 when we send an XResizeWindow request to the X server it is happy
to say it is done when the request has been handed over to the window
manager. The window manager itself may however take some time to
actually do the resize. Godot expects that a resize request is
immediate. To work around this issue we could implement the whole
_NET_WM_SYNC_REQUEST protocol. However this protocol does not fit very
well with the way we currently process X events and would when
implemented in the current framework still cause a 1 frame delay between
a resize request and the actual resize happening.
This fixes#21720
(cherry picked from commit 9a1deedb84)