This is needed because the final startup values for shapes may change between parenting and entering the scene tree. For instance, if the collision shape belongs to a inherited scene.
Fixes#13835.
Using `misc/scripts/fix_headers.py` on all Godot files.
Some missing header guards were added, and the header inclusion order
was fixed in the Bullet module.
That change was borne out of a confusion regarding the meaning of "local" in #14569.
Affine transformations in Spatial simply correspond to affine operations of its Transform. Such operations take place in a coordinate system that is defined by the parent Spatial. When there is no parent, they correspond to operations in the global coordinate system.
This coordinate system, which is relative to the parent, has been referred to as the local coordinate system in the docs so far, but this sloppy language has apparently confused some users, making them think that the local coordinate system refers to the one whose axes are "painted" on the Spatial node itself.
To avoid such conceptual conflations and misunderstandings in the future, the parent-relative local system is now referred to as "parent-local", and the object-relative local system is called "object-local" in the docs.
This commit adds the functionality "requested" in #14569, not by changing how rotate/scale/translate works, but by adding new rotate_object_local, scale_object_local and translate_object_local functions. Also, for completeness, there is now global_scale.
This commit also updates another part of the docs regarding the rotation property of Spatial, which also leads to confusion among some users.
The previous commit corrected the RNG behavior for the lightbaker but
also made it significantly slower on high core count systems. Due to the
vector of states being physically close together in RAM we force a cache
synchronization across all cores whenever we call for the next random
number to be generated.
This will create a temporary local copy of the RNG state before entering
the loop and then saving it back to the global state when done. This
will preserve the per-thread RNG state (and random number quality) while
significantly improving performance.
On my 16 thread box it saves 3 minutes baking the Sponza scene, bringing
performance back in line to before the various RNG fixes were
introduced, being slightly faster than the first implementation.
In our previous attempts to fix the lightmapper we may have
inadvertently introduced the same issue we were trying to fix. It
appears that rand() will on some platforms introduce a mutex making it
slower and on others may have a per-thread state that would need to be
initialized with srand() on each thread. This slows down the lightbaking
further.
This sets up a separate rng state for each OpenMP thread by calling
rand() only in the single-threaded part of the code. We then keep a
vector of states.
I believe this solves our problems.
Due to memory contraints in other places in Godot it is unlikely that
anything higher than 1024 will actually work. When/if we improve memory
management for vectors we can increase this limit again
Based off of perf-based prediction misses these seem to be the
lowest-hanging fruit for quick (albeit small) improvements. These are
based on:
* baking a complex lightmap
* running platformer 3d
* running goltorus
On higher threadcount systems this allows for better utilization. On my
16 thread box CPU use goes from 10 - 11 threads to a steady 15 threads
on the Sponza scene.
Baking time goes from ~10:00 to ~07:30 for me. On lower threadcount
systems I expect some improvement also but likely a little less.
This speeds up the lightmapper by about 10% with no visible impact. A
comparison is up here:
https://tmm.cx/nextcloud/s/Log1eAXen1dJzBz
AMD Ryzen 7 1700 Eight-Core Processor
Sponza scene
pcg32
256/256/high 00:10:13
256/256/medium 00:02:50
256/256/low 00:01:11
xorshift
256/256/high 00:09:32
256/256/medium 00:02:34
256/256/low 00:01:05