List of changes:
- Modified bvh class to handle 2D and 3D as a template
- Changes in Rect2, Vector2, Vector3 interface to uniformize template
calls
- New option in Project Settings to enable BVH for 2D Physics (enabled
by default like in 3D)
This non-obvious behavior can take a while to discover and fix,
so it's important to mention it in the class reference.
(cherry picked from commit 554742312d)
Allows users to override default API usage, in order to get best performance on different platforms.
Also changes the default legacy flags to use STREAM rather than DYNAMIC.
Added note to say that custom shaders operate on VERTEX after the transform with software skinning rather than before (as is the case with hardware skinning).
Changes default ninepatch mode to preserve compatibility, and renames default mode to 'fixed'.
Also adds an editor restart to changing ninepatch mode and software skinning, which will be more user friendly.
More work is needed to make sure that those options actually solve users' issues, so we prefer to remove the options for 3.2.4 and revisit for a future release.
The rendering/quality/2d section of project settings is becoming considerably expanded in 3.2.4, and arguably was not the correct place for settings that were not really to do with quality.
3.2.4 is the last sensible opportunity we will have to move these settings, as the only existing one likely to break compatibility in a small way is `pixel_snap`, and given that the whole snapping area is being overhauled we can draw attention to the fact it has changed in the release notes.
Class reference is also updated and slightly improved.
`pixel_snap` is renamed to `gpu_pixel_snap` in the project settings and code to help differentiate from CPU side transform snapping.
Antialiasing is not supported for batched polys. Currently due to the fallback mechanism, skinned antialiased polys will be rendered without applying animation.
This PR simply treats such polys as if antialiasing had not been selected. The class reference is updated to reflect this.
This suppresses the blocky shadow appearance, bringing the shadow rendering
much closer to GLES3. This soft filter is more demanding as it requires
more lookups, but it makes PCF13 shadows more usable.
The soft PCF filter was adapted from three.js.
This can be used in server builds for journalctl compatibility.
(cherry picked from commit 341b9cf15a)
Fixes crash when exiting with --verbose with leaked resources
(cherry picked from commit 25c4dacb88)
This adds a new project setting (`physics/common/enable_pause_aware_picking`). It's disabled by default.
When enabled, it changes the way 2D & 3D physics picking behaves in relation to pause:
- When pause is set, every collision object that is hovered or captured (3D only) is released from that condition, getting the relevant mouse-exit callback., unless its pause mode makes it immune from pause.
- During the pause. picking only considers collision objects immune from pause, sending input events and enter/exit callbacks to them as expected.
- When pause is left, nothing happens. This is a big difference with the classic behavior, which at this point would process all the input events that have been queued against the current state of the 2D/3D world (in other words, checking them against the current position of the objects instead of those at the time of the events).
- Fix Embree runtime when using MinGW (patch by @RandomShaper).
- Fix baking of lightmaps on GridMaps.
- Fix some GLSL errors.
- Fix overflow in the number of shader variants (GLES2).
Complete rewrite of spatial partitioning using a bounding volume hierarchy rather than octree.
Switchable in project settings between using octree or BVH for rendering and physics.
It can be enabled in the Project Settings
(`rendering/quality/filters/use_debanding`). It's disabled
by default as it has a small performance impact and can make
PNG screenshots much larger (due to how dithering works).
As a result, it should be enabled only when banding is noticeable enough.
Since debanding requires a HDR viewport to work, it's only supported
in the GLES3 backend.
Another bug in the octree has been discovered which can cause flickering in rare circumstances : #42895
For safety until this is fixed properly this PR reverts the default state of the octree to match the old behaviour, which doesn't appear exhibit the bug (or at least not as readily).
Batching is mostly separated into a common template which can be used with multiple backends (GLES2 and GLES3 here). Only necessary specifics are in the backend files.
Batching is extended to cover more primitives.
Option in MeshInstance to enable software skinning, in order to test
against the current USE_SKELETON_SOFTWARE path which causes problems
with bad performance.
Co-authored-by: lawnjelly <lawnjelly@gmail.com>
Prevents adding new octants until a limiting number of elements have been added to the current octant. This enables balancing the benefits of brute force against the benefits of spatial partitioning. The limit can be set per octree.
Project settings are added for rendering octree to set the best balance per project depending on number of tests per frame / tick, and the amount of editing of the octree.
Fixes octants being leaked when removing elements.
Optimize octree with cached linear lists
Storing elements in octants using linked lists is efficient for housekeeping but very slow for testing. This optimization stores additional local_vectors with Element pointers and AABBs which are cached and only updated when a dirty flag is set on the octant.
This is selectable with 2 versions of Octree : Octree and Octree_CL, Octree being the old behaviour. At present the cached list version is only used for the visual server octree (rendering) as it has only been demonstrated to be faster there so far.
This uses slightly more memory (probably a few kb in most cases) but can be significantly faster during testing (culling etc).
Co-authored-by: Sergey Minakov <naithar@icloud.com>
- Use the `.log` file extension (recognized on Windows out of the box)
to better hint that generated files are logs. Some editors provide
dedicated syntax highlighting for those files.
- Use an underscore to separate the basename from the date and
the date from the time in log filenames. This makes the filename
easier to read.
- Keep only 5 log files by default to decrease disk usage in case
messages are spammed.
(cherry picked from commit 20af28ec06)
Scaling tilemaps can cause border artifacts around the edges of tiles. This has been traced to precision issues in the GPU. This PR adds an adjustment to allow a minor contraction of the UVs of rects in order to compensate for the incorrect classification of texels across the UV border.
As it now seems like we will soon have GLES3 batching working using the same intermediate layer as GLES2, it makes more sense to reuse the same batching settings for both renderers rather than duplicate project settings for GLES2 and GLES3.
Although 2D draws in painters order with strict ordering, in certain circumstances items can be reordered to increase batching / decrease state changes, without affecting the end result. This can be determined by an overlap test.
In situation with item:
A-B-A
providing the third item does not overlap the second, they can be reordered:
A-A-B
Items already contain an AABB which can be used for this overlap test.
1)
To utilise this, I have implemented item reordering (only for single rects for now), with the lookahead adjustable in project settings. This can increase performance in situations where items may not be grouped in the scene tree by texture. It can also be switched off (by setting lookahead to 0).
2)
This same trick can be used to help join items that are lit. Lit items previously would prevent joining completely, thus missing out on performance gains other than multi-command items such as tilemaps.
In this PR, lights are assigned as bits in a bitfield (up to 64, the optimization is disabled above this), and on each try_item (for joining), the bitfield for lights and shadows is constructed and compared with the previous items. If these match the 2 items can potentially be joined. However, this can only be done without changing the rendered result if an overlap test is successful.
This overlap test can be adjusted to join items up to a specific number of item references, selectable in project settings, or turned off.
3)
The legacy uniform single rect drawing routine seems to have been identified as the source of flicker, particularly on nvidia. However, it can also be up to 2x as fast. Because of the speed the batching contains a fallback where it can use the legacy single rect method, but I have now added a project setting to make this switchable. In most cases with batching it should not be necessary (as single rects are drawn less frequently) and thus the flickering can be totally avoided.
4)
This PR also fixes a color modulate bug when drawing light passes, in certain situations (particularly custom _draw routines with multiple rects).
5)
This PR also fixes#38291, a bug in the legacy renderer where light passes could draw rects in wrong position.
Added project setting to enable / disable print frame diagnostics every 10 seconds. This prints out a list of batches and info, which is useful to optimize games and identify performance problems.
2d rendering is currently bottlenecked by drawing primitives one at a time, limiting OpenGL efficiency. This PR batches primitives and renders in fewer drawcalls, resulting in significant performance improvements. This also speeds up text rendering.
This PR batches across canvas items as well as within items.
The code dynamically chooses between a vertex format with and without color, depending on the input data for a frame, in order to optimize throughput and maximize batch size. It also adds an option to use glScissor to reduce fillrate in light passes.
We already removed it from the online docs with #35132.
Currently it can only be "Built-In Types" (Variant types) or "Core"
(everything else), which is of limited use.
We might also want to consider dropping it from `ClassDB` altogether
in Godot 4.0.