Although 2D draws in painters order with strict ordering, in certain circumstances items can be reordered to increase batching / decrease state changes, without affecting the end result. This can be determined by an overlap test.
In situation with item:
A-B-A
providing the third item does not overlap the second, they can be reordered:
A-A-B
Items already contain an AABB which can be used for this overlap test.
1)
To utilise this, I have implemented item reordering (only for single rects for now), with the lookahead adjustable in project settings. This can increase performance in situations where items may not be grouped in the scene tree by texture. It can also be switched off (by setting lookahead to 0).
2)
This same trick can be used to help join items that are lit. Lit items previously would prevent joining completely, thus missing out on performance gains other than multi-command items such as tilemaps.
In this PR, lights are assigned as bits in a bitfield (up to 64, the optimization is disabled above this), and on each try_item (for joining), the bitfield for lights and shadows is constructed and compared with the previous items. If these match the 2 items can potentially be joined. However, this can only be done without changing the rendered result if an overlap test is successful.
This overlap test can be adjusted to join items up to a specific number of item references, selectable in project settings, or turned off.
3)
The legacy uniform single rect drawing routine seems to have been identified as the source of flicker, particularly on nvidia. However, it can also be up to 2x as fast. Because of the speed the batching contains a fallback where it can use the legacy single rect method, but I have now added a project setting to make this switchable. In most cases with batching it should not be necessary (as single rects are drawn less frequently) and thus the flickering can be totally avoided.
4)
This PR also fixes a color modulate bug when drawing light passes, in certain situations (particularly custom _draw routines with multiple rects).
5)
This PR also fixes#38291, a bug in the legacy renderer where light passes could draw rects in wrong position.
- Resurrect it for GL ES 2
- Apply roll over with `fmod()` instead of resetting it to 0
- Expose the setting from the `VisualServer`, since it does not belong in any specific rasterizer
This should make headless exporting work in projects using textures
in any format.
Error messages should no longer appear when running a project
that used image formats that were previously not present in the list.
(cherry picked from commit 3007c7e2a3)
When reading SCREEN_TEXTURE in a shader, this previously only worked succesfully for the first read of the screen, because state.canvas_texscreen_used was never getting reset. This PR resets state.canvas_texscreen_used at the beginning of each joined item, so that further screen reads can happen.
Joining items across z_indices can interfere with light culling for lights which only affect certain z ranges. This PR disables joining across z_indices when lights are present, except specifically for lights with both z_min set to the global minimum (-4096) and z_max set to the global maximum (4096).
In addition, the z_index is now stored on the joined_item for accurate light culling. The z_index is also displayed in frame diagnostics.
In rare circumstances default batches were being joined incorrectly, causing visual regressions. This logic has been fixed.
In addition slightly more output information has been added to frame diagnosis mode.
Batching across z_index layers was not preserving the batch_break flag, which determines whether to not join the previous item. This is fixed by storing the flag in RenderItemState and preserving it across canvas_render_items calls.
Added project setting to enable / disable print frame diagnostics every 10 seconds. This prints out a list of batches and info, which is useful to optimize games and identify performance problems.
This adds 2 new values (items and draw calls) to the performance monitor in a '2d' section, rather than reusing the 3d values in the 'raster' section.
This makes it far easier to optimize games to minimize drawcalls.
Defers sending 'transform' commands within a RasterizerCanvas::Item until they are needed for default batches. Instead locally caches the extra matrix and applies it using software transform, preventing unnecessary batch breaks.
The logic is relatively complex, and the whole 'extra matrix' of the legacy renderer in addition to the final_transform is not ideal. However this is required to accelerate some user drawing techniques, and later the lines in the IDE.
Extra functions canvas_render_items_begin and canvas_render_items_end are added to RasterizerCanvas, with noop stubs for non-GLES2 renderers. This enables batching to be spready over multiple z_indices, and multiple calls to canvas_render_items.
It does this by only performing item joining within canvas_render_items, and deferring rendering until canvas_render_items_end().
Determined that a large reason for the decrease in performance in unbatchable scenes was due to the new routine being analogous to the 'nvidia workaround' code, that is about half the speed. So this simply uses the old routine in the case of single unbatchable rects. Hopefully we will be able to remove the old path at a later stage.
Where the final_modulate color varies between render_items this can prevent batching. This PR solves this by baking final_modulate into the vertex colors, and setting the uniform 'final_modulate' to white, and allowing the joining of items that have different final_modulate values. The previous batching system can then cope with vertex color changes as normal.
2d rendering is currently bottlenecked by drawing primitives one at a time, limiting OpenGL efficiency. This PR batches primitives and renders in fewer drawcalls, resulting in significant performance improvements. This also speeds up text rendering.
This PR batches across canvas items as well as within items.
The code dynamically chooses between a vertex format with and without color, depending on the input data for a frame, in order to optimize throughput and maximize batch size. It also adds an option to use glScissor to reduce fillrate in light passes.
Namely, move the drive dropdown to just the left of the path text box and don't include the former
in the latter.
This improves the UX on Windows.
In the UNIX case, since its concept of drives is (ab)used to provide shortcuts to useful paths, its
dropdown is kept at the original location.
Pith bend message now has correct size (was 2 bytes instead of 3).
Recognized (but not implemented) 0xF? messages. SysEx messages will be reocognized as such, but their contents will be ignored.
Fixes#26637.
Fixes#19900.
The viewport_size returned by get_viewport_size was previously incorrect, being half the correct value. The function is renamed to get_viewport_half_extents, and now returns a Vector2.
Code which called this function has also been modified accordingly.
This PR also fixes shadow culling when using ortho cameras, because the correct input for CameraMatrix::set_orthogonal should be the full HEIGHT from get_viewport_half_extents, and not half the width.
It also fixes state.ubo_data.viewport_size in rasterizer_scene_gles3.cpp to be the width and the height of the viewport in pixels as stated in the documentation, rather than the current value which is half the viewport extents in worldspace, presumed to be a bug.
Reverts the following commits:
- c81ec6f26d40b70283958a4ef3e216fb32cbaf14:
"Exposes capture methods to AudioServer, variable renames for
consistency, added documentation."
- 47c558b98abf842910c780294314326662410cdf:
"Expose audio callbacks as signals."
- dabaa11b3c451e9b8f2cca7e563bd9ec51edb169:
"Fix to make sure the capture buffers are deallocated at shutdown.
Silences warnings."
Some documentation improvements were kept for pre-existing methods.
See rationale for reverting these changes in #30468.
All the calculations leading up to `mipLevel` are only relevant for
Panorama mode. Similarly, the `source_resolution` uniform is only
needed for that mode.
`ERROR: _display_error_with_code: CanvasShaderGLES3: Fragment Program Compilation Failed:
0:166(2): error: `return' with wrong type int, in function `map_ninepatch_axis' returning float` caused by #34704
Some cases were not handled properly for Polygon2D after making changes in common code to fix Line2D antialiasing. Added an option for drawing polygons to differentiate the two use cases.
Fixes#34568
Happy new year to the wonderful Godot community!
We're starting a new decade with a well-established, non-profit, free
and open source game engine, and tons of further improvements in the
pipeline from hundreds of contributors.
Godot will keep getting better, and we're looking forward to all the
games that the community will keep developing and releasing with it.
The new 'split_libmodules=yes' option is useful to work around linker
command line size limitations when linking a huge number of objects.
We're currently over 64k chars when linking libmodules.a on Windows
with MinGW, which triggers issues as seen in #30892.
Even on Linux, we can also reach linker command line size limitations
by adding more custom modules.
We force this option to True for MinGW on Windows, which fixes#30892.
Additional changes to lib splitting:
- Fix linking of the split module libs with interdependent symbols,
hacking our way into LINKCOM and SHLINKCOM to set the `--start-group`
and `--end-group` flags.
- Fix Python 3 compatibility in `methods.split_lib()`.
- Drop seemingly obsolete condition for 'msys' on 'posix'.
- Drop the unnecessary 'split_drivers' as the drivers lib is no longer
too big since we moved all thirdparty builds to modules.
Co-authored-by: Hein-Pieter van Braam-Stewart <hp@tmm.cx>
Polygon2D:
The property wasn't used anymore after switching from canvas_item_add_polygon() to canvas_item_add_triangle_array() for drawing.
Line2D:
Added the same property as for Polygon2D & fixed smooth line drawing to use indices correctly.
Fixes#26823
This changes the code path so that `glRenderBufferStorage*` always uses
values appropriate for renderbuffers and `glTexImage2D` never uses an
internalformat meant for buffers.
Fixes#33825.
As discussed in #32657, this can't be done here as lines can be used
with a canvas scale, and this breaks them.
A suggestion is to do the pixel shifting at matrix level instead.
Fixes#33393.
Fixes#33421.
When rendering to an external texture and MSAA was active (as happened
in the Oculus Mobile ARVR plugin) no MSAA was rendered as the correct
depth buffer and multisample texture target was not used.
This also fixes https://github.com/GodotVR/godot_oculus_mobile/issues/54
The misterious windows networking stack...
Using connect instead of WSAConnect causes socket error 10022 under
certain conditions.
See: https://github.com/godotengine/webrtc-native/ (issue 6)
Having to guess, code path for connect is different then WSAConnect with
NULL extra parameters.
The only reference about weird error with this code mentions something
called "Windows Filtering Platform" but windows internals are, as
always, obscure.
This might be something to try and report to Microsoft if anyone has the
time to spare with the likely outcome of being ignored.
While OpenGL ES 3.0 and WebGL 2.0 both support non power-of-2 (NPOT)
textures in their specification, the situation seems to be less clear
about *compressed* NPOT textures using repeat or mipmap flags.
At least Chrome on Linux doesn't seem to support this combination,
and a variety of mobile hardware have similar limitations.
As a workaround, we force decompressing such textures when running on
WebGL 2.0, at the cost of loading time and memory usage.
Fixes#33058.
OpenGL uses the diamond exit rule to rasterize lines. If we don't shift
the points down and to the right by 0.5, the line can sometimes miss a
pixel when it shouldn't. The final fragment of a line isn't drawn. By
drawing the lines clockwise, we can avoid a missing pixel in the rectangle.
See section 3.4.1 in the OpenGL 1.5 specification.
Fixes#32279
On Unix systems, file descriptors are usually shared among child
processes.
This means, that if we spawn a subprocess (or we fork) like we do in
the editor any open file descriptor will leak to the new process.
This PR sets the close-on-exec flag when opening a file, which causes
the file descriptor to not be shared with the child process.
On Unix systems, sockets are like file descriptors, and file descriptors
are usually shared among child processes.
This means, that if we spawn a subprocess (or we fork) like we do in the
editor, open file descriptors will leak to the new process.
This causes issue with sockets as they might remain open and bound
(listening) when the original process closes.
Fixes this error:
```
drivers\unix\ip_unix.cpp(155): error C2593: 'operator =' is ambiguous
.\core/ustring.h(177): note: could be 'void String::operator =(const CharType *)'
.\core/ustring.h(176): note: or 'void String::operator =(const char *)'
drivers\unix\ip_unix.cpp(155): note: while trying to match the argument list '(String, int)'
```
This fixes an issue that was fixed for gles3 in #31419 but not applied
to gles2. The fix consists of using a constant scale for cube_normal of -1.0
instead of -1000000. It results in broken panorama rendering on the
oculus quest (see https://github.com/GodotVR/godot_oculus_mobile/issues/29)
Although the backup USE_SKELETON_SOFTWARE skinning path is currently used when float texture is not supported, the default skinning path still fails when float texture is supported but GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS is 0, i.e. the device cannot read from texture during vertex shader. This PR adds the logic to activate the SKELETON_SOFTWARE path if either of these cases occur, preventing crashes on devices which have this combination of features.
In 2.1 and 3.0, light_vec could be modified for altering shadow_computations.
But it broke shadows when rotating light. shadow_vec would do the same, but without breaking
shadows in rotated lights if not used.
Add inverse light transformation to shadow vec, so it's not affected when rotating lights;
Added usage define for shadow vec.
For shadow vec working properly when rotating a light, it's needed to multiply it by light_matrix normalized. Added usage define in order to don't do that if shadow_vec not used.
Changed the behaviour of the Linear tonemapping operator to not clamp to [0, 1] range
in the case when KEEP_3D_LINEAR is defined. This allows to render values > 1.0 in
floating point texture targets (via Viewport) for further processing or saving high
dynamic range data into files. This only works when no color conversion is active.
The last remaining ERR_EXPLAIN call is in FreeType code and makes sense as is
(conditionally defines the error message).
There are a few ERR_EXPLAINC calls for C-strings where String is not included
which can stay as is to avoid adding additional _MSGC macros just for that.
Part of #31244.
Condensed some if and ERR statements. Added dots to end of error messages
Couldn't figure out EXPLAINC. These files gave me trouble: core/error_macros.h, core/io/file_access_buffered_fa.h (where is it?),
core/os/memory.cpp,
drivers/png/png_driver_common.cpp,
drivers/xaudio2/audio_driver_xaudio2.cpp (where is it?)
This makes height fog appear at the bottom of the scene
(instead of the top), which is generally the expected result.
This also tweaks the fog height setting hint to be more flexible.
This closes#30709.
On some file systems, like ext4 on Linux, readdir() gives enough
information to determine the entry type in order to avoid doing
a stat() system call.
Use this information and call stat() only if necessary: for file
systems that do not support this feature and for links.
On some file systems, like ext4 on Linux, readdir() gives enough
information to determine the entry type in order to avoid doing
a stat() system call.
Use this information and call stat() only if necessary.
This is a straight copy-paste of the code from
`drivers/gles3/rasterizer_canvas_gles3.cpp`. It is subject to the
same restrictions as the GLES3 implementation: it only works
on desktop platforms as they use OpenGL instead of OpenGL ES.
For clarity, assign-to-release idiom for PoolVector::Read/Write
replaced with a function call.
Existing uses replaced (or removed if already handled by scope)
When setting the default precision type for uniforms (before compiling
the shader) prevent boolean uniforms from having one set. Booleans can't
have a precision type and on some Android devices this caused a
compilation failure.
Fixes#30317
It's the recommended way to set those, and is more portable
(automatically prepends -D for GCC/Clang and /D for MSVC).
We still use CPPFLAGS for some pre-processor flags which are not
defines.
Allow getting interfaces names and assigned names.
On UWP this is not supported, and the function will return one interface
for each local address (with interface name the local address itself).
Wrapped libpng usage in a pair of functions under PNGDriverCommon,
which convert between Godot Image and png data.
Switched to libpng 1.6 simplified API for ease of maintenance.
Implemented ImageLoaderPNG and ResourceSaverPNG in terms of
PNGDriverCommon functions.
Travis, switched to builtin libpng (thus builtin freetype and zlib also)
so we can build on Xenial.
ResourceFormatLoader and ResourceFormatSaver are meant to be overridden
to add support for different formats in ResourceLoader and ResourceSaver.
Those should be exposed as they can be overridden in plugins.
On the other hand, all predefined subclasses of those two base classes
are only meant to register support for new file and resource types, but
should not and cannot be used directly from script, so they should not
be exposed.
Also unexposed ResourceImporterOGGVorbis (and thus its base class
ResourceImporter) which are editor-only.
Non-tools OpenGLES2 devices that use the USE_SKELETON_SOFTWARE path (i.e. do not support float texture) depend on surface->data being set containing the bone IDs and weights (rasterizer_scene_gles2.cpp, line 1456, RasterizerSceneGLES2::_setup_geometry). However currently if TOOLS_ENABLED is not defined, surface->data is not stored in main memory in rasterizer_storage_gles2.cpp. This causes a crash in rasterizer_scene_gles2.cpp when a rigged object comes into view.
This fix addresses the specific case of skinned objects when USE_SKELETON_SOFTWARE is active, and stores a copy of the bone data, as is done when TOOLS_ENABLED is defined. This fixes the crash by allowing the same mechanism as on desktop, without adding the memory overhead of storing all vertex data where not required.
Fixes#28298
Warnings raised by Emscripten 1.38.0 and MinGW64 5.0.4 / GCC 8.3.0.
JS can now build with `werror=yes warnings=extra`.
MinGW64 still has a few warnings to resolve with `warnings=extra`,
and only one with `warnings=all`.
Part of #29033 and #29801.
This is a new singleton where camera sources such as webcams or cameras on a mobile phone can register themselves with the Server.
Other parts of Godot can interact with this to obtain images from the camera as textures.
This work includes additions to the Visual Server to use this functionality to present the camera image in the background. This is specifically targetted at AR applications.
It's not necessary, but the vast majority of calls of error macros
do have an ending semicolon, so it's best to be consistent.
Most WARN_DEPRECATED calls did *not* have a semicolon, but there's
no reason for them to be treated differently.
This decreases the number of samples significantly, leading to a
notable performance increase with only a very slight loss in
visual quality.
This also tweaks the default SSAO settings to use 3×3 blurring,
which makes noise patterns much less visible.
The use of different default precision values (highp in vertex; mediump
in fragment) for uniform variables caused the shader program to not link properly on some android
devices/emulators.
If a non-imported texture resource file (e.g. DDS) gets updated the editor
doesn't reload it. The cause of the problem is two-fold:
First, the code of ImageTexture assumes that textures are always imported
from an image, but that's not the case for e.g. DDS. This change thus adds
code to issue a resource reload in case an image reload is not possible
(which is the case for non-imported texture resources).
Second, the code is filled with bogus calls to Image::get_image_data_size()
to determine the mipmap offset when that should be done using
Image::get_image_mipmap_offset(). Previous code literally passed the integer
mip level value to Image::get_image_data_size() where that actually expects
a boolean. Thus this part of the change might actually solve some other
issues as well.
To be pedantic, the texture_get_data() funciton of the rasterizer drivers is
still quite a mess, as it only ever returns the whole mipchain when
GLES_OVER_GL is set (practically only on desktop builds) but this change does
not attempt to resolve that.
Include paths are processed from left to right, so we use Prepend to
ensure that paths to bundled thirdparty files will have precedence over
system paths (e.g. `/usr/include` should have lowest priority).
Many contributors (me included) did not fully understand what CCFLAGS,
CXXFLAGS and CPPFLAGS refer to exactly, and were thus not using them
in the way they are intended to be.
As per the SCons manual: https://www.scons.org/doc/HTML/scons-user/apa.html
- CCFLAGS: General options that are passed to the C and C++ compilers.
- CFLAGS: General options that are passed to the C compiler (C only;
not C++).
- CXXFLAGS: General options that are passed to the C++ compiler. By
default, this includes the value of $CCFLAGS, so that setting
$CCFLAGS affects both C and C++ compilation.
- CPPFLAGS: User-specified C preprocessor options. These will be
included in any command that uses the C preprocessor, including not
just compilation of C and C++ source files [...], but also [...]
Fortran [...] and [...] assembly language source file[s].
TL;DR: Compiler options go to CCFLAGS, unless they must be restricted
to either C (CFLAGS) or C++ (CXXFLAGS). Preprocessor defines go to
CPPFLAGS.
The bake mode property of lights previously didn't affect GI probes.
This change makes the GI probe ignore lights that have their bake mode
set to disabled.
It seems to stay compatible with formatting done by clang-format 6.0 and 7.0,
so contributors can keep using those versions for now (they will not undo those
changes).
FBX support and MMD (pmx) support.
Normals, Albedo, Metallic, and Roughness through Arnold 5 Materials for Maya FBX.
Maya FBX Stingray PBS support.
Importing FBX static meshes work.
Importing FBX animations is a work in progress.
Supports FBX 4 bone influence animations.
Supports FBX blend shapes.
MMDs do not have an associated animation import yet.
Sponsored by IMVU Inc.
Example of the warning:
./core/script_language.h:198:7: warning: 'class ScriptCodeCompletionCache' has virtual functions and accessible non-virtual destructor [-Wnon-virtual-dtor]
- Cleaned up and improved the code determining when we need to use a depth
prepass (previously it wasn't executed in certain cases even if it was
needed)
- Added code to prepare and bind the depth texture even when no depth prepass
or MRTs (more precisely effect buffers) are used
Fixes#25870, #25535, and #25387.
In canvas.glsl and scene.glsl, we were using texel2DFetch from stdlib.glsl,
which uses texture2DLod. In both cases, the stdlib.glsl include came before
the define of texture2DLod.
Might fix issues for drivers that don't support GL_EXT_shader_texture_lod.
- Texture arrays and 3D textures weren't working previously due to an
incorrect number of calls to glTexImage3D with incorrect level parameters.
This change fixes that.
- Fixed the incorrect calculation of the byte size of layered textures.
- Added the layer count to the debugger info when viewing video memory usage.