With the octahedral compression, we had attributes of a size of 2 bytes
which potentially caused performance regressions on iOS/Mac
Now add padding to the normal/tangent buffer
For octahedral, normal will always be oct32 encoded
UNLESS tangent exists and is also compressed
then both will be oct16 encoded and packed into a vec4<GL_BYTE>
attribute
Initial octahedral compression incorrectly wrote tangent to the buffer
using an offset of 3 rather than 4, losing the sign of the tangent
vector needed for things like tangent space for texturing mapping
GLES3 renderer used remove_custom_define rather than set_conditional to
change back to the default conditional state the scene shader should be
in
Implement Octahedral Compression for normal/tangent vectors
*Oct32 for uncompressed vectors
*Oct16 for compressed vectors
Reduces vertex size for each attribute by
*Uncompressed: 12 bytes, vec4<float32> -> vec2<unorm16>
*Compressed: 2 bytes, vec4<unorm8> -> vec2<unorm8>
Binormal sign is encoded in the y coordinate of the encoded tangent
Added conversion functions to go from octahedral mapping to cartesian
for normal and tangent vectors
sprite_3d and soft_body meshes write to their vertex buffer memory
directly and need to convert their normals and tangents to the new oct
format before writing
Created a new mesh flag to specify whether a mesh is using octahedral
compression or not
Updated documentation to discuss new flag/defaults
Created shader flags to specify whether octahedral or cartesian vectors
are being used
Updated importers to use octahedral representation as the default format
for importing meshes
Updated ShaderGLES2 to support 64 bit version codes as we hit the limit
of the 32-bit integer that was previously used as a bitset to store
enabled/disabled flags
Implemented splitting of vertex positions and attributes in the vertex
buffer
Positions are sequential at the start of the buffer, followed by the
additional attributes which are interleaved
Made a project setting which enables/disabled the buffer formatting
throughout the project
Implemented in both GLES2 and GLES3
This improves performance particularly on tile-based GPUs as well as
cache performance for something like shadow mapping which only needs
position data
Updated Docs and Project Setting
This is an older, easier to implement variant of CAS as a pure
fragment shader. It doesn't support upscaling, but we won't make
use of it (at least for now).
The sharpening intensity can be adjusted on a per-Viewport basis.
For the root viewport, it can be adjusted in the Project Settings.
Since `textureLodOffset()` isn't available in GLES2, there is no
way to support contrast-adaptive sharpening in GLES2.
Add two new functions to the IP class that returns all addresses/aliases associated with a given address.
This is a cherry-pick merge from 010a3433df which was merged in 2.1, and has been updated to build with the latest code.
This merge adds two new methods IP.resolve_hostname_addresses and IP.get_resolve_item_addresses that returns a List of all addresses returned from the DNS request.
The error check was added for `FileAccessUnix` but it's not an error when both
`p_src` and `p_length` are zero.
Added correct error checks to all implementations to prevent the actual
erroneous case: `p_src` is nullptr but `p_length > 0` (risk of null pointer
indexing).
Fixes#33564.
(cherry picked from commit 01d5c463be)
This changes the types of a big number of variables.
General rules:
- Using `uint64_t` in general. We also considered `int64_t` but eventually
settled on keeping it unsigned, which is also closer to what one would expect
with `size_t`/`off_t`.
- We only keep `int64_t` for `seek_end` (takes a negative offset from the end)
and for the `Variant` bindings, since `Variant::INT` is `int64_t`. This means
we only need to guard against passing negative values in `core_bind.cpp`.
- Using `uint32_t` integers for concepts not needing such a huge range, like
pages, blocks, etc.
In addition:
- Improve usage of integer types in some related places; namely, `DirAccess`,
core binds.
Note:
- On Windows, `_ftelli64` reports invalid values when using 32-bit MinGW with
version < 8.0. This was an upstream bug fixed in 8.0. It breaks support for
big files on 32-bit Windows builds made with that toolchain. We might add a
workaround.
Fixes#44363.
Fixesgodotengine/godot-proposals#400.
Co-authored-by: Rémi Verschelde <rverschelde@gmail.com>
In the legacy renderer unrigged polys would display with no transform applied, whereas the software skinning didn't deal with these at all (outputted them with position zero). This PR simply copies the source to destination verts and replicates the legacy behaviour.
All my earlier test cases for software skinning had the polys parent transform to be identity. This works fine until you had cases where the user had moved the transform of the parent nodes of skinned polys.
This PR fixes this situation by taking into account the final (concatenated) transform of the polys RELATIVE to the skeleton base transform. It does this by applying the inverse skeleton base transform to the poly final transform.
Since we clone the environments to build thirdparty code, we don't get an
explicit dependency on the build objects produced by that environment.
So when we update thirdparty code, Godot code using it is not necessarily
rebuilt (I think it is for changed headers, but not for changed .c/.cpp files),
which can lead to an invalid compilation output (linking old Godot .o files
with a newer, potentially ABI breaking version of thirdparty code).
This was only seen as really problematic with bullet updates (leading to
crashes when rebuilding Godot after a bullet update without cleaning .o files),
but it's safer to fix it everywhere, even if it's a LOT of hacky boilerplate.
(cherry picked from commit c7b53c03ae)
We've been using standard C library functions `memcpy`/`memset` for these since
2016 with 67f65f6639.
There was still the possibility for third-party platform ports to override the
definitions with a custom header, but this doesn't seem useful anymore.
Backport of #48239.
The final_modulate was incorrectly being set in the uniform on light passes in GLES3 in situations where color was baked in the vertices. This was already correct in GLES2. This PR makes prevents setting final_modulate in this situation.
The translation to larger vertex formats was assuming that batches were rects, and not accounting that the num_commands had a different meaning for lines and polys, so the calculation for number of vertices to translate was incorrect in these cases.
Also prevents infinite loop if a single polygon has too many vertices to fit in the batch buffer.
When users create an invalid shader, the shader->valid flag is set to false. Batching previously assumes that shaders are valid, and this can result in primitives with invalid shader being joined, causing visual errors.
This PR prevents joining items that have invalid shaders.
Allows users to override default API usage, in order to get best performance on different platforms.
Also changes the default legacy flags to use STREAM rather than DYNAMIC.
When using modulate_fvf, final_modulate was still being applied on CPU in some circumstances, AS WELL as in the shader. This double application resulted in the wrong color.
This PR prevents CPU multiplication of final_modulate when modulate_fvf is in use.
It also applies an OR to the joined item flags with each item joined. This fixes a bug where a smaller FVF is used than required, resulting in incorrect colors.
In rare cases default batches could occur which were containing commands that were not owned by the first item referenced by the joined item. This had assumed to be the case, and would read the wrong command, or crash.
Instead for safety in this PR we now store a pointer to the parent item in default batches, and use this to determine the correct command list instead of assuming.
An earlier PR #46898 had flipped the rotation basis polarity. This turns out to also need a corresponding flip for the light angles for the lighting to make sense.
The editor under certain circumstances is passing invalid polys to the renderer. This should be fixed upstream but just in case this PR adds fault tolerance for invalid indices.
Trying to use the old `hardware_transform` flag to combine the new large_fvf has lead to several bugs. So here the logic is broken out into 2 separate components, single item and large_fvf.
The old `hardware_transform` name also no longer makes sense, as there are now 3 transform paths:
Software (CPU)
Hardware (uniform)
Hardware (attribute)
Large FVF which encodes the transform in a vertex attribute is triggered by reading from VERTEX in a custom shader. This means that the local vertex position must be available in the shader, so the only way to batch is to also pass the transform as an attribute.
The large FVF path already disabled CPU transform in the case of rects, but not in other primitives, which this PR fixes.
Note that large FVF is incompatible with 2d software skinning. So reading from VERTEX in a custom shader when using skinning will not work.
- Fix objects with no material being considered as fully transparent by the lightmapper.
- Added "environment_min_light" property: gives artistic control over the shadow color.
- Fixed "Custom Color" environment mode, it was ignored before.
- Added "interior" property to BakedLightmapData: controls whether dynamic capture objects receive environment light or not.
- Automatically update dynamic capture objects when the capture data changes (also works for "energy" which used to require object movement to trigger the update).
- Added "use_in_baked_light" property to GridMap: controls whether the GridMap will be included in BakedLightmap bakes.
- Set "flush zero" and "denormal zero" mode for SSE2 instructions in the Embree raycaster. According to Embree docs it should give a performance improvement.
As part of the improvements to batch more cases, batching can store final_modulate as an attribute in the vertex format rather than sending as a uniform. This allows draw calls with different final_modulate to be batched together.
However custom shader code was reading from only the final_modulate uniform, and not the attribute when it was in use. This was leading to visual errors.
This is tricky to solve, because we cannot use the same name for the attribute in the vertex and fragment shaders, because one is an attribute and one a varying, whereas a uniform is accessible anywhere. To get around this, a macro is used which can translate to the most appropriate variable depending on whether uniform or attribute or varying is required.
This is something that I missed from the initial implementation of large FVF. In large FVF the transform is sent per vertex in an attribute, and the vertex position is the original vertex position. This is so that the original vertex position can be read and modified in a custom shader.
This whole system is therefore incompatible with the legacy hardware transform method, whereby the transform is sent in a uniform. The shader already correctly ignores the uniform transform, but there are some parts of the CPU side logic that can be confused treating large FVF batches as if they were hardware transform.
This PR completes the logic by making the CPU treat large FVF as though it was software transform.
Slight technical hitch, the basis was reversed that was sent to the shader, so rotations were opposite. This PR reverses polarity of the basis to be correct.
There have been a couple of reports of pixel lines when using light scissoring. These seem to be an off by one error caused by either rounding or pixel snapping.
This PR adds a single pixel boost to light scissor rects to protect against this. This should make little difference to performance.
Although batching supported both ninepatch modes (fixed and scaling) when using ninepatch stretch mode, the ninepatch tiling modes (in GLES3) could only run through the shader.
The shader only supported one of the ninepatch modes. This PR uses the hack method of #if defined in the shader to prevent the use of a conditional. The define is set at startup according to the project setting.
GLES3 changes:
This commit makes it possible to disable 3D directional lights by using
the light's cull mask. It also automatically disables directionals when
the object has baked lighting and the light is set to "bake all".
GLES2 changes:
Added a check for the light cull mask, since it was previously ignored.
One of the new fvf types (FVF_MODULATED) allows batching custom shaders that use modulate. The only slight oversight is that there is a special define when MODULATE is used in a custom shader, called MODULATE_USED, that is checked, and if set it does NOT apply final_modulate as part of canvas.glsl.
This MODULATE_USED define wasn't checked when the new FVF was used and modulate was passed in an attribute.
This PR moves the application of the final_modulate into the #ifndef MODULATE_USED section.
The rendering/quality/2d section of project settings is becoming considerably expanded in 3.2.4, and arguably was not the correct place for settings that were not really to do with quality.
3.2.4 is the last sensible opportunity we will have to move these settings, as the only existing one likely to break compatibility in a small way is `pixel_snap`, and given that the whole snapping area is being overhauled we can draw attention to the fact it has changed in the release notes.
Class reference is also updated and slightly improved.
`pixel_snap` is renamed to `gpu_pixel_snap` in the project settings and code to help differentiate from CPU side transform snapping.
Antialiasing is not supported for batched polys. Currently due to the fallback mechanism, skinned antialiased polys will be rendered without applying animation.
This PR simply treats such polys as if antialiasing had not been selected. The class reference is updated to reflect this.
Due to multi pass approach to lighting in GLES2, in some situations the rendered result can look different if lights are presented in a different order.
The order (aside from directional lights) seems to be simply copied from the culling routine (octree or bvh) which is essentially arbitrary. While octree is usually consistent with order, bvh uses a trickle optimize which may result in lights occurring in different order from frame to frame.
This PR adds an extra layer of sorting on GLES2 lights in order to get some kind of order consistency.
These functions don't yet exist on ubuntu 14.04 so this leads to build
problems there. Omitting these symbols in the generated wrappers fixes
this. If we want to start using these symbols at a later date we should
just regenerate the wrapper.
(cherry picked from commit f42a7f9849)
This suppresses the blocky shadow appearance, bringing the shadow rendering
much closer to GLES3. This soft filter is more demanding as it requires
more lookups, but it makes PCF13 shadows more usable.
The soft PCF filter was adapted from three.js.
This #define's older inttypes to their newer versions and #includes
<stdint.h> in the generated files. This will help with older
glibc/compiler versions using headers generated on newer systems.
cherry picked from 5233d78f49
These are benign but worth fixing as it clears the log to find more important errors.
A common problem with the sanitizer is that enums are often used to represent bits (e.g. 1, 2, 4, 8 etc) but without specifying the enum type, the compiler is free to use unsigned or signed int. In this case it uses int, and when it performs bitwise operations on the int type, the sanitizer complains.
This is probably because a bitshift with negative signed value can give undefined behaviour - the sanitizer can't know ahead of time that you are using the enum for sensible bitflags.
- Based on C++11's `thread` and `thread_local`
- No more need to allocate-deallocate or check for null
- No pointer anymore, just a member variable
- Platform-specific implementations no longer needed (except for the few cases of non-portable functions)
- Simpler for `NO_THREADS`
- Thread ids are now the same across platforms (main is 1; others follow)
- Based on C++11's `mutex` and `condition_variable`
- No more need to allocate-deallocate or check for null
- No pointer anymore, just a member variable
- Platform-specific implementations no longer needed
- Simpler for `NO_THREADS`
- Based on C++11's `mutex`
- No more need to allocate-deallocate or check for null
- No pointer anymore, just a member variable
- Platform-specific implementations no longer needed
- Simpler for `NO_THREADS`
- `BinaryMutex` added for special cases as the non-recursive version
- `MutexLock` now takes a reference. At this point the cases of null `Mutex`es are rare. If you ever need that, just don't use `MutexLock`.
- `ScopedMutexLock` is dropped and replaced by `MutexLock`, because they were pretty much the same.
- Based on C++14's `shared_time_mutex`
- No more need to allocate-deallocate or check for null
- No pointer anymore, just a member variable
- Platform-specific implementations no longer needed
- Simpler for `NO_THREADS`
It appears that we can get a fun circle dependency on a shared object on
some system configurations causing issues with our 'fake' function
pointer names. This can lead to a crash.
The new wrapper generator renames all the symbols so this can't happen
anymore. See https://github.com/hpvb/dynload-wrapper/commit/704135e
Cherry-pick from 8d36b17343
By generating stubs using https://github.com/hpvb/dynload-wrapper we
can dynamically load libpulse and libasound on systems where it is available.
Both are still a build-time requirement but no longer a run-time dependency.
For maintenance purposes the wrappers should not need to be re-generated
unless we want to bump pulse or asound to an incompatible version. It is
unlikely we will want to do this any time soon.
cherry-pick from 09f82fa6ea
Valgrind reported two instances of reading uninitialized memory in the batching. They are both pretty benign (as evidenced by no bug reports) but wise to close these.
The first is that when changing batch from a default batch it reads the batch color which is not set (as it is not relevant for default batches). The segment of code is not necessary when it has already deemed a batch change necessary (which will occur from a default batch). In addition this means that the count of color changes will be more accurate, rather than having a possible random value in.
The second is that on initialization _set_texture_rect_mode is called before the state has been properly initialized (it is initialized at the beginning of each canvas_begin, but this occurs outside of that).
Reordering an item from after a copybackbuffer to before would result in the wrong thing being rendered into the backbuffer.
This PR puts in a check to prevent reordering over such a boundary.
Move definition of rendering/quality/filters/anisotropic_filter_level to
servers/visual_server.cpp, since both GLES2 and GLES3 now use it
rasterizer_storage_gles3.cpp: Remove a spurious variable write (the
value gets overwritten soon after)
- Fix Embree runtime when using MinGW (patch by @RandomShaper).
- Fix baking of lightmaps on GridMaps.
- Fix some GLSL errors.
- Fix overflow in the number of shader variants (GLES2).
Completely re-write the lightmap generation code:
- Follow the general lightmapper code structure from 4.0.
- Use proper path tracing to compute the global illumination.
- Use atlassing to merge all lightmaps into a single texture (done by @RandomShaper)
- Use OpenImageDenoiser to improve the generated lightmaps.
- Take into account alpha transparency in material textures.
- Allow baking environment lighting.
- Add bicubic lightmap filtering.
There is some minor compatibility breakage in some properties and methods
in BakedLightmap, but lightmaps generated in previous engine versions
should work fine out of the box.
The scene importer has been changed to generate `.unwrap_cache` files
next to the imported scene files. These files *SHOULD* be added to any
version control system as they guarantee there won't be differences when
re-importing the scene from other OSes or engine versions.
This work started as a Google Summer of Code project; Was later funded by IMVU for a good amount of progress;
Was then finished and polished by me on my free time.
Co-authored-by: Pedro J. Estébanez <pedrojrulez@gmail.com>
nanosleep returns 0 or -1 not the error code.
The error code "EINTR" (if encountered) is placed in errno, in which case nanosleep can be safely recalled with the remaining time.
This is required, so that nanosleep continues if the calling thread is interrupted by a signal.
See manpage nanosleep(2) for additional details.
(cherry picked from commit 1107c7f327)
Happy new year to the wonderful Godot community!
2020 has been a tough year for most of us personally, but a good year for
Godot development nonetheless with a huge amount of work done towards Godot
4.0 and great improvements backported to the long-lived 3.2 branch.
We've had close to 400 contributors to engine code this year, authoring near
7,000 commit! (And that's only for the `master` branch and for the engine code,
there's a lot more when counting docs, demos and other first-party repos.)
Here's to a great year 2021 for all Godot users 🎆
(cherry picked from commit b5334d14f7)