The default shadow material was used for depth rendering disregarding the cull mode of the original material. This commit adds a check so the default shadow material is only used when the original material has back-face culling.
Fixes a regression introduced by the color pass flags rework. The various rasterizer state structs were not being reset for each flag combination, which meant some state changes were wrongly applied to some flag combinations.
The validation layers were complaining that we use DEFAULT_RD_TEXTURE_WHITE (which is RGBA8) in places where it's sampled as a depth texture. This commit adds the new default texture DEFAULT_RD_TEXTURE_DEPTH and uses it where needed.
This commit removes a lot of enum values related to the color render pass in favor of a new flag-bases approach. This means instead of hard-coding all the possible option combinations into enums, we can write our logic by checking a bit-mask.
The changes in rendering_device_vulkan.cpp add support for unused attachments. That means RenderingDeviceVulkan::framebuffer_create() can take null RIDs in the attachments vector, which will result in VK_ATTACHMENT_UNUSED entries in the render pass.
This is used in this same PR to establish fixed locations for the color pass attachments (only color and separate specular so far, but TAA will add motion vectors as well). This way the attachment locations in the shader can stay the same regardless of which attachments are actually used.
Right now all the combinations of flags are generated, but we will need to add a way to limit the amount of combinations in the future.
3 options are available:
- Light and Sky (default)
- Light Only (new)
- Sky Only (equivalent to `use_in_sky_only = true`)
Co-authored by: clayjohn <claynjohn@gmail.com>
When given roughness is lower than 0.01, d value in original code will
be zero. This can make last return value as NAN because of
divide-by-zero. This is well addressed in issue #56373.
Modified code is referenced on D_GGX function of google/filament
(https://github.com/google/filament/blob/main/shaders/src/brdf.fs#L54-L79)
Signed-off-by: snowapril <sinjihng@gmail.com>
* Changed syntax usage for RD::Uniform to create faster with a single RID
* Converted render pass setup to use this in clustered renderer to test.
This is the first step into creating a proper uniform set cache system to simplify large parts of the codebase.
- Add 2D and 3D in timestamp names when needed to avoid ambiguity.
- Use present tense in all render timestamp names.
- Add a space after ">" (begin) and "<" (end) symbols.
- Remove redundant "End" in render timestamp names (indicated by "<").
This can be used to fade lights and their shadows in the distance,
similar to Decal nodes. This can bring significant performance
improvements, especially for lights with shadows enabled and when
using higher-than-default shadow quality settings.
While lights can be smoothly faded out over distance, shadows are
currently "all or nothing" since per-light shadow color is no longer
customizable in the Vulkan renderer. This may result in noticeable
pop-in when leaving the shadow cutoff distance, but depending on the
scene, it may not always be that noticeable.
* Adds optional vec4 USERDATA1 .. USERDATA6 to particles, allowing to store custom data.
* This data is allocated on demand, so shaders that do not use it do not cost more.
- Enable Read Sky Light to get proper outdoors lighting out of the box.
- Set bounce feedback to 0.5 by default to get a better quality result.
- Higher values may cause infinite feedback with bright surfaces.
- Increase the number of frames to converge to improve quality
at the cost of latency. Most scenes are fairly static after all.
- Use 75% Y scale by default as most scenes are not highly vertical.
- Reorder the Y scale enum to go from the lowest Y scale to the highest.
Also rename the "Disabled" setting to "100%" for clarity.
This improves rendering performance noticeably, especially when the
camera moves fast.
On a medium-sized test scene on a GTX 1080 in 2560×1440, going
from 6 to cascades saves 0.5 ms of frame time while looking visually
identical (as most of the scene fits within the 4 cascades).
16-bit shadow atlases are already the default in the project settings,
but low-level methods used 24-bit shadows by default.
This makes low-level methods more consistent with the default project
settings to avoid accidental performance issues when users change
the shadow size at run-time.
`smoothstep()` avoids the sudden transparency jump when entering or
leaving an object's alpha fade margin distance. This in turn helps
make opacity transitions less noticeable to the player, as it's
less likely to catch the player's eye.
* Erase shadow owner *before* setting it to RID().
* Add default texture in shadow atlas debug view to avoid error spam when no atlas is present.
* Fix typo.
A common source of errors is to call functions (such as round()) expecting them to work in place, but them actually being designed only to return the processed value. Not using the return value in this case in indicative of a bug, and can be flagged as a warning by using the [[nodiscard]] attribute.
This provides more flexibility between performance and quality
adjustments, especially when using SDFGI for small-scale levels
(which can be useful for procedurally generated scenes).
On the only platform where PVRTC is supported (iOS),
ETC2 generally supersedes PVRTC in every possible way. The increased
memory usage is not really a problem thanks to modern iOS' devices
processing power being higher than its Android counterparts.
Found via `codespell -q 3 -S ./thirdparty,*.po,./DONORS.md -L ackward,ang,ans,ba,beng,cas,childs,childrens,dof,doubleclick,expct,fave,findn,gird,hist,inh,inout,leapyear,lod,nd,numer,ois,ony,paket,ro,seeked,sinc,switchs,te,uint,varn,vew`
Applying overlay materials into multi-surface meshes currently
requires adding a next pass material to all the surfaces, which
might be cumbersome when the material is to be applied to a range
of different geometries. This also makes it not trivial to use
AnimationPlayer to control the material in case of visual effects.
The material_override property is not an option as it works
replacing the active material for the surfaces, not adding a new pass.
This commit adds the material_overlay property to GeometryInstance3D
(and therefore MeshInstance3D), having the same reach as
material_override (that is, all surfaces) but adding a new material
pass on top of the active materials, instead of replacing them.
Each file in Godot has had multiple contributors who co-authored it over the
years, and the information of who was the original person to create that file
is not very relevant, especially when used so inconsistently.
`git blame` is a much better way to know who initially authored or later
modified a given chunk of code, and most IDEs now have good integration to
show this information.
This can be used to distinguish between integrated, dedicated, virtual
and software-emulated GPUs. This in turn can be used to automatically
adjust graphics settings, or warn users about features that may run
slowly on their hardware.
This reduces visible banding in indirect lighting and reflections.
Sharp reflections now match more closely the original scene.
The downside of this change is that clipping may appear in reflections
in extremely bright scenes, but this should not be a concern in most
scenes.
The message about SpatialMaterial conversion was turned into a warning,
as it can potentially interfere with porting projects from Godot 3.x
(if there's a bug in the conversion code).
In scenes that have little to no overdraw, disabling the depth prepass
can give a small performance boost. Nonetheless, in most other scenarios,
the depth prepass should be left enabled as it improves performance
significantly.
On a GeForce GTX 1080 in 2002×1447 resolution, decreasing VoxelGI quality
from High to Low quality saves 1.2 ms of GPU time in a medium-sized
test scene. This only results in a minor drop in quality.
Sets `AlignOperands` to `DontAlign`.
`clang-format` developers seem to mostly care about space-based indentation and
every other version of clang-format breaks the bad mismatch of tabs and spaces
that it seems to use for operand alignment. So it's better without, so that it
respects our two-tabs `ContinuationIndentWidth`.
Fixes the SHADOW_CASTING_SETTING_OFF setting in
GeometryInstance3D and the "shadows_disabled" render
mode in spatial materials, which were not working
before.
The built-in ALPHA in spatial shaders comes pre-set with a per-instance
transparency value. Multiply by it if you want to keep it.
The transparency value of any given GeometryInstance3D is affected by:
- Its new "transparency" property.
- Its own visiblity range when the new "visibility_range_fade_mode"
property is set to "Self".
- Its parent visibility range when the parent's fade mode is
set to "Dependencies".
The "Self" mode will fade-out the instance when reaching the visibility
range limits, while the "Dependencies" mode will fade-in its
dependencies.
Per-instance transparency is only implemented in the forward clustered
renderer, support for mobile should be added in the future.
Co-authored-by: reduz <reduzio@gmail.com>
This can be used to improve 3D shadow rendering quality at little
performance cost. Unlike the existing Hard setting which is limited
to variable shadow blur only, it works with both fixed blur and
variable blur.
This property was intended to provide a way to have SSAO or VoxelGI
ambient occlusion with a color other than black. However, it was
dropped during the Vulkan renderer development due to the performance
overhead it caused when the feature wasn't used.
Add glTF2 uri decode for paths.
Add vertex custom apis.
Add scene importer api.
Change Color to float; add support for float-based custom channels in SurfaceTool and EditorSceneImporterMesh
Co-authored-by: darth negative hunter
<thenegativehunter2@users.noreply.github.com>
In the `master` branch, 16× MSAA caused the entire system to freeze
on NVIDIA GPUs. This is likely caused by graphics drivers not actually
implementing 16× MSAA, but combining 8× MSAA with 2× SSAA instead.
On top of that, modern shader complexity makes 16× MSAA very difficult
to use while keeping a good framerate. 8× MSAA is hard enough to use
as it is.
OmniLight3D:
* Fixed lack of precision in cube map mode by scaling the projection's
znear.
* Fixed aliasing issues by making the paraboloids use two square regions instead of two half
squares.
* Fixed shadowmap atlas bleeding by adding padding.
* Fixed sihadow blur's inconsistent radius and unclamped sampling.
SpotLight3D:
* Fixed lack of precision by scaling the projection's znear.
* Fixed normal biasing.
Both:
* Tweaked biasing to make sure it works out of the box in most situations.
When a shader error is printed about a built-in shader, the origin
of the shader will now be recognizable immediately by looking at
the top of the printed shader code.
* Make sure shaders are named, to aid in debug in case of failure
* SceneRenderRD was being wrongly initialized (virtual functions being called when derivative class not initialized).
* Fixed some bugs resulting on the above being corrected.
The "Use Parent Material" option now does something when enabled on a CanvasItem. As before, it's not just limited to a node's direct parent but can move up the tree until it finds a material.
Also corrected a typo in rendering_device_vulkan.h that didn't merit its own commit.
* Specialization constants used to disable anything not needed to draw
* Added softshadow and projector support on mobile.
This new approach ensures mobile shaders are smaller and more efficient, but relies on more pipeline versions compiled on demand.
As a result, random stalls can ocur like in Godot 3.x. These will be solved by using background compilation and fallbacks eventually (but needs to be tested first).
The project setting wasn't being used anywhere.
This also tweaks the property hints to denote that these properties
are only effective after a restart.
* Only apply final actions to attachments used in the last pass.
* Fixes to draw list final action (was using continue instead of read/drop).
* Profiling regions inside draw lists now properly throw errors.
* Ability to enable gpu profile printing from project settings. (used to debug).
* Simplified code a lot, bias based on normalized cascade size.
* Lets scale cascades, max distance, etc. without creating acne.
* Fixed normal biasing in directional shadows.
I removed normal biasing in both omni and spot shadows, since the technique can't be easily implemented there.
Will need to be replaced by something else.
* Added an extra stage before compiling shader, which is generating a binary blob.
* On Vulkan, this allows caching the SPIRV reflection information, which is expensive to parse.
* On other (future) RenderingDevices, it allows caching converted binary data, such as DXIL or MSL.
This PR makes the shader cache include the reflection information, hence editor startup times are significantly improved.
I tested this well and it appears to work, and I added a lot of consistency checks, but because it includes writing and reading binary information, rare bugs may pop up, so be aware.
There was not much of a choice for storing the reflection information, given shaders can be a lot, take a lot of space and take time to parse.
Found via `codespell -q 3 -S ./thirdparty,*.po,./DONORS.md -L ackward,ang,ans,ba,beng,cas,childs,childrens,dof,doubleclick,fave,findn,hist,inout,leapyear,lod,nd,numer,ois,ony,paket,seeked,sinc,switchs,te,uint`
* Shadow quality settings now specialization constant.
* Decal and light projector filters can be set.
* Changing those settings forces re-creation of the pipelines.
These changes should help improve performance related to shadow mapping, and allows improving performance by sacrificing decal and light projector quality.
* use valid format for framebuffer: VK_FORMAT_A2B10G10R10_UNORM_PACK32
* Unfortunately cant be used for compute.
* Mobile will need to do refprobe, sky, mipmapblurring using raster.
* Keep track of when projector, softshadow or directional sofshadow were enabled.
* Enable them via specializaton constant where it makes sense.
* Re-implements soft shadows.
* Re-implements light projectors.
* IF a texture was reimported (calling replace as an example), it would invalidate all materials using it, causing plenty of errors.
* Added the possibility to get a notification when a uniform set is erased.
* With this notification, materials can be queued for update properly.
* Implements the code to show the boot splash on load using RenderingDevice
* Does not work on X11 when maximized, some platform specific hack will be needed there.
* Fixed and redone the process to obtain render information from a viewport
* Some stats, such as material changes are too difficult to guess on Vulkan, were removed.
* Separated visible and shadow stats, which causes confusion.
* Texture, buffer and general video memory can be queried now.
* Fixed the performance metrics too.
The Optimized shadow depth range was removed in late 2020 in favor
of the Stable shadow depth range, but it still had a (broken) property
that allowed to enable it.
* Colors were imported as 16BPP (half float)
* Far most common use cases only require 8BPP
* If you need higher data precision, use a custom array, which are supported now.
**WARNING**: 3D Scenes imported in 4.0 no longer compatible with this new format. You need to re-import them (erase them from .godot/import)
* Removed entirely from RenderingServer.
* Replaced by ImmediateMesh resource.
* ImmediateMesh replaces ImmediateGeometry, but could use more optimization in the future.
* Sprite3D and AnimatedSprite3D work again, ported from Godot 3.x (though a lot of work was needed to adapt them to Godot 4).
* RootMotionView works again.
* Polygon3D editor works again.
* Ability to allocate empty objects in RID_Owner, so RID_PtrOwner is not needed in most cases.
* Improves cache usage, as objects are now allocated together
* Should improve performance in 2D rendering
* Added a function to ignore subsequent commands if they don't fall within the slice.
* This will be used by the new TileMap to properly provide animated tiles.
* This is the 3D counterpart to #49632
* Implemented a bit different as 3D works using instancing
After merged, both 2D and 3D classes will most likely be renamed in a separate PR to DisplayNotifier2D/3D.
* GIProbe is now VoxelGI
* BakedLightmap is now LightmapGI
As godot adds more ways to provide GI (as an example, SDFGI in 4.0), the different techniques (which have different pros/cons) need to be properly named to avoid confusion.
* Shader compilation is now cached. Subsequent loads take less than a millisecond.
* Improved game, editor and project manager startup time.
* Editor uses .godot/shader_cache to store shaders.
* Game uses user://shader_cache
* Project manager uses $config_dir/shader_cache
* Options to tweak shader caching in project settings.
* Editor path configuration moved from EditorSettings to new class, EditorPaths, so it can be available early on (before shaders are compiled).
* Reworked ShaderCompilerRD to ensure deterministic shader code creation (else shader may change and cache will be invalidated).
* Added shader compression with SMOLV: https://github.com/aras-p/smol-v
-Mesh2D now works
-MultiMesh2D now works
-Polygon2D now works
-Added hooks for processing 2D particles
-Skeleton2D now works
2D particles still not working, but stuff needed for it is now implemented.
Various fixes to UV2 unwrapping and the GPU lightmapper. Listed here for
context in case of git blame/bisect:
* Fix UV2 unwrapping on import, also cleaned up the unwrap cache code.
* Fix saving of RGBA images in EXR format.
* Fixes to the GPU lightmapper:
- Added padding between atlas elements, avoids bleeding.
- Remove old SDF generation code.
- Fix baked attenuation for Omni/Spot lights.
- Fix baking of material properties onto UV2 (wireframe was
wrongly used before).
- Disable statically baked lights for objects that have a
lightmap texture to avoid applying the same light twice.
- Fix lightmap pairing in RendererSceneCull.
- Fix UV2 array generated from `RenderingServer::mesh_surface_get_arrays()`.
- Port autoexposure fix for OIDN from 3.x.
- Save debug textures as EXR when using floating point format.
-Enable the trails and set the length in seconds
-Provide a mesh with a skeleton and a skin
-Or, alternatively use one of the built-in TubeTrailMesh/RibbonTrailMesh
-Works deterministically
-Fixed particle collisions (were broken)
-Not working in 2D yet (that will happen next)
We've been using standard C library functions `memcpy`/`memset` for these since
2016 with 67f65f6639.
There was still the possibility for third-party platform ports to override the
definitions with a custom header, but this doesn't seem useful anymore.
Added an occlusion culling system with support for static occluder meshes.
It can be enabled via `Project Settings > Rendering > Occlusion Culling > Use Occlusion Culling`.
Occluders are defined via the new `Occluder3D` resource and instanced using the new
`OccluderInstance3D` node. The occluders can also be automatically baked from a
scene using the built-in editor plugin.
* Particle shaders now have start() and process()
* Particle collision happens between them.
* The RESTART property is kept, so porting an old shader is still possible.
This fixes the problem of particle collisions not functioning on the first particle frame.
-Used a more consistent set of keywords for the shader
-Remove all harcoded entry points
-Re-wrote the GLSL shader parser, new system is more flexible. Allows any entry point organization.
-Entry point for sky shaders is now sky().
-Entry point for particle shaders is now process().
-Advanced Settings toggle also hides advanced properties when disabled
-Simplified Advanced Bar (errors were just plain redundant)
-Reorganized rendering quality settings.
-Reorganized miscelaneous settings for clean up.
-Rendering server now uses a split RID allocate/initialize internally, this allows generating RIDs immediately but initialization to happen later on the proper thread (as rendering APIs generally requiere to call on the right thread).
-RenderingServerWrapMT is no more, multithreading is done in RenderingServerDefault.
-Some functions like texture or mesh creation, when renderer supports it, can register and return immediately (so no waiting for server API to flush, and saving staging and command buffer memory).
-3D physics server changed to be made multithread friendly.
-Added PhysicsServer3DWrapMT to use 3D physics server from multiple threads.
-Disablet Bullet (too much effort to make multithread friendly, this needs to be fixed eventually).
Inverted the spotlight angle attenuation so a higher value results in
a dimmer light, this makes it more consistent with the distance
attenuation.
Also changed the way spotlighs are computed in SDFGI
and GIPorbes and GPU lightmapper, now it matches the falloff used in the scene rendering
code.
-Always use temporal reproject, it just loos way better than any other filter.
-By always using termporal reproject, the shadowmap reduction can be done away with, massively improving performance.
-Disadvantage of temporal reproject is update latency so..
-Made sure a gaussian filter runs in XY after fog, this allows to keep stability and lower latency.
-Added more finegrained control in RenderingDevice API
-Optimized barriers (use less ones for thee same)
-General optimizations
-Shadows render all together unbarriered
-GI can render together with shadows.
-SDFGI can render together with depth-preoass.
-General fixes
-Added GPU detection
-Removed sync to draw, now everything syncs to draw by default.
-Fixed many validation layer errors.
-Added support for VkImageViewUsageCreateInfo to fix validation layer warnings.
-Texture, buffer, raster and compute functions now all allow spcifying which barriers will be used.
-When importing, a vertex-only version of the mesh is created.
-This version is used when rendering shadows, and improves performance by reducing bandwidth
-It's automatic, but can optionally be used by users, in case they want to make special versions of geometry for shadow casting.
-All shadow rendering is done with raster now (no compute)
-All shadow rendering is done by rendering directly to the shadow atlas
-Improved how buffer clearing is done to optimize the above.
-Ability to set shadows as 16 bits.
-SDFGI direct light is done over many frames
-SDFGI Changed settings for rays/frame
-SDFGI Misc optimizations
-SDFGI Bug fix on probe scroll
-GIProbe was not working, got it to work again
-GIProbe dynamic objects were not working, fixed
-Added a half size GI option.
Happy new year to the wonderful Godot community!
2020 has been a tough year for most of us personally, but a good year for
Godot development nonetheless with a huge amount of work done towards Godot
4.0 and great improvements backported to the long-lived 3.2 branch.
We've had close to 400 contributors to engine code this year, authoring near
7,000 commit! (And that's only for the `master` branch and for the engine code,
there's a lot more when counting docs, demos and other first-party repos.)
Here's to a great year 2021 for all Godot users 🎆
Blendshapes without a skeleton already worked.
However, due to a faulty ERR_FAIL_COND, it was impossible to create a mesh with both bones and blendshapes.
This also fixes an assumption that all surfaces reference the same number of bones as surface 0.
-Much greater pairing/unpairing performance
-For now, using it for culling too, but this will change in a couple of days.
-Added a paged allocator, to efficiently alloc/free some types of objects.
Use double instead of int for the p_lod_distance_multiplier default value in
RendererSceneRenderForward::_render_shadow to match parent default value in
RendererSceneRenderRD::_render_shadow
-Happens on import by default for all models
-Just works (tm)
-Biasing can be later adjusted per node or per viewport (as well as globally)
-Disabled AABB.get_support test because its broken
We haven't had a proper implementation for COMPRESS_PVRTC2 (which is PVRTC1 2-bpp) in years,
so let's drop it instead of keeping a compress type which doesn't work.
The other enum values were renamed to clarify that our PVRTC2 and PVRTC4 are respectively
PVRTC1 2-bpp and PVRTC1 4-bpp. PVRTC2 2-bpp and 4-bpp are not implemented yet.