diff --git a/doc/classes/Environment.xml b/doc/classes/Environment.xml index c5b6a54f71a..9dd4ecc37bf 100644 --- a/doc/classes/Environment.xml +++ b/doc/classes/Environment.xml @@ -218,26 +218,29 @@ The screen-space ambient occlusion intensity on materials that have an AO texture defined. Values higher than [code]0[/code] will make the SSAO effect visible in areas darkened by AO textures. - - The screen-space ambient occlusion bias. This should be kept high enough to prevent "smooth" curves from being affected by ambient occlusion. - - - The screen-space ambient occlusion blur quality. See [enum SSAOBlur] for possible values. - - - The screen-space ambient occlusion edge sharpness. + + Sets the strength of the additional level of detail for the screen-space ambient occlusion effect. A high value makes the detail pass more prominent, but it may contribute to aliasing in your final image. - If [code]true[/code], the screen-space ambient occlusion effect is enabled. This darkens objects' corners and cavities to simulate ambient light not reaching the entire object as in real life. This works well for small, dynamic objects, but baked lighting or ambient occlusion textures will do a better job at displaying ambient occlusion on large static objects. This is a costly effect and should be disabled first when running into performance issues. + If [code]true[/code], the screen-space ambient occlusion effect is enabled. This darkens objects' corners and cavities to simulate ambient light not reaching the entire object as in real life. This works well for small, dynamic objects, but baked lighting or ambient occlusion textures will do a better job at displaying ambient occlusion on large static objects. Godot uses a form of SSAO called Adaptive Screen Space Ambient Occlusion which is itself a form of Horizon Based Ambient Occlusion. - - The primary screen-space ambient occlusion intensity. See also [member ssao_radius]. + + The threshold for considering whether a given point on a surface is occluded or not represented as an angle from the horizon mapped into the [code]0.0-1.0[/code] range. A value of [code]1.0[/code] results in no occlusion. + + + The primary screen-space ambient occlusion intensity. Acts as a multiplier for the screen-space ambient occlusion effect. A higher value results in darker occlusion. The screen-space ambient occlusion intensity in direct light. In real life, ambient occlusion only applies to indirect light, which means its effects can't be seen in direct light. Values higher than [code]0[/code] will make the SSAO effect visible in direct light. + + The distribution of occlusion. A higher value results in darker occlusion, similar to [member ssao_intensity], but with a sharper falloff. + - The primary screen-space ambient occlusion radius. + The distance at which objects can occlude each other when calculating screen-space ambient occlusion. Higher values will result in occlusion over a greater distance at the cost of performance and quality. + + + Sharpness refers to the amount that the screen-space ambient occlusion effect is allowed to blur over the edges of objects. Setting too high will result in aliasing around the edges of objects. Setting too low will make object edges appear blurry. The default exposure used for tonemapping. @@ -335,18 +338,6 @@ Mixes the glow with the underlying color to avoid increasing brightness as much while still maintaining a glow effect. - - No blur for the screen-space ambient occlusion effect (fastest). - - - 1×1 blur for the screen-space ambient occlusion effect. - - - 2×2 blur for the screen-space ambient occlusion effect. - - - 3×3 blur for the screen-space ambient occlusion effect. Increases the radius of the blur for a smoother look, but can result in checkerboard-like artifacts. - diff --git a/doc/classes/ProjectSettings.xml b/doc/classes/ProjectSettings.xml index d0f2a927cb4..e856d1ea9ca 100644 --- a/doc/classes/ProjectSettings.xml +++ b/doc/classes/ProjectSettings.xml @@ -1240,11 +1240,26 @@ Lower-end override for [member rendering/quality/shadows/soft_shadow_quality] on mobile devices, due to performance concerns or driver support. + + Quality target to use when [member rendering/quality/ssao/quality] is set to [code]ULTRA[/code]. A value of [code]0.0[/code] provides a quality and speed similar to [code]MEDIUM[/code] while a value of [code]1.0[/code] provides much higher quality than any of the other settings at the cost of performance. + + + Number of blur passes to use when computing screen-space ambient occlusion. A higher number will result in a smoother look, but will be slower to compute and will have less high-frequency detail. + + + Distance at which the screen-space ambient occlusion effect starts to fade out. Use this hide ambient occlusion at great distances. + + + Distance at which the screen-space ambient occlusion is fully faded out. Use this hide ambient occlusion at great distances. + If [code]true[/code], screen-space ambient occlusion will be rendered at half size and then upscaled before being added to the scene. This is significantly faster but may miss small details. - - Sets the quality of the screen-space ambient occlusion effect. Higher values take more samples and so will result in better quality, at the cost of performance. + + Lower-end override for [member rendering/quality/ssao/half_size] on mobile devices, due to performance concerns. + + + Sets the quality of the screen-space ambient occlusion effect. Higher values take more samples and so will result in better quality, at the cost of performance. Setting to [code]ULTRA[/code] will use the [member rendering/quality/ssao/adaptive_target] setting. Scales the depth over which the subsurface scattering effect is applied. A high value may allow light to scatter into a part of the mesh or another mesh that is close in screen space but far in depth. diff --git a/doc/classes/RenderingServer.xml b/doc/classes/RenderingServer.xml index 74eb6a17e5e..036b50f0edb 100644 --- a/doc/classes/RenderingServer.xml +++ b/doc/classes/RenderingServer.xml @@ -765,17 +765,20 @@ - + - + - + - + - + + + + Sets the variables to be used with the "screen space ambient occlusion" post-process effect. See [Environment] for more details. @@ -3499,29 +3502,20 @@ - - Disables the blur set for SSAO. Will make SSAO look noisier. - - - Perform a 1x1 blur on the SSAO output. - - - Performs a 2x2 blur on the SSAO output. - - - Performs a 3x3 blur on the SSAO output. Use this for smoothest SSAO. - - + Lowest quality of screen space ambient occlusion. - + + Low quality screen space ambient occlusion. + + Medium quality screen space ambient occlusion. - + High quality screen space ambient occlusion. - - Highest quality screen space ambient occlusion. + + Highest quality screen space ambient occlusion. Uses the adaptive setting which can be dynamically adjusted to smoothly balance performance and visual quality. diff --git a/drivers/dummy/rasterizer_dummy.h b/drivers/dummy/rasterizer_dummy.h index bc34d2df9aa..f674c365008 100644 --- a/drivers/dummy/rasterizer_dummy.h +++ b/drivers/dummy/rasterizer_dummy.h @@ -86,8 +86,8 @@ public: void environment_set_ssr(RID p_env, bool p_enable, int p_max_steps, float p_fade_int, float p_fade_out, float p_depth_tolerance) override {} void environment_set_ssr_roughness_quality(RS::EnvironmentSSRRoughnessQuality p_quality) override {} - void environment_set_ssao(RID p_env, bool p_enable, float p_radius, float p_intensity, float p_bias, float p_light_affect, float p_ao_channel_affect, RS::EnvironmentSSAOBlur p_blur, float p_bilateral_sharpness) override {} - void environment_set_ssao_quality(RS::EnvironmentSSAOQuality p_quality, bool p_half_size) override {} + void environment_set_ssao(RID p_env, bool p_enable, float p_radius, float p_intensity, float p_power, float p_detail, float p_horizon, float p_sharpness, float p_light_affect, float p_ao_channel_affect) override {} + void environment_set_ssao_quality(RS::EnvironmentSSAOQuality p_quality, bool p_half_size, float p_adaptive_target, int p_blur_passes, float p_fadeout_from, float p_fadeout_to) override {} void environment_set_sdfgi(RID p_env, bool p_enable, RS::EnvironmentSDFGICascades p_cascades, float p_min_cell_size, RS::EnvironmentSDFGIYScale p_y_scale, bool p_use_occlusion, bool p_use_multibounce, bool p_read_sky, float p_energy, float p_normal_bias, float p_probe_bias) override {} diff --git a/drivers/vulkan/rendering_device_vulkan.cpp b/drivers/vulkan/rendering_device_vulkan.cpp index c6d49de2748..03216c667eb 100644 --- a/drivers/vulkan/rendering_device_vulkan.cpp +++ b/drivers/vulkan/rendering_device_vulkan.cpp @@ -2095,15 +2095,26 @@ RID RenderingDeviceVulkan::texture_create_shared_from_slice(const TextureView &p ERR_FAIL_COND_V_MSG(p_slice_type == TEXTURE_SLICE_3D && src_texture->type != TEXTURE_TYPE_3D, RID(), "Can only create a 3D slice from a 3D texture"); + ERR_FAIL_COND_V_MSG(p_slice_type == TEXTURE_SLICE_2D_ARRAY && (src_texture->type != TEXTURE_TYPE_2D_ARRAY), RID(), + "Can only create an array slice from a 2D array mipmap"); + //create view ERR_FAIL_UNSIGNED_INDEX_V(p_mipmap, src_texture->mipmaps, RID()); ERR_FAIL_UNSIGNED_INDEX_V(p_layer, src_texture->layers, RID()); + int slice_layers = 1; + if (p_slice_type == TEXTURE_SLICE_2D_ARRAY) { + ERR_FAIL_COND_V_MSG(p_layer != 0, RID(), "layer must be 0 when obtaining a 2D array mipmap slice"); + slice_layers = src_texture->layers; + } else if (p_slice_type == TEXTURE_SLICE_CUBEMAP) { + slice_layers = 6; + } + Texture texture = *src_texture; get_image_format_required_size(texture.format, texture.width, texture.height, texture.depth, p_mipmap + 1, &texture.width, &texture.height); texture.mipmaps = 1; - texture.layers = p_slice_type == TEXTURE_SLICE_CUBEMAP ? 6 : 1; + texture.layers = slice_layers; texture.base_mipmap = p_mipmap; texture.base_layer = p_layer; @@ -2157,7 +2168,7 @@ RID RenderingDeviceVulkan::texture_create_shared_from_slice(const TextureView &p } image_view_create_info.subresourceRange.baseMipLevel = p_mipmap; image_view_create_info.subresourceRange.levelCount = 1; - image_view_create_info.subresourceRange.layerCount = p_slice_type == TEXTURE_SLICE_CUBEMAP ? 6 : 1; + image_view_create_info.subresourceRange.layerCount = slice_layers; image_view_create_info.subresourceRange.baseArrayLayer = p_layer; if (texture.usage_flags & TEXTURE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT) { diff --git a/editor/editor_node.cpp b/editor/editor_node.cpp index a8dc14427d0..970b61af0c4 100644 --- a/editor/editor_node.cpp +++ b/editor/editor_node.cpp @@ -485,7 +485,7 @@ void EditorNode::_notification(int p_what) { RS::DOFBlurQuality dof_quality = RS::DOFBlurQuality(int(GLOBAL_GET("rendering/quality/depth_of_field/depth_of_field_bokeh_quality"))); bool dof_jitter = GLOBAL_GET("rendering/quality/depth_of_field/depth_of_field_use_jitter"); RS::get_singleton()->camera_effects_set_dof_blur_quality(dof_quality, dof_jitter); - RS::get_singleton()->environment_set_ssao_quality(RS::EnvironmentSSAOQuality(int(GLOBAL_GET("rendering/quality/ssao/quality"))), GLOBAL_GET("rendering/quality/ssao/half_size")); + RS::get_singleton()->environment_set_ssao_quality(RS::EnvironmentSSAOQuality(int(GLOBAL_GET("rendering/quality/ssao/quality"))), GLOBAL_GET("rendering/quality/ssao/half_size"), GLOBAL_GET("rendering/quality/ssao/adaptive_target"), GLOBAL_GET("rendering/quality/ssao/blur_passes"), GLOBAL_GET("rendering/quality/ssao/fadeout_from"), GLOBAL_GET("rendering/quality/ssao/fadeout_to")); RS::get_singleton()->screen_space_roughness_limiter_set_active(GLOBAL_GET("rendering/quality/screen_filters/screen_space_roughness_limiter_enabled"), GLOBAL_GET("rendering/quality/screen_filters/screen_space_roughness_limiter_amount"), GLOBAL_GET("rendering/quality/screen_filters/screen_space_roughness_limiter_limit")); bool glow_bicubic = int(GLOBAL_GET("rendering/quality/glow/upscale_mode")) > 0; RS::get_singleton()->environment_glow_set_use_bicubic_upscale(glow_bicubic); diff --git a/scene/resources/environment.cpp b/scene/resources/environment.cpp index 1eb78a3679c..32840dd3739 100644 --- a/scene/resources/environment.cpp +++ b/scene/resources/environment.cpp @@ -368,13 +368,40 @@ float Environment::get_ssao_intensity() const { return ssao_intensity; } -void Environment::set_ssao_bias(float p_bias) { - ssao_bias = p_bias; +void Environment::set_ssao_power(float p_power) { + ssao_power = p_power; _update_ssao(); } -float Environment::get_ssao_bias() const { - return ssao_bias; +float Environment::get_ssao_power() const { + return ssao_power; +} + +void Environment::set_ssao_detail(float p_detail) { + ssao_detail = p_detail; + _update_ssao(); +} + +float Environment::get_ssao_detail() const { + return ssao_detail; +} + +void Environment::set_ssao_horizon(float p_horizon) { + ssao_horizon = p_horizon; + _update_ssao(); +} + +float Environment::get_ssao_horizon() const { + return ssao_horizon; +} + +void Environment::set_ssao_sharpness(float p_sharpness) { + ssao_sharpness = p_sharpness; + _update_ssao(); +} + +float Environment::get_ssao_sharpness() const { + return ssao_sharpness; } void Environment::set_ssao_direct_light_affect(float p_direct_light_affect) { @@ -395,35 +422,18 @@ float Environment::get_ssao_ao_channel_affect() const { return ssao_ao_channel_affect; } -void Environment::set_ssao_blur(SSAOBlur p_blur) { - ssao_blur = p_blur; - _update_ssao(); -} - -Environment::SSAOBlur Environment::get_ssao_blur() const { - return ssao_blur; -} - -void Environment::set_ssao_edge_sharpness(float p_edge_sharpness) { - ssao_edge_sharpness = p_edge_sharpness; - _update_ssao(); -} - -float Environment::get_ssao_edge_sharpness() const { - return ssao_edge_sharpness; -} - void Environment::_update_ssao() { RS::get_singleton()->environment_set_ssao( environment, ssao_enabled, ssao_radius, ssao_intensity, - ssao_bias, + ssao_power, + ssao_detail, + ssao_horizon, + ssao_sharpness, ssao_direct_light_affect, - ssao_ao_channel_affect, - RS::EnvironmentSSAOBlur(ssao_blur), - ssao_edge_sharpness); + ssao_ao_channel_affect); } // SDFGI @@ -1150,26 +1160,29 @@ void Environment::_bind_methods() { ClassDB::bind_method(D_METHOD("get_ssao_radius"), &Environment::get_ssao_radius); ClassDB::bind_method(D_METHOD("set_ssao_intensity", "intensity"), &Environment::set_ssao_intensity); ClassDB::bind_method(D_METHOD("get_ssao_intensity"), &Environment::get_ssao_intensity); - ClassDB::bind_method(D_METHOD("set_ssao_bias", "bias"), &Environment::set_ssao_bias); - ClassDB::bind_method(D_METHOD("get_ssao_bias"), &Environment::get_ssao_bias); + ClassDB::bind_method(D_METHOD("set_ssao_power", "power"), &Environment::set_ssao_power); + ClassDB::bind_method(D_METHOD("get_ssao_power"), &Environment::get_ssao_power); + ClassDB::bind_method(D_METHOD("set_ssao_detail", "detail"), &Environment::set_ssao_detail); + ClassDB::bind_method(D_METHOD("get_ssao_detail"), &Environment::get_ssao_detail); + ClassDB::bind_method(D_METHOD("set_ssao_horizon", "horizon"), &Environment::set_ssao_horizon); + ClassDB::bind_method(D_METHOD("get_ssao_horizon"), &Environment::get_ssao_horizon); + ClassDB::bind_method(D_METHOD("set_ssao_sharpness", "sharpness"), &Environment::set_ssao_sharpness); + ClassDB::bind_method(D_METHOD("get_ssao_sharpness"), &Environment::get_ssao_sharpness); ClassDB::bind_method(D_METHOD("set_ssao_direct_light_affect", "amount"), &Environment::set_ssao_direct_light_affect); ClassDB::bind_method(D_METHOD("get_ssao_direct_light_affect"), &Environment::get_ssao_direct_light_affect); ClassDB::bind_method(D_METHOD("set_ssao_ao_channel_affect", "amount"), &Environment::set_ssao_ao_channel_affect); ClassDB::bind_method(D_METHOD("get_ssao_ao_channel_affect"), &Environment::get_ssao_ao_channel_affect); - ClassDB::bind_method(D_METHOD("set_ssao_blur", "mode"), &Environment::set_ssao_blur); - ClassDB::bind_method(D_METHOD("get_ssao_blur"), &Environment::get_ssao_blur); - ClassDB::bind_method(D_METHOD("set_ssao_edge_sharpness", "edge_sharpness"), &Environment::set_ssao_edge_sharpness); - ClassDB::bind_method(D_METHOD("get_ssao_edge_sharpness"), &Environment::get_ssao_edge_sharpness); ADD_GROUP("SSAO", "ssao_"); ADD_PROPERTY(PropertyInfo(Variant::BOOL, "ssao_enabled"), "set_ssao_enabled", "is_ssao_enabled"); - ADD_PROPERTY(PropertyInfo(Variant::FLOAT, "ssao_radius", PROPERTY_HINT_RANGE, "0.1,128,0.01"), "set_ssao_radius", "get_ssao_radius"); - ADD_PROPERTY(PropertyInfo(Variant::FLOAT, "ssao_intensity", PROPERTY_HINT_RANGE, "0.0,128,0.01"), "set_ssao_intensity", "get_ssao_intensity"); - ADD_PROPERTY(PropertyInfo(Variant::FLOAT, "ssao_bias", PROPERTY_HINT_RANGE, "0.001,8,0.001"), "set_ssao_bias", "get_ssao_bias"); + ADD_PROPERTY(PropertyInfo(Variant::FLOAT, "ssao_radius", PROPERTY_HINT_RANGE, "0.01,16,0.01,or_greater"), "set_ssao_radius", "get_ssao_radius"); + ADD_PROPERTY(PropertyInfo(Variant::FLOAT, "ssao_intensity", PROPERTY_HINT_RANGE, "0,16,0.01,or_greater"), "set_ssao_intensity", "get_ssao_intensity"); + ADD_PROPERTY(PropertyInfo(Variant::FLOAT, "ssao_power", PROPERTY_HINT_EXP_EASING), "set_ssao_power", "get_ssao_power"); + ADD_PROPERTY(PropertyInfo(Variant::FLOAT, "ssao_detail", PROPERTY_HINT_RANGE, "0,5,0.01"), "set_ssao_detail", "get_ssao_detail"); + ADD_PROPERTY(PropertyInfo(Variant::FLOAT, "ssao_horizon", PROPERTY_HINT_RANGE, "0,1,0.01"), "set_ssao_horizon", "get_ssao_horizon"); + ADD_PROPERTY(PropertyInfo(Variant::FLOAT, "ssao_sharpness", PROPERTY_HINT_RANGE, "0,1,0.01"), "set_ssao_sharpness", "get_ssao_sharpness"); ADD_PROPERTY(PropertyInfo(Variant::FLOAT, "ssao_light_affect", PROPERTY_HINT_RANGE, "0.00,1,0.01"), "set_ssao_direct_light_affect", "get_ssao_direct_light_affect"); ADD_PROPERTY(PropertyInfo(Variant::FLOAT, "ssao_ao_channel_affect", PROPERTY_HINT_RANGE, "0.00,1,0.01"), "set_ssao_ao_channel_affect", "get_ssao_ao_channel_affect"); - ADD_PROPERTY(PropertyInfo(Variant::INT, "ssao_blur", PROPERTY_HINT_ENUM, "Disabled,1x1,2x2,3x3"), "set_ssao_blur", "get_ssao_blur"); - ADD_PROPERTY(PropertyInfo(Variant::FLOAT, "ssao_edge_sharpness", PROPERTY_HINT_RANGE, "0,32,0.01"), "set_ssao_edge_sharpness", "get_ssao_edge_sharpness"); // SDFGI @@ -1367,11 +1380,6 @@ void Environment::_bind_methods() { BIND_ENUM_CONSTANT(GLOW_BLEND_MODE_REPLACE); BIND_ENUM_CONSTANT(GLOW_BLEND_MODE_MIX); - BIND_ENUM_CONSTANT(SSAO_BLUR_DISABLED); - BIND_ENUM_CONSTANT(SSAO_BLUR_1x1); - BIND_ENUM_CONSTANT(SSAO_BLUR_2x2); - BIND_ENUM_CONSTANT(SSAO_BLUR_3x3); - BIND_ENUM_CONSTANT(SDFGI_CASCADES_4); BIND_ENUM_CONSTANT(SDFGI_CASCADES_6); BIND_ENUM_CONSTANT(SDFGI_CASCADES_8); diff --git a/scene/resources/environment.h b/scene/resources/environment.h index 106ba92bfee..a720f2cc473 100644 --- a/scene/resources/environment.h +++ b/scene/resources/environment.h @@ -70,13 +70,6 @@ public: TONE_MAPPER_ACES, }; - enum SSAOBlur { - SSAO_BLUR_DISABLED, - SSAO_BLUR_1x1, - SSAO_BLUR_2x2, - SSAO_BLUR_3x3, - }; - enum SDFGICascades { SDFGI_CASCADES_4, SDFGI_CASCADES_6, @@ -148,12 +141,13 @@ private: // SSAO bool ssao_enabled = false; float ssao_radius = 1.0; - float ssao_intensity = 1.0; - float ssao_bias = 0.01; + float ssao_intensity = 2.0; + float ssao_power = 1.5; + float ssao_detail = 0.5; + float ssao_horizon = 0.06; + float ssao_sharpness = 0.98; float ssao_direct_light_affect = 0.0; float ssao_ao_channel_affect = 0.0; - SSAOBlur ssao_blur = SSAO_BLUR_3x3; - float ssao_edge_sharpness = 4.0; void _update_ssao(); // SDFGI @@ -295,16 +289,18 @@ public: float get_ssao_radius() const; void set_ssao_intensity(float p_intensity); float get_ssao_intensity() const; - void set_ssao_bias(float p_bias); - float get_ssao_bias() const; + void set_ssao_power(float p_power); + float get_ssao_power() const; + void set_ssao_detail(float p_detail); + float get_ssao_detail() const; + void set_ssao_horizon(float p_horizon); + float get_ssao_horizon() const; + void set_ssao_sharpness(float p_sharpness); + float get_ssao_sharpness() const; void set_ssao_direct_light_affect(float p_direct_light_affect); float get_ssao_direct_light_affect() const; void set_ssao_ao_channel_affect(float p_ao_channel_affect); float get_ssao_ao_channel_affect() const; - void set_ssao_blur(SSAOBlur p_blur); - SSAOBlur get_ssao_blur() const; - void set_ssao_edge_sharpness(float p_edge_sharpness); - float get_ssao_edge_sharpness() const; // SDFGI void set_sdfgi_enabled(bool p_enabled); @@ -414,7 +410,6 @@ VARIANT_ENUM_CAST(Environment::BGMode) VARIANT_ENUM_CAST(Environment::AmbientSource) VARIANT_ENUM_CAST(Environment::ReflectionSource) VARIANT_ENUM_CAST(Environment::ToneMapper) -VARIANT_ENUM_CAST(Environment::SSAOBlur) VARIANT_ENUM_CAST(Environment::SDFGICascades) VARIANT_ENUM_CAST(Environment::SDFGIYScale) VARIANT_ENUM_CAST(Environment::GlowBlendMode) diff --git a/servers/rendering/renderer_rd/effects_rd.cpp b/servers/rendering/renderer_rd/effects_rd.cpp index cbb000cb8ce..b33255b54b2 100644 --- a/servers/rendering/renderer_rd/effects_rd.cpp +++ b/servers/rendering/renderer_rd/effects_rd.cpp @@ -31,6 +31,7 @@ #include "effects_rd.h" #include "core/config/project_settings.h" +#include "core/math/math_defs.h" #include "core/os/os.h" #include "thirdparty/misc/cubemap_coeffs.h" @@ -125,6 +126,33 @@ RID EffectsRD::_get_compute_uniform_set_from_texture(RID p_texture, bool p_use_m return uniform_set; } +RID EffectsRD::_get_compute_uniform_set_from_texture_and_sampler(RID p_texture, RID p_sampler) { + TextureSamplerPair tsp; + tsp.texture = p_texture; + tsp.sampler = p_sampler; + + if (texture_sampler_to_compute_uniform_set_cache.has(tsp)) { + RID uniform_set = texture_sampler_to_compute_uniform_set_cache[tsp]; + if (RD::get_singleton()->uniform_set_is_valid(uniform_set)) { + return uniform_set; + } + } + + Vector uniforms; + RD::Uniform u; + u.uniform_type = RD::UNIFORM_TYPE_SAMPLER_WITH_TEXTURE; + u.binding = 0; + u.ids.push_back(p_sampler); + u.ids.push_back(p_texture); + uniforms.push_back(u); + //any thing with the same configuration (one texture in binding 0 for set 0), is good + RID uniform_set = RD::get_singleton()->uniform_set_create(uniforms, ssao.blur_shader.version_get_shader(ssao.blur_shader_version, 0), 0); + + texture_sampler_to_compute_uniform_set_cache[tsp] = uniform_set; + + return uniform_set; +} + RID EffectsRD::_get_compute_uniform_set_from_texture_pair(RID p_texture1, RID p_texture2, bool p_use_mipmaps) { TexturePair tp; tp.texture1 = p_texture1; @@ -951,157 +979,345 @@ void EffectsRD::bokeh_dof(RID p_base_texture, RID p_depth_texture, const Size2i RD::get_singleton()->compute_list_end(); } -void EffectsRD::generate_ssao(RID p_depth_buffer, RID p_normal_buffer, const Size2i &p_depth_buffer_size, RID p_depth_mipmaps_texture, const Vector &depth_mipmaps, RID p_ao1, bool p_half_size, RID p_ao2, RID p_upscale_buffer, float p_intensity, float p_radius, float p_bias, const CameraMatrix &p_projection, RS::EnvironmentSSAOQuality p_quality, RS::EnvironmentSSAOBlur p_blur, float p_edge_sharpness) { - //minify first - ssao.minify_push_constant.orthogonal = p_projection.is_orthogonal(); - ssao.minify_push_constant.z_near = p_projection.get_z_near(); - ssao.minify_push_constant.z_far = p_projection.get_z_far(); - ssao.minify_push_constant.pixel_size[0] = 1.0 / p_depth_buffer_size.x; - ssao.minify_push_constant.pixel_size[1] = 1.0 / p_depth_buffer_size.y; - ssao.minify_push_constant.source_size[0] = p_depth_buffer_size.x; - ssao.minify_push_constant.source_size[1] = p_depth_buffer_size.y; +void EffectsRD::gather_ssao(RD::ComputeListID p_compute_list, const Vector p_ao_slices, const SSAOSettings &p_settings, bool p_adaptive_base_pass) { + RD::get_singleton()->compute_list_bind_uniform_set(p_compute_list, ssao.gather_uniform_set, 0); + if ((p_settings.quality == RS::ENV_SSAO_QUALITY_ULTRA) && !p_adaptive_base_pass) { + RD::get_singleton()->compute_list_bind_uniform_set(p_compute_list, ssao.importance_map_uniform_set, 1); + } + for (int i = 0; i < 4; i++) { + if ((p_settings.quality == RS::ENV_SSAO_QUALITY_VERY_LOW) && ((i == 1) || (i == 2))) { + continue; + } + + ssao.gather_push_constant.pass_coord_offset[0] = i % 2; + ssao.gather_push_constant.pass_coord_offset[1] = i / 2; + ssao.gather_push_constant.pass_uv_offset[0] = ((i % 2) - 0.0) / p_settings.screen_size.x; + ssao.gather_push_constant.pass_uv_offset[1] = ((i / 2) - 0.0) / p_settings.screen_size.y; + ssao.gather_push_constant.pass = i; + RD::get_singleton()->compute_list_bind_uniform_set(p_compute_list, _get_uniform_set_from_image(p_ao_slices[i]), 2); + RD::get_singleton()->compute_list_set_push_constant(p_compute_list, &ssao.gather_push_constant, sizeof(SSAOGatherPushConstant)); + + int x_groups = ((p_settings.screen_size.x >> (p_settings.half_size ? 2 : 1)) - 1) / 8 + 1; + int y_groups = ((p_settings.screen_size.y >> (p_settings.half_size ? 2 : 1)) - 1) / 8 + 1; + + RD::get_singleton()->compute_list_dispatch(p_compute_list, x_groups, y_groups, 1); + } + RD::get_singleton()->compute_list_add_barrier(p_compute_list); +} + +void EffectsRD::generate_ssao(RID p_depth_buffer, RID p_normal_buffer, RID p_depth_mipmaps_texture, const Vector &depth_mipmaps, RID p_ao, const Vector p_ao_slices, RID p_ao_pong, const Vector p_ao_pong_slices, RID p_upscale_buffer, RID p_importance_map, RID p_importance_map_pong, const CameraMatrix &p_projection, const SSAOSettings &p_settings, bool p_invalidate_uniform_sets) { RD::ComputeListID compute_list = RD::get_singleton()->compute_list_begin(); /* FIRST PASS */ - // Minify the depth buffer. - - for (int i = 0; i < depth_mipmaps.size(); i++) { - if (i == 0) { - RD::get_singleton()->compute_list_bind_compute_pipeline(compute_list, ssao.pipelines[SSAO_MINIFY_FIRST]); - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_depth_buffer), 0); - } else { - if (i == 1) { - RD::get_singleton()->compute_list_bind_compute_pipeline(compute_list, ssao.pipelines[SSAO_MINIFY_MIPMAP]); + // Downsample and deinterleave the depth buffer. + { + if (p_invalidate_uniform_sets) { + Vector uniforms; + { + RD::Uniform u; + u.uniform_type = RD::UNIFORM_TYPE_IMAGE; + u.binding = 0; + u.ids.push_back(depth_mipmaps[1]); + uniforms.push_back(u); } - - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_uniform_set_from_image(depth_mipmaps[i - 1]), 0); + { + RD::Uniform u; + u.uniform_type = RD::UNIFORM_TYPE_IMAGE; + u.binding = 1; + u.ids.push_back(depth_mipmaps[2]); + uniforms.push_back(u); + } + { + RD::Uniform u; + u.uniform_type = RD::UNIFORM_TYPE_IMAGE; + u.binding = 2; + u.ids.push_back(depth_mipmaps[3]); + uniforms.push_back(u); + } + ssao.downsample_uniform_set = RD::get_singleton()->uniform_set_create(uniforms, ssao.downsample_shader.version_get_shader(ssao.downsample_shader_version, 2), 2); } - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_uniform_set_from_image(depth_mipmaps[i]), 1); - RD::get_singleton()->compute_list_set_push_constant(compute_list, &ssao.minify_push_constant, sizeof(SSAOMinifyPushConstant)); - // shrink after set - ssao.minify_push_constant.source_size[0] = MAX(1, ssao.minify_push_constant.source_size[0] >> 1); - ssao.minify_push_constant.source_size[1] = MAX(1, ssao.minify_push_constant.source_size[1] >> 1); + float depth_linearize_mul = -p_projection.matrix[3][2]; + float depth_linearize_add = p_projection.matrix[2][2]; + if (depth_linearize_mul * depth_linearize_add < 0) { + depth_linearize_add = -depth_linearize_add; + } - int x_groups = (ssao.minify_push_constant.source_size[0] - 1) / 8 + 1; - int y_groups = (ssao.minify_push_constant.source_size[1] - 1) / 8 + 1; + ssao.downsample_push_constant.orthogonal = p_projection.is_orthogonal(); + ssao.downsample_push_constant.z_near = depth_linearize_mul; + ssao.downsample_push_constant.z_far = depth_linearize_add; + if (ssao.downsample_push_constant.orthogonal) { + ssao.downsample_push_constant.z_near = p_projection.get_z_near(); + ssao.downsample_push_constant.z_far = p_projection.get_z_far(); + } + ssao.downsample_push_constant.pixel_size[0] = 1.0 / p_settings.screen_size.x; + ssao.downsample_push_constant.pixel_size[1] = 1.0 / p_settings.screen_size.y; + ssao.downsample_push_constant.radius_sq = p_settings.radius * p_settings.radius; + + int downsample_pipeline = SSAO_DOWNSAMPLE; + if (p_settings.quality == RS::ENV_SSAO_QUALITY_VERY_LOW) { + downsample_pipeline = SSAO_DOWNSAMPLE_HALF; + } else if (p_settings.quality > RS::ENV_SSAO_QUALITY_MEDIUM) { + downsample_pipeline = SSAO_DOWNSAMPLE_MIPMAP; + } + + if (p_settings.half_size) { + downsample_pipeline++; + } + + RD::get_singleton()->compute_list_bind_compute_pipeline(compute_list, ssao.pipelines[downsample_pipeline]); + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_depth_buffer), 0); + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_uniform_set_from_image(depth_mipmaps[0]), 1); + if (p_settings.quality > RS::ENV_SSAO_QUALITY_MEDIUM) { + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, ssao.downsample_uniform_set, 2); + } + RD::get_singleton()->compute_list_set_push_constant(compute_list, &ssao.downsample_push_constant, sizeof(SSAODownsamplePushConstant)); + + int x_groups = (MAX(1, p_settings.screen_size.x >> (p_settings.half_size ? 2 : 1)) - 1) / 8 + 1; + int y_groups = (MAX(1, p_settings.screen_size.y >> (p_settings.half_size ? 2 : 1)) - 1) / 8 + 1; RD::get_singleton()->compute_list_dispatch(compute_list, x_groups, y_groups, 1); RD::get_singleton()->compute_list_add_barrier(compute_list); } /* SECOND PASS */ - // Gather samples + // Sample SSAO + { + ssao.gather_push_constant.screen_size[0] = p_settings.screen_size.x; + ssao.gather_push_constant.screen_size[1] = p_settings.screen_size.y; - RD::get_singleton()->compute_list_bind_compute_pipeline(compute_list, ssao.pipelines[(SSAO_GATHER_LOW + p_quality) + (p_half_size ? 4 : 0)]); - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_depth_mipmaps_texture), 0); - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_uniform_set_from_image(p_ao1), 1); - if (!p_half_size) { - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_depth_buffer), 2); - } - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_normal_buffer), 3); + ssao.gather_push_constant.half_screen_pixel_size[0] = 1.0 / p_settings.half_screen_size.x; + ssao.gather_push_constant.half_screen_pixel_size[1] = 1.0 / p_settings.half_screen_size.y; + float tan_half_fov_x = 1.0 / p_projection.matrix[0][0]; + float tan_half_fov_y = 1.0 / p_projection.matrix[1][1]; + ssao.gather_push_constant.NDC_to_view_mul[0] = tan_half_fov_x * 2.0; + ssao.gather_push_constant.NDC_to_view_mul[1] = tan_half_fov_y * -2.0; + ssao.gather_push_constant.NDC_to_view_add[0] = tan_half_fov_x * -1.0; + ssao.gather_push_constant.NDC_to_view_add[1] = tan_half_fov_y; + ssao.gather_push_constant.is_orthogonal = p_projection.is_orthogonal(); - ssao.gather_push_constant.screen_size[0] = p_depth_buffer_size.x; - ssao.gather_push_constant.screen_size[1] = p_depth_buffer_size.y; - if (p_half_size) { - ssao.gather_push_constant.screen_size[0] >>= 1; - ssao.gather_push_constant.screen_size[1] >>= 1; - } - ssao.gather_push_constant.z_far = p_projection.get_z_far(); - ssao.gather_push_constant.z_near = p_projection.get_z_near(); - ssao.gather_push_constant.orthogonal = p_projection.is_orthogonal(); + ssao.gather_push_constant.half_screen_pixel_size_x025[0] = ssao.gather_push_constant.half_screen_pixel_size[0] * 0.25; + ssao.gather_push_constant.half_screen_pixel_size_x025[1] = ssao.gather_push_constant.half_screen_pixel_size[1] * 0.25; - ssao.gather_push_constant.proj_info[0] = -2.0f / (ssao.gather_push_constant.screen_size[0] * p_projection.matrix[0][0]); - ssao.gather_push_constant.proj_info[1] = -2.0f / (ssao.gather_push_constant.screen_size[1] * p_projection.matrix[1][1]); - ssao.gather_push_constant.proj_info[2] = (1.0f - p_projection.matrix[0][2]) / p_projection.matrix[0][0]; - ssao.gather_push_constant.proj_info[3] = (1.0f + p_projection.matrix[1][2]) / p_projection.matrix[1][1]; - //ssao.gather_push_constant.proj_info[2] = (1.0f - p_projection.matrix[0][2]) / p_projection.matrix[0][0]; - //ssao.gather_push_constant.proj_info[3] = -(1.0f + p_projection.matrix[1][2]) / p_projection.matrix[1][1]; + float radius_near_limit = (p_settings.radius * 1.2f); + if (p_settings.quality <= RS::ENV_SSAO_QUALITY_LOW) { + radius_near_limit *= 1.50f; - ssao.gather_push_constant.radius = p_radius; - - ssao.gather_push_constant.proj_scale = float(p_projection.get_pixels_per_meter(ssao.gather_push_constant.screen_size[0])); - ssao.gather_push_constant.bias = p_bias; - ssao.gather_push_constant.intensity_div_r6 = p_intensity / pow(p_radius, 6.0f); - - ssao.gather_push_constant.pixel_size[0] = 1.0 / p_depth_buffer_size.x; - ssao.gather_push_constant.pixel_size[1] = 1.0 / p_depth_buffer_size.y; - - RD::get_singleton()->compute_list_set_push_constant(compute_list, &ssao.gather_push_constant, sizeof(SSAOGatherPushConstant)); - - int x_groups = (ssao.gather_push_constant.screen_size[0] - 1) / 8 + 1; - int y_groups = (ssao.gather_push_constant.screen_size[1] - 1) / 8 + 1; - - RD::get_singleton()->compute_list_dispatch(compute_list, x_groups, y_groups, 1); - RD::get_singleton()->compute_list_add_barrier(compute_list); - - /* THIRD PASS */ - // Blur horizontal - - ssao.blur_push_constant.edge_sharpness = p_edge_sharpness; - ssao.blur_push_constant.filter_scale = p_blur; - ssao.blur_push_constant.screen_size[0] = ssao.gather_push_constant.screen_size[0]; - ssao.blur_push_constant.screen_size[1] = ssao.gather_push_constant.screen_size[1]; - ssao.blur_push_constant.z_far = p_projection.get_z_far(); - ssao.blur_push_constant.z_near = p_projection.get_z_near(); - ssao.blur_push_constant.orthogonal = p_projection.is_orthogonal(); - ssao.blur_push_constant.axis[0] = 1; - ssao.blur_push_constant.axis[1] = 0; - - if (p_blur != RS::ENV_SSAO_BLUR_DISABLED) { - RD::get_singleton()->compute_list_bind_compute_pipeline(compute_list, ssao.pipelines[p_half_size ? SSAO_BLUR_PASS_HALF : SSAO_BLUR_PASS]); - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_ao1), 0); - if (p_half_size) { - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_depth_mipmaps_texture), 1); - } else { - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_depth_buffer), 1); + if (p_settings.quality == RS::ENV_SSAO_QUALITY_VERY_LOW) { + ssao.gather_push_constant.radius *= 0.8f; + } + if (p_settings.half_size) { + ssao.gather_push_constant.radius *= 0.5f; + } } - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_uniform_set_from_image(p_ao2), 3); + radius_near_limit /= tan_half_fov_y; + ssao.gather_push_constant.radius = p_settings.radius; + ssao.gather_push_constant.intensity = p_settings.intensity; + ssao.gather_push_constant.shadow_power = p_settings.power; + ssao.gather_push_constant.shadow_clamp = 0.98; + ssao.gather_push_constant.fade_out_mul = -1.0 / (p_settings.fadeout_to - p_settings.fadeout_from); + ssao.gather_push_constant.fade_out_add = p_settings.fadeout_from / (p_settings.fadeout_to - p_settings.fadeout_from) + 1.0; + ssao.gather_push_constant.horizon_angle_threshold = p_settings.horizon; + ssao.gather_push_constant.inv_radius_near_limit = 1.0f / radius_near_limit; + ssao.gather_push_constant.neg_inv_radius = -1.0 / ssao.gather_push_constant.radius; - RD::get_singleton()->compute_list_set_push_constant(compute_list, &ssao.blur_push_constant, sizeof(SSAOBlurPushConstant)); + ssao.gather_push_constant.load_counter_avg_div = 9.0 / float((p_settings.quarter_size.x) * (p_settings.quarter_size.y) * 255); + ssao.gather_push_constant.adaptive_sample_limit = p_settings.adaptive_target; - RD::get_singleton()->compute_list_dispatch(compute_list, x_groups, y_groups, 1); - RD::get_singleton()->compute_list_add_barrier(compute_list); + ssao.gather_push_constant.detail_intensity = p_settings.detail; + ssao.gather_push_constant.quality = MAX(0, p_settings.quality - 1); + ssao.gather_push_constant.size_multiplier = p_settings.half_size ? 2 : 1; - /* THIRD PASS */ - // Blur vertical + if (p_invalidate_uniform_sets) { + Vector uniforms; + { + RD::Uniform u; + u.uniform_type = RD::UNIFORM_TYPE_SAMPLER_WITH_TEXTURE; + u.binding = 0; + u.ids.push_back(ssao.mirror_sampler); + u.ids.push_back(p_depth_mipmaps_texture); + uniforms.push_back(u); + } + { + RD::Uniform u; + u.uniform_type = RD::UNIFORM_TYPE_IMAGE; + u.binding = 1; + u.ids.push_back(p_normal_buffer); + uniforms.push_back(u); + } + { + RD::Uniform u; + u.uniform_type = RD::UNIFORM_TYPE_UNIFORM_BUFFER; + u.binding = 2; + u.ids.push_back(ssao.gather_constants_buffer); + uniforms.push_back(u); + } + ssao.gather_uniform_set = RD::get_singleton()->uniform_set_create(uniforms, ssao.gather_shader.version_get_shader(ssao.gather_shader_version, 0), 0); + } - ssao.blur_push_constant.axis[0] = 0; - ssao.blur_push_constant.axis[1] = 1; + if (p_invalidate_uniform_sets) { + Vector uniforms; + { + RD::Uniform u; + u.uniform_type = RD::UNIFORM_TYPE_IMAGE; + u.binding = 0; + u.ids.push_back(p_ao_pong); + uniforms.push_back(u); + } + { + RD::Uniform u; + u.uniform_type = RD::UNIFORM_TYPE_SAMPLER_WITH_TEXTURE; + u.binding = 1; + u.ids.push_back(default_sampler); + u.ids.push_back(p_importance_map); + uniforms.push_back(u); + } + { + RD::Uniform u; + u.uniform_type = RD::UNIFORM_TYPE_STORAGE_BUFFER; + u.binding = 2; + u.ids.push_back(ssao.importance_map_load_counter); + uniforms.push_back(u); + } + ssao.importance_map_uniform_set = RD::get_singleton()->uniform_set_create(uniforms, ssao.gather_shader.version_get_shader(ssao.gather_shader_version, 2), 1); + } - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_ao2), 0); - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_uniform_set_from_image(p_ao1), 3); + if (p_settings.quality == RS::ENV_SSAO_QUALITY_ULTRA) { + ssao.importance_map_push_constant.half_screen_pixel_size[0] = 1.0 / p_settings.half_screen_size.x; + ssao.importance_map_push_constant.half_screen_pixel_size[1] = 1.0 / p_settings.half_screen_size.y; + ssao.importance_map_push_constant.intensity = p_settings.intensity; + ssao.importance_map_push_constant.power = p_settings.power; + //base pass + RD::get_singleton()->compute_list_bind_compute_pipeline(compute_list, ssao.pipelines[SSAO_GATHER_BASE]); + gather_ssao(compute_list, p_ao_pong_slices, p_settings, true); + //generate importance map + int x_groups = (p_settings.quarter_size.x - 1) / 8 + 1; + int y_groups = (p_settings.quarter_size.y - 1) / 8 + 1; - RD::get_singleton()->compute_list_set_push_constant(compute_list, &ssao.blur_push_constant, sizeof(SSAOBlurPushConstant)); + RD::get_singleton()->compute_list_bind_compute_pipeline(compute_list, ssao.pipelines[SSAO_GENERATE_IMPORTANCE_MAP]); + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_ao_pong), 0); + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_uniform_set_from_image(p_importance_map), 1); + RD::get_singleton()->compute_list_set_push_constant(compute_list, &ssao.importance_map_push_constant, sizeof(SSAOImportanceMapPushConstant)); + RD::get_singleton()->compute_list_dispatch(compute_list, x_groups, y_groups, 1); + RD::get_singleton()->compute_list_add_barrier(compute_list); + //process importance map A + RD::get_singleton()->compute_list_bind_compute_pipeline(compute_list, ssao.pipelines[SSAO_PROCESS_IMPORTANCE_MAPA]); + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_importance_map), 0); + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_uniform_set_from_image(p_importance_map_pong), 1); + RD::get_singleton()->compute_list_set_push_constant(compute_list, &ssao.importance_map_push_constant, sizeof(SSAOImportanceMapPushConstant)); + RD::get_singleton()->compute_list_dispatch(compute_list, x_groups, y_groups, 1); + RD::get_singleton()->compute_list_add_barrier(compute_list); + //process Importance Map B + RD::get_singleton()->compute_list_bind_compute_pipeline(compute_list, ssao.pipelines[SSAO_PROCESS_IMPORTANCE_MAPB]); + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_importance_map_pong), 0); + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_uniform_set_from_image(p_importance_map), 1); + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, ssao.counter_uniform_set, 2); + RD::get_singleton()->compute_list_set_push_constant(compute_list, &ssao.importance_map_push_constant, sizeof(SSAOImportanceMapPushConstant)); + RD::get_singleton()->compute_list_dispatch(compute_list, x_groups, y_groups, 1); + RD::get_singleton()->compute_list_add_barrier(compute_list); - RD::get_singleton()->compute_list_dispatch(compute_list, x_groups, y_groups, 1); + RD::get_singleton()->compute_list_bind_compute_pipeline(compute_list, ssao.pipelines[SSAO_GATHER_ADAPTIVE]); + } else { + RD::get_singleton()->compute_list_bind_compute_pipeline(compute_list, ssao.pipelines[SSAO_GATHER]); + } + + gather_ssao(compute_list, p_ao_slices, p_settings, false); } - if (p_half_size) { //must upscale - /* FOURTH PASS */ - // upscale if half size - //back to full size - ssao.blur_push_constant.screen_size[0] = p_depth_buffer_size.x; - ssao.blur_push_constant.screen_size[1] = p_depth_buffer_size.y; + // /* THIRD PASS */ + // // Blur + // + { + ssao.blur_push_constant.edge_sharpness = 1.0 - p_settings.sharpness; + ssao.blur_push_constant.half_screen_pixel_size[0] = 1.0 / p_settings.half_screen_size.x; + ssao.blur_push_constant.half_screen_pixel_size[1] = 1.0 / p_settings.half_screen_size.y; - RD::get_singleton()->compute_list_add_barrier(compute_list); - RD::get_singleton()->compute_list_bind_compute_pipeline(compute_list, ssao.pipelines[SSAO_BLUR_UPSCALE]); + int blur_passes = p_settings.quality > RS::ENV_SSAO_QUALITY_VERY_LOW ? p_settings.blur_passes : 1; - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_ao1), 0); - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_uniform_set_from_image(p_upscale_buffer), 3); - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_depth_buffer), 1); - RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_depth_mipmaps_texture), 2); + for (int pass = 0; pass < blur_passes; pass++) { + int blur_pipeline = SSAO_BLUR_PASS; + if (p_settings.quality > RS::ENV_SSAO_QUALITY_VERY_LOW) { + if (pass < blur_passes - 2) { + blur_pipeline = SSAO_BLUR_PASS_WIDE; + } + blur_pipeline = SSAO_BLUR_PASS_SMART; + } - RD::get_singleton()->compute_list_set_push_constant(compute_list, &ssao.blur_push_constant, sizeof(SSAOBlurPushConstant)); //not used but set anyway + for (int i = 0; i < 4; i++) { + if ((p_settings.quality == RS::ENV_SSAO_QUALITY_VERY_LOW) && ((i == 1) || (i == 2))) { + continue; + } - x_groups = (p_depth_buffer_size.x - 1) / 8 + 1; - y_groups = (p_depth_buffer_size.y - 1) / 8 + 1; + RD::get_singleton()->compute_list_bind_compute_pipeline(compute_list, ssao.pipelines[blur_pipeline]); + if (pass % 2 == 0) { + if (p_settings.quality == RS::ENV_SSAO_QUALITY_VERY_LOW) { + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_ao_slices[i]), 0); + } else { + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture_and_sampler(p_ao_slices[i], ssao.mirror_sampler), 0); + } + + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_uniform_set_from_image(p_ao_pong_slices[i]), 1); + } else { + if (p_settings.quality == RS::ENV_SSAO_QUALITY_VERY_LOW) { + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_ao_pong_slices[i]), 0); + } else { + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture_and_sampler(p_ao_pong_slices[i], ssao.mirror_sampler), 0); + } + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_uniform_set_from_image(p_ao_slices[i]), 1); + } + RD::get_singleton()->compute_list_set_push_constant(compute_list, &ssao.blur_push_constant, sizeof(SSAOBlurPushConstant)); + + int x_groups = ((p_settings.screen_size.x >> (p_settings.half_size ? 2 : 1)) - 1) / 8 + 1; + int y_groups = ((p_settings.screen_size.y >> (p_settings.half_size ? 2 : 1)) - 1) / 8 + 1; + + RD::get_singleton()->compute_list_dispatch(compute_list, x_groups, y_groups, 1); + } + + if (p_settings.quality > RS::ENV_SSAO_QUALITY_VERY_LOW) { + RD::get_singleton()->compute_list_add_barrier(compute_list); + } + } + } + + /* FOURTH PASS */ + // Interleave buffers + // back to full size + { + ssao.interleave_push_constant.inv_sharpness = 1.0 - p_settings.sharpness; + ssao.interleave_push_constant.pixel_size[0] = 1.0 / p_settings.screen_size.x; + ssao.interleave_push_constant.pixel_size[1] = 1.0 / p_settings.screen_size.y; + ssao.interleave_push_constant.size_modifier = uint32_t(p_settings.half_size ? 4 : 2); + + int interleave_pipeline = SSAO_INTERLEAVE_HALF; + if (p_settings.quality == RS::ENV_SSAO_QUALITY_LOW) { + interleave_pipeline = SSAO_INTERLEAVE; + } else if (p_settings.quality >= RS::ENV_SSAO_QUALITY_MEDIUM) { + interleave_pipeline = SSAO_INTERLEAVE_SMART; + } + + RD::get_singleton()->compute_list_bind_compute_pipeline(compute_list, ssao.pipelines[interleave_pipeline]); + + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_uniform_set_from_image(p_upscale_buffer), 0); + if (p_settings.quality > RS::ENV_SSAO_QUALITY_VERY_LOW && p_settings.blur_passes % 2 == 0) { + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_ao), 1); + } else { + RD::get_singleton()->compute_list_bind_uniform_set(compute_list, _get_compute_uniform_set_from_texture(p_ao_pong), 1); + } + + RD::get_singleton()->compute_list_set_push_constant(compute_list, &ssao.interleave_push_constant, sizeof(SSAOInterleavePushConstant)); + + int x_groups = (p_settings.screen_size.x - 1) / 8 + 1; + int y_groups = (p_settings.screen_size.y - 1) / 8 + 1; RD::get_singleton()->compute_list_dispatch(compute_list, x_groups, y_groups, 1); + RD::get_singleton()->compute_list_add_barrier(compute_list); } RD::get_singleton()->compute_list_end(); + + int zero[1] = { 0 }; + RD::get_singleton()->buffer_update(ssao.importance_map_load_counter, 0, sizeof(uint32_t), &zero, false); } void EffectsRD::roughness_limit(RID p_source_normal, RID p_roughness, const Size2i &p_size, float p_curve) { @@ -1486,57 +1702,142 @@ EffectsRD::EffectsRD() { { // Initialize ssao + + RD::SamplerState sampler; + sampler.mag_filter = RD::SAMPLER_FILTER_NEAREST; + sampler.min_filter = RD::SAMPLER_FILTER_NEAREST; + sampler.mip_filter = RD::SAMPLER_FILTER_NEAREST; + sampler.repeat_u = RD::SAMPLER_REPEAT_MODE_MIRRORED_REPEAT; + sampler.repeat_v = RD::SAMPLER_REPEAT_MODE_MIRRORED_REPEAT; + sampler.repeat_w = RD::SAMPLER_REPEAT_MODE_MIRRORED_REPEAT; + sampler.max_lod = 4; + + ssao.mirror_sampler = RD::get_singleton()->sampler_create(sampler); + uint32_t pipeline = 0; { Vector ssao_modes; - ssao_modes.push_back("\n#define MINIFY_START\n"); ssao_modes.push_back("\n"); + ssao_modes.push_back("\n#define USE_HALF_SIZE\n"); + ssao_modes.push_back("\n#define GENERATE_MIPS\n"); + ssao_modes.push_back("\n#define GENERATE_MIPS\n#define USE_HALF_SIZE"); + ssao_modes.push_back("\n#define USE_HALF_BUFFERS\n"); + ssao_modes.push_back("\n#define USE_HALF_BUFFERS\n#define USE_HALF_SIZE"); - ssao.minify_shader.initialize(ssao_modes); + ssao.downsample_shader.initialize(ssao_modes); - ssao.minify_shader_version = ssao.minify_shader.version_create(); + ssao.downsample_shader_version = ssao.downsample_shader.version_create(); - for (int i = 0; i <= SSAO_MINIFY_MIPMAP; i++) { - ssao.pipelines[pipeline] = RD::get_singleton()->compute_pipeline_create(ssao.minify_shader.version_get_shader(ssao.minify_shader_version, i)); + for (int i = 0; i <= SSAO_DOWNSAMPLE_HALF_RES_HALF; i++) { + ssao.pipelines[pipeline] = RD::get_singleton()->compute_pipeline_create(ssao.downsample_shader.version_get_shader(ssao.downsample_shader_version, i)); pipeline++; } } { Vector ssao_modes; - ssao_modes.push_back("\n#define SSAO_QUALITY_LOW\n"); + ssao_modes.push_back("\n"); - ssao_modes.push_back("\n#define SSAO_QUALITY_HIGH\n"); - ssao_modes.push_back("\n#define SSAO_QUALITY_ULTRA\n"); - ssao_modes.push_back("\n#define SSAO_QUALITY_LOW\n#define USE_HALF_SIZE\n"); - ssao_modes.push_back("\n#define USE_HALF_SIZE\n"); - ssao_modes.push_back("\n#define SSAO_QUALITY_HIGH\n#define USE_HALF_SIZE\n"); - ssao_modes.push_back("\n#define SSAO_QUALITY_ULTRA\n#define USE_HALF_SIZE\n"); + ssao_modes.push_back("\n#define SSAO_BASE\n"); + ssao_modes.push_back("\n#define ADAPTIVE\n"); ssao.gather_shader.initialize(ssao_modes); ssao.gather_shader_version = ssao.gather_shader.version_create(); - for (int i = SSAO_GATHER_LOW; i <= SSAO_GATHER_ULTRA_HALF; i++) { - ssao.pipelines[pipeline] = RD::get_singleton()->compute_pipeline_create(ssao.gather_shader.version_get_shader(ssao.gather_shader_version, i - SSAO_GATHER_LOW)); + for (int i = SSAO_GATHER; i <= SSAO_GATHER_ADAPTIVE; i++) { + ssao.pipelines[pipeline] = RD::get_singleton()->compute_pipeline_create(ssao.gather_shader.version_get_shader(ssao.gather_shader_version, i - SSAO_GATHER)); pipeline++; } + + ssao.gather_constants_buffer = RD::get_singleton()->uniform_buffer_create(sizeof(SSAOGatherConstants)); + SSAOGatherConstants gather_constants; + + const int sub_pass_count = 5; + for (int pass = 0; pass < 4; pass++) { + for (int subPass = 0; subPass < sub_pass_count; subPass++) { + int a = pass; + int b = subPass; + + int spmap[5]{ 0, 1, 4, 3, 2 }; + b = spmap[subPass]; + + float ca, sa; + float angle0 = (float(a) + float(b) / float(sub_pass_count)) * Math_PI * 0.5f; + + ca = Math::cos(angle0); + sa = Math::sin(angle0); + + float scale = 1.0f + (a - 1.5f + (b - (sub_pass_count - 1.0f) * 0.5f) / float(sub_pass_count)) * 0.07f; + + gather_constants.rotation_matrices[pass * 20 + subPass * 4 + 0] = scale * ca; + gather_constants.rotation_matrices[pass * 20 + subPass * 4 + 1] = scale * -sa; + gather_constants.rotation_matrices[pass * 20 + subPass * 4 + 2] = -scale * sa; + gather_constants.rotation_matrices[pass * 20 + subPass * 4 + 3] = -scale * ca; + } + } + + RD::get_singleton()->buffer_update(ssao.gather_constants_buffer, 0, sizeof(SSAOGatherConstants), &gather_constants, false); } { Vector ssao_modes; - ssao_modes.push_back("\n#define MODE_FULL_SIZE\n"); - ssao_modes.push_back("\n"); - ssao_modes.push_back("\n#define MODE_UPSCALE\n"); + ssao_modes.push_back("\n#define GENERATE_MAP\n"); + ssao_modes.push_back("\n#define PROCESS_MAPA\n"); + ssao_modes.push_back("\n#define PROCESS_MAPB\n"); + + ssao.importance_map_shader.initialize(ssao_modes); + + ssao.importance_map_shader_version = ssao.importance_map_shader.version_create(); + + for (int i = SSAO_GENERATE_IMPORTANCE_MAP; i <= SSAO_PROCESS_IMPORTANCE_MAPB; i++) { + ssao.pipelines[pipeline] = RD::get_singleton()->compute_pipeline_create(ssao.importance_map_shader.version_get_shader(ssao.importance_map_shader_version, i - SSAO_GENERATE_IMPORTANCE_MAP)); + + pipeline++; + } + ssao.importance_map_load_counter = RD::get_singleton()->storage_buffer_create(sizeof(uint32_t)); + int zero[1] = { 0 }; + RD::get_singleton()->buffer_update(ssao.importance_map_load_counter, 0, sizeof(uint32_t), &zero, false); + + Vector uniforms; + { + RD::Uniform u; + u.uniform_type = RD::UNIFORM_TYPE_STORAGE_BUFFER; + u.binding = 0; + u.ids.push_back(ssao.importance_map_load_counter); + uniforms.push_back(u); + } + ssao.counter_uniform_set = RD::get_singleton()->uniform_set_create(uniforms, ssao.importance_map_shader.version_get_shader(ssao.importance_map_shader_version, 2), 2); + } + { + Vector ssao_modes; + ssao_modes.push_back("\n#define MODE_NON_SMART\n"); + ssao_modes.push_back("\n#define MODE_SMART\n"); + ssao_modes.push_back("\n#define MODE_WIDE\n"); ssao.blur_shader.initialize(ssao_modes); ssao.blur_shader_version = ssao.blur_shader.version_create(); - for (int i = SSAO_BLUR_PASS; i <= SSAO_BLUR_UPSCALE; i++) { + for (int i = SSAO_BLUR_PASS; i <= SSAO_BLUR_PASS_WIDE; i++) { ssao.pipelines[pipeline] = RD::get_singleton()->compute_pipeline_create(ssao.blur_shader.version_get_shader(ssao.blur_shader_version, i - SSAO_BLUR_PASS)); pipeline++; } } + { + Vector ssao_modes; + ssao_modes.push_back("\n#define MODE_NON_SMART\n"); + ssao_modes.push_back("\n#define MODE_SMART\n"); + ssao_modes.push_back("\n#define MODE_HALF\n"); + + ssao.interleave_shader.initialize(ssao_modes); + + ssao.interleave_shader_version = ssao.interleave_shader.version_create(); + for (int i = SSAO_INTERLEAVE; i <= SSAO_INTERLEAVE_HALF; i++) { + ssao.pipelines[pipeline] = RD::get_singleton()->compute_pipeline_create(ssao.interleave_shader.version_get_shader(ssao.interleave_shader_version, i - SSAO_INTERLEAVE)); + + pipeline++; + } + } ERR_FAIL_COND(pipeline != SSAO_MAX); } @@ -1777,6 +2078,10 @@ EffectsRD::~EffectsRD() { RD::get_singleton()->free(index_buffer); //array gets freed as dependency RD::get_singleton()->free(filter.coefficient_buffer); + RD::get_singleton()->free(ssao.mirror_sampler); + RD::get_singleton()->free(ssao.gather_constants_buffer); + RD::get_singleton()->free(ssao.importance_map_load_counter); + bokeh.shader.version_free(bokeh.shader_version); copy.shader.version_free(copy.shader_version); copy_to_fb.shader.version_free(copy_to_fb.shader_version); @@ -1791,7 +2096,9 @@ EffectsRD::~EffectsRD() { specular_merge.shader.version_free(specular_merge.shader_version); ssao.blur_shader.version_free(ssao.blur_shader_version); ssao.gather_shader.version_free(ssao.gather_shader_version); - ssao.minify_shader.version_free(ssao.minify_shader_version); + ssao.downsample_shader.version_free(ssao.downsample_shader_version); + ssao.interleave_shader.version_free(ssao.interleave_shader_version); + ssao.importance_map_shader.version_free(ssao.importance_map_shader_version); ssr.shader.version_free(ssr.shader_version); ssr_filter.shader.version_free(ssr_filter.shader_version); ssr_scale.shader.version_free(ssr_scale.shader_version); diff --git a/servers/rendering/renderer_rd/effects_rd.h b/servers/rendering/renderer_rd/effects_rd.h index 3afc111b9de..8731466dea8 100644 --- a/servers/rendering/renderer_rd/effects_rd.h +++ b/servers/rendering/renderer_rd/effects_rd.h @@ -51,7 +51,9 @@ #include "servers/rendering/renderer_rd/shaders/specular_merge.glsl.gen.h" #include "servers/rendering/renderer_rd/shaders/ssao.glsl.gen.h" #include "servers/rendering/renderer_rd/shaders/ssao_blur.glsl.gen.h" -#include "servers/rendering/renderer_rd/shaders/ssao_minify.glsl.gen.h" +#include "servers/rendering/renderer_rd/shaders/ssao_downsample.glsl.gen.h" +#include "servers/rendering/renderer_rd/shaders/ssao_importance_map.glsl.gen.h" +#include "servers/rendering/renderer_rd/shaders/ssao_interleave.glsl.gen.h" #include "servers/rendering/renderer_rd/shaders/subsurface_scattering.glsl.gen.h" #include "servers/rendering/renderer_rd/shaders/tonemap.glsl.gen.h" @@ -288,71 +290,121 @@ class EffectsRD { } bokeh; enum SSAOMode { - SSAO_MINIFY_FIRST, - SSAO_MINIFY_MIPMAP, - SSAO_GATHER_LOW, - SSAO_GATHER_MEDIUM, - SSAO_GATHER_HIGH, - SSAO_GATHER_ULTRA, - SSAO_GATHER_LOW_HALF, - SSAO_GATHER_MEDIUM_HALF, - SSAO_GATHER_HIGH_HALF, - SSAO_GATHER_ULTRA_HALF, + SSAO_DOWNSAMPLE, + SSAO_DOWNSAMPLE_HALF_RES, + SSAO_DOWNSAMPLE_MIPMAP, + SSAO_DOWNSAMPLE_MIPMAP_HALF_RES, + SSAO_DOWNSAMPLE_HALF, + SSAO_DOWNSAMPLE_HALF_RES_HALF, + SSAO_GATHER, + SSAO_GATHER_BASE, + SSAO_GATHER_ADAPTIVE, + SSAO_GENERATE_IMPORTANCE_MAP, + SSAO_PROCESS_IMPORTANCE_MAPA, + SSAO_PROCESS_IMPORTANCE_MAPB, SSAO_BLUR_PASS, - SSAO_BLUR_PASS_HALF, - SSAO_BLUR_UPSCALE, + SSAO_BLUR_PASS_SMART, + SSAO_BLUR_PASS_WIDE, + SSAO_INTERLEAVE, + SSAO_INTERLEAVE_SMART, + SSAO_INTERLEAVE_HALF, SSAO_MAX }; - struct SSAOMinifyPushConstant { + struct SSAODownsamplePushConstant { float pixel_size[2]; float z_far; float z_near; - int32_t source_size[2]; uint32_t orthogonal; - uint32_t pad; + float radius_sq; + uint32_t pad[2]; }; struct SSAOGatherPushConstant { int32_t screen_size[2]; - float z_far; - float z_near; + int pass; + int quality; + + float half_screen_pixel_size[2]; + int size_multiplier; + float detail_intensity; + + float NDC_to_view_mul[2]; + float NDC_to_view_add[2]; + + float pad[2]; + float half_screen_pixel_size_x025[2]; - uint32_t orthogonal; - float intensity_div_r6; float radius; - float bias; + float intensity; + float shadow_power; + float shadow_clamp; - float proj_info[4]; - float pixel_size[2]; - float proj_scale; - uint32_t pad; + float fade_out_mul; + float fade_out_add; + float horizon_angle_threshold; + float inv_radius_near_limit; + + bool is_orthogonal; + float neg_inv_radius; + float load_counter_avg_div; + float adaptive_sample_limit; + + int32_t pass_coord_offset[2]; + float pass_uv_offset[2]; + }; + + struct SSAOGatherConstants { + float rotation_matrices[80]; //5 vec4s * 4 + }; + + struct SSAOImportanceMapPushConstant { + float half_screen_pixel_size[2]; + float intensity; + float power; }; struct SSAOBlurPushConstant { float edge_sharpness; - int32_t filter_scale; - float z_far; - float z_near; - uint32_t orthogonal; - uint32_t pad[3]; - int32_t axis[2]; - int32_t screen_size[2]; + float pad; + float half_screen_pixel_size[2]; + }; + + struct SSAOInterleavePushConstant { + float inv_sharpness; + uint32_t size_modifier; + float pixel_size[2]; }; struct SSAO { - SSAOMinifyPushConstant minify_push_constant; - SsaoMinifyShaderRD minify_shader; - RID minify_shader_version; + SSAODownsamplePushConstant downsample_push_constant; + SsaoDownsampleShaderRD downsample_shader; + RID downsample_shader_version; + RID downsample_uniform_set; SSAOGatherPushConstant gather_push_constant; SsaoShaderRD gather_shader; RID gather_shader_version; + RID gather_uniform_set; + RID gather_constants_buffer; + bool gather_initialized = false; + + SSAOImportanceMapPushConstant importance_map_push_constant; + SsaoImportanceMapShaderRD importance_map_shader; + RID importance_map_shader_version; + RID importance_map_load_counter; + RID importance_map_uniform_set; + RID counter_uniform_set; SSAOBlurPushConstant blur_push_constant; SsaoBlurShaderRD blur_shader; RID blur_shader_version; + SSAOInterleavePushConstant interleave_push_constant; + SsaoInterleaveShaderRD interleave_shader; + RID interleave_shader_version; + + RID mirror_sampler; RID pipelines[SSAO_MAX]; } ssao; @@ -598,13 +650,27 @@ class EffectsRD { } }; + struct TextureSamplerPair { + RID texture; + RID sampler; + _FORCE_INLINE_ bool operator<(const TextureSamplerPair &p_pair) const { + if (texture == p_pair.texture) { + return sampler < p_pair.sampler; + } else { + return texture < p_pair.texture; + } + } + }; + Map texture_to_compute_uniform_set_cache; Map texture_pair_to_compute_uniform_set_cache; Map image_pair_to_compute_uniform_set_cache; + Map texture_sampler_to_compute_uniform_set_cache; RID _get_uniform_set_from_image(RID p_texture); RID _get_uniform_set_from_texture(RID p_texture, bool p_use_mipmaps = false); RID _get_compute_uniform_set_from_texture(RID p_texture, bool p_use_mipmaps = false); + RID _get_compute_uniform_set_from_texture_and_sampler(RID p_texture, RID p_sampler); RID _get_compute_uniform_set_from_texture_pair(RID p_texture, RID p_texture2, bool p_use_mipmaps = false); RID _get_compute_uniform_set_from_image_pair(RID p_texture, RID p_texture2); @@ -664,9 +730,30 @@ public: Vector2i texture_size; }; + struct SSAOSettings { + float radius = 1.0; + float intensity = 2.0; + float power = 1.5; + float detail = 0.5; + float horizon = 0.06; + float sharpness = 0.98; + + RS::EnvironmentSSAOQuality quality = RS::ENV_SSAO_QUALITY_MEDIUM; + bool half_size = false; + float adaptive_target = 0.5; + int blur_passes = 2; + float fadeout_from = 50.0; + float fadeout_to = 300.0; + + Size2i screen_size = Size2i(); + Size2i half_screen_size = Size2i(); + Size2i quarter_size = Size2i(); + }; + void tonemapper(RID p_source_color, RID p_dst_framebuffer, const TonemapSettings &p_settings); - void generate_ssao(RID p_depth_buffer, RID p_normal_buffer, const Size2i &p_depth_buffer_size, RID p_depth_mipmaps_texture, const Vector &depth_mipmaps, RID p_ao1, bool p_half_size, RID p_ao2, RID p_upscale_buffer, float p_intensity, float p_radius, float p_bias, const CameraMatrix &p_projection, RS::EnvironmentSSAOQuality p_quality, RS::EnvironmentSSAOBlur p_blur, float p_edge_sharpness); + void gather_ssao(RD::ComputeListID p_compute_list, const Vector p_ao_slices, const SSAOSettings &p_settings, bool p_adaptive_base_pass); + void generate_ssao(RID p_depth_buffer, RID p_normal_buffer, RID p_depth_mipmaps_texture, const Vector &depth_mipmaps, RID p_ao, const Vector p_ao_slices, RID p_ao_pong, const Vector p_ao_pong_slices, RID p_upscale_buffer, RID p_importance_map, RID p_importance_map_pong, const CameraMatrix &p_projection, const SSAOSettings &p_settings, bool p_invalidate_uniform_sets); void roughness_limit(RID p_source_normal, RID p_roughness, const Size2i &p_size, float p_curve); void cubemap_downsample(RID p_source_cubemap, RID p_dest_cubemap, const Size2i &p_size); diff --git a/servers/rendering/renderer_rd/renderer_scene_render_forward.cpp b/servers/rendering/renderer_rd/renderer_scene_render_forward.cpp index 93f4b7bc8e0..01dd2965c48 100644 --- a/servers/rendering/renderer_rd/renderer_scene_render_forward.cpp +++ b/servers/rendering/renderer_rd/renderer_scene_render_forward.cpp @@ -766,7 +766,7 @@ void RendererSceneRenderForward::_allocate_normal_roughness_texture(RenderBuffer tf.format = RD::DATA_FORMAT_R8G8B8A8_UNORM; tf.width = rb->width; tf.height = rb->height; - tf.usage_bits = RD::TEXTURE_USAGE_SAMPLING_BIT; + tf.usage_bits = RD::TEXTURE_USAGE_SAMPLING_BIT | RD::TEXTURE_USAGE_STORAGE_BIT; if (rb->msaa != RS::VIEWPORT_MSAA_DISABLED) { tf.usage_bits |= RD::TEXTURE_USAGE_CAN_COPY_TO_BIT | RD::TEXTURE_USAGE_STORAGE_BIT; @@ -782,7 +782,7 @@ void RendererSceneRenderForward::_allocate_normal_roughness_texture(RenderBuffer fb.push_back(rb->normal_roughness_buffer); rb->depth_normal_roughness_fb = RD::get_singleton()->framebuffer_create(fb); } else { - tf.usage_bits = RD::TEXTURE_USAGE_COLOR_ATTACHMENT_BIT | RD::TEXTURE_USAGE_CAN_COPY_FROM_BIT | RD::TEXTURE_USAGE_SAMPLING_BIT; + tf.usage_bits = RD::TEXTURE_USAGE_COLOR_ATTACHMENT_BIT | RD::TEXTURE_USAGE_CAN_COPY_FROM_BIT | RD::TEXTURE_USAGE_SAMPLING_BIT | RD::TEXTURE_USAGE_STORAGE_BIT; tf.samples = rb->texture_samples; rb->normal_roughness_buffer_msaa = RD::get_singleton()->texture_create(tf, RD::TextureView()); diff --git a/servers/rendering/renderer_rd/renderer_scene_render_rd.cpp b/servers/rendering/renderer_rd/renderer_scene_render_rd.cpp index 4543ced8dce..1d9983a28c0 100644 --- a/servers/rendering/renderer_rd/renderer_scene_render_rd.cpp +++ b/servers/rendering/renderer_rd/renderer_scene_render_rd.cpp @@ -3122,7 +3122,7 @@ RS::EnvironmentSSRRoughnessQuality RendererSceneRenderRD::environment_get_ssr_ro return ssr_roughness_quality; } -void RendererSceneRenderRD::environment_set_ssao(RID p_env, bool p_enable, float p_radius, float p_intensity, float p_bias, float p_light_affect, float p_ao_channel_affect, RS::EnvironmentSSAOBlur p_blur, float p_bilateral_sharpness) { +void RendererSceneRenderRD::environment_set_ssao(RID p_env, bool p_enable, float p_radius, float p_intensity, float p_power, float p_detail, float p_horizon, float p_sharpness, float p_light_affect, float p_ao_channel_affect) { Environment *env = environment_owner.getornull(p_env); ERR_FAIL_COND(!env); @@ -3133,15 +3133,21 @@ void RendererSceneRenderRD::environment_set_ssao(RID p_env, bool p_enable, float env->ssao_enabled = p_enable; env->ssao_radius = p_radius; env->ssao_intensity = p_intensity; - env->ssao_bias = p_bias; + env->ssao_power = p_power; + env->ssao_detail = p_detail; + env->ssao_horizon = p_horizon; + env->ssao_sharpness = p_sharpness; env->ssao_direct_light_affect = p_light_affect; env->ssao_ao_channel_affect = p_ao_channel_affect; - env->ssao_blur = p_blur; } -void RendererSceneRenderRD::environment_set_ssao_quality(RS::EnvironmentSSAOQuality p_quality, bool p_half_size) { +void RendererSceneRenderRD::environment_set_ssao_quality(RS::EnvironmentSSAOQuality p_quality, bool p_half_size, float p_adaptive_target, int p_blur_passes, float p_fadeout_from, float p_fadeout_to) { ssao_quality = p_quality; ssao_half_size = p_half_size; + ssao_adaptive_target = p_adaptive_target; + ssao_blur_passes = p_blur_passes; + ssao_fadeout_from = p_fadeout_from; + ssao_fadeout_to = p_fadeout_to; } bool RendererSceneRenderRD::environment_is_ssao_enabled(RID p_env) const { @@ -5081,21 +5087,24 @@ void RendererSceneRenderRD::_free_render_buffer_data(RenderBuffers *rb) { rb->luminance.current = RID(); } - if (rb->ssao.ao[0].is_valid()) { + if (rb->ssao.depth.is_valid()) { RD::get_singleton()->free(rb->ssao.depth); - RD::get_singleton()->free(rb->ssao.ao[0]); - if (rb->ssao.ao[1].is_valid()) { - RD::get_singleton()->free(rb->ssao.ao[1]); - } - if (rb->ssao.ao_full.is_valid()) { - RD::get_singleton()->free(rb->ssao.ao_full); - } + RD::get_singleton()->free(rb->ssao.ao_deinterleaved); + RD::get_singleton()->free(rb->ssao.ao_pong); + RD::get_singleton()->free(rb->ssao.ao_final); + + RD::get_singleton()->free(rb->ssao.importance_map[0]); + RD::get_singleton()->free(rb->ssao.importance_map[1]); rb->ssao.depth = RID(); - rb->ssao.ao[0] = RID(); - rb->ssao.ao[1] = RID(); - rb->ssao.ao_full = RID(); + rb->ssao.ao_deinterleaved = RID(); + rb->ssao.ao_pong = RID(); + rb->ssao.ao_final = RID(); + rb->ssao.importance_map[0] = RID(); + rb->ssao.importance_map[1] = RID(); rb->ssao.depth_slices.clear(); + rb->ssao.ao_deinterleaved_slices.clear(); + rb->ssao.ao_pong_slices.clear(); } if (rb->ssr.blur_radius[0].is_valid()) { @@ -5194,64 +5203,133 @@ void RendererSceneRenderRD::_process_ssao(RID p_render_buffers, RID p_environmen RENDER_TIMESTAMP("Process SSAO"); - if (rb->ssao.ao[0].is_valid() && rb->ssao.ao_full.is_valid() != ssao_half_size) { + //TODO clear when settings chenge to or from ultra + if (rb->ssao.ao_final.is_valid() && ssao_using_half_size != ssao_half_size) { RD::get_singleton()->free(rb->ssao.depth); - RD::get_singleton()->free(rb->ssao.ao[0]); - if (rb->ssao.ao[1].is_valid()) { - RD::get_singleton()->free(rb->ssao.ao[1]); - } - if (rb->ssao.ao_full.is_valid()) { - RD::get_singleton()->free(rb->ssao.ao_full); - } + RD::get_singleton()->free(rb->ssao.ao_deinterleaved); + RD::get_singleton()->free(rb->ssao.ao_pong); + RD::get_singleton()->free(rb->ssao.ao_final); + + RD::get_singleton()->free(rb->ssao.importance_map[0]); + RD::get_singleton()->free(rb->ssao.importance_map[1]); rb->ssao.depth = RID(); - rb->ssao.ao[0] = RID(); - rb->ssao.ao[1] = RID(); - rb->ssao.ao_full = RID(); + rb->ssao.ao_deinterleaved = RID(); + rb->ssao.ao_pong = RID(); + rb->ssao.ao_final = RID(); + rb->ssao.importance_map[0] = RID(); + rb->ssao.importance_map[1] = RID(); rb->ssao.depth_slices.clear(); + rb->ssao.ao_deinterleaved_slices.clear(); + rb->ssao.ao_pong_slices.clear(); } - if (!rb->ssao.ao[0].is_valid()) { + int buffer_width; + int buffer_height; + int half_width; + int half_height; + if (ssao_half_size) { + buffer_width = (rb->width + 3) / 4; + buffer_height = (rb->height + 3) / 4; + half_width = (rb->width + 7) / 8; + half_height = (rb->height + 7) / 8; + } else { + buffer_width = (rb->width + 1) / 2; + buffer_height = (rb->height + 1) / 2; + half_width = (rb->width + 3) / 4; + half_height = (rb->height + 3) / 4; + } + bool uniform_sets_are_invalid = false; + if (rb->ssao.depth.is_null()) { //allocate depth slices { RD::TextureFormat tf; - tf.format = RD::DATA_FORMAT_R32_SFLOAT; - tf.width = rb->width / 2; - tf.height = rb->height / 2; - tf.mipmaps = Image::get_image_required_mipmaps(tf.width, tf.height, Image::FORMAT_RF) + 1; + tf.format = RD::DATA_FORMAT_R16_SFLOAT; + tf.texture_type = RD::TEXTURE_TYPE_2D_ARRAY; + tf.width = buffer_width; + tf.height = buffer_height; + tf.mipmaps = 4; + tf.array_layers = 4; tf.usage_bits = RD::TEXTURE_USAGE_SAMPLING_BIT | RD::TEXTURE_USAGE_STORAGE_BIT; rb->ssao.depth = RD::get_singleton()->texture_create(tf, RD::TextureView()); for (uint32_t i = 0; i < tf.mipmaps; i++) { - RID slice = RD::get_singleton()->texture_create_shared_from_slice(RD::TextureView(), rb->ssao.depth, 0, i); + RID slice = RD::get_singleton()->texture_create_shared_from_slice(RD::TextureView(), rb->ssao.depth, 0, i, RD::TEXTURE_SLICE_2D_ARRAY); rb->ssao.depth_slices.push_back(slice); } } { RD::TextureFormat tf; - tf.format = RD::DATA_FORMAT_R8_UNORM; - tf.width = ssao_half_size ? rb->width / 2 : rb->width; - tf.height = ssao_half_size ? rb->height / 2 : rb->height; + tf.format = RD::DATA_FORMAT_R8G8_UNORM; + tf.texture_type = RD::TEXTURE_TYPE_2D_ARRAY; + tf.width = buffer_width; + tf.height = buffer_height; + tf.array_layers = 4; tf.usage_bits = RD::TEXTURE_USAGE_SAMPLING_BIT | RD::TEXTURE_USAGE_STORAGE_BIT; - rb->ssao.ao[0] = RD::get_singleton()->texture_create(tf, RD::TextureView()); - rb->ssao.ao[1] = RD::get_singleton()->texture_create(tf, RD::TextureView()); + rb->ssao.ao_deinterleaved = RD::get_singleton()->texture_create(tf, RD::TextureView()); + for (uint32_t i = 0; i < 4; i++) { + RID slice = RD::get_singleton()->texture_create_shared_from_slice(RD::TextureView(), rb->ssao.ao_deinterleaved, i, 0); + rb->ssao.ao_deinterleaved_slices.push_back(slice); + } } - if (ssao_half_size) { - //upsample texture + { + RD::TextureFormat tf; + tf.format = RD::DATA_FORMAT_R8G8_UNORM; + tf.texture_type = RD::TEXTURE_TYPE_2D_ARRAY; + tf.width = buffer_width; + tf.height = buffer_height; + tf.array_layers = 4; + tf.usage_bits = RD::TEXTURE_USAGE_SAMPLING_BIT | RD::TEXTURE_USAGE_STORAGE_BIT; + rb->ssao.ao_pong = RD::get_singleton()->texture_create(tf, RD::TextureView()); + for (uint32_t i = 0; i < 4; i++) { + RID slice = RD::get_singleton()->texture_create_shared_from_slice(RD::TextureView(), rb->ssao.ao_pong, i, 0); + rb->ssao.ao_pong_slices.push_back(slice); + } + } + + { + RD::TextureFormat tf; + tf.format = RD::DATA_FORMAT_R8_UNORM; + tf.width = half_width; + tf.height = half_height; + tf.usage_bits = RD::TEXTURE_USAGE_SAMPLING_BIT | RD::TEXTURE_USAGE_STORAGE_BIT; + rb->ssao.importance_map[0] = RD::get_singleton()->texture_create(tf, RD::TextureView()); + rb->ssao.importance_map[1] = RD::get_singleton()->texture_create(tf, RD::TextureView()); + } + { RD::TextureFormat tf; tf.format = RD::DATA_FORMAT_R8_UNORM; tf.width = rb->width; tf.height = rb->height; tf.usage_bits = RD::TEXTURE_USAGE_SAMPLING_BIT | RD::TEXTURE_USAGE_STORAGE_BIT; - rb->ssao.ao_full = RD::get_singleton()->texture_create(tf, RD::TextureView()); + rb->ssao.ao_final = RD::get_singleton()->texture_create(tf, RD::TextureView()); + _render_buffers_uniform_set_changed(p_render_buffers); } - - _render_buffers_uniform_set_changed(p_render_buffers); + ssao_using_half_size = ssao_half_size; + uniform_sets_are_invalid = true; } - storage->get_effects()->generate_ssao(rb->depth_texture, p_normal_buffer, Size2i(rb->width, rb->height), rb->ssao.depth, rb->ssao.depth_slices, rb->ssao.ao[0], rb->ssao.ao_full.is_valid(), rb->ssao.ao[1], rb->ssao.ao_full, env->ssao_intensity, env->ssao_radius, env->ssao_bias, p_projection, ssao_quality, env->ssao_blur, env->ssao_blur_edge_sharpness); + EffectsRD::SSAOSettings settings; + settings.radius = env->ssao_radius; + settings.intensity = env->ssao_intensity; + settings.power = env->ssao_power; + settings.detail = env->ssao_detail; + settings.horizon = env->ssao_horizon; + settings.sharpness = env->ssao_sharpness; + + settings.quality = ssao_quality; + settings.half_size = ssao_half_size; + settings.adaptive_target = ssao_adaptive_target; + settings.blur_passes = ssao_blur_passes; + settings.fadeout_from = ssao_fadeout_from; + settings.fadeout_to = ssao_fadeout_to; + settings.screen_size = Size2i(rb->width, rb->height); + settings.half_screen_size = Size2i(buffer_width, buffer_height); + settings.quarter_size = Size2i(half_width, half_height); + + storage->get_effects()->generate_ssao(rb->depth_texture, p_normal_buffer, rb->ssao.depth, rb->ssao.depth_slices, rb->ssao.ao_deinterleaved, rb->ssao.ao_deinterleaved_slices, rb->ssao.ao_pong, rb->ssao.ao_pong_slices, rb->ssao.ao_final, rb->ssao.importance_map[0], rb->ssao.importance_map[1], p_projection, settings, uniform_sets_are_invalid); } void RendererSceneRenderRD::_render_buffers_post_process_and_tonemap(RID p_render_buffers, RID p_environment, RID p_camera_effects, const CameraMatrix &p_projection) { @@ -5431,9 +5509,9 @@ void RendererSceneRenderRD::_render_buffers_debug_draw(RID p_render_buffers, RID } } - if (debug_draw == RS::VIEWPORT_DEBUG_DRAW_SSAO && rb->ssao.ao[0].is_valid()) { + if (debug_draw == RS::VIEWPORT_DEBUG_DRAW_SSAO && rb->ssao.ao_final.is_valid()) { Size2 rtsize = storage->render_target_get_size(rb->render_target); - RID ao_buf = rb->ssao.ao_full.is_valid() ? rb->ssao.ao_full : rb->ssao.ao[0]; + RID ao_buf = rb->ssao.ao_final; effects->copy_to_fb_rect(ao_buf, storage->render_target_get_rd_framebuffer(rb->render_target), Rect2(Vector2(), rtsize), false, true); } @@ -5621,7 +5699,7 @@ RID RendererSceneRenderRD::render_buffers_get_ao_texture(RID p_render_buffers) { RenderBuffers *rb = render_buffers_owner.getornull(p_render_buffers); ERR_FAIL_COND_V(!rb, RID()); - return rb->ssao.ao_full.is_valid() ? rb->ssao.ao_full : rb->ssao.ao[0]; + return rb->ssao.ao_final; } RID RendererSceneRenderRD::render_buffers_get_gi_probe_buffer(RID p_render_buffers) { @@ -8435,7 +8513,7 @@ RendererSceneRenderRD::RendererSceneRenderRD(RendererStorageRD *p_storage) { camera_effects_set_dof_blur_bokeh_shape(RS::DOFBokehShape(int(GLOBAL_GET("rendering/quality/depth_of_field/depth_of_field_bokeh_shape")))); camera_effects_set_dof_blur_quality(RS::DOFBlurQuality(int(GLOBAL_GET("rendering/quality/depth_of_field/depth_of_field_bokeh_quality"))), GLOBAL_GET("rendering/quality/depth_of_field/depth_of_field_use_jitter")); - environment_set_ssao_quality(RS::EnvironmentSSAOQuality(int(GLOBAL_GET("rendering/quality/ssao/quality"))), GLOBAL_GET("rendering/quality/ssao/half_size")); + environment_set_ssao_quality(RS::EnvironmentSSAOQuality(int(GLOBAL_GET("rendering/quality/ssao/quality"))), GLOBAL_GET("rendering/quality/ssao/half_size"), GLOBAL_GET("rendering/quality/ssao/adaptive_target"), GLOBAL_GET("rendering/quality/ssao/blur_passes"), GLOBAL_GET("rendering/quality/ssao/fadeout_from"), GLOBAL_GET("rendering/quality/ssao/fadeout_to")); screen_space_roughness_limiter = GLOBAL_GET("rendering/quality/screen_filters/screen_space_roughness_limiter_enabled"); screen_space_roughness_limiter_amount = GLOBAL_GET("rendering/quality/screen_filters/screen_space_roughness_limiter_amount"); screen_space_roughness_limiter_limit = GLOBAL_GET("rendering/quality/screen_filters/screen_space_roughness_limiter_limit"); diff --git a/servers/rendering/renderer_rd/renderer_scene_render_rd.h b/servers/rendering/renderer_rd/renderer_scene_render_rd.h index 3afe6e3f4ae..1b0ae61042c 100644 --- a/servers/rendering/renderer_rd/renderer_scene_render_rd.h +++ b/servers/rendering/renderer_rd/renderer_scene_render_rd.h @@ -737,13 +737,14 @@ private: /// SSAO bool ssao_enabled = false; - float ssao_radius = 1; - float ssao_intensity = 1; - float ssao_bias = 0.01; + float ssao_radius = 1.0; + float ssao_intensity = 2.0; + float ssao_power = 1.5; + float ssao_detail = 0.5; + float ssao_horizon = 0.06; + float ssao_sharpness = 0.98; float ssao_direct_light_affect = 0.0; float ssao_ao_channel_affect = 0.0; - float ssao_blur_edge_sharpness = 4.0; - RS::EnvironmentSSAOBlur ssao_blur = RS::ENV_SSAO_BLUR_3x3; /// SSR /// @@ -777,6 +778,12 @@ private: RS::EnvironmentSSAOQuality ssao_quality = RS::ENV_SSAO_QUALITY_MEDIUM; bool ssao_half_size = false; + bool ssao_using_half_size = false; + float ssao_adaptive_target = 0.5; + int ssao_blur_passes = 2; + float ssao_fadeout_from = 50.0; + float ssao_fadeout_to = 300.0; + bool glow_bicubic_upscale = false; bool glow_high_quality = false; RS::EnvironmentSSRRoughnessQuality ssr_roughness_quality = RS::ENV_SSR_ROUGNESS_QUALITY_LOW; @@ -861,8 +868,12 @@ private: struct SSAO { RID depth; Vector depth_slices; - RID ao[2]; - RID ao_full; //when using half-size + RID ao_deinterleaved; + Vector ao_deinterleaved_slices; + RID ao_pong; + Vector ao_pong_slices; + RID ao_final; + RID importance_map[2]; } ssao; struct SSR { @@ -1567,8 +1578,8 @@ public: virtual void environment_set_volumetric_fog_positional_shadow_shrink_size(int p_shrink_size); void environment_set_ssr(RID p_env, bool p_enable, int p_max_steps, float p_fade_int, float p_fade_out, float p_depth_tolerance); - void environment_set_ssao(RID p_env, bool p_enable, float p_radius, float p_intensity, float p_bias, float p_light_affect, float p_ao_channel_affect, RS::EnvironmentSSAOBlur p_blur, float p_bilateral_sharpness); - void environment_set_ssao_quality(RS::EnvironmentSSAOQuality p_quality, bool p_half_size); + void environment_set_ssao(RID p_env, bool p_enable, float p_radius, float p_intensity, float p_power, float p_detail, float p_horizon, float p_sharpness, float p_light_affect, float p_ao_channel_affect); + void environment_set_ssao_quality(RS::EnvironmentSSAOQuality p_quality, bool p_half_size, float p_adaptive_target, int p_blur_passes, float p_fadeout_from, float p_fadeout_to); bool environment_is_ssao_enabled(RID p_env) const; float environment_get_ssao_ao_affect(RID p_env) const; float environment_get_ssao_light_affect(RID p_env) const; diff --git a/servers/rendering/renderer_rd/shaders/SCsub b/servers/rendering/renderer_rd/shaders/SCsub index cb62882debb..deaa9668df8 100644 --- a/servers/rendering/renderer_rd/shaders/SCsub +++ b/servers/rendering/renderer_rd/shaders/SCsub @@ -21,8 +21,10 @@ if "RD_GLSL" in env["BUILDERS"]: env.RD_GLSL("luminance_reduce.glsl") env.RD_GLSL("bokeh_dof.glsl") env.RD_GLSL("ssao.glsl") - env.RD_GLSL("ssao_minify.glsl") + env.RD_GLSL("ssao_downsample.glsl") + env.RD_GLSL("ssao_importance_map.glsl") env.RD_GLSL("ssao_blur.glsl") + env.RD_GLSL("ssao_interleave.glsl") env.RD_GLSL("roughness_limiter.glsl") env.RD_GLSL("screen_space_reflection.glsl") env.RD_GLSL("screen_space_reflection_filter.glsl") diff --git a/servers/rendering/renderer_rd/shaders/ssao.glsl b/servers/rendering/renderer_rd/shaders/ssao.glsl index 346338181a7..f67965ab499 100644 --- a/servers/rendering/renderer_rd/shaders/ssao.glsl +++ b/servers/rendering/renderer_rd/shaders/ssao.glsl @@ -1,249 +1,491 @@ +/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// +// Copyright (c) 2016, Intel Corporation +// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated +// documentation files (the "Software"), to deal in the Software without restriction, including without limitation +// the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +// permit persons to whom the Software is furnished to do so, subject to the following conditions: +// The above copyright notice and this permission notice shall be included in all copies or substantial portions of +// the Software. +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO +// THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +// TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. +/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// +// File changes (yyyy-mm-dd) +// 2016-09-07: filip.strugar@intel.com: first commit +// 2020-12-05: clayjohn: convert to Vulkan and Godot +/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// + #[compute] #version 450 VERSION_DEFINES +#define SSAO_ADAPTIVE_TAP_BASE_COUNT 5 + +#define INTELSSAO_MAIN_DISK_SAMPLE_COUNT (32) +const vec4 sample_pattern[INTELSSAO_MAIN_DISK_SAMPLE_COUNT] = { + vec4(0.78488064, 0.56661671, 1.500000, -0.126083), vec4(0.26022232, -0.29575172, 1.500000, -1.064030), vec4(0.10459357, 0.08372527, 1.110000, -2.730563), vec4(-0.68286800, 0.04963045, 1.090000, -0.498827), + vec4(-0.13570161, -0.64190155, 1.250000, -0.532765), vec4(-0.26193795, -0.08205118, 0.670000, -1.783245), vec4(-0.61177456, 0.66664219, 0.710000, -0.044234), vec4(0.43675563, 0.25119025, 0.610000, -1.167283), + vec4(0.07884444, 0.86618668, 0.640000, -0.459002), vec4(-0.12790935, -0.29869005, 0.600000, -1.729424), vec4(-0.04031125, 0.02413622, 0.600000, -4.792042), vec4(0.16201244, -0.52851415, 0.790000, -1.067055), + vec4(-0.70991218, 0.47301072, 0.640000, -0.335236), vec4(0.03277707, -0.22349690, 0.600000, -1.982384), vec4(0.68921727, 0.36800742, 0.630000, -0.266718), vec4(0.29251814, 0.37775412, 0.610000, -1.422520), + vec4(-0.12224089, 0.96582592, 0.600000, -0.426142), vec4(0.11071457, -0.16131058, 0.600000, -2.165947), vec4(0.46562141, -0.59747696, 0.600000, -0.189760), vec4(-0.51548797, 0.11804193, 0.600000, -1.246800), + vec4(0.89141309, -0.42090443, 0.600000, 0.028192), vec4(-0.32402530, -0.01591529, 0.600000, -1.543018), vec4(0.60771245, 0.41635221, 0.600000, -0.605411), vec4(0.02379565, -0.08239821, 0.600000, -3.809046), + vec4(0.48951152, -0.23657045, 0.600000, -1.189011), vec4(-0.17611565, -0.81696892, 0.600000, -0.513724), vec4(-0.33930185, -0.20732205, 0.600000, -1.698047), vec4(-0.91974425, 0.05403209, 0.600000, 0.062246), + vec4(-0.15064627, -0.14949332, 0.600000, -1.896062), vec4(0.53180975, -0.35210401, 0.600000, -0.758838), vec4(0.41487166, 0.81442589, 0.600000, -0.505648), vec4(-0.24106961, -0.32721516, 0.600000, -1.665244) +}; + +// these values can be changed (up to SSAO_MAX_TAPS) with no changes required elsewhere; values for 4th and 5th preset are ignored but array needed to avoid compilation errors +// the actual number of texture samples is two times this value (each "tap" has two symmetrical depth texture samples) +const int num_taps[5] = { 3, 5, 12, 0, 0 }; + +#define SSAO_TILT_SAMPLES_ENABLE_AT_QUALITY_PRESET (99) // to disable simply set to 99 or similar +#define SSAO_TILT_SAMPLES_AMOUNT (0.4) +// +#define SSAO_HALOING_REDUCTION_ENABLE_AT_QUALITY_PRESET (1) // to disable simply set to 99 or similar +#define SSAO_HALOING_REDUCTION_AMOUNT (0.6) // values from 0.0 - 1.0, 1.0 means max weighting (will cause artifacts, 0.8 is more reasonable) +// +#define SSAO_NORMAL_BASED_EDGES_ENABLE_AT_QUALITY_PRESET (2) // to disable simply set to 99 or similar +#define SSAO_NORMAL_BASED_EDGES_DOT_THRESHOLD (0.5) // use 0-0.1 for super-sharp normal-based edges +// +#define SSAO_DETAIL_AO_ENABLE_AT_QUALITY_PRESET (1) // whether to use detail; to disable simply set to 99 or similar +// +#define SSAO_DEPTH_MIPS_ENABLE_AT_QUALITY_PRESET (2) // !!warning!! the MIP generation on the C++ side will be enabled on quality preset 2 regardless of this value, so if changing here, change the C++ side too +#define SSAO_DEPTH_MIPS_GLOBAL_OFFSET (-4.3) // best noise/quality/performance tradeoff, found empirically +// +// !!warning!! the edge handling is hard-coded to 'disabled' on quality level 0, and enabled above, on the C++ side; while toggling it here will work for +// testing purposes, it will not yield performance gains (or correct results) +#define SSAO_DEPTH_BASED_EDGES_ENABLE_AT_QUALITY_PRESET (1) +// +#define SSAO_REDUCE_RADIUS_NEAR_SCREEN_BORDER_ENABLE_AT_QUALITY_PRESET (1) + +#define SSAO_MAX_TAPS 32 +#define SSAO_MAX_REF_TAPS 512 +#define SSAO_ADAPTIVE_TAP_BASE_COUNT 5 +#define SSAO_ADAPTIVE_TAP_FLEXIBLE_COUNT (SSAO_MAX_TAPS - SSAO_ADAPTIVE_TAP_BASE_COUNT) +#define SSAO_DEPTH_MIP_LEVELS 4 + layout(local_size_x = 8, local_size_y = 8, local_size_z = 1) in; -#define TWO_PI 6.283185307179586476925286766559 +layout(set = 0, binding = 0) uniform sampler2DArray source_depth_mipmaps; +layout(rgba8, set = 0, binding = 1) uniform restrict readonly image2D source_normal; +layout(set = 0, binding = 2) uniform Constants { //get into a lower set + vec4 rotation_matrices[20]; +} +constants; -#ifdef SSAO_QUALITY_HIGH -#define NUM_SAMPLES (20) +#ifdef ADAPTIVE +layout(rg8, set = 1, binding = 0) uniform restrict readonly image2DArray source_ssao; +layout(set = 1, binding = 1) uniform sampler2D source_importance; +layout(set = 1, binding = 2, std430) buffer Counter { + uint sum; +} +counter; #endif -#ifdef SSAO_QUALITY_ULTRA -#define NUM_SAMPLES (48) -#endif - -#ifdef SSAO_QUALITY_LOW -#define NUM_SAMPLES (8) -#endif - -#if !defined(SSAO_QUALITY_LOW) && !defined(SSAO_QUALITY_HIGH) && !defined(SSAO_QUALITY_ULTRA) -#define NUM_SAMPLES (12) -#endif - -// If using depth mip levels, the log of the maximum pixel offset before we need to switch to a lower -// miplevel to maintain reasonable spatial locality in the cache -// If this number is too small (< 3), too many taps will land in the same pixel, and we'll get bad variance that manifests as flashing. -// If it is too high (> 5), we'll get bad performance because we're not using the MIP levels effectively -#define LOG_MAX_OFFSET (3) - -// This must be less than or equal to the MAX_MIP_LEVEL defined in SSAO.cpp -#define MAX_MIP_LEVEL (4) - -// This is the number of turns around the circle that the spiral pattern makes. This should be prime to prevent -// taps from lining up. This particular choice was tuned for NUM_SAMPLES == 9 - -const int ROTATIONS[] = int[]( - 1, 1, 2, 3, 2, 5, 2, 3, 2, - 3, 3, 5, 5, 3, 4, 7, 5, 5, 7, - 9, 8, 5, 5, 7, 7, 7, 8, 5, 8, - 11, 12, 7, 10, 13, 8, 11, 8, 7, 14, - 11, 11, 13, 12, 13, 19, 17, 13, 11, 18, - 19, 11, 11, 14, 17, 21, 15, 16, 17, 18, - 13, 17, 11, 17, 19, 18, 25, 18, 19, 19, - 29, 21, 19, 27, 31, 29, 21, 18, 17, 29, - 31, 31, 23, 18, 25, 26, 25, 23, 19, 34, - 19, 27, 21, 25, 39, 29, 17, 21, 27); - -//#define NUM_SPIRAL_TURNS (7) -const int NUM_SPIRAL_TURNS = ROTATIONS[NUM_SAMPLES - 1]; - -layout(set = 0, binding = 0) uniform sampler2D source_depth_mipmaps; -layout(r8, set = 1, binding = 0) uniform restrict writeonly image2D dest_image; - -#ifndef USE_HALF_SIZE -layout(set = 2, binding = 0) uniform sampler2D source_depth; -#endif - -layout(set = 3, binding = 0) uniform sampler2D source_normal; +layout(rg8, set = 2, binding = 0) uniform restrict writeonly image2D dest_image; +// This push_constant is full - 128 bytes - if you need to add more data, consider adding to the uniform buffer instead layout(push_constant, binding = 1, std430) uniform Params { ivec2 screen_size; - float z_far; - float z_near; + int pass; + int quality; + + vec2 half_screen_pixel_size; + int size_multiplier; + float detail_intensity; + + vec2 NDC_to_view_mul; + vec2 NDC_to_view_add; + + vec2 pad2; + vec2 half_screen_pixel_size_x025; - bool orthogonal; - float intensity_div_r6; float radius; - float bias; + float intensity; + float shadow_power; + float shadow_clamp; - vec4 proj_info; - vec2 pixel_size; - float proj_scale; - uint pad; + float fade_out_mul; + float fade_out_add; + float horizon_angle_threshold; + float inv_radius_near_limit; + + bool is_orthogonal; + float neg_inv_radius; + float load_counter_avg_div; + float adaptive_sample_limit; + + ivec2 pass_coord_offset; + vec2 pass_uv_offset; } params; -vec3 reconstructCSPosition(vec2 S, float z) { - if (params.orthogonal) { - return vec3((S.xy * params.proj_info.xy + params.proj_info.zw), z); +// packing/unpacking for edges; 2 bits per edge mean 4 gradient values (0, 0.33, 0.66, 1) for smoother transitions! +float pack_edges(vec4 p_edgesLRTB) { + p_edgesLRTB = round(clamp(p_edgesLRTB, 0.0, 1.0) * 3.05); + return dot(p_edgesLRTB, vec4(64.0 / 255.0, 16.0 / 255.0, 4.0 / 255.0, 1.0 / 255.0)); +} + +vec3 NDC_to_view_space(vec2 p_pos, float p_viewspace_depth) { + if (params.is_orthogonal) { + return vec3((params.NDC_to_view_mul * p_pos.xy + params.NDC_to_view_add), p_viewspace_depth); } else { - return vec3((S.xy * params.proj_info.xy + params.proj_info.zw) * z, z); + return vec3((params.NDC_to_view_mul * p_pos.xy + params.NDC_to_view_add) * p_viewspace_depth, p_viewspace_depth); } } -vec3 getPosition(ivec2 ssP) { - vec3 P; -#ifdef USE_HALF_SIZE - P.z = texelFetch(source_depth_mipmaps, ssP, 0).r; - P.z = -P.z; -#else - P.z = texelFetch(source_depth, ssP, 0).r; +// calculate effect radius and fit our screen sampling pattern inside it +void calculate_radius_parameters(const float p_pix_center_length, const vec2 p_pixel_size_at_center, out float r_lookup_radius, out float r_radius, out float r_fallof_sq) { + r_radius = params.radius; - P.z = P.z * 2.0 - 1.0; - if (params.orthogonal) { - P.z = ((P.z + (params.z_far + params.z_near) / (params.z_far - params.z_near)) * (params.z_far - params.z_near)) / 2.0; - } else { - P.z = 2.0 * params.z_near * params.z_far / (params.z_far + params.z_near - P.z * (params.z_far - params.z_near)); + // when too close, on-screen sampling disk will grow beyond screen size; limit this to avoid closeup temporal artifacts + const float too_close_limit = clamp(p_pix_center_length * params.inv_radius_near_limit, 0.0, 1.0) * 0.8 + 0.2; + + r_radius *= too_close_limit; + + // 0.85 is to reduce the radius to allow for more samples on a slope to still stay within influence + r_lookup_radius = (0.85 * r_radius) / p_pixel_size_at_center.x; + + // used to calculate falloff (both for AO samples and per-sample weights) + r_fallof_sq = -1.0 / (r_radius * r_radius); +} + +vec4 calculate_edges(const float p_center_z, const float p_left_z, const float p_right_z, const float p_top_z, const float p_bottom_z) { + // slope-sensitive depth-based edge detection + vec4 edgesLRTB = vec4(p_left_z, p_right_z, p_top_z, p_bottom_z) - p_center_z; + vec4 edgesLRTB_slope_adjusted = edgesLRTB + edgesLRTB.yxwz; + edgesLRTB = min(abs(edgesLRTB), abs(edgesLRTB_slope_adjusted)); + return clamp((1.3 - edgesLRTB / (p_center_z * 0.040)), 0.0, 1.0); +} + +vec3 decode_normal(vec3 p_encoded_normal) { + vec3 normal = p_encoded_normal * 2.0 - 1.0; + return normal; +} + +vec3 load_normal(ivec2 p_pos) { + vec3 encoded_normal = imageLoad(source_normal, p_pos).xyz; + encoded_normal.z = 1.0 - encoded_normal.z; + return decode_normal(encoded_normal); +} + +vec3 load_normal(ivec2 p_pos, ivec2 p_offset) { + vec3 encoded_normal = imageLoad(source_normal, p_pos + p_offset).xyz; + encoded_normal.z = 1.0 - encoded_normal.z; + return decode_normal(encoded_normal); +} + +// all vectors in viewspace +float calculate_pixel_obscurance(vec3 p_pixel_normal, vec3 p_hit_delta, float p_fallof_sq) { + float length_sq = dot(p_hit_delta, p_hit_delta); + float NdotD = dot(p_pixel_normal, p_hit_delta) / sqrt(length_sq); + + float falloff_mult = max(0.0, length_sq * p_fallof_sq + 1.0); + + return max(0, NdotD - params.horizon_angle_threshold) * falloff_mult; +} + +void SSAO_tap_inner(const int p_quality_level, inout float r_obscurance_sum, inout float r_weight_sum, const vec2 p_sampling_uv, const float p_mip_level, const vec3 p_pix_center_pos, vec3 p_pixel_normal, const float p_fallof_sq, const float p_weight_mod) { + // get depth at sample + float viewspace_sample_z = textureLod(source_depth_mipmaps, vec3(p_sampling_uv, params.pass), p_mip_level).x; + + // convert to viewspace + vec3 hit_pos = NDC_to_view_space(p_sampling_uv.xy, viewspace_sample_z).xyz; + vec3 hit_delta = hit_pos - p_pix_center_pos; + + float obscurance = calculate_pixel_obscurance(p_pixel_normal, hit_delta, p_fallof_sq); + float weight = 1.0; + + if (p_quality_level >= SSAO_HALOING_REDUCTION_ENABLE_AT_QUALITY_PRESET) { + float reduct = max(0, -hit_delta.z); + reduct = clamp(reduct * params.neg_inv_radius + 2.0, 0.0, 1.0); + weight = SSAO_HALOING_REDUCTION_AMOUNT * reduct + (1.0 - SSAO_HALOING_REDUCTION_AMOUNT); } - P.z = -P.z; -#endif - // Offset to pixel center - P = reconstructCSPosition(vec2(ssP) + vec2(0.5), P.z); - return P; + weight *= p_weight_mod; + r_obscurance_sum += obscurance * weight; + r_weight_sum += weight; } -/** Returns a unit vector and a screen-space radius for the tap on a unit disk (the caller should scale by the actual disk radius) */ -vec2 tapLocation(int sampleNumber, float spinAngle, out float ssR) { - // Radius relative to ssR - float alpha = (float(sampleNumber) + 0.5) * (1.0 / float(NUM_SAMPLES)); - float angle = alpha * (float(NUM_SPIRAL_TURNS) * 6.28) + spinAngle; +void SSAOTap(const int p_quality_level, inout float r_obscurance_sum, inout float r_weight_sum, const int p_tap_index, const mat2 p_rot_scale, const vec3 p_pix_center_pos, vec3 p_pixel_normal, const vec2 p_normalized_screen_pos, const float p_mip_offset, const float p_fallof_sq, float p_weight_mod, vec2 p_norm_xy, float p_norm_xy_length) { + vec2 sample_offset; + float sample_pow_2_len; - ssR = alpha; - return vec2(cos(angle), sin(angle)); + // patterns + { + vec4 new_sample = sample_pattern[p_tap_index]; + sample_offset = new_sample.xy * p_rot_scale; + sample_pow_2_len = new_sample.w; // precalculated, same as: sample_pow_2_len = log2( length( new_sample.xy ) ); + p_weight_mod *= new_sample.z; + } + + // snap to pixel center (more correct obscurance math, avoids artifacts) + sample_offset = round(sample_offset); + + // calculate MIP based on the sample distance from the centre, similar to as described + // in http://graphics.cs.williams.edu/papers/SAOHPG12/. + float mip_level = (p_quality_level < SSAO_DEPTH_MIPS_ENABLE_AT_QUALITY_PRESET) ? (0) : (sample_pow_2_len + p_mip_offset); + + vec2 sampling_uv = sample_offset * params.half_screen_pixel_size + p_normalized_screen_pos; + + SSAO_tap_inner(p_quality_level, r_obscurance_sum, r_weight_sum, sampling_uv, mip_level, p_pix_center_pos, p_pixel_normal, p_fallof_sq, p_weight_mod); + + // for the second tap, just use the mirrored offset + vec2 sample_offset_mirrored_uv = -sample_offset; + + // tilt the second set of samples so that the disk is effectively rotated by the normal + // effective at removing one set of artifacts, but too expensive for lower quality settings + if (p_quality_level >= SSAO_TILT_SAMPLES_ENABLE_AT_QUALITY_PRESET) { + float dot_norm = dot(sample_offset_mirrored_uv, p_norm_xy); + sample_offset_mirrored_uv -= dot_norm * p_norm_xy_length * p_norm_xy; + sample_offset_mirrored_uv = round(sample_offset_mirrored_uv); + } + + // snap to pixel center (more correct obscurance math, avoids artifacts) + vec2 sampling_mirrored_uv = sample_offset_mirrored_uv * params.half_screen_pixel_size + p_normalized_screen_pos; + + SSAO_tap_inner(p_quality_level, r_obscurance_sum, r_weight_sum, sampling_mirrored_uv, mip_level, p_pix_center_pos, p_pixel_normal, p_fallof_sq, p_weight_mod); } -/** Read the camera-space position of the point at screen-space pixel ssP + unitOffset * ssR. Assumes length(unitOffset) == 1 */ -vec3 getOffsetPosition(ivec2 ssP, float ssR) { - // Derivation: - // mipLevel = floor(log(ssR / MAX_OFFSET)); +// this function is designed to only work with half/half depth at the moment - there's a couple of hardcoded paths that expect pixel/texel size, so it will not work for full res +void generate_SSAO_shadows_internal(out float r_shadow_term, out vec4 r_edges, out float r_weight, const vec2 p_pos, int p_quality_level, bool p_adaptive_base) { + vec2 pos_rounded = trunc(p_pos); + uvec2 upos = uvec2(pos_rounded); - int mipLevel = clamp(int(floor(log2(ssR))) - LOG_MAX_OFFSET, 0, MAX_MIP_LEVEL); + const int number_of_taps = (p_adaptive_base) ? (SSAO_ADAPTIVE_TAP_BASE_COUNT) : (num_taps[p_quality_level]); + float pix_z, pix_left_z, pix_top_z, pix_right_z, pix_bottom_z; - vec3 P; + vec4 valuesUL = textureGather(source_depth_mipmaps, vec3(pos_rounded * params.half_screen_pixel_size, params.pass)); // g_ViewspaceDepthSource.GatherRed(g_PointMirrorSampler, pos_rounded * params.half_screen_pixel_size); + vec4 valuesBR = textureGather(source_depth_mipmaps, vec3((pos_rounded + vec2(1.0)) * params.half_screen_pixel_size, params.pass)); // g_ViewspaceDepthSource.GatherRed(g_PointMirrorSampler, pos_rounded * params.half_screen_pixel_size, ivec2(1, 1)); - // We need to divide by 2^mipLevel to read the appropriately scaled coordinate from a MIP-map. - // Manually clamp to the texture size because texelFetch bypasses the texture unit - ivec2 mipP = clamp(ssP >> mipLevel, ivec2(0), (params.screen_size >> mipLevel) - ivec2(1)); + // get this pixel's viewspace depth + pix_z = valuesUL.y; -#ifdef USE_HALF_SIZE - P.z = texelFetch(source_depth_mipmaps, mipP, mipLevel).r; - P.z = -P.z; -#else - if (mipLevel < 1) { - //read from depth buffer - P.z = texelFetch(source_depth, mipP, 0).r; - P.z = P.z * 2.0 - 1.0; - if (params.orthogonal) { - P.z = ((P.z + (params.z_far + params.z_near) / (params.z_far - params.z_near)) * (params.z_far - params.z_near)) / 2.0; - } else { - P.z = 2.0 * params.z_near * params.z_far / (params.z_far + params.z_near - P.z * (params.z_far - params.z_near)); + // get left right top bottom neighbouring pixels for edge detection (gets compiled out on quality_level == 0) + pix_left_z = valuesUL.x; + pix_top_z = valuesUL.z; + pix_right_z = valuesBR.z; + pix_bottom_z = valuesBR.x; + + vec2 normalized_screen_pos = pos_rounded * params.half_screen_pixel_size + params.half_screen_pixel_size_x025; + vec3 pix_center_pos = NDC_to_view_space(normalized_screen_pos, pix_z); + + // Load this pixel's viewspace normal + uvec2 full_res_coord = upos * 2 * params.size_multiplier + params.pass_coord_offset.xy; + vec3 pixel_normal = load_normal(ivec2(full_res_coord)); + + //const vec2 pixel_size_at_center = pix_center_pos.z * params.NDC_to_view_mul * params.half_screen_pixel_size; // optimized approximation of: + vec2 pixel_size_at_center = NDC_to_view_space(normalized_screen_pos.xy + params.half_screen_pixel_size * 0.5, pix_center_pos.z).xy - pix_center_pos.xy; + + float pixel_lookup_radius; + float fallof_sq; + + // calculate effect radius and fit our screen sampling pattern inside it + float viewspace_radius; + calculate_radius_parameters(length(pix_center_pos), pixel_size_at_center, pixel_lookup_radius, viewspace_radius, fallof_sq); + + // calculate samples rotation/scaling + mat2 rot_scale_matrix; + uint pseudo_random_index; + + { + vec4 rotation_scale; + // reduce effect radius near the screen edges slightly; ideally, one would render a larger depth buffer (5% on each side) instead + if (!p_adaptive_base && (p_quality_level >= SSAO_REDUCE_RADIUS_NEAR_SCREEN_BORDER_ENABLE_AT_QUALITY_PRESET)) { + float near_screen_border = min(min(normalized_screen_pos.x, 1.0 - normalized_screen_pos.x), min(normalized_screen_pos.y, 1.0 - normalized_screen_pos.y)); + near_screen_border = clamp(10.0 * near_screen_border + 0.6, 0.0, 1.0); + pixel_lookup_radius *= near_screen_border; } - P.z = -P.z; - } else { - //read from mipmaps - P.z = texelFetch(source_depth_mipmaps, mipP, mipLevel - 1).r; - P.z = -P.z; + // load & update pseudo-random rotation matrix + pseudo_random_index = uint(pos_rounded.y * 2 + pos_rounded.x) % 5; + rotation_scale = constants.rotation_matrices[params.pass * 5 + pseudo_random_index]; + rot_scale_matrix = mat2(rotation_scale.x * pixel_lookup_radius, rotation_scale.y * pixel_lookup_radius, rotation_scale.z * pixel_lookup_radius, rotation_scale.w * pixel_lookup_radius); + } + + // the main obscurance & sample weight storage + float obscurance_sum = 0.0; + float weight_sum = 0.0; + + // edge mask for between this and left/right/top/bottom neighbour pixels - not used in quality level 0 so initialize to "no edge" (1 is no edge, 0 is edge) + vec4 edgesLRTB = vec4(1.0, 1.0, 1.0, 1.0); + + // Move center pixel slightly towards camera to avoid imprecision artifacts due to using of 16bit depth buffer; a lot smaller offsets needed when using 32bit floats + pix_center_pos *= 0.9992; + + if (!p_adaptive_base && (p_quality_level >= SSAO_DEPTH_BASED_EDGES_ENABLE_AT_QUALITY_PRESET)) { + edgesLRTB = calculate_edges(pix_z, pix_left_z, pix_right_z, pix_top_z, pix_bottom_z); + } + + // adds a more high definition sharp effect, which gets blurred out (reuses left/right/top/bottom samples that we used for edge detection) + if (!p_adaptive_base && (p_quality_level >= SSAO_DETAIL_AO_ENABLE_AT_QUALITY_PRESET)) { + // disable in case of quality level 4 (reference) + if (p_quality_level != 4) { + //approximate neighbouring pixels positions (actually just deltas or "positions - pix_center_pos" ) + vec3 normalized_viewspace_dir = vec3(pix_center_pos.xy / pix_center_pos.zz, 1.0); + vec3 pixel_left_delta = vec3(-pixel_size_at_center.x, 0.0, 0.0) + normalized_viewspace_dir * (pix_left_z - pix_center_pos.z); + vec3 pixel_right_delta = vec3(+pixel_size_at_center.x, 0.0, 0.0) + normalized_viewspace_dir * (pix_right_z - pix_center_pos.z); + vec3 pixel_top_delta = vec3(0.0, -pixel_size_at_center.y, 0.0) + normalized_viewspace_dir * (pix_top_z - pix_center_pos.z); + vec3 pixel_bottom_delta = vec3(0.0, +pixel_size_at_center.y, 0.0) + normalized_viewspace_dir * (pix_bottom_z - pix_center_pos.z); + + const float range_reduction = 4.0f; // this is to avoid various artifacts + const float modified_fallof_sq = range_reduction * fallof_sq; + + vec4 additional_obscurance; + additional_obscurance.x = calculate_pixel_obscurance(pixel_normal, pixel_left_delta, modified_fallof_sq); + additional_obscurance.y = calculate_pixel_obscurance(pixel_normal, pixel_right_delta, modified_fallof_sq); + additional_obscurance.z = calculate_pixel_obscurance(pixel_normal, pixel_top_delta, modified_fallof_sq); + additional_obscurance.w = calculate_pixel_obscurance(pixel_normal, pixel_bottom_delta, modified_fallof_sq); + + obscurance_sum += params.detail_intensity * dot(additional_obscurance, edgesLRTB); + } + } + + // Sharp normals also create edges - but this adds to the cost as well + if (!p_adaptive_base && (p_quality_level >= SSAO_NORMAL_BASED_EDGES_ENABLE_AT_QUALITY_PRESET)) { + vec3 neighbour_normal_left = load_normal(ivec2(full_res_coord), ivec2(-2, 0)); + vec3 neighbour_normal_right = load_normal(ivec2(full_res_coord), ivec2(2, 0)); + vec3 neighbour_normal_top = load_normal(ivec2(full_res_coord), ivec2(0, -2)); + vec3 neighbour_normal_bottom = load_normal(ivec2(full_res_coord), ivec2(0, 2)); + + const float dot_threshold = SSAO_NORMAL_BASED_EDGES_DOT_THRESHOLD; + + vec4 normal_edgesLRTB; + normal_edgesLRTB.x = clamp((dot(pixel_normal, neighbour_normal_left) + dot_threshold), 0.0, 1.0); + normal_edgesLRTB.y = clamp((dot(pixel_normal, neighbour_normal_right) + dot_threshold), 0.0, 1.0); + normal_edgesLRTB.z = clamp((dot(pixel_normal, neighbour_normal_top) + dot_threshold), 0.0, 1.0); + normal_edgesLRTB.w = clamp((dot(pixel_normal, neighbour_normal_bottom) + dot_threshold), 0.0, 1.0); + + edgesLRTB *= normal_edgesLRTB; + } + + const float global_mip_offset = SSAO_DEPTH_MIPS_GLOBAL_OFFSET; + float mip_offset = (p_quality_level < SSAO_DEPTH_MIPS_ENABLE_AT_QUALITY_PRESET) ? (0) : (log2(pixel_lookup_radius) + global_mip_offset); + + // Used to tilt the second set of samples so that the disk is effectively rotated by the normal + // effective at removing one set of artifacts, but too expensive for lower quality settings + vec2 norm_xy = vec2(pixel_normal.x, pixel_normal.y); + float norm_xy_length = length(norm_xy); + norm_xy /= vec2(norm_xy_length, -norm_xy_length); + norm_xy_length *= SSAO_TILT_SAMPLES_AMOUNT; + + // standard, non-adaptive approach + if ((p_quality_level != 3) || p_adaptive_base) { + for (int i = 0; i < number_of_taps; i++) { + SSAOTap(p_quality_level, obscurance_sum, weight_sum, i, rot_scale_matrix, pix_center_pos, pixel_normal, normalized_screen_pos, mip_offset, fallof_sq, 1.0, norm_xy, norm_xy_length); + } + } +#ifdef ADAPTIVE + else { + // add new ones if needed + vec2 full_res_uv = normalized_screen_pos + params.pass_uv_offset.xy; + float importance = textureLod(source_importance, full_res_uv, 0.0).x; + + // this is to normalize SSAO_DETAIL_AO_AMOUNT across all pixel regardless of importance + obscurance_sum *= (SSAO_ADAPTIVE_TAP_BASE_COUNT / float(SSAO_MAX_TAPS)) + (importance * SSAO_ADAPTIVE_TAP_FLEXIBLE_COUNT / float(SSAO_MAX_TAPS)); + + // load existing base values + vec2 base_values = imageLoad(source_ssao, ivec3(upos, params.pass)).xy; + weight_sum += base_values.y * float(SSAO_ADAPTIVE_TAP_BASE_COUNT * 4.0); + obscurance_sum += (base_values.x) * weight_sum; + + // increase importance around edges + float edge_count = dot(1.0 - edgesLRTB, vec4(1.0, 1.0, 1.0, 1.0)); + + float avg_total_importance = float(counter.sum) * params.load_counter_avg_div; + + float importance_limiter = clamp(params.adaptive_sample_limit / avg_total_importance, 0.0, 1.0); + importance *= importance_limiter; + + float additional_sample_count = SSAO_ADAPTIVE_TAP_FLEXIBLE_COUNT * importance; + + const float blend_range = 3.0; + const float blend_range_inv = 1.0 / blend_range; + + additional_sample_count += 0.5; + uint additional_samples = uint(additional_sample_count); + uint additional_samples_to = min(SSAO_MAX_TAPS, additional_samples + SSAO_ADAPTIVE_TAP_BASE_COUNT); + + for (uint i = SSAO_ADAPTIVE_TAP_BASE_COUNT; i < additional_samples_to; i++) { + additional_sample_count -= 1.0f; + float weight_mod = clamp(additional_sample_count * blend_range_inv, 0.0, 1.0); + SSAOTap(p_quality_level, obscurance_sum, weight_sum, int(i), rot_scale_matrix, pix_center_pos, pixel_normal, normalized_screen_pos, mip_offset, fallof_sq, weight_mod, norm_xy, norm_xy_length); + } } #endif - // Offset to pixel center - P = reconstructCSPosition(vec2(ssP) + vec2(0.5), P.z); + // early out for adaptive base - just output weight (used for the next pass) + if (p_adaptive_base) { + float obscurance = obscurance_sum / weight_sum; - return P; -} - -/** Compute the occlusion due to sample with index \a i about the pixel at \a ssC that corresponds - to camera-space point \a C with unit normal \a n_C, using maximum screen-space sampling radius \a ssDiskRadius - - Note that units of H() in the HPG12 paper are meters, not - unitless. The whole falloff/sampling function is therefore - unitless. In this implementation, we factor out (9 / radius). - - Four versions of the falloff function are implemented below -*/ -float sampleAO(in ivec2 ssC, in vec3 C, in vec3 n_C, in float ssDiskRadius, in float p_radius, in int tapIndex, in float randomPatternRotationAngle) { - // Offset on the unit disk, spun for this pixel - float ssR; - vec2 unitOffset = tapLocation(tapIndex, randomPatternRotationAngle, ssR); - ssR *= ssDiskRadius; - - ivec2 ssP = ivec2(ssR * unitOffset) + ssC; - - if (any(lessThan(ssP, ivec2(0))) || any(greaterThanEqual(ssP, params.screen_size))) { - return 0.0; + r_shadow_term = obscurance; + r_edges = vec4(0.0); + r_weight = weight_sum; + return; } - // The occluding point in camera space - vec3 Q = getOffsetPosition(ssP, ssR); + // calculate weighted average + float obscurance = obscurance_sum / weight_sum; - vec3 v = Q - C; + // calculate fadeout (1 close, gradient, 0 far) + float fade_out = clamp(pix_center_pos.z * params.fade_out_mul + params.fade_out_add, 0.0, 1.0); - float vv = dot(v, v); - float vn = dot(v, n_C); + // Reduce the SSAO shadowing if we're on the edge to remove artifacts on edges (we don't care for the lower quality one) + if (!p_adaptive_base && (p_quality_level >= SSAO_DEPTH_BASED_EDGES_ENABLE_AT_QUALITY_PRESET)) { + // when there's more than 2 opposite edges, start fading out the occlusion to reduce aliasing artifacts + float edge_fadeout_factor = clamp((1.0 - edgesLRTB.x - edgesLRTB.y) * 0.35, 0.0, 1.0) + clamp((1.0 - edgesLRTB.z - edgesLRTB.w) * 0.35, 0.0, 1.0); - const float epsilon = 0.01; - float radius2 = p_radius * p_radius; + fade_out *= clamp(1.0 - edge_fadeout_factor, 0.0, 1.0); + } - // A: From the HPG12 paper - // Note large epsilon to avoid overdarkening within cracks - //return float(vv < radius2) * max((vn - bias) / (epsilon + vv), 0.0) * radius2 * 0.6; + // same as a bove, but a lot more conservative version + // fade_out *= clamp( dot( edgesLRTB, vec4( 0.9, 0.9, 0.9, 0.9 ) ) - 2.6 , 0.0, 1.0); - // B: Smoother transition to zero (lowers contrast, smoothing out corners). [Recommended] - float f = max(radius2 - vv, 0.0); - return f * f * f * max((vn - params.bias) / (epsilon + vv), 0.0); + // strength + obscurance = params.intensity * obscurance; - // C: Medium contrast (which looks better at high radii), no division. Note that the - // contribution still falls off with radius^2, but we've adjusted the rate in a way that is - // more computationally efficient and happens to be aesthetically pleasing. - // return 4.0 * max(1.0 - vv * invRadius2, 0.0) * max(vn - bias, 0.0); + // clamp + obscurance = min(obscurance, params.shadow_clamp); - // D: Low contrast, no division operation - // return 2.0 * float(vv < radius * radius) * max(vn - bias, 0.0); + // fadeout + obscurance *= fade_out; + + // conceptually switch to occlusion with the meaning being visibility (grows with visibility, occlusion == 1 implies full visibility), + // to be in line with what is more commonly used. + float occlusion = 1.0 - obscurance; + + // modify the gradient + // note: this cannot be moved to a later pass because of loss of precision after storing in the render target + occlusion = pow(clamp(occlusion, 0.0, 1.0), params.shadow_power); + + // outputs! + r_shadow_term = occlusion; // Our final 'occlusion' term (0 means fully occluded, 1 means fully lit) + r_edges = edgesLRTB; // These are used to prevent blurring across edges, 1 means no edge, 0 means edge, 0.5 means half way there, etc. + r_weight = weight_sum; } void main() { - // Pixel being shaded + float out_shadow_term; + float out_weight; + vec4 out_edges; ivec2 ssC = ivec2(gl_GlobalInvocationID.xy); if (any(greaterThanEqual(ssC, params.screen_size))) { //too large, do nothing return; } - // World space point being shaded - vec3 C = getPosition(ssC); + vec2 uv = vec2(gl_GlobalInvocationID) + vec2(0.5); +#ifdef SSAO_BASE + generate_SSAO_shadows_internal(out_shadow_term, out_edges, out_weight, uv, params.quality, true); -#ifdef USE_HALF_SIZE - vec3 n_C = texelFetch(source_normal, ssC << 1, 0).xyz * 2.0 - 1.0; + imageStore(dest_image, ivec2(gl_GlobalInvocationID.xy), vec4(out_shadow_term, out_weight / (float(SSAO_ADAPTIVE_TAP_BASE_COUNT) * 4.0), 0.0, 0.0)); #else - vec3 n_C = texelFetch(source_normal, ssC, 0).xyz * 2.0 - 1.0; + generate_SSAO_shadows_internal(out_shadow_term, out_edges, out_weight, uv, params.quality, false); // pass in quality levels + if (params.quality == 0) { + out_edges = vec4(1.0); + } + + imageStore(dest_image, ivec2(gl_GlobalInvocationID.xy), vec4(out_shadow_term, pack_edges(out_edges), 0.0, 0.0)); #endif - n_C = normalize(n_C); - n_C.y = -n_C.y; //because this code reads flipped - - // Hash function used in the HPG12 AlchemyAO paper - float randomPatternRotationAngle = mod(float((3 * ssC.x ^ ssC.y + ssC.x * ssC.y) * 10), TWO_PI); - - // Reconstruct normals from positions. These will lead to 1-pixel black lines - // at depth discontinuities, however the blur will wipe those out so they are not visible - // in the final image. - - // Choose the screen-space sample radius - // proportional to the projected area of the sphere - - float ssDiskRadius = -params.proj_scale * params.radius; - if (!params.orthogonal) { - ssDiskRadius = -params.proj_scale * params.radius / C.z; - } - float sum = 0.0; - for (int i = 0; i < NUM_SAMPLES; ++i) { - sum += sampleAO(ssC, C, n_C, ssDiskRadius, params.radius, i, randomPatternRotationAngle); - } - - float A = max(0.0, 1.0 - sum * params.intensity_div_r6 * (5.0 / float(NUM_SAMPLES))); - - imageStore(dest_image, ssC, vec4(A)); } diff --git a/servers/rendering/renderer_rd/shaders/ssao_blur.glsl b/servers/rendering/renderer_rd/shaders/ssao_blur.glsl index 3e63e3cb597..510a7770488 100644 --- a/servers/rendering/renderer_rd/shaders/ssao_blur.glsl +++ b/servers/rendering/renderer_rd/shaders/ssao_blur.glsl @@ -1,3 +1,22 @@ +/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// +// Copyright (c) 2016, Intel Corporation +// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated +// documentation files (the "Software"), to deal in the Software without restriction, including without limitation +// the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +// permit persons to whom the Software is furnished to do so, subject to the following conditions: +// The above copyright notice and this permission notice shall be included in all copies or substantial portions of +// the Software. +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO +// THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +// TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. +/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// +// File changes (yyyy-mm-dd) +// 2016-09-07: filip.strugar@intel.com: first commit +// 2020-12-05: clayjohn: convert to Vulkan and Godot +/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// + #[compute] #version 450 @@ -7,147 +26,129 @@ VERSION_DEFINES layout(local_size_x = 8, local_size_y = 8, local_size_z = 1) in; layout(set = 0, binding = 0) uniform sampler2D source_ssao; -layout(set = 1, binding = 0) uniform sampler2D source_depth; -#ifdef MODE_UPSCALE -layout(set = 2, binding = 0) uniform sampler2D source_depth_mipmaps; -#endif -layout(r8, set = 3, binding = 0) uniform restrict writeonly image2D dest_image; - -////////////////////////////////////////////////////////////////////////////////////////////// -// Tunable Parameters: +layout(rg8, set = 1, binding = 0) uniform restrict writeonly image2D dest_image; layout(push_constant, binding = 1, std430) uniform Params { - float edge_sharpness; /** Increase to make depth edges crisper. Decrease to reduce flicker. */ - int filter_scale; - float z_far; - float z_near; - bool orthogonal; - uint pad0; - uint pad1; - uint pad2; - ivec2 axis; /** (1, 0) or (0, 1) */ - ivec2 screen_size; + float edge_sharpness; + float pad; + vec2 half_screen_pixel_size; } params; -/** Filter radius in pixels. This will be multiplied by SCALE. */ -#define R (4) +vec4 unpack_edges(float p_packed_val) { + uint packed_val = uint(p_packed_val * 255.5); + vec4 edgesLRTB; + edgesLRTB.x = float((packed_val >> 6) & 0x03) / 3.0; + edgesLRTB.y = float((packed_val >> 4) & 0x03) / 3.0; + edgesLRTB.z = float((packed_val >> 2) & 0x03) / 3.0; + edgesLRTB.w = float((packed_val >> 0) & 0x03) / 3.0; -////////////////////////////////////////////////////////////////////////////////////////////// + return clamp(edgesLRTB + params.edge_sharpness, 0.0, 1.0); +} -// Gaussian coefficients -const float gaussian[R + 1] = - //float[](0.356642, 0.239400, 0.072410, 0.009869); - //float[](0.398943, 0.241971, 0.053991, 0.004432, 0.000134); // stddev = 1.0 - float[](0.153170, 0.144893, 0.122649, 0.092902, 0.062970); // stddev = 2.0 -//float[](0.111220, 0.107798, 0.098151, 0.083953, 0.067458, 0.050920, 0.036108); // stddev = 3.0 +void add_sample(float p_ssao_value, float p_edge_value, inout float r_sum, inout float r_sum_weight) { + float weight = p_edge_value; + + r_sum += (weight * p_ssao_value); + r_sum_weight += weight; +} + +#ifdef MODE_WIDE +vec2 sample_blurred_wide(vec2 p_coord) { + vec2 vC = textureLodOffset(source_ssao, vec2(p_coord), 0.0, ivec2(0, 0)).xy; + vec2 vL = textureLodOffset(source_ssao, vec2(p_coord), 0.0, ivec2(-2, 0)).xy; + vec2 vT = textureLodOffset(source_ssao, vec2(p_coord), 0.0, ivec2(0, -2)).xy; + vec2 vR = textureLodOffset(source_ssao, vec2(p_coord), 0.0, ivec2(2, 0)).xy; + vec2 vB = textureLodOffset(source_ssao, vec2(p_coord), 0.0, ivec2(0, 2)).xy; + + float packed_edges = vC.y; + vec4 edgesLRTB = unpack_edges(packed_edges); + edgesLRTB.x *= unpack_edges(vL.y).y; + edgesLRTB.z *= unpack_edges(vT.y).w; + edgesLRTB.y *= unpack_edges(vR.y).x; + edgesLRTB.w *= unpack_edges(vB.y).z; + + float ssao_value = vC.x; + float ssao_valueL = vL.x; + float ssao_valueT = vT.x; + float ssao_valueR = vR.x; + float ssao_valueB = vB.x; + + float sum_weight = 0.8f; + float sum = ssao_value * sum_weight; + + add_sample(ssao_valueL, edgesLRTB.x, sum, sum_weight); + add_sample(ssao_valueR, edgesLRTB.y, sum, sum_weight); + add_sample(ssao_valueT, edgesLRTB.z, sum, sum_weight); + add_sample(ssao_valueB, edgesLRTB.w, sum, sum_weight); + + float ssao_avg = sum / sum_weight; + + ssao_value = ssao_avg; + + return vec2(ssao_value, packed_edges); +} +#endif + +#ifdef MODE_SMART +vec2 sample_blurred(vec3 p_pos, vec2 p_coord) { + float packed_edges = texelFetch(source_ssao, ivec2(p_pos.xy), 0).y; + vec4 edgesLRTB = unpack_edges(packed_edges); + + vec4 valuesUL = textureGather(source_ssao, vec2(p_coord - params.half_screen_pixel_size * 0.5)); + vec4 valuesBR = textureGather(source_ssao, vec2(p_coord + params.half_screen_pixel_size * 0.5)); + + float ssao_value = valuesUL.y; + float ssao_valueL = valuesUL.x; + float ssao_valueT = valuesUL.z; + float ssao_valueR = valuesBR.z; + float ssao_valueB = valuesBR.x; + + float sum_weight = 0.5; + float sum = ssao_value * sum_weight; + + add_sample(ssao_valueL, edgesLRTB.x, sum, sum_weight); + add_sample(ssao_valueR, edgesLRTB.y, sum, sum_weight); + + add_sample(ssao_valueT, edgesLRTB.z, sum, sum_weight); + add_sample(ssao_valueB, edgesLRTB.w, sum, sum_weight); + + float ssao_avg = sum / sum_weight; + + ssao_value = ssao_avg; + + return vec2(ssao_value, packed_edges); +} +#endif void main() { // Pixel being shaded ivec2 ssC = ivec2(gl_GlobalInvocationID.xy); - if (any(greaterThanEqual(ssC, params.screen_size))) { //too large, do nothing - return; - } -#ifdef MODE_UPSCALE +#ifdef MODE_NON_SMART - //closest one should be the same pixel, but check nearby just in case - float depth = texelFetch(source_depth, ssC, 0).r; + vec2 halfPixel = params.half_screen_pixel_size * 0.5f; - depth = depth * 2.0 - 1.0; - if (params.orthogonal) { - depth = ((depth + (params.z_far + params.z_near) / (params.z_far - params.z_near)) * (params.z_far - params.z_near)) / 2.0; - } else { - depth = 2.0 * params.z_near * params.z_far / (params.z_far + params.z_near - depth * (params.z_far - params.z_near)); - } + vec2 uv = (vec2(gl_GlobalInvocationID.xy) + vec2(0.5, 0.5)) * params.half_screen_pixel_size; - vec2 pixel_size = 1.0 / vec2(params.screen_size); - vec2 closest_uv = vec2(ssC) * pixel_size + pixel_size * 0.5; - vec2 from_uv = closest_uv; - vec2 ps2 = pixel_size; // * 2.0; + vec2 centre = textureLod(source_ssao, vec2(uv), 0.0).xy; - float closest_depth = abs(textureLod(source_depth_mipmaps, closest_uv, 0.0).r - depth); + vec4 vals; + vals.x = textureLod(source_ssao, vec2(uv + vec2(-halfPixel.x * 3, -halfPixel.y)), 0.0).x; + vals.y = textureLod(source_ssao, vec2(uv + vec2(+halfPixel.x, -halfPixel.y * 3)), 0.0).x; + vals.z = textureLod(source_ssao, vec2(uv + vec2(-halfPixel.x, +halfPixel.y * 3)), 0.0).x; + vals.w = textureLod(source_ssao, vec2(uv + vec2(+halfPixel.x * 3, +halfPixel.y)), 0.0).x; - vec2 offsets[4] = vec2[](vec2(ps2.x, 0), vec2(-ps2.x, 0), vec2(0, ps2.y), vec2(0, -ps2.y)); - for (int i = 0; i < 4; i++) { - vec2 neighbour = from_uv + offsets[i]; - float neighbour_depth = abs(textureLod(source_depth_mipmaps, neighbour, 0.0).r - depth); - if (neighbour_depth < closest_depth) { - closest_uv = neighbour; - closest_depth = neighbour_depth; - } - } + vec2 sampled = vec2(dot(vals, vec4(0.2)) + centre.x * 0.2, centre.y); - float visibility = textureLod(source_ssao, closest_uv, 0.0).r; - imageStore(dest_image, ssC, vec4(visibility)); #else - - float depth = texelFetch(source_depth, ssC, 0).r; - -#ifdef MODE_FULL_SIZE - depth = depth * 2.0 - 1.0; - - if (params.orthogonal) { - depth = ((depth + (params.z_far + params.z_near) / (params.z_far - params.z_near)) * (params.z_far - params.z_near)) / 2.0; - } else { - depth = 2.0 * params.z_near * params.z_far / (params.z_far + params.z_near - depth * (params.z_far - params.z_near)); - } +#ifdef MODE_SMART + vec2 sampled = sample_blurred(vec3(gl_GlobalInvocationID), (vec2(gl_GlobalInvocationID.xy) + vec2(0.5, 0.5)) * params.half_screen_pixel_size); +#else // MODE_WIDE + vec2 sampled = sample_blurred_wide((vec2(gl_GlobalInvocationID.xy) + vec2(0.5, 0.5)) * params.half_screen_pixel_size); +#endif #endif - float depth_divide = 1.0 / params.z_far; - - //depth *= depth_divide; - - /* - if (depth > params.z_far * 0.999) { - discard; //skybox - } - */ - - float sum = texelFetch(source_ssao, ssC, 0).r; - - // Base weight for depth falloff. Increase this for more blurriness, - // decrease it for better edge discrimination - float BASE = gaussian[0]; - float totalWeight = BASE; - sum *= totalWeight; - - ivec2 clamp_limit = params.screen_size - ivec2(1); - - for (int r = -R; r <= R; ++r) { - // We already handled the zero case above. This loop should be unrolled and the static branch optimized out, - // so the IF statement has no runtime cost - if (r != 0) { - ivec2 ppos = ssC + params.axis * (r * params.filter_scale); - float value = texelFetch(source_ssao, clamp(ppos, ivec2(0), clamp_limit), 0).r; - ivec2 rpos = clamp(ppos, ivec2(0), clamp_limit); - - float temp_depth = texelFetch(source_depth, rpos, 0).r; -#ifdef MODE_FULL_SIZE - temp_depth = temp_depth * 2.0 - 1.0; - if (params.orthogonal) { - temp_depth = ((temp_depth + (params.z_far + params.z_near) / (params.z_far - params.z_near)) * (params.z_far - params.z_near)) / 2.0; - } else { - temp_depth = 2.0 * params.z_near * params.z_far / (params.z_far + params.z_near - temp_depth * (params.z_far - params.z_near)); - } - //temp_depth *= depth_divide; -#endif - // spatial domain: offset gaussian tap - float weight = 0.3 + gaussian[abs(r)]; - //weight *= max(0.0, dot(temp_normal, normal)); - - // range domain (the "bilateral" weight). As depth difference increases, decrease weight. - weight *= max(0.0, 1.0 - params.edge_sharpness * abs(temp_depth - depth)); - - sum += value * weight; - totalWeight += weight; - } - } - - const float epsilon = 0.0001; - float visibility = sum / (totalWeight + epsilon); - - imageStore(dest_image, ssC, vec4(visibility)); -#endif + imageStore(dest_image, ivec2(ssC), vec4(sampled, 0.0, 0.0)); } diff --git a/servers/rendering/renderer_rd/shaders/ssao_downsample.glsl b/servers/rendering/renderer_rd/shaders/ssao_downsample.glsl new file mode 100644 index 00000000000..cb2d31f70d0 --- /dev/null +++ b/servers/rendering/renderer_rd/shaders/ssao_downsample.glsl @@ -0,0 +1,206 @@ +/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// +// Copyright (c) 2016, Intel Corporation +// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated +// documentation files (the "Software"), to deal in the Software without restriction, including without limitation +// the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +// permit persons to whom the Software is furnished to do so, subject to the following conditions: +// The above copyright notice and this permission notice shall be included in all copies or substantial portions of +// the Software. +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO +// THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +// TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. +/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// +// File changes (yyyy-mm-dd) +// 2016-09-07: filip.strugar@intel.com: first commit +// 2020-12-05: clayjohn: convert to Vulkan and Godot +/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// + +#[compute] + +#version 450 + +VERSION_DEFINES + +layout(local_size_x = 8, local_size_y = 8, local_size_z = 1) in; + +layout(push_constant, binding = 1, std430) uniform Params { + vec2 pixel_size; + float z_far; + float z_near; + bool orthogonal; + float radius_sq; + uvec2 pad; +} +params; + +layout(set = 0, binding = 0) uniform sampler2D source_depth; + +layout(r16f, set = 1, binding = 0) uniform restrict writeonly image2DArray dest_image0; //rename +#ifdef GENERATE_MIPS +layout(r16f, set = 2, binding = 0) uniform restrict writeonly image2DArray dest_image1; +layout(r16f, set = 2, binding = 1) uniform restrict writeonly image2DArray dest_image2; +layout(r16f, set = 2, binding = 2) uniform restrict writeonly image2DArray dest_image3; +#endif + +vec4 screen_space_to_view_space_depth(vec4 p_depth) { + if (params.orthogonal) { + vec4 depth = p_depth * 2.0 - 1.0; + return ((depth + (params.z_far + params.z_near) / (params.z_far - params.z_near)) * (params.z_far - params.z_near)) / 2.0; + } + + float depth_linearize_mul = params.z_near; + float depth_linearize_add = params.z_far; + + // Optimised version of "-cameraClipNear / (cameraClipFar - projDepth * (cameraClipFar - cameraClipNear)) * cameraClipFar" + + // Set your depth_linearize_mul and depth_linearize_add to: + // depth_linearize_mul = ( cameraClipFar * cameraClipNear) / ( cameraClipFar - cameraClipNear ); + // depth_linearize_add = cameraClipFar / ( cameraClipFar - cameraClipNear ); + + return depth_linearize_mul / (depth_linearize_add - p_depth); +} + +float screen_space_to_view_space_depth(float p_depth) { + if (params.orthogonal) { + float depth = p_depth * 2.0 - 1.0; + return ((depth + (params.z_far + params.z_near) / (params.z_far - params.z_near)) * (params.z_far - params.z_near)) / (2.0 * params.z_far); + } + + float depth_linearize_mul = params.z_near; + float depth_linearize_add = params.z_far; + + return depth_linearize_mul / (depth_linearize_add - p_depth); +} + +#ifdef GENERATE_MIPS + +shared float depth_buffer[4][8][8]; + +float mip_smart_average(vec4 p_depths) { + float closest = min(min(p_depths.x, p_depths.y), min(p_depths.z, p_depths.w)); + float fallof_sq = -1.0f / params.radius_sq; + vec4 dists = p_depths - closest.xxxx; + vec4 weights = clamp(dists * dists * fallof_sq + 1.0, 0.0, 1.0); + return dot(weights, p_depths) / dot(weights, vec4(1.0, 1.0, 1.0, 1.0)); +} + +void prepare_depths_and_mips(vec4 p_samples, uvec2 p_output_coord, uvec2 p_gtid) { + p_samples = screen_space_to_view_space_depth(p_samples); + + depth_buffer[0][p_gtid.x][p_gtid.y] = p_samples.w; + depth_buffer[1][p_gtid.x][p_gtid.y] = p_samples.z; + depth_buffer[2][p_gtid.x][p_gtid.y] = p_samples.x; + depth_buffer[3][p_gtid.x][p_gtid.y] = p_samples.y; + + imageStore(dest_image0, ivec3(p_output_coord.x, p_output_coord.y, 0), vec4(p_samples.w)); + imageStore(dest_image0, ivec3(p_output_coord.x, p_output_coord.y, 1), vec4(p_samples.z)); + imageStore(dest_image0, ivec3(p_output_coord.x, p_output_coord.y, 2), vec4(p_samples.x)); + imageStore(dest_image0, ivec3(p_output_coord.x, p_output_coord.y, 3), vec4(p_samples.y)); + + uint depth_array_index = 2 * (p_gtid.y % 2) + (p_gtid.x % 2); + uvec2 depth_array_offset = ivec2(p_gtid.x % 2, p_gtid.y % 2); + ivec2 buffer_coord = ivec2(p_gtid) - ivec2(depth_array_offset); + + p_output_coord /= 2; + groupMemoryBarrier(); + barrier(); + + // if (still_alive) <-- all threads alive here + { + float sample_00 = depth_buffer[depth_array_index][buffer_coord.x + 0][buffer_coord.y + 0]; + float sample_01 = depth_buffer[depth_array_index][buffer_coord.x + 0][buffer_coord.y + 1]; + float sample_10 = depth_buffer[depth_array_index][buffer_coord.x + 1][buffer_coord.y + 0]; + float sample_11 = depth_buffer[depth_array_index][buffer_coord.x + 1][buffer_coord.y + 1]; + + float avg = mip_smart_average(vec4(sample_00, sample_01, sample_10, sample_11)); + imageStore(dest_image1, ivec3(p_output_coord.x, p_output_coord.y, depth_array_index), vec4(avg)); + depth_buffer[depth_array_index][buffer_coord.x][buffer_coord.y] = avg; + } + + bool still_alive = p_gtid.x % 4 == depth_array_offset.x && p_gtid.y % 4 == depth_array_offset.y; + + p_output_coord /= 2; + groupMemoryBarrier(); + barrier(); + + if (still_alive) { + float sample_00 = depth_buffer[depth_array_index][buffer_coord.x + 0][buffer_coord.y + 0]; + float sample_01 = depth_buffer[depth_array_index][buffer_coord.x + 0][buffer_coord.y + 2]; + float sample_10 = depth_buffer[depth_array_index][buffer_coord.x + 2][buffer_coord.y + 0]; + float sample_11 = depth_buffer[depth_array_index][buffer_coord.x + 2][buffer_coord.y + 2]; + + float avg = mip_smart_average(vec4(sample_00, sample_01, sample_10, sample_11)); + imageStore(dest_image2, ivec3(p_output_coord.x, p_output_coord.y, depth_array_index), vec4(avg)); + depth_buffer[depth_array_index][buffer_coord.x][buffer_coord.y] = avg; + } + + still_alive = p_gtid.x % 8 == depth_array_offset.x && depth_array_offset.y % 8 == depth_array_offset.y; + + p_output_coord /= 2; + groupMemoryBarrier(); + barrier(); + + if (still_alive) { + float sample_00 = depth_buffer[depth_array_index][buffer_coord.x + 0][buffer_coord.y + 0]; + float sample_01 = depth_buffer[depth_array_index][buffer_coord.x + 0][buffer_coord.y + 4]; + float sample_10 = depth_buffer[depth_array_index][buffer_coord.x + 4][buffer_coord.y + 0]; + float sample_11 = depth_buffer[depth_array_index][buffer_coord.x + 4][buffer_coord.y + 4]; + + float avg = mip_smart_average(vec4(sample_00, sample_01, sample_10, sample_11)); + imageStore(dest_image3, ivec3(p_output_coord.x, p_output_coord.y, depth_array_index), vec4(avg)); + } +} +#else +#ifndef USE_HALF_BUFFERS +void prepare_depths(vec4 p_samples, uvec2 p_tid) { + p_samples = screen_space_to_view_space_depth(p_samples); + + imageStore(dest_image0, ivec3(p_tid, 0), vec4(p_samples.w)); + imageStore(dest_image0, ivec3(p_tid, 1), vec4(p_samples.z)); + imageStore(dest_image0, ivec3(p_tid, 2), vec4(p_samples.x)); + imageStore(dest_image0, ivec3(p_tid, 3), vec4(p_samples.y)); +} +#endif +#endif + +void main() { +#ifdef USE_HALF_BUFFERS +#ifdef USE_HALF_SIZE + float sample_00 = texelFetch(source_depth, ivec2(4 * gl_GlobalInvocationID.x + 0, 4 * gl_GlobalInvocationID.y + 0), 0).x; + float sample_11 = texelFetch(source_depth, ivec2(4 * gl_GlobalInvocationID.x + 2, 4 * gl_GlobalInvocationID.y + 2), 0).x; +#else + float sample_00 = texelFetch(source_depth, ivec2(2 * gl_GlobalInvocationID.x + 0, 2 * gl_GlobalInvocationID.y + 0), 0).x; + float sample_11 = texelFetch(source_depth, ivec2(2 * gl_GlobalInvocationID.x + 1, 2 * gl_GlobalInvocationID.y + 1), 0).x; +#endif + sample_00 = screen_space_to_view_space_depth(sample_00); + sample_11 = screen_space_to_view_space_depth(sample_11); + + imageStore(dest_image0, ivec3(gl_GlobalInvocationID.xy, 0), vec4(sample_00)); + imageStore(dest_image0, ivec3(gl_GlobalInvocationID.xy, 3), vec4(sample_11)); +#else //!USE_HALF_BUFFERS +#ifdef USE_HALF_SIZE + ivec2 depth_buffer_coord = 4 * ivec2(gl_GlobalInvocationID.xy); + ivec2 output_coord = ivec2(gl_GlobalInvocationID); + + vec2 uv = (vec2(depth_buffer_coord) + 0.5f) * params.pixel_size; + vec4 samples; + samples.x = textureLodOffset(source_depth, uv, 0, ivec2(0, 2)).x; + samples.y = textureLodOffset(source_depth, uv, 0, ivec2(2, 2)).x; + samples.z = textureLodOffset(source_depth, uv, 0, ivec2(2, 0)).x; + samples.w = textureLodOffset(source_depth, uv, 0, ivec2(0, 0)).x; +#else + ivec2 depth_buffer_coord = 2 * ivec2(gl_GlobalInvocationID.xy); + ivec2 output_coord = ivec2(gl_GlobalInvocationID); + + vec2 uv = (vec2(depth_buffer_coord) + 0.5f) * params.pixel_size; + vec4 samples = textureGather(source_depth, uv); +#endif +#ifdef GENERATE_MIPS + prepare_depths_and_mips(samples, output_coord, gl_LocalInvocationID.xy); +#else + prepare_depths(samples, gl_GlobalInvocationID.xy); +#endif +#endif +} diff --git a/servers/rendering/renderer_rd/shaders/ssao_importance_map.glsl b/servers/rendering/renderer_rd/shaders/ssao_importance_map.glsl new file mode 100644 index 00000000000..6aa7624261b --- /dev/null +++ b/servers/rendering/renderer_rd/shaders/ssao_importance_map.glsl @@ -0,0 +1,126 @@ +/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// +// Copyright (c) 2016, Intel Corporation +// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated +// documentation files (the "Software"), to deal in the Software without restriction, including without limitation +// the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +// permit persons to whom the Software is furnished to do so, subject to the following conditions: +// The above copyright notice and this permission notice shall be included in all copies or substantial portions of +// the Software. +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO +// THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +// TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. +/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// +// File changes (yyyy-mm-dd) +// 2016-09-07: filip.strugar@intel.com: first commit +// 2020-12-05: clayjohn: convert to Vulkan and Godot +/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// + +#[compute] + +#version 450 + +VERSION_DEFINES + +layout(local_size_x = 8, local_size_y = 8, local_size_z = 1) in; + +#ifdef GENERATE_MAP +layout(set = 0, binding = 0) uniform sampler2DArray source_ssao; +#else +layout(set = 0, binding = 0) uniform sampler2D source_importance; +#endif +layout(r8, set = 1, binding = 0) uniform restrict writeonly image2D dest_image; + +#ifdef PROCESS_MAPB +layout(set = 2, binding = 0, std430) buffer Counter { + uint sum; +} +counter; +#endif + +layout(push_constant, binding = 1, std430) uniform Params { + vec2 half_screen_pixel_size; + float intensity; + float power; +} +params; + +void main() { + // Pixel being shaded + ivec2 ssC = ivec2(gl_GlobalInvocationID.xy); + +#ifdef GENERATE_MAP + // importance map stuff + uvec2 base_position = ssC * 2; + + vec2 base_uv = (vec2(base_position) + vec2(0.5f, 0.5f)) * params.half_screen_pixel_size; + + float avg = 0.0; + float minV = 1.0; + float maxV = 0.0; + for (int i = 0; i < 4; i++) { + vec4 vals = textureGather(source_ssao, vec3(base_uv, i)); + + // apply the same modifications that would have been applied in the main shader + vals = params.intensity * vals; + + vals = 1 - vals; + + vals = pow(clamp(vals, 0.0, 1.0), vec4(params.power)); + + avg += dot(vec4(vals.x, vals.y, vals.z, vals.w), vec4(1.0 / 16.0, 1.0 / 16.0, 1.0 / 16.0, 1.0 / 16.0)); + + maxV = max(maxV, max(max(vals.x, vals.y), max(vals.z, vals.w))); + minV = min(minV, min(min(vals.x, vals.y), min(vals.z, vals.w))); + } + + float min_max_diff = maxV - minV; + + imageStore(dest_image, ssC, vec4(pow(clamp(min_max_diff * 2.0, 0.0, 1.0), 0.8))); +#endif + +#ifdef PROCESS_MAPA + vec2 uv = (vec2(ssC) + 0.5f) * params.half_screen_pixel_size * 2.0; + + float centre = textureLod(source_importance, uv, 0.0).x; + + vec2 half_pixel = params.half_screen_pixel_size; + + vec4 vals; + vals.x = textureLod(source_importance, uv + vec2(-half_pixel.x * 3, -half_pixel.y), 0.0).x; + vals.y = textureLod(source_importance, uv + vec2(+half_pixel.x, -half_pixel.y * 3), 0.0).x; + vals.z = textureLod(source_importance, uv + vec2(+half_pixel.x * 3, +half_pixel.y), 0.0).x; + vals.w = textureLod(source_importance, uv + vec2(-half_pixel.x, +half_pixel.y * 3), 0.0).x; + + float avg = dot(vals, vec4(0.25, 0.25, 0.25, 0.25)); + + imageStore(dest_image, ssC, vec4(avg)); +#endif + +#ifdef PROCESS_MAPB + vec2 uv = (vec2(ssC) + 0.5f) * params.half_screen_pixel_size * 2.0; + + float centre = textureLod(source_importance, uv, 0.0).x; + + vec2 half_pixel = params.half_screen_pixel_size; + + vec4 vals; + vals.x = textureLod(source_importance, uv + vec2(-half_pixel.x, -half_pixel.y * 3), 0.0).x; + vals.y = textureLod(source_importance, uv + vec2(+half_pixel.x * 3, -half_pixel.y), 0.0).x; + vals.z = textureLod(source_importance, uv + vec2(+half_pixel.x, +half_pixel.y * 3), 0.0).x; + vals.w = textureLod(source_importance, uv + vec2(-half_pixel.x * 3, +half_pixel.y), 0.0).x; + + float avg = dot(vals, vec4(0.25, 0.25, 0.25, 0.25)); + + imageStore(dest_image, ssC, vec4(avg)); + + // sum the average; to avoid overflowing we assume max AO resolution is not bigger than 16384x16384; so quarter res (used here) will be 4096x4096, which leaves us with 8 bits per pixel + uint sum = uint(clamp(avg, 0.0, 1.0) * 255.0 + 0.5); + + // save every 9th to avoid InterlockedAdd congestion - since we're blurring, this is good enough; compensated by multiplying load_counter_avg_div by 9 + if (((ssC.x % 3) + (ssC.y % 3)) == 0) { + atomicAdd(counter.sum, sum); + } +#endif +} diff --git a/servers/rendering/renderer_rd/shaders/ssao_interleave.glsl b/servers/rendering/renderer_rd/shaders/ssao_interleave.glsl new file mode 100644 index 00000000000..4fdf334aa5f --- /dev/null +++ b/servers/rendering/renderer_rd/shaders/ssao_interleave.glsl @@ -0,0 +1,119 @@ +/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// +// Copyright (c) 2016, Intel Corporation +// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated +// documentation files (the "Software"), to deal in the Software without restriction, including without limitation +// the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +// permit persons to whom the Software is furnished to do so, subject to the following conditions: +// The above copyright notice and this permission notice shall be included in all copies or substantial portions of +// the Software. +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO +// THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +// TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +// SOFTWARE. +/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// +// File changes (yyyy-mm-dd) +// 2016-09-07: filip.strugar@intel.com: first commit +// 2020-12-05: clayjohn: convert to Vulkan and Godot +/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// +#[compute] + +#version 450 + +VERSION_DEFINES + +layout(local_size_x = 8, local_size_y = 8, local_size_z = 1) in; + +layout(rgba8, set = 0, binding = 0) uniform restrict writeonly image2D dest_image; +layout(set = 1, binding = 0) uniform sampler2DArray source_texture; + +layout(push_constant, binding = 1, std430) uniform Params { + float inv_sharpness; + uint size_modifier; + vec2 pixel_size; +} +params; + +vec4 unpack_edges(float p_packed_val) { + uint packed_val = uint(p_packed_val * 255.5); + vec4 edgesLRTB; + edgesLRTB.x = float((packed_val >> 6) & 0x03) / 3.0; + edgesLRTB.y = float((packed_val >> 4) & 0x03) / 3.0; + edgesLRTB.z = float((packed_val >> 2) & 0x03) / 3.0; + edgesLRTB.w = float((packed_val >> 0) & 0x03) / 3.0; + + return clamp(edgesLRTB + params.inv_sharpness, 0.0, 1.0); +} + +void main() { + ivec2 ssC = ivec2(gl_GlobalInvocationID.xy); + if (any(greaterThanEqual(ssC, ivec2(1.0 / params.pixel_size)))) { //too large, do nothing + return; + } + +#ifdef MODE_SMART + float ao; + uvec2 pix_pos = uvec2(gl_GlobalInvocationID.xy); + vec2 uv = (gl_GlobalInvocationID.xy + vec2(0.5)) * params.pixel_size; + + // calculate index in the four deinterleaved source array texture + int mx = int(pix_pos.x % 2); + int my = int(pix_pos.y % 2); + int index_center = mx + my * 2; // center index + int index_horizontal = (1 - mx) + my * 2; // neighbouring, horizontal + int index_vertical = mx + (1 - my) * 2; // neighbouring, vertical + int index_diagonal = (1 - mx) + (1 - my) * 2; // diagonal + + vec2 center_val = texelFetch(source_texture, ivec3(pix_pos / uvec2(params.size_modifier), index_center), 0).xy; + + ao = center_val.x; + + vec4 edgesLRTB = unpack_edges(center_val.y); + + // convert index shifts to sampling offsets + float fmx = float(mx); + float fmy = float(my); + + // in case of an edge, push sampling offsets away from the edge (towards pixel center) + float fmxe = (edgesLRTB.y - edgesLRTB.x); + float fmye = (edgesLRTB.w - edgesLRTB.z); + + // calculate final sampling offsets and sample using bilinear filter + vec2 uv_horizontal = (gl_GlobalInvocationID.xy + vec2(0.5) + vec2(fmx + fmxe - 0.5, 0.5 - fmy)) * params.pixel_size; + float ao_horizontal = textureLod(source_texture, vec3(uv_horizontal, index_horizontal), 0.0).x; + vec2 uv_vertical = (gl_GlobalInvocationID.xy + vec2(0.5) + vec2(0.5 - fmx, fmy - 0.5 + fmye)) * params.pixel_size; + float ao_vertical = textureLod(source_texture, vec3(uv_vertical, index_vertical), 0.0).x; + vec2 uv_diagonal = (gl_GlobalInvocationID.xy + vec2(0.5) + vec2(fmx - 0.5 + fmxe, fmy - 0.5 + fmye)) * params.pixel_size; + float ao_diagonal = textureLod(source_texture, vec3(uv_diagonal, index_diagonal), 0.0).x; + + // reduce weight for samples near edge - if the edge is on both sides, weight goes to 0 + vec4 blendWeights; + blendWeights.x = 1.0; + blendWeights.y = (edgesLRTB.x + edgesLRTB.y) * 0.5; + blendWeights.z = (edgesLRTB.z + edgesLRTB.w) * 0.5; + blendWeights.w = (blendWeights.y + blendWeights.z) * 0.5; + + // calculate weighted average + float blendWeightsSum = dot(blendWeights, vec4(1.0, 1.0, 1.0, 1.0)); + ao = dot(vec4(ao, ao_horizontal, ao_vertical, ao_diagonal), blendWeights) / blendWeightsSum; + + imageStore(dest_image, ivec2(gl_GlobalInvocationID.xy), vec4(ao)); +#else // !MODE_SMART + + vec2 uv = (gl_GlobalInvocationID.xy + vec2(0.5)) * params.pixel_size; +#ifdef MODE_HALF + float a = textureLod(source_texture, vec3(uv, 0), 0.0).x; + float d = textureLod(source_texture, vec3(uv, 3), 0.0).x; + float avg = (a + d) * 0.5; + +#else + float a = textureLod(source_texture, vec3(uv, 0), 0.0).x; + float b = textureLod(source_texture, vec3(uv, 1), 0.0).x; + float c = textureLod(source_texture, vec3(uv, 2), 0.0).x; + float d = textureLod(source_texture, vec3(uv, 3), 0.0).x; + float avg = (a + b + c + d) * 0.25; + +#endif + imageStore(dest_image, ivec2(gl_GlobalInvocationID.xy), vec4(avg)); +#endif +} diff --git a/servers/rendering/renderer_rd/shaders/ssao_minify.glsl b/servers/rendering/renderer_rd/shaders/ssao_minify.glsl deleted file mode 100644 index 263fca386f6..00000000000 --- a/servers/rendering/renderer_rd/shaders/ssao_minify.glsl +++ /dev/null @@ -1,45 +0,0 @@ -#[compute] - -#version 450 - -VERSION_DEFINES - -layout(local_size_x = 8, local_size_y = 8, local_size_z = 1) in; - -layout(push_constant, binding = 1, std430) uniform Params { - vec2 pixel_size; - float z_far; - float z_near; - ivec2 source_size; - bool orthogonal; - uint pad; -} -params; - -#ifdef MINIFY_START -layout(set = 0, binding = 0) uniform sampler2D source_texture; -#else -layout(r32f, set = 0, binding = 0) uniform restrict readonly image2D source_image; -#endif -layout(r32f, set = 1, binding = 0) uniform restrict writeonly image2D dest_image; - -void main() { - ivec2 pos = ivec2(gl_GlobalInvocationID.xy); - - if (any(greaterThan(pos, params.source_size >> 1))) { //too large, do nothing - return; - } - -#ifdef MINIFY_START - float depth = texelFetch(source_texture, pos << 1, 0).r * 2.0 - 1.0; - if (params.orthogonal) { - depth = ((depth + (params.z_far + params.z_near) / (params.z_far - params.z_near)) * (params.z_far - params.z_near)) / 2.0; - } else { - depth = 2.0 * params.z_near * params.z_far / (params.z_far + params.z_near - depth * (params.z_far - params.z_near)); - } -#else - float depth = imageLoad(source_image, pos << 1).r; -#endif - - imageStore(dest_image, pos, vec4(depth)); -} diff --git a/servers/rendering/renderer_scene.h b/servers/rendering/renderer_scene.h index 56c38beaa3e..22af720ae7b 100644 --- a/servers/rendering/renderer_scene.h +++ b/servers/rendering/renderer_scene.h @@ -132,9 +132,9 @@ public: virtual void environment_set_ssr(RID p_env, bool p_enable, int p_max_steps, float p_fade_int, float p_fade_out, float p_depth_tolerance) = 0; virtual void environment_set_ssr_roughness_quality(RS::EnvironmentSSRRoughnessQuality p_quality) = 0; - virtual void environment_set_ssao(RID p_env, bool p_enable, float p_radius, float p_intensity, float p_bias, float p_light_affect, float p_ao_channel_affect, RS::EnvironmentSSAOBlur p_blur, float p_bilateral_sharpness) = 0; + virtual void environment_set_ssao(RID p_env, bool p_enable, float p_radius, float p_intensity, float p_power, float p_detail, float p_horizon, float p_sharpness, float p_light_affect, float p_ao_channel_affect) = 0; - virtual void environment_set_ssao_quality(RS::EnvironmentSSAOQuality p_quality, bool p_half_size) = 0; + virtual void environment_set_ssao_quality(RS::EnvironmentSSAOQuality p_quality, bool p_half_size, float p_adaptive_target, int p_blur_passes, float p_fadeout_from, float p_fadeout_to) = 0; virtual void environment_set_sdfgi(RID p_env, bool p_enable, RS::EnvironmentSDFGICascades p_cascades, float p_min_cell_size, RS::EnvironmentSDFGIYScale p_y_scale, bool p_use_occlusion, bool p_use_multibounce, bool p_read_sky, float p_energy, float p_normal_bias, float p_probe_bias) = 0; diff --git a/servers/rendering/renderer_scene_cull.h b/servers/rendering/renderer_scene_cull.h index 4d82d873cc4..051919314e5 100644 --- a/servers/rendering/renderer_scene_cull.h +++ b/servers/rendering/renderer_scene_cull.h @@ -516,8 +516,8 @@ public: PASS6(environment_set_ssr, RID, bool, int, float, float, float) PASS1(environment_set_ssr_roughness_quality, RS::EnvironmentSSRRoughnessQuality) - PASS9(environment_set_ssao, RID, bool, float, float, float, float, float, RS::EnvironmentSSAOBlur, float) - PASS2(environment_set_ssao_quality, RS::EnvironmentSSAOQuality, bool) + PASS10(environment_set_ssao, RID, bool, float, float, float, float, float, float, float, float) + PASS6(environment_set_ssao_quality, RS::EnvironmentSSAOQuality, bool, float, int, float, float) PASS11(environment_set_glow, RID, bool, Vector, float, float, float, float, RS::EnvironmentGlowBlendMode, float, float, float) PASS1(environment_glow_set_use_bicubic_upscale, bool) diff --git a/servers/rendering/renderer_scene_render.h b/servers/rendering/renderer_scene_render.h index e1f3ea9b6be..b6bdcab88f2 100644 --- a/servers/rendering/renderer_scene_render.h +++ b/servers/rendering/renderer_scene_render.h @@ -96,9 +96,9 @@ public: virtual void environment_set_ssr(RID p_env, bool p_enable, int p_max_steps, float p_fade_int, float p_fade_out, float p_depth_tolerance) = 0; virtual void environment_set_ssr_roughness_quality(RS::EnvironmentSSRRoughnessQuality p_quality) = 0; - virtual void environment_set_ssao(RID p_env, bool p_enable, float p_radius, float p_intensity, float p_bias, float p_light_affect, float p_ao_channel_affect, RS::EnvironmentSSAOBlur p_blur, float p_bilateral_sharpness) = 0; + virtual void environment_set_ssao(RID p_env, bool p_enable, float p_radius, float p_intensity, float p_power, float p_detail, float p_horizon, float p_sharpness, float p_light_affect, float p_ao_channel_affect) = 0; - virtual void environment_set_ssao_quality(RS::EnvironmentSSAOQuality p_quality, bool p_half_size) = 0; + virtual void environment_set_ssao_quality(RS::EnvironmentSSAOQuality p_quality, bool p_half_size, float p_adaptive_target, int p_blur_passes, float p_fadeout_from, float p_fadeout_to) = 0; virtual void environment_set_sdfgi(RID p_env, bool p_enable, RS::EnvironmentSDFGICascades p_cascades, float p_min_cell_size, RS::EnvironmentSDFGIYScale p_y_scale, bool p_use_occlusion, bool p_use_multibounce, bool p_read_sky, float p_energy, float p_normal_bias, float p_probe_bias) = 0; diff --git a/servers/rendering/rendering_device.h b/servers/rendering/rendering_device.h index 524d7749c98..b3d4e66f6ce 100644 --- a/servers/rendering/rendering_device.h +++ b/servers/rendering/rendering_device.h @@ -433,6 +433,7 @@ public: TEXTURE_SLICE_2D, TEXTURE_SLICE_CUBEMAP, TEXTURE_SLICE_3D, + TEXTURE_SLICE_2D_ARRAY, }; virtual RID texture_create_shared_from_slice(const TextureView &p_view, RID p_with_texture, uint32_t p_layer, uint32_t p_mipmap, TextureSliceType p_slice_type = TEXTURE_SLICE_2D) = 0; diff --git a/servers/rendering/rendering_server_default.h b/servers/rendering/rendering_server_default.h index dd1a41a4f77..11220dcf79a 100644 --- a/servers/rendering/rendering_server_default.h +++ b/servers/rendering/rendering_server_default.h @@ -590,8 +590,8 @@ public: BIND6(environment_set_ssr, RID, bool, int, float, float, float) BIND1(environment_set_ssr_roughness_quality, EnvironmentSSRRoughnessQuality) - BIND9(environment_set_ssao, RID, bool, float, float, float, float, float, EnvironmentSSAOBlur, float) - BIND2(environment_set_ssao_quality, EnvironmentSSAOQuality, bool) + BIND10(environment_set_ssao, RID, bool, float, float, float, float, float, float, float, float) + BIND6(environment_set_ssao_quality, EnvironmentSSAOQuality, bool, float, int, float, float) BIND11(environment_set_glow, RID, bool, Vector, float, float, float, float, EnvironmentGlowBlendMode, float, float, float) BIND1(environment_glow_set_use_bicubic_upscale, bool) diff --git a/servers/rendering/rendering_server_wrap_mt.h b/servers/rendering/rendering_server_wrap_mt.h index 20556779fed..ec71178daea 100644 --- a/servers/rendering/rendering_server_wrap_mt.h +++ b/servers/rendering/rendering_server_wrap_mt.h @@ -498,9 +498,9 @@ public: FUNC6(environment_set_ssr, RID, bool, int, float, float, float) FUNC1(environment_set_ssr_roughness_quality, EnvironmentSSRRoughnessQuality) - FUNC9(environment_set_ssao, RID, bool, float, float, float, float, float, EnvironmentSSAOBlur, float) + FUNC10(environment_set_ssao, RID, bool, float, float, float, float, float, float, float, float) - FUNC2(environment_set_ssao_quality, EnvironmentSSAOQuality, bool) + FUNC6(environment_set_ssao_quality, EnvironmentSSAOQuality, bool, float, int, float, float) FUNC11(environment_set_sdfgi, RID, bool, EnvironmentSDFGICascades, float, EnvironmentSDFGIYScale, bool, bool, bool, float, float, float) FUNC1(environment_set_sdfgi_ray_count, EnvironmentSDFGIRayCount) diff --git a/servers/rendering_server.cpp b/servers/rendering_server.cpp index b643b14040c..3eff92d916b 100644 --- a/servers/rendering_server.cpp +++ b/servers/rendering_server.cpp @@ -1668,7 +1668,7 @@ void RenderingServer::_bind_methods() { ClassDB::bind_method(D_METHOD("environment_set_tonemap", "env", "tone_mapper", "exposure", "white", "auto_exposure", "min_luminance", "max_luminance", "auto_exp_speed", "auto_exp_grey"), &RenderingServer::environment_set_tonemap); ClassDB::bind_method(D_METHOD("environment_set_adjustment", "env", "enable", "brightness", "contrast", "saturation", "use_1d_color_correction", "color_correction"), &RenderingServer::environment_set_adjustment); ClassDB::bind_method(D_METHOD("environment_set_ssr", "env", "enable", "max_steps", "fade_in", "fade_out", "depth_tolerance"), &RenderingServer::environment_set_ssr); - ClassDB::bind_method(D_METHOD("environment_set_ssao", "env", "enable", "radius", "intensity", "bias", "light_affect", "ao_channel_affect", "blur", "bilateral_sharpness"), &RenderingServer::environment_set_ssao); + ClassDB::bind_method(D_METHOD("environment_set_ssao", "env", "enable", "radius", "intensity", "power", "detail", "horizon", "sharpness", "light_affect", "ao_channel_affect"), &RenderingServer::environment_set_ssao); ClassDB::bind_method(D_METHOD("environment_set_fog", "env", "enable", "light_color", "light_energy", "sun_scatter", "density", "height", "height_density", "aerial_perspective"), &RenderingServer::environment_set_fog); ClassDB::bind_method(D_METHOD("scenario_create"), &RenderingServer::scenario_create); @@ -2038,11 +2038,7 @@ void RenderingServer::_bind_methods() { BIND_ENUM_CONSTANT(ENV_SSR_ROUGNESS_QUALITY_MEDIUM); BIND_ENUM_CONSTANT(ENV_SSR_ROUGNESS_QUALITY_HIGH); - BIND_ENUM_CONSTANT(ENV_SSAO_BLUR_DISABLED); - BIND_ENUM_CONSTANT(ENV_SSAO_BLUR_1x1); - BIND_ENUM_CONSTANT(ENV_SSAO_BLUR_2x2); - BIND_ENUM_CONSTANT(ENV_SSAO_BLUR_3x3); - + BIND_ENUM_CONSTANT(ENV_SSAO_QUALITY_VERY_LOW); BIND_ENUM_CONSTANT(ENV_SSAO_QUALITY_LOW); BIND_ENUM_CONSTANT(ENV_SSAO_QUALITY_MEDIUM); BIND_ENUM_CONSTANT(ENV_SSAO_QUALITY_HIGH); @@ -2309,9 +2305,18 @@ RenderingServer::RenderingServer() { ProjectSettings::get_singleton()->set_custom_property_info("rendering/quality/depth_of_field/depth_of_field_bokeh_quality", PropertyInfo(Variant::INT, "rendering/quality/depth_of_field/depth_of_field_bokeh_quality", PROPERTY_HINT_ENUM, "Very Low (Fastest),Low (Fast),Medium (Average),High (Slow)")); GLOBAL_DEF("rendering/quality/depth_of_field/depth_of_field_use_jitter", false); - GLOBAL_DEF("rendering/quality/ssao/quality", 1); - ProjectSettings::get_singleton()->set_custom_property_info("rendering/quality/ssao/quality", PropertyInfo(Variant::INT, "rendering/quality/ssao/quality", PROPERTY_HINT_ENUM, "Low (Fast),Medium (Average),High (Slow),Ultra (Slower)")); + GLOBAL_DEF("rendering/quality/ssao/quality", 2); + ProjectSettings::get_singleton()->set_custom_property_info("rendering/quality/ssao/quality", PropertyInfo(Variant::INT, "rendering/quality/ssao/quality", PROPERTY_HINT_ENUM, "Very Low (Fast),Low (Fast),Medium (Average),High (Slow),Ultra (Custom)")); GLOBAL_DEF("rendering/quality/ssao/half_size", false); + GLOBAL_DEF("rendering/quality/ssao/half_size.mobile", true); + GLOBAL_DEF("rendering/quality/ssao/adaptive_target", 0.5); + ProjectSettings::get_singleton()->set_custom_property_info("rendering/quality/ssao/adaptive_target", PropertyInfo(Variant::FLOAT, "rendering/quality/ssao/adaptive_target", PROPERTY_HINT_RANGE, "0.0,1.0,0.01")); + GLOBAL_DEF("rendering/quality/ssao/blur_passes", 2); + ProjectSettings::get_singleton()->set_custom_property_info("rendering/quality/ssao/blur_passes", PropertyInfo(Variant::INT, "rendering/quality/ssao/blur_passes", PROPERTY_HINT_RANGE, "0,6")); + GLOBAL_DEF("rendering/quality/ssao/fadeout_from", 50.0); + ProjectSettings::get_singleton()->set_custom_property_info("rendering/quality/ssao/fadeout_from", PropertyInfo(Variant::FLOAT, "rendering/quality/ssao/fadeout_from", PROPERTY_HINT_RANGE, "0.0,512,0.1,or_greater")); + GLOBAL_DEF("rendering/quality/ssao/fadeout_to", 300.0); + ProjectSettings::get_singleton()->set_custom_property_info("rendering/quality/ssao/fadeout_to", PropertyInfo(Variant::FLOAT, "rendering/quality/ssao/fadeout_to", PROPERTY_HINT_RANGE, "64,65536,0.1,or_greater")); GLOBAL_DEF("rendering/quality/screen_filters/screen_space_roughness_limiter_enabled", true); GLOBAL_DEF("rendering/quality/screen_filters/screen_space_roughness_limiter_amount", 0.25); diff --git a/servers/rendering_server.h b/servers/rendering_server.h index 6f7359b2768..4dc4f332fac 100644 --- a/servers/rendering_server.h +++ b/servers/rendering_server.h @@ -940,23 +940,17 @@ public: virtual void environment_set_ssr_roughness_quality(EnvironmentSSRRoughnessQuality p_quality) = 0; - enum EnvironmentSSAOBlur { - ENV_SSAO_BLUR_DISABLED, - ENV_SSAO_BLUR_1x1, - ENV_SSAO_BLUR_2x2, - ENV_SSAO_BLUR_3x3, - }; - - virtual void environment_set_ssao(RID p_env, bool p_enable, float p_radius, float p_intensity, float p_bias, float p_light_affect, float p_ao_channel_affect, EnvironmentSSAOBlur p_blur, float p_bilateral_sharpness) = 0; + virtual void environment_set_ssao(RID p_env, bool p_enable, float p_radius, float p_intensity, float p_power, float p_detail, float p_horizon, float p_sharpness, float p_light_affect, float p_ao_channel_affect) = 0; enum EnvironmentSSAOQuality { + ENV_SSAO_QUALITY_VERY_LOW, ENV_SSAO_QUALITY_LOW, ENV_SSAO_QUALITY_MEDIUM, ENV_SSAO_QUALITY_HIGH, ENV_SSAO_QUALITY_ULTRA, }; - virtual void environment_set_ssao_quality(EnvironmentSSAOQuality p_quality, bool p_half_size) = 0; + virtual void environment_set_ssao_quality(EnvironmentSSAOQuality p_quality, bool p_half_size, float p_adaptive_target, int p_blur_passes, float p_fadeout_from, float p_fadeout_to) = 0; enum EnvironmentSDFGICascades { ENV_SDFGI_CASCADES_4, @@ -1488,7 +1482,6 @@ VARIANT_ENUM_CAST(RenderingServer::EnvironmentReflectionSource); VARIANT_ENUM_CAST(RenderingServer::EnvironmentGlowBlendMode); VARIANT_ENUM_CAST(RenderingServer::EnvironmentToneMapper); VARIANT_ENUM_CAST(RenderingServer::EnvironmentSSRRoughnessQuality); -VARIANT_ENUM_CAST(RenderingServer::EnvironmentSSAOBlur); VARIANT_ENUM_CAST(RenderingServer::EnvironmentSSAOQuality); VARIANT_ENUM_CAST(RenderingServer::SubSurfaceScatteringQuality); VARIANT_ENUM_CAST(RenderingServer::DOFBlurQuality);