1. 程式人生 > 其它 >剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

剖析虛幻渲染體系(06)- UE5特輯Part 2(Lumen和其它)

目錄

6.5 Lumen

6.5.1 Lumen技術特性

6.2.2.2 Lumen全域性動態光照小節已經簡介過Lumen的特性,包含間接光照明、天空光、自發光照明、軟硬陰影、反射等,本節將更加詳細地介紹其技術特性。

首先需要闡明的是,Lumen是綜合使用了多種技術的結合體,而非單一技術的運用。比如,Lumen預設使用有符號距離場(SDF)的軟光追,但是當硬體光線追蹤被啟用時,可以在支援的顯示卡上實現更高的質量。

下面將Lumen涉及的主要技術點羅列出來。

6.5.1.1 表面快取(Surface Cache)

Lumen會為場景表面的附近生成自動化引數,被稱為表面快取(Surface Cache),表面快取用於快速查詢場景中射線命中點的光照。Lumen會為每個網格從多角度捕捉材質屬性,這些捕捉位置被稱為Cards,是逐網格被離線生成的。通過控制檯引數r.Lumen.Visualize.CardPlacement 1可以檢視Lumen Cards的視覺化效果:

上:正常渲染畫面;下:Lumen Card視覺化。

Nanite加速了網格捕捉,用於保持Surface Cache與三角形場景同步。特別是高面數的網格,需要使用Nanite來獲得高效捕捉。

當Surface Cache被材質屬性填充後,Lumen計算這些表面位置的直接和間接照明。這些更新在多個幀上攤銷,為許多動態燈光和多反彈的全域性照明提供有效的支援。

只有內部簡單的網格可以被支援,如牆壁、地板和天花板,它們應該各自使用單獨的網格,而不應該合成一個大網格。

6.5.1.2 螢幕追蹤(Screen Tracing)

Lumen的特點是先對螢幕進行追蹤(稱為螢幕追蹤或螢幕空間追蹤),如果沒有擊中,或者光線經過表面後,就使用更可靠的方法。

使用螢幕追蹤的缺點是,它極大地限制了藝術家的控制,導致只適用於間接照明,如Indirect lighting Scale、Emissive Boost等光照屬性。

件光線追蹤首先使用螢幕追蹤,然後再使用其它開銷更大的追蹤選項。如果螢幕追蹤被禁用於GI和反射,將會看見只有Lumen場景。螢幕跟蹤支援任何幾何型別,並有助於掩蓋Lumen場景和三角形場景之間的不匹配現象。

使用r.Lumen.ScreenProbeGather.ScreenTraces 0|1開啟或關閉螢幕追蹤,以檢視場景的對比效果:

上:開啟了Lumen螢幕追蹤的效果;下:關閉Lumen螢幕追蹤的效果。可知在反射上差別最明顯,其次是部分間接光。

6.5.1.3 Lumen光線追蹤

Lumen支援兩種光線追蹤模式:

1、軟體光線追蹤。可以在最廣泛的硬體和平臺上執行。

2、硬體光線追蹤。需要顯示卡和作業系統支援。

  • 軟體光線追蹤

Lumen預設使用依賴有向距離場的軟體光線追蹤,這意味著可以運行於支援SM5的硬體上。

需要在工程設定中開啟生成網格距離場(Generate Mesh Distance Fields),UE5預設已開啟。

渲染器會合併網格的距離場到一個全域性距離場(Global Distance Field)以加速追蹤。預設情況下,Lumen追蹤每一個網格距離場的前兩米的準確性,其它距離的射線則使用合併的全域性距離場。如果專案需要精確控制Lumen軟光追,則可以在專案設定中使用的軟體光線追蹤模式的方法:

細節追蹤(Detail Tracing)是預設的追蹤方法,可以利用單獨的網格距離場來達到高質量的GI(前兩米才使用,其它距離用全域性距離場)。全域性追蹤(Global Tracing)利用全域性距離場來快速追蹤,但會損失一定的畫質效果。

網格距離場會根據攝像機在世界的移動而動態流式載入或解除安裝。它們會被打包成一個圖集(Atlas),可以通過控制檯命令r.DistanceFields.LogAtlasStats 1輸出資訊:

由於Lumen的軟光追的質量非常依賴網格距離場,所以關注網格距離場的質量可以提升Lumen的GI效果。下圖是現實網格距離場和全域性距離場的選單:

下面兩圖分別是網格距離場和全域性距離場視覺化:

但是,軟體光線追蹤存在著諸多限制,主要有:

  • 幾何物體限制:

    • Lumen場景只支援靜態網格、例項化靜態網格、層級例項化靜態網格(Hierarchical Instanced Static Meshe)。
    • 不支援地貌幾何體,因此它們沒有間接反射光。未來將會支援。
  • 材質限制:

    • 不支援世界位置偏移(WPO)。
    • 不支援透明物體,視Masked物體為不透明物體。
    • 距離場資料的構建基於靜態網格資產的材質屬性,而不是覆蓋的元件(override component)。意味著執行時改變材質不會影響到Lumen的GI。
  • 工作流限制:

    • 軟體光線追蹤要求層級是由模組組成。牆壁、地板和天花板應該是獨立的網格。較大的網格(如山)將有不良的表現,並可能導致自遮擋偽陰影。
    • 牆壁應大於10釐米,以避免漏光。
    • 距離場的解析度依賴靜態網格匯入時的設定,如果壓縮率過高,將得不到高質量的距離場資料。
    • 距離場無法表達很薄的物體。

上面已經闡述完Lumen的軟體光追,下面繼續介紹其硬體光追。

  • 硬體光線追蹤

硬體光線追蹤比軟體光線追蹤支援更大範圍的幾何物體型別,特別是它支援追蹤蒙皮網格。硬體光線追蹤也能更好地獲得更高的畫面質量:它與實際的三角形相交,並有選擇地來評估光線擊中點的照明,而不是較低質量的Surface Cache。

然而,硬體光線追蹤的場景設定成本很高,目前還無法擴充套件到例項數超過10萬的場景。動態變形網格(如蒙皮網格)也會導致更新每一幀的光線追蹤加速結構的巨大成本,該成本與蒙皮三角形的數量成正比。

對於使用Nanite的靜態網格,硬體光線追蹤為了渲染效率,只能在靜態網格編輯器設定中Nanite的Proxy Triangle Percent生成的代理網格(Proxy Mesh)上操作。這些Proxy Mesh可以通過控制檯命令r.Nanite 0|1來開關視覺化:

上:全精度細節的三角形網格;下:對應的Nanite代理網格。

螢幕追蹤用於掩蓋Nanite渲染的全精度三角形網格和Lumen射線追蹤的代理網格之間的不匹配。然而,在某些情況下,不匹配太大而無法掩蓋。上面兩圖就是因為Proxy Triangle Percent數值太小,導致了自陰影的瑕疵。

Lumen只有在滿足以下條件時才啟用硬體光線追蹤:

  • 工程設定裡開啟了Use Hardware Ray Tracing when availableSupport Hardware Ray Tracing
  • 工程運行於支援的作業系統、RHI和顯示卡。目前僅以下平臺支援硬體光追:
    • 帶DirectX 12的Windows10。
    • PlayStation 5。
    • Xbox系列S / X。
    • 顯示卡必須NVIDIA RTX-2000系列及以上,或者AMD RX 6000系列及以上。

6.5.1.4 Lumen其它說明

Lumen場景運行於攝像機附近的世界,而不是整個世界,實現了大世界和流資料。Lumen依賴於Nanite的LOD和多檢視光柵化來快速捕捉場景,以維護Surface Cache,並控制所有操作以防止出現錯誤。Lumen不需要Nanite來操作,但是在沒有啟用Nanite的場景中,Lumen的場景捕捉會變得非常慢。如果資產沒有良好的LOD設定,這種情況尤其嚴重。

Lumen的Surface Cache覆蓋了距離攝像頭200米的位置。在此之後的範圍,只有螢幕追蹤對於全域性照明是開啟的。

此外,Lumen還存在其它限制:

  • Lumen全域性光照不能和光照圖(Lightmap)一起使用。未來,Lumen的反射應該被擴充套件到和Lightmap中使用全域性照明,這將進一步提升渲染質量。
  • 植物還不能被很好地支援,因為嚴重依賴於下采樣渲染和時間濾波器。
  • Lumen的最後收集(Final Gather)會在移動物體周圍新增顯著的噪點,目前仍在積極開發中。
  • 透明材質還不支援Lumen反射。
  • 透明材質沒有高質量的動態GI。

以下是Lumen相關的除錯或視覺化資訊:

上:正常畫面;中:Lumen Scene視覺化;下:Lumen GI視覺化。

當然,除了以上出現的幾個視覺化選項,實際上Lumen還有很多其它視覺化控制命令:

r.Lumen.RadianceCache.Visualize    
r.Lumen.RadianceCache.VisualizeClipmapIndex
r.Lumen.RadianceCache.VisualizeProbeRadius
r.Lumen.RadianceCache.VisualizeRadiusScale

r.Lumen.ScreenProbeGather.VisualizeTraces    
r.Lumen.ScreenProbeGather.VisualizeTracesFreeze

r.Lumen.Visualize.CardInterpolateInfluenceRadius    
r.Lumen.Visualize.CardPlacement
r.Lumen.Visualize.CardPlacementDistance    
r.Lumen.Visualize.CardPlacementIndex    
r.Lumen.Visualize.CardPlacementOrientation    
r.Lumen.Visualize.ClipmapIndex    
r.Lumen.Visualize.ConeAngle    
r.Lumen.Visualize.ConeStepFactor
r.Lumen.Visualize.GridPixelSize    
r.Lumen.Visualize.HardwareRayTracing
r.Lumen.Visualize.HardwareRayTracing.DeferredMaterial    
r.Lumen.Visualize.HardwareRayTracing.DeferredMaterial.TileDimension
r.Lumen.Visualize.HardwareRayTracing.LightingMode
r.Lumen.Visualize.HardwareRayTracing.MaxTranslucentSkipCount
r.Lumen.Visualize.MaxMeshSDFTraceDistance
r.Lumen.Visualize.MaxTraceDistance    
r.Lumen.Visualize.MinTraceDistance    
r.Lumen.Visualize.Stats    
r.Lumen.Visualize.TraceMeshSDFs    
r.Lumen.Visualize.TraceRadianceCache
r.Lumen.Visualize.VoxelFaceIndex
r.Lumen.Visualize.Voxels
r.Lumen.Visualize.VoxelStepFactor    

ShowFlag.LumenGlobalIllumination
ShowFlag.LumenReflections
ShowFlag.VisualizeLumenIndirectDiffuse
ShowFlag.VisualizeLumenScene

此外,還有很多控制命令,以下顯示部分命令:

r.Lumen.DiffuseIndirect.Allow
r.Lumen.DiffuseIndirect.CardInterpolateInfluenceRadius
r.Lumen.DiffuseIndirect.CardTraceEndDistanceFromCamera    

r.Lumen.DirectLighting    
r.Lumen.DirectLighting.BatchSize    
r.Lumen.DirectLighting.CardUpdateFrequencyScale    

r.Lumen.HardwareRayTracing
r.Lumen.HardwareRayTracing.PullbackBias
r.Lumen.IrradianceFieldGather
r.Lumen.IrradianceFieldGather.ClipmapDistributionBase
r.Lumen.IrradianceFieldGather.ClipmapWorldExtent

r.Lumen.MaxConeSteps
r.Lumen.MaxTraceDistance
r.Lumen.ProbeHierarchy
r.Lumen.ProbeHierarchy.AdditionalSpecularRayThreshold
r.Lumen.ProbeHierarchy.AntiTileAliasing

r.Lumen.RadianceCache.DownsampleDistanceFromCamera
r.Lumen.RadianceCache.ForceFullUpdate    
r.Lumen.RadianceCache.NumFramesToKeepCachedProbes    

r.Lumen.Radiosity    
r.Lumen.Radiosity.CardUpdateFrequencyScale    
r.Lumen.Radiosity.ComputeScatter    
r.Lumen.Radiosity.ConeAngleScale

r.Lumen.Reflections.Allow
r.Lumen.Reflections.DownsampleFactor    
r.Lumen.Reflections.GGXSamplingBias    
r.Lumen.Reflections.HardwareRayTracing
r.Lumen.Reflections.HardwareRayTracing.DeferredMaterial

r.Lumen.Reflections.HierarchicalScreenTraces.UncertainTraceRelativeDepthThreshold
r.Lumen.Reflections.MaxRayIntensity
r.Lumen.Reflections.MaxRoughnessToTrace    
r.Lumen.Reflections.RoughnessFadeLength    
r.Lumen.Reflections.ScreenSpaceReconstruction

r.Lumen.Reflections.ScreenTraces
r.Lumen.Reflections.Temporal
r.Lumen.Reflections.Temporal.DistanceThreshold
r.Lumen.Reflections.Temporal.HistoryWeight
r.Lumen.Reflections.TraceMeshSDFs

r.Lumen.ScreenProbeGather
r.Lumen.ScreenProbeGather.AdaptiveProbeAllocationFraction
r.Lumen.ScreenProbeGather.AdaptiveProbeMinDownsampleFactor
r.Lumen.ScreenProbeGather.DiffuseIntegralMethod
r.Lumen.ScreenProbeGather.DownsampleFactor
r.Lumen.ScreenProbeGather.FixedJitterIndex
r.Lumen.ScreenProbeGather.FullResolutionJitterWidth
r.Lumen.ScreenProbeGather.GatherNumMips
r.Lumen.ScreenProbeGather.GatherOctahedronResolutionScale
r.Lumen.ScreenProbeGather.HardwareRayTracing

r.Lumen.ScreenProbeGather.ImportanceSample.ProbeRadianceHistory
r.Lumen.ScreenProbeGather.MaxRayIntensity
r.Lumen.ScreenProbeGather.OctahedralSolidAngleTextureSize
r.Lumen.ScreenProbeGather.RadianceCache
r.Lumen.ScreenProbeGather.RadianceCache.ClipmapDistributionBase

r.Lumen.ScreenProbeGather.ReferenceMode
r.Lumen.ScreenProbeGather.ScreenSpaceBentNormal
r.Lumen.ScreenProbeGather.ScreenTraces
r.Lumen.ScreenProbeGather.ScreenTraces.HZBTraversal

r.Lumen.ScreenProbeGather.SpatialFilterHalfKernelSize    Experimental
r.Lumen.ScreenProbeGather.SpatialFilterMaxRadianceHitAngle

r.Lumen.ScreenProbeGather.Temporal    
r.Lumen.ScreenProbeGather.Temporal.ClearHistoryEveryFrame    

r.Lumen.ScreenProbeGather.TraceMeshSDFs    
r.Lumen.ScreenProbeGather.TracingOctahedronResolution
r.Lumen.TraceMeshSDFs
r.Lumen.TraceMeshSDFs.Allow    
r.Lumen.TranslucencyVolume.ConeAngleScale    
r.Lumen.TranslucencyVolume.Enable    
r.Lumen.TranslucencyVolume.EndDistanceFromCamera    

r.LumenParallelBeginUpdate
r.LumenScene.CardAtlasAllocatorBinSize    
r.LumenScene.CardAtlasSize    
r.LumenScene.CardCameraDistanceTexelDensityScale
r.LumenScene.CardCaptureMargin

r.LumenScene.ClipmapResolution    
r.LumenScene.ClipmapWorldExtent    
r.LumenScene.ClipmapZResolutionDivisor    
r.LumenScene.DiffuseReflectivityOverride    
r.LumenScene.DistantScene
r.LumenScene.DistantScene.CardResolution    

r.LumenScene.FastCameraMode
r.LumenScene.GlobalDFClipmapExtent    
r.LumenScene.GlobalDFResolution    
r.LumenScene.HeightfieldSlopeThreshold    
r.LumenScene.MaxInstanceAddsPerFrame
r.LumenScene.MeshCardsCullFaces    
r.LumenScene.MeshCardsMaxLOD

r.LumenScene.NaniteMultiViewCapture    
r.LumenScene.NumClipmapLevels    
r.LumenScene.PrimitivesPerPacket
r.LumenScene.RecaptureEveryFrame    
r.LumenScene.Reset
r.LumenScene.UploadCardBufferEveryFrame    
r.LumenScene.VoxelLightingAverageObjectsPerVisBufferTile

r.SSGI.AllowStandaloneLumenProbeHierarchy
r.Water.SingleLayer.LumenReflections

Lumen相關的控制檯指令達到上百個,由此可知Lumen渲染的複雜度有多高!!

6.5.2 Lumen渲染基礎

本節將闡述Lumen相關的基礎概念和型別。

6.5.2.1 FLumenCard

FLumenCard就是上一小節提及的Card,是FLumenMeshCards的基本組成元素。

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneData.h

// Lumen卡片型別。
class FLumenCard
{
public:
    FLumenCard();
    ~FLumenCard();

    // 世界空間的包圍盒.
    FBox WorldBounds;
    // 旋轉資訊.
    FVector LocalToWorldRotationX;
    FVector LocalToWorldRotationY;
    FVector LocalToWorldRotationZ;
    // 位置.
    FVector Origin;
    // 區域性空間的包圍盒.
    FVector LocalExtent;
    
    // 是否可見.
    bool bVisible = false;
    // 是否處於遠景.
    bool bDistantScene = false;

    // 所在圖集的資訊.
    bool bAllocated = false;
    FIntPoint DesiredResolution;
    FIntRect AtlasAllocation;

    // 朝向
    int32 Orientation = -1;
    // 在可見列表的索引.
    int32 IndexInVisibleCardIndexBuffer = -1;
    // 所在的FLumenMeshCards的Card列表的索引.
    int32 IndexInMeshCards = -1;
    // 所在的FLumenMeshCards的索引.
    int32 MeshCardsIndex = -1;
    // 解析度縮放.
    float ResolutionScale = 1.0f;

    // 初始化
    void Initialize(float InResolutionScale, const FMatrix& LocalToWorld, const FLumenCardBuildData& CardBuildData, int32 InIndexInMeshCards, int32 InMeshCardsIndex);

    // 設定變換資料
    void SetTransform(const FMatrix& LocalToWorld, FVector CardLocalCenter, FVector CardLocalExtent, int32 InOrientation);
    void SetTransform(const FMatrix& LocalToWorld, const FVector& LocalOrigin, const FVector& CardToLocalRotationX, const FVector& CardToLocalRotationY, const FVector& CardToLocalRotationZ, const FVector& InLocalExtent);

    // 從圖集(場景)中刪除.
    void RemoveFromAtlas(FLumenSceneData& LumenSceneData);

    int32 GetNumTexels() const
    {
        return AtlasAllocation.Area();
    }

    inline FVector TransformWorldPositionToCardLocal(FVector WorldPosition) const
    {
        FVector Offset = WorldPosition - Origin;
        return FVector(Offset | LocalToWorldRotationX, Offset | LocalToWorldRotationY, Offset | LocalToWorldRotationZ);
    }

    inline FVector TransformCardLocalPositionToWorld(FVector CardPosition) const
    {
        return Origin + CardPosition.X * LocalToWorldRotationX + CardPosition.Y * LocalToWorldRotationY + CardPosition.Z * LocalToWorldRotationZ;
    }
};

6.5.2.2 FLumenMeshCards

FLumenMeshCards是計算Surface Cache的基本元素,也是構成Lumen Scene的基本單元。它最多可儲存6個面(朝向)的FLumenCard資訊,每個朝向可儲存0~N個FLumenCard資訊(由NumCardsPerOrientation指定)。

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenMeshCards.h

class FLumenMeshCards
{
public:
    // 初始化.
    void Initialize(
        const FMatrix& InLocalToWorld, 
        const FBox& InBounds,
        uint32 InFirstCardIndex,
        uint32 InNumCards,
        uint32 InNumCardsPerOrientation[6],
        uint32 InCardOffsetPerOrientation[6])
    {
        Bounds = InBounds;
        SetTransform(InLocalToWorld);
        FirstCardIndex = InFirstCardIndex;
        NumCards = InNumCards;

        for (uint32 OrientationIndex = 0; OrientationIndex < 6; ++OrientationIndex)
        {
            NumCardsPerOrientation[OrientationIndex] = InNumCardsPerOrientation[OrientationIndex];
            CardOffsetPerOrientation[OrientationIndex] = InCardOffsetPerOrientation[OrientationIndex];
        }
    }

    // 設定變換矩陣.
    void SetTransform(const FMatrix& InLocalToWorld)
    {
        LocalToWorld = InLocalToWorld;
    }

    // 區域性到世界的矩陣.
    FMatrix LocalToWorld;
    // 區域性包圍盒.
    FBox Bounds;

    // 第一個FLumenCard索引.
    uint32 FirstCardIndex = 0;
    // FLumenCard數量.
    uint32 NumCards = 0;
    // 6個朝向的FLumenCard數量.
    uint32 NumCardsPerOrientation[6];
    // 6個朝向的FLumenCard偏移.
    uint32 CardOffsetPerOrientation[6];
};

6.5.2.3 FLumenSceneData

FLumenSceneData就是Lumen實現全域性光照的場景代表,它使用的不是Nanite的高精度網格,而是基於FLumenCard和FLumenMeshCards為基本元素的粗糙的場景。其定義及相關型別如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneData.h

// Lumen圖元例項
class FLumenPrimitiveInstance
{
public:
    FBox WorldSpaceBoundingBox;
    // FLumenMeshCards索引.
    int32 MeshCardsIndex;
    bool bValidMeshCards;
};

// Lumen圖元
class FLumenPrimitive
{
public:
    // 世界空間包圍盒.
    FBox WorldSpaceBoundingBox;
    // 屬於此圖元的FLumenMeshCards的最大包圍盒, 用於早期剔除.
    float MaxCardExtent;

    // 圖元例項列表.
    TArray<FLumenPrimitiveInstance, TInlineAllocator<1>> Instances;

    // 對應的真實場景的圖元資訊.
    FPrimitiveSceneInfo* Primitive = nullptr;

    // 是否合併的例項.
    bool bMergedInstances = false;
    // 卡片解析度縮放.
    float CardResolutionScale = 1.0f;
    // FLumenMeshCards的數量.
    int32 NumMeshCards = 0;

    // 對映到LumenDFInstanceToDFObjectIndex.
    uint32 LumenDFInstanceOffset = UINT32_MAX;
    int32 LumenNumDFInstances = 0;

    // 獲取FLumenMeshCards索引.
    int32 GetMeshCardsIndex(int32 InstanceIndex) const
    {
        if (bMergedInstances)
        {
            return Instances[0].MeshCardsIndex;
        }

        if (InstanceIndex < Instances.Num())
        {
            return Instances[InstanceIndex].MeshCardsIndex;
        }

        return -1;
    }
};

// Lumen場景資料.
class FLumenSceneData
{
public:
    int32 Generation;

    // 上傳GPU的緩衝.
    FScatterUploadBuffer CardUploadBuffer;
    FScatterUploadBuffer UploadMeshCardsBuffer;
    FScatterUploadBuffer ByteBufferUploadBuffer;
    FScatterUploadBuffer UploadPrimitiveBuffer;

    FUniqueIndexList CardIndicesToUpdateInBuffer;
    FRWBufferStructured CardBuffer;

    TArray<FBox> PrimitiveModifiedBounds;

    // Lumen場景的所有Lumen圖元.
    TArray<FLumenPrimitive> LumenPrimitives;

    // FLumenMeshCards資料.
    FUniqueIndexList MeshCardsIndicesToUpdateInBuffer;
    TSparseSpanArray<FLumenMeshCards> MeshCards;
    TSparseSpanArray<FLumenCard> Cards;
    TArray<int32, TInlineAllocator<8>> DistantCardIndices;
    FRWBufferStructured MeshCardsBuffer;
    FRWByteAddressBuffer DFObjectToMeshCardsIndexBuffer;

    // 從圖元對映到LumenDFInstance.
    FUniqueIndexList PrimitivesToUpdate;
    FRWByteAddressBuffer PrimitiveToDFLumenInstanceOffsetBuffer;
    uint32 PrimitiveToLumenDFInstanceOffsetBufferSize = 0;

    // 從LumenDFInstance對映到DFObjectIndex
    FUniqueIndexList DFObjectIndicesToUpdateInBuffer;
    FUniqueIndexList LumenDFInstancesToUpdate;
    TSparseSpanArray<int32> LumenDFInstanceToDFObjectIndex;
    FRWByteAddressBuffer LumenDFInstanceToDFObjectIndexBuffer;
    uint32 LumenDFInstanceToDFObjectIndexBufferSize = 0;

    // 可見的FLumenMeshCards列表.
    TArray<int32> VisibleCardsIndices;
    TRefCountPtr<FRDGPooledBuffer> VisibleCardsIndexBuffer;

    // --- 從三角形場景中捕獲的資料 ---
    TRefCountPtr<IPooledRenderTarget> AlbedoAtlas;
    TRefCountPtr<IPooledRenderTarget> NormalAtlas;
    TRefCountPtr<IPooledRenderTarget> EmissiveAtlas;

    // --- 生成的資料 ---
    TRefCountPtr<IPooledRenderTarget> DepthAtlas;
    TRefCountPtr<IPooledRenderTarget> FinalLightingAtlas;
    TRefCountPtr<IPooledRenderTarget> IrradianceAtlas;
    TRefCountPtr<IPooledRenderTarget> IndirectIrradianceAtlas;
    TRefCountPtr<IPooledRenderTarget> RadiosityAtlas;
    TRefCountPtr<IPooledRenderTarget> OpacityAtlas;

    // 其它資料.
    bool bFinalLightingAtlasContentsValid;
    FIntPoint MaxAtlasSize;
    FBinnedTextureLayout AtlasAllocator;
    int32 NumCardTexels = 0;
    int32 NumMeshCardsToAddToSurfaceCache = 0;

    // 增刪圖元資料.
    bool bTrackAllPrimitives;
    TSet<FPrimitiveSceneInfo*> PendingAddOperations;
    TSet<FPrimitiveSceneInfo*> PendingUpdateOperations;
    TArray<FLumenPrimitiveRemoveInfo> PendingRemoveOperations;

    FLumenSceneData(EShaderPlatform ShaderPlatform, EWorldType::Type WorldType);
    ~FLumenSceneData();

    // 增刪圖元操作.
    void AddPrimitiveToUpdate(int32 PrimitiveIndex);
    void AddPrimitive(FPrimitiveSceneInfo* InPrimitive);
    void UpdatePrimitive(FPrimitiveSceneInfo* InPrimitive);
    void RemovePrimitive(FPrimitiveSceneInfo* InPrimitive, int32 PrimitiveIndex);

    // 增刪FLumenMeshCards.
    void AddCardToVisibleCardList(int32 CardIndex);
    void RemoveCardFromVisibleCardList(int32 CardIndex);
    void AddMeshCards(int32 LumenPrimitiveIndex, int32 LumenInstanceIndex);
    void UpdateMeshCards(const FMatrix& LocalToWorld, int32 MeshCardsIndex, const FMeshCardsBuildData& MeshCardsBuildData);
    void RemoveMeshCards(FLumenPrimitive& LumenPrimitive, FLumenPrimitiveInstance& LumenPrimitiveInstance);

    bool HasPendingOperations() const
    {
        return PendingAddOperations.Num() > 0 || PendingUpdateOperations.Num() > 0 || PendingRemoveOperations.Num() > 0;
    }

    void UpdatePrimitiveToDistanceFieldInstanceMapping(FScene& Scene, FRHICommandListImmediate& RHICmdList);

private:
    // 從構建資料增加FLumenMeshCards.
    int32 AddMeshCardsFromBuildData(const FMatrix& LocalToWorld, const FMeshCardsBuildData& MeshCardsBuildData, float ResolutionScale);
};

由此可知,FLumenSceneData儲存著FLumenMeshCards以及以FLumenMeshCards為基礎的圖元FLumenPrimitive和圖元例項FLumenPrimitiveInstance。每個FLumenPrimitive又儲存著若干個FLumenMeshCards,同時儲存了一個FPrimitiveSceneInfo指標,標明它是真實世界哪個FPrimitiveSceneInfo的粗糙代表。

6.5.3 Lumen資料構建

Lumen在正在渲染之前,會執行很多資料構建,包含生成Mesh Distance Field、Global Distance Field以及MeshCard。

首次啟動Lumen工程時,會構建很多資料,包含網格距離場等。

6.5.3.1 CardRepresentation

為了構建網格卡片代表,UE5獨立出了MeshCardRepresentation模組,其核心概念和型別如下:

// Engine\Source\Runtime\Engine\Public\MeshCardRepresentation.h

// FLumenCard構建資料
class FLumenCardBuildData
{
public:
    // 中心和包圍盒.
    FVector Center;
    FVector Extent;

    // 朝向順序: -X, +X, -Y, +Y, -Z, +Z
    int32 Orientation;
    int32 LODLevel;

    // 根據朝向旋轉Extent.
    static FVector TransformFaceExtent(FVector Extent, int32 Orientation)
    {
        if (Orientation / 2 == 2) // 朝向: -Z, +Z
        {
            return FVector(Extent.Y, Extent.X, Extent.Z);
        }
        else if (Orientation / 2 == 1) // 朝向: -Y, +Y
        {
            return FVector(Extent.Z, Extent.X, Extent.Y);
        }
        else // (Orientation / 2 == 0), 朝向: -X, +X
        {
            return FVector(Extent.Y, Extent.Z, Extent.X);
        }
    }
};

// FLumenMeshCards構建資料.
class FMeshCardsBuildData
{
public:
    FBox Bounds;
    int32 MaxLODLevel;
    // FLumenCard構建資料列表.
    TArray<FLumenCardBuildData> CardBuildData;

    (......)
};

// 每個卡片表示資料例項的唯一id。
class FCardRepresentationDataId
{
public:
    uint32 Value = 0;

    bool IsValid() const
    {
        return Value != 0;
    }

    bool operator==(FCardRepresentationDataId B) const
    {
        return Value == B.Value;
    }

    friend uint32 GetTypeHash(FCardRepresentationDataId DataId)
    {
        return GetTypeHash(DataId.Value);
    }
};

// 卡片代表網格構建過程的有效負載和輸出資料.
class FCardRepresentationData : public FDeferredCleanupInterface
{
public:
    // 網格卡片構建資料和ID.
    FMeshCardsBuildData MeshCardsBuildData;
    FCardRepresentationDataId CardRepresentationDataId;

    (......)

#if WITH_EDITORONLY_DATA
    // 快取卡片代表的資料.
    void CacheDerivedData(const FString& InDDCKey, const ITargetPlatform* TargetPlatform, UStaticMesh* Mesh, UStaticMesh* GenerateSource, bool bGenerateDistanceFieldAsIfTwoSided, FSourceMeshDataForDerivedDataTask* OptionalSourceMeshData);
#endif
};

// 構建任務
class FAsyncCardRepresentationTaskWorker : public FNonAbandonableTask
{
public:
    (.....)
    
    void DoWork();

private:
    FAsyncCardRepresentationTask& Task;
};

// 構建任務資料載體.
class FAsyncCardRepresentationTask
{
public:
    bool bSuccess = false;

#if WITH_EDITOR
    TArray<FSignedDistanceFieldBuildMaterialData> MaterialBlendModes;
#endif

    FSourceMeshDataForDerivedDataTask SourceMeshData;
    bool bGenerateDistanceFieldAsIfTwoSided = false;
    UStaticMesh* StaticMesh = nullptr;
    UStaticMesh* GenerateSource = nullptr;
    FString DDCKey;
    FCardRepresentationData* GeneratedCardRepresentation;
    TUniquePtr<FAsyncTask<FAsyncCardRepresentationTaskWorker>> AsyncTask = nullptr;
};

// 管理網格距離場的非同步構建的型別.
class FCardRepresentationAsyncQueue : public FGCObject
{
public:
    // 增加新的構建任務.
    ENGINE_API void AddTask(FAsyncCardRepresentationTask* Task);
    
    // 處理非同步任務.
    ENGINE_API void ProcessAsyncTasks(bool bLimitExecutionTime = false);
    
    // 取消構建.
    ENGINE_API void CancelBuild(UStaticMesh* StaticMesh);
    ENGINE_API void CancelAllOutstandingBuilds();

    // 阻塞構建任務.
    ENGINE_API void BlockUntilBuildComplete(UStaticMesh* StaticMesh, bool bWarnIfBlocked);
    ENGINE_API void BlockUntilAllBuildsComplete();

    (......)
};

// 全域性構建佇列.
extern ENGINE_API FCardRepresentationAsyncQueue* GCardRepresentationAsyncQueue;

extern ENGINE_API FString BuildCardRepresentationDerivedDataKey(const FString& InMeshKey);

extern ENGINE_API void BeginCacheMeshCardRepresentation(const ITargetPlatform* TargetPlatform, UStaticMesh* StaticMeshAsset, class FStaticMeshRenderData& RenderData, const FString& DistanceFieldKey, FSourceMeshDataForDerivedDataTask* OptionalSourceMeshData);

6.5.3.2 GCardRepresentationAsyncQueue

為了構建Lumen需要的資料,UE5聲明瞭兩個全域性佇列變數:GCardRepresentationAsyncQueueGDistanceFieldAsyncQueue,前者用於Lumen Card的資料構建,後者用於距離場的資料構建。它們的建立和更新邏輯如下:

// Engine\Source\Runtime\Launch\Private\LaunchEngineLoop.cpp

int32 FEngineLoop::PreInitPreStartupScreen(const TCHAR* CmdLine)
{
    (......)
    
    if (!FPlatformProperties::RequiresCookedData())
    {
        (......)
        
        // 建立全域性非同步佇列.
        GDistanceFieldAsyncQueue = new FDistanceFieldAsyncQueue();
        GCardRepresentationAsyncQueue = new FCardRepresentationAsyncQueue();

        (......)
    }
    
    (......)
}

void FEngineLoop::Tick()
{
    (......)
    
    // 每幀更新全域性非同步佇列.
    if (GDistanceFieldAsyncQueue)
    {
        QUICK_SCOPE_CYCLE_COUNTER(STAT_FEngineLoop_Tick_GDistanceFieldAsyncQueue);
        GDistanceFieldAsyncQueue->ProcessAsyncTasks();
    }
    if (GCardRepresentationAsyncQueue)
    {
        QUICK_SCOPE_CYCLE_COUNTER(STAT_FEngineLoop_Tick_GCardRepresentationAsyncQueue);
        GCardRepresentationAsyncQueue->ProcessAsyncTasks();
    }
    
    (......)
}

由於GDistanceFieldAsyncQueue是UE4就存在的型別,本節將忽略之,將精力放在GCardRepresentationAsyncQueue上。

對於CardRepresentation加入到全域性構建佇列GCardRepresentationAsyncQueue的時機,可在MeshCardRepresentation.cpp找到答案:

FCardRepresentationAsyncQueue* GCardRepresentationAsyncQueue = NULL;

// 開始快取網格卡片代表.
void BeginCacheMeshCardRepresentation(const ITargetPlatform* TargetPlatform, UStaticMesh* StaticMeshAsset, FStaticMeshRenderData& RenderData, const FString& DistanceFieldKey, FSourceMeshDataForDerivedDataTask* OptionalSourceMeshData)
{
    static const auto CVarCards = IConsoleManager::Get().FindTConsoleVariableDataInt(TEXT("r.MeshCardRepresentation"));

    if (CVarCards->GetValueOnAnyThread() != 0)
    {
        FString Key = BuildCardRepresentationDerivedDataKey(DistanceFieldKey);
        if (RenderData.LODResources.IsValidIndex(0))
        {
            // 構建FCardRepresentationData例項.
            if (!RenderData.LODResources[0].CardRepresentationData)
            {
                RenderData.LODResources[0].CardRepresentationData = new FCardRepresentationData();
            }

            const FMeshBuildSettings& BuildSettings = StaticMeshAsset->GetSourceModel(0).BuildSettings;
            UStaticMesh* MeshToGenerateFrom = StaticMeshAsset;

            // 快取FCardRepresentationData.
            RenderData.LODResources[0].CardRepresentationData->CacheDerivedData(Key, TargetPlatform, StaticMeshAsset, MeshToGenerateFrom, BuildSettings.bGenerateDistanceFieldAsIfTwoSided, OptionalSourceMeshData);
        }
    }
}

// 快取FCardRepresentationData.
void FCardRepresentationData::CacheDerivedData(const FString& InDDCKey, const ITargetPlatform* TargetPlatform, UStaticMesh* Mesh, UStaticMesh* GenerateSource, bool bGenerateDistanceFieldAsIfTwoSided, FSourceMeshDataForDerivedDataTask* OptionalSourceMeshData)
{
    TArray<uint8> DerivedData;

    (......)
    {
        COOK_STAT(Timer.TrackCyclesOnly());
        
        // 建立新的構建任務FAsyncCardRepresentationTask.
        FAsyncCardRepresentationTask* NewTask = new FAsyncCardRepresentationTask;
        NewTask->DDCKey = InDDCKey;
        check(Mesh && GenerateSource);
        NewTask->StaticMesh = Mesh;
        NewTask->GenerateSource = GenerateSource;
        NewTask->GeneratedCardRepresentation = new FCardRepresentationData();
        NewTask->bGenerateDistanceFieldAsIfTwoSided = bGenerateDistanceFieldAsIfTwoSided;

        // 處理材質混合模式.
        for (int32 MaterialIndex = 0; MaterialIndex < Mesh->GetStaticMaterials().Num(); MaterialIndex++)
        {
            FSignedDistanceFieldBuildMaterialData MaterialData;
            // Default material blend mode
            MaterialData.BlendMode = BLEND_Opaque;
            MaterialData.bTwoSided = false;

            if (Mesh->GetStaticMaterials()[MaterialIndex].MaterialInterface)
            {
                MaterialData.BlendMode = Mesh->GetStaticMaterials()[MaterialIndex].MaterialInterface->GetBlendMode();
                MaterialData.bTwoSided = Mesh->GetStaticMaterials()[MaterialIndex].MaterialInterface->IsTwoSided();
            }

            NewTask->MaterialBlendModes.Add(MaterialData);
        }

        // Nanite材質用一個粗糙表示覆蓋源靜態網格。在構建網格SDF之前,需要載入原始資料。
        if (OptionalSourceMeshData)
        {
            NewTask->SourceMeshData = *OptionalSourceMeshData;
        }
        // 建立Nanite的粗糙代表.
        else if (Mesh->NaniteSettings.bEnabled)
        {
            IMeshBuilderModule& MeshBuilderModule = IMeshBuilderModule::GetForPlatform(TargetPlatform);
            if (!MeshBuilderModule.BuildMeshVertexPositions(Mesh, NewTask->SourceMeshData.TriangleIndices, NewTask->SourceMeshData.VertexPositions))
            {
                UE_LOG(LogStaticMesh, Error, TEXT("Failed to build static mesh. See previous line(s) for details."));
            }
        }

        // 加入全域性佇列GCardRepresentationAsyncQueue.
        GCardRepresentationAsyncQueue->AddTask(NewTask);
    }
}

6.5.3.3 GenerateCardRepresentationData

跟蹤FCardRepresentationAsyncQueue的呼叫堆疊,不難查到其最終會進入FMeshUtilities::GenerateCardRepresentationData介面,此介面會執行具體的網格卡片構建邏輯:

// Engine\Source\Developer\MeshUtilities\Private\MeshCardRepresentationUtilities.cpp

bool FMeshUtilities::GenerateCardRepresentationData(
    FString MeshName,
    const FSourceMeshDataForDerivedDataTask& SourceMeshData,
    const FStaticMeshLODResources& LODModel,
    class FQueuedThreadPool& ThreadPool,
    const TArray<FSignedDistanceFieldBuildMaterialData>& MaterialBlendModes,
    const FBoxSphereBounds& Bounds,
    const FDistanceFieldVolumeData* DistanceFieldVolumeData,
    bool bGenerateAsIfTwoSided,
    FCardRepresentationData& OutData)
{
    // 構建Embree場景.
    FEmbreeScene EmbreeScene;
    MeshRepresentation::SetupEmbreeScene(MeshName,
        SourceMeshData,
        LODModel,
        MaterialBlendModes,
        bGenerateAsIfTwoSided,
        EmbreeScene);

    if (!EmbreeScene.EmbreeScene)
    {
        return false;
    }

    // 處理上下文.
    FGenerateCardMeshContext Context(MeshName, EmbreeScene.EmbreeScene, EmbreeScene.EmbreeDevice, OutData);
    // 構建網格卡片.
    BuildMeshCards(DistanceFieldVolumeData ? DistanceFieldVolumeData->LocalSpaceMeshBounds : Bounds.GetBox(), Context, OutData);

    MeshRepresentation::DeleteEmbreeScene(EmbreeScene);
    
    (......)

    return true;
}

由此可知,構建網格卡片過程使用了Embree第三方庫。

關於Embree

Embree是由Intel開發維護的開源庫,是一個高效能光線追蹤核心的集合,幫助開發者提高逼真渲染的應用程式的效能。它的特性有高階頭髮幾何體、運動模糊、動態場景、多關卡例項:

Embree的實現和技術有以下特點:

  • 核心為支援SSE、AVX、AVX2和AVX-512指令的最新Intel處理器進行了優化。
  • 支援執行時程式碼選擇,以選擇遍歷和構建演算法,以最佳匹配的CPU指令集。
  • 支援使用Intel SPMD程式編譯器(ISPC)編寫的應用程式,還提供了核心射線追蹤演算法的ISPC介面。
  • 包含針對非快取一致的工作負載(如蒙特卡羅光線追蹤演算法)和快取一致的工作負載(如主要可見性和硬陰影射線)優化的演算法。

簡而言之,Embree是基於CPU的高度優化的光線追蹤渲染加速器,但不支援GPU的硬體加速。正是這個特點,Lumen的網格卡片構建時間主要取決於CPU的效能。

構建的核心邏輯位於BuildMeshCards

void BuildMeshCards(const FBox& MeshBounds, const FGenerateCardMeshContext& Context, FCardRepresentationData& OutData)
{
    static const auto CVarMeshCardRepresentationMinSurface = IConsoleManager::Get().FindTConsoleVariableDataFloat(TEXT("r.MeshCardRepresentation.MinSurface"));
    const float MinSurfaceThreshold = CVarMeshCardRepresentationMinSurface->GetValueOnAnyThread();

    // 確保生成的卡片包圍盒不為空.
    const FVector MeshCardsBoundsCenter = MeshBounds.GetCenter();
    const FVector MeshCardsBoundsExtent = FVector::Max(MeshBounds.GetExtent() + 1.0f, FVector(5.0f));
    const FBox MeshCardsBounds(MeshCardsBoundsCenter - MeshCardsBoundsExtent, MeshCardsBoundsCenter + MeshCardsBoundsExtent);

    // 初始化部分輸出資料.
    OutData.MeshCardsBuildData.Bounds = MeshCardsBounds;
    OutData.MeshCardsBuildData.MaxLODLevel = 1;
    OutData.MeshCardsBuildData.CardBuildData.Reset();

    // 處理取樣和體素資料.
    const float SamplesPerWorldUnit = 1.0f / 10.0f;
    const int32 MinSamplesPerAxis = 4;
    const int32 MaxSamplesPerAxis = 64;
    FIntVector VolumeSizeInVoxels;
    VolumeSizeInVoxels.X = FMath::Clamp<int32>(MeshCardsBounds.GetSize().X * SamplesPerWorldUnit, MinSamplesPerAxis, MaxSamplesPerAxis);
    VolumeSizeInVoxels.Y = FMath::Clamp<int32>(MeshCardsBounds.GetSize().Y * SamplesPerWorldUnit, MinSamplesPerAxis, MaxSamplesPerAxis);
    VolumeSizeInVoxels.Z = FMath::Clamp<int32>(MeshCardsBounds.GetSize().Z * SamplesPerWorldUnit, MinSamplesPerAxis, MaxSamplesPerAxis);

    // 單個體素的大小.
    const FVector VoxelExtent = MeshCardsBounds.GetSize() / FVector(VolumeSizeInVoxels);

    // 隨機在半球上生成射線方向.
    TArray<FVector4> RayDirectionsOverHemisphere;
    {
        FRandomStream RandomStream(0);
        MeshUtilities::GenerateStratifiedUniformHemisphereSamples(64, RandomStream, RayDirectionsOverHemisphere);
    }
    
    // 遍歷6個朝向, 給每個朝向生成卡片資料.
    for (int32 Orientation = 0; Orientation < 6; ++Orientation)
    {
        // 初始化高度場和射線等資料.
        FIntPoint HeighfieldSize(0, 0);
        FVector RayDirection(0.0f, 0.0f, 0.0f);
        FVector RayOriginFrame = MeshCardsBounds.Min;
        FVector HeighfieldStepX(0.0f, 0.0f, 0.0f);
        FVector HeighfieldStepY(0.0f, 0.0f, 0.0f);
        float MaxRayT = 0.0f;
        int32 MeshSliceNum = 0;

        // 根據朝向調整高度場和射線資料.
        switch (Orientation / 2)
        {
            case 0: // 朝向: -X, +X
                MaxRayT = MeshCardsBounds.GetSize().X + 0.1f;
                MeshSliceNum = VolumeSizeInVoxels.X;
                HeighfieldSize.X = VolumeSizeInVoxels.Y;
                HeighfieldSize.Y = VolumeSizeInVoxels.Z;
                HeighfieldStepX = FVector(0.0f, MeshCardsBounds.GetSize().Y / HeighfieldSize.X, 0.0f);
                HeighfieldStepY = FVector(0.0f, 0.0f, MeshCardsBounds.GetSize().Z / HeighfieldSize.Y);
                break;

            case 1: // 朝向: -Y, +Y
                MaxRayT = MeshCardsBounds.GetSize().Y + 0.1f;
                MeshSliceNum = VolumeSizeInVoxels.Y;
                HeighfieldSize.X = VolumeSizeInVoxels.X;
                HeighfieldSize.Y = VolumeSizeInVoxels.Z;
                HeighfieldStepX = FVector(MeshCardsBounds.GetSize().X / HeighfieldSize.X, 0.0f, 0.0f);
                HeighfieldStepY = FVector(0.0f, 0.0f, MeshCardsBounds.GetSize().Z / HeighfieldSize.Y);
                break;

            case 2: // 朝向: -Z, +Z
                MaxRayT = MeshCardsBounds.GetSize().Z + 0.1f;
                MeshSliceNum = VolumeSizeInVoxels.Z;
                HeighfieldSize.X = VolumeSizeInVoxels.X;
                HeighfieldSize.Y = VolumeSizeInVoxels.Y;
                HeighfieldStepX = FVector(MeshCardsBounds.GetSize().X / HeighfieldSize.X, 0.0f, 0.0f);
                HeighfieldStepY = FVector(0.0f, MeshCardsBounds.GetSize().Y / HeighfieldSize.Y, 0.0f);
                break;
        }

        // 根據朝向調整射線方向.
        switch (Orientation)
        {
            case 0: 
                RayDirection.X = +1.0f; 
                break;

            case 1: 
                RayDirection.X = -1.0f; 
                RayOriginFrame.X = MeshCardsBounds.Max.X;
                break;

            case 2: 
                RayDirection.Y = +1.0f; 
                break;

            case 3: 
                RayDirection.Y = -1.0f; 
                RayOriginFrame.Y = MeshCardsBounds.Max.Y;
                break;

            case 4: 
                RayDirection.Z = +1.0f; 
                break;

            case 5: 
                RayDirection.Z = -1.0f; 
                RayOriginFrame.Z = MeshCardsBounds.Max.Z;
                break;

            default: 
                check(false);
        };

        TArray<TArray<FSurfacePoint, TInlineAllocator<16>>> HeightfieldLayers;
        HeightfieldLayers.SetNum(HeighfieldSize.X * HeighfieldSize.Y);

        // 填充表面點的資料.
        {
            TRACE_CPUPROFILER_EVENT_SCOPE(FillSurfacePoints);

            TArray<float> Heightfield;
            Heightfield.SetNum(HeighfieldSize.X * HeighfieldSize.Y);
            for (int32 HeighfieldY = 0; HeighfieldY < HeighfieldSize.Y; ++HeighfieldY)
            {
                for (int32 HeighfieldX = 0; HeighfieldX < HeighfieldSize.X; ++HeighfieldX)
                {
                    Heightfield[HeighfieldX + HeighfieldY * HeighfieldSize.X] = -1.0f;
                }
            }

            for (int32 HeighfieldY = 0; HeighfieldY < HeighfieldSize.Y; ++HeighfieldY)
            {
                for (int32 HeighfieldX = 0; HeighfieldX < HeighfieldSize.X; ++HeighfieldX)
                {
                    FVector RayOrigin = RayOriginFrame;
                    RayOrigin += (HeighfieldX + 0.5f) * HeighfieldStepX;
                    RayOrigin += (HeighfieldY + 0.5f) * HeighfieldStepY;

                    float StepTMin = 0.0f;

                    for (int32 StepIndex = 0; StepIndex < 64; ++StepIndex)
                    {
                        FEmbreeRay EmbreeRay;
                        EmbreeRay.ray.org_x = RayOrigin.X;
                        EmbreeRay.ray.org_y = RayOrigin.Y;
                        EmbreeRay.ray.org_z = RayOrigin.Z;
                        EmbreeRay.ray.dir_x = RayDirection.X;
                        EmbreeRay.ray.dir_y = RayDirection.Y;
                        EmbreeRay.ray.dir_z = RayDirection.Z;
                        EmbreeRay.ray.tnear = StepTMin;
                        EmbreeRay.ray.tfar = FLT_MAX;

                        FEmbreeIntersectionContext EmbreeContext;
                        rtcInitIntersectContext(&EmbreeContext);
                        rtcIntersect1(Context.FullMeshEmbreeScene, &EmbreeContext, &EmbreeRay);

                        if (EmbreeRay.hit.geomID != RTC_INVALID_GEOMETRY_ID && EmbreeRay.hit.primID != RTC_INVALID_GEOMETRY_ID)
                        {
                            const FVector SurfacePoint = RayOrigin + RayDirection * EmbreeRay.ray.tfar;
                            const FVector SurfaceNormal = EmbreeRay.GetHitNormal();

                            const float NdotD = FVector::DotProduct(RayDirection, SurfaceNormal);
                            const bool bPassCullTest = EmbreeContext.IsHitTwoSided() || NdotD <= 0.0f;
                            const bool bPassProjectionAngleTest = FMath::Abs(NdotD) >= FMath::Cos(75.0f * (PI / 180.0f));

                            const float MinDistanceBetweenPoints = (MaxRayT / MeshSliceNum);
                            const bool bPassDistanceToAnotherSurfaceTest = EmbreeRay.ray.tnear <= 0.0f || (EmbreeRay.ray.tfar - EmbreeRay.ray.tnear > MinDistanceBetweenPoints);

                            if (bPassCullTest && bPassProjectionAngleTest && bPassDistanceToAnotherSurfaceTest)
                            {
                                const bool bIsInsideMesh = IsSurfacePointInsideMesh(Context.FullMeshEmbreeScene, SurfacePoint, SurfaceNormal, RayDirectionsOverHemisphere);
                                if (!bIsInsideMesh)
                                {
                                    HeightfieldLayers[HeighfieldX + HeighfieldY * HeighfieldSize.X].Add(
                                        { EmbreeRay.ray.tnear, EmbreeRay.ray.tfar }
                                    );
                                }
                            }

                            StepTMin = EmbreeRay.ray.tfar + 0.01f;
                        }
                        else
                        {
                            break;
                        }
                    }
                }
            }
        }

        const int32 MinCardHits = FMath::Floor(HeighfieldSize.X * HeighfieldSize.Y * MinSurfaceThreshold);

        TArray<FPlacedCard, TInlineAllocator<16>> PlacedCards;
        int32 PlacedCardsHits = 0;

        // 放置一個預設卡片.
        {
            FPlacedCard PlacedCard;
            PlacedCard.SliceMin = 0;
            PlacedCard.SliceMax = MeshSliceNum;
            PlacedCards.Add(PlacedCard);

            PlacedCardsHits = UpdatePlacedCards(PlacedCards, RayOriginFrame, RayDirection, HeighfieldStepX, HeighfieldStepY, HeighfieldSize, MeshSliceNum, MaxRayT, MinCardHits, VoxelExtent, HeightfieldLayers);

            if (PlacedCardsHits < MinCardHits)
            {
                PlacedCards.Reset();
            }
        }

        SerializePlacedCards(PlacedCards, /*LOD level*/ 0, Orientation, MinCardHits, MeshCardsBounds, OutData);

        // 嘗試通過拆分現有的卡片去放置更多的卡片.
        for (uint32 CardPlacementIteration = 0; CardPlacementIteration < 4; ++CardPlacementIteration)
        {
            TArray<FPlacedCard, TInlineAllocator<16>> BestPlacedCards;
            int32 BestPlacedCardHits = PlacedCardsHits;

            for (int32 PlacedCardIndex = 0; PlacedCardIndex < PlacedCards.Num(); ++PlacedCardIndex)
            {
                const FPlacedCard& PlacedCard = PlacedCards[PlacedCardIndex];
                for (int32 SliceIndex = PlacedCard.SliceMin + 2; SliceIndex < PlacedCard.SliceMax; ++SliceIndex)
                {
                    TArray<FPlacedCard, TInlineAllocator<16>> TempPlacedCards(PlacedCards);

                    FPlacedCard NewPlacedCard;
                    NewPlacedCard.SliceMin = SliceIndex;
                    NewPlacedCard.SliceMax = PlacedCard.SliceMax;

                    TempPlacedCards[PlacedCardIndex].SliceMax = SliceIndex - 1;
                    TempPlacedCards.Insert(NewPlacedCard, PlacedCardIndex + 1);

                    const int32 NumHits = UpdatePlacedCards(TempPlacedCards, RayOriginFrame, RayDirection, HeighfieldStepX, HeighfieldStepY, HeighfieldSize, MeshSliceNum, MaxRayT, MinCardHits, VoxelExtent, HeightfieldLayers);

                    if (NumHits > BestPlacedCardHits)
                    {
                        BestPlacedCards = TempPlacedCards;
                        BestPlacedCardHits = NumHits;
                    }
                }
            }

            if (BestPlacedCardHits >= PlacedCardsHits + MinCardHits)
            {
                PlacedCards = BestPlacedCards;
                PlacedCardsHits = BestPlacedCardHits;
            }
        }

        SerializePlacedCards(PlacedCards, /*LOD level*/ 1, Orientation, MinCardHits, MeshCardsBounds, OutData);
    } // for (int32 Orientation = 0; Orientation < 6; ++Orientation)
}

以上程式碼顯示構建卡牌資料時使用了高度場光線追蹤(Height Field Ray Tracing)來加速,而光線追蹤多年前就存在的技術。它的核心思想和步驟在於將網格離散化成大小相等的3D體素,然後根據解析度大小從攝像機位置向每個畫素位置發射一條光線和3D體素相交測試,從而渲染出高度場的輪廓。而高度場的輪廓將螢幕劃分為高度場覆蓋區域和高度場以上區域的分界線:

這樣獲得的輪廓存在明顯的鋸齒,論文Ray Tracing Height Fields提供了高度場平面、線性近似平面、三角面、雙線性表面等方法來重建表面資料以緩解鋸齒。

經過以上構建之後,可以出現如下所示的網格卡片資料:

上:網格正常資料;下:網格卡片資料視覺化。

網格卡片資料存在LOD,會根據鏡頭遠近選擇對應等級的LOD(點選看視訊)。

此外,UE5構建出來的網格距離場資料做了改進,利用稀疏儲存提升了精度(下圖左),明顯要好於UE4(下圖右)。

6.5.4 Lumen渲染流程

Lumen的主要渲染流程依然在FDeferredShadingSceneRenderer::Render中:

void FDeferredShadingSceneRenderer::Render(FRDGBuilder& GraphBuilder)
{
    (......)
    
    bool bAnyLumenEnabled = false;
    if (!IsSimpleForwardShadingEnabled(ShaderPlatform))
    {
        (......)

        // 檢測是否有檢視啟用了Lumen.
        for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ViewIndex++)
        {
            FViewInfo& View = Views[ViewIndex];
            bAnyLumenEnabled = bAnyLumenEnabled 
                || GetViewPipelineState(View).DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen
                || GetViewPipelineState(View).ReflectionsMethod == EReflectionsMethod::Lumen;
        }

        (......)
    }
    
    (......)
    
    // PrePass.
    RenderPrePass(...);
    
    (......)
    
    // 更新Lumen場景.
    UpdateLumenScene(GraphBuilder);

    // 如果在BasePass之前執行遮擋剔除, 則在RenderBasePass之前渲染Lumen場景光照.
    // bOcclusionBeforeBasePass預設為false.
    if (bOcclusionBeforeBasePass)
    {
        {
            LLM_SCOPE_BYTAG(Lumen);
            RenderLumenSceneLighting(GraphBuilder, Views[0]);
        }

        ComputeVolumetricFog(GraphBuilder);
    }
    
    (......)
    
    // BasePass.
    RenderBasePass(...);
    
    (......)
    
    // BasePass之後的Lumen光照.
    if (!bOcclusionBeforeBasePass)
    {
        const bool bAfterBasePass = true;
        // 渲染陰影.
        AllocateVirtualShadowMaps(bAfterBasePass);
        RenderShadowDepthMaps(GraphBuilder, InstanceCullingManager);
        
        {
            LLM_SCOPE_BYTAG(Lumen);
            // 渲染Lumen場景光照.
            RenderLumenSceneLighting(GraphBuilder, Views[0]);
        }

        AddServiceLocalQueuePass(GraphBuilder);
    }
    
    (......)
    
    // 渲染Lumen視覺化.
    RenderLumenSceneVisualization(GraphBuilder, SceneTextures);
    // 渲染非直接漫反射和AO.
    RenderDiffuseIndirectAndAmbientOcclusion(GraphBuilder, SceneTextures, LightingChannelsTexture, true);
    
    (......)
}

下面的紅框是RenderDoc截幀中Lumen的執行步驟:

Lumen的光照主要有更新場景UpdateLumenScene和計算場景光照RenderLumenSceneLighting兩個階段。

6.5.5 Lumen場景更新

6.5.5.1 UpdateLumenScene

Lumen場景更新主要由UpdateLumenScene承擔:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneRendering.cpp

void FDeferredShadingSceneRenderer::UpdateLumenScene(FRDGBuilder& GraphBuilder)
{
    LLM_SCOPE_BYTAG(Lumen);

    FViewInfo& View = Views[0];
    const FPerViewPipelineState& ViewPipelineState = GetViewPipelineState(View);
    const bool bAnyLumenActive = ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen || ViewPipelineState.ReflectionsMethod == EReflectionsMethod::Lumen;

    if (bAnyLumenActive
        // 非主要檢視更新場景
        && !View.bIsPlanarReflection 
        && !View.bIsSceneCapture
        && !View.bIsReflectionCapture
        && View.ViewState)
    {
        const double StartTime = FPlatformTime::Seconds();

        // 獲取Lumen場景和卡片資料.
        FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
        TArray<FCardRenderData, SceneRenderingAllocator>& CardsToRender = LumenCardRenderer.CardsToRender;

        RDG_EVENT_SCOPE(GraphBuilder, "UpdateLumenScene: %u card captures %.3fM texels", CardsToRender.Num(), LumenCardRenderer.NumCardTexelsToCapture / 1e6f);

        // 更新卡片場景緩衝.
        UpdateCardSceneBuffer(GraphBuilder.RHICmdList, ViewFamily, Scene);

        // 因為更新了Lumen的圖元對映緩衝, 所以需要重新建立檢視統一緩衝區.
        Lumen::SetupViewUniformBufferParameters(Scene, *View.CachedViewUniformShaderParameters);
        View.ViewUniformBuffer = TUniformBufferRef<FViewUniformShaderParameters>::CreateUniformBufferImmediate(*View.CachedViewUniformShaderParameters, UniformBuffer_SingleFrame);
        
        LumenCardRenderer.CardIdsToRender.Empty(CardsToRender.Num());

        // 捕捉卡片的臨時深度緩衝區.
        const FRDGTextureDesc DepthStencilAtlasDesc = FRDGTextureDesc::Create2D(LumenSceneData.MaxAtlasSize, PF_DepthStencil, FClearValueBinding::DepthZero, TexCreate_ShaderResource | TexCreate_DepthStencilTargetable | TexCreate_NoFastClear);
        FRDGTextureRef DepthStencilAtlasTexture = GraphBuilder.CreateTexture(DepthStencilAtlasDesc, TEXT("Lumen.DepthStencilAtlas"));

        if (CardsToRender.Num() > 0)
        {
            FRHIBuffer* PrimitiveIdVertexBuffer = nullptr;
            FInstanceCullingResult InstanceCullingResult;
            // 裁剪卡片, 支援GPU和非GPU裁剪.
#if GPUCULL_TODO
            if (Scene->GPUScene.IsEnabled())
            {
                int32 MaxInstances = 0;
                int32 VisibleMeshDrawCommandsNum = 0;
                int32 NewPassVisibleMeshDrawCommandsNum = 0;

                FInstanceCullingContext InstanceCullingContext(nullptr, TArrayView<const int32>(&View.GPUSceneViewId, 1));

                SetupGPUInstancedDraws(InstanceCullingContext, LumenCardRenderer.MeshDrawCommands, false, MaxInstances, VisibleMeshDrawCommandsNum, NewPassVisibleMeshDrawCommandsNum);
                // Not supposed to do any compaction here.
                ensure(VisibleMeshDrawCommandsNum == LumenCardRenderer.MeshDrawCommands.Num());

                InstanceCullingContext.BuildRenderingCommands(GraphBuilder, Scene->GPUScene, View.DynamicPrimitiveCollector.GetPrimitiveIdRange(), InstanceCullingResult);
            }
            else
#endif // GPUCULL_TODO
            {
                // Prepare primitive Id VB for rendering mesh draw commands.
                if (LumenCardRenderer.MeshDrawPrimitiveIds.Num() > 0)
                {
                    const uint32 PrimitiveIdBufferDataSize = LumenCardRenderer.MeshDrawPrimitiveIds.Num() * sizeof(int32);

                    FPrimitiveIdVertexBufferPoolEntry Entry = GPrimitiveIdVertexBufferPool.Allocate(PrimitiveIdBufferDataSize);
                    PrimitiveIdVertexBuffer = Entry.BufferRHI;

                    void* RESTRICT Data = RHILockBuffer(PrimitiveIdVertexBuffer, 0, PrimitiveIdBufferDataSize, RLM_WriteOnly);
                    FMemory::Memcpy(Data, LumenCardRenderer.MeshDrawPrimitiveIds.GetData(), PrimitiveIdBufferDataSize);
                    RHIUnlockBuffer(PrimitiveIdVertexBuffer);

                    GPrimitiveIdVertexBufferPool.ReturnToFreeList(Entry);
                }
        }
            FRDGTextureRef AlbedoAtlasTexture = GraphBuilder.RegisterExternalTexture(LumenSceneData.AlbedoAtlas);
            FRDGTextureRef NormalAtlasTexture = GraphBuilder.RegisterExternalTexture(LumenSceneData.NormalAtlas);
            FRDGTextureRef EmissiveAtlasTexture = GraphBuilder.RegisterExternalTexture(LumenSceneData.EmissiveAtlas);

            uint32 NumRects = 0;
            FRDGBufferRef RectMinMaxBuffer = nullptr;
            {
                // 上傳卡片id,用於在待渲染卡片上操作的批量繪製。
                TArray<FUintVector4, SceneRenderingAllocator> RectMinMaxToRender;
                RectMinMaxToRender.Reserve(CardsToRender.Num());
                for (const FCardRenderData& CardRenderData : CardsToRender)
                {
                    FIntRect AtlasRect = CardRenderData.AtlasAllocation;

                    FUintVector4 Rect;
                    Rect.X = FMath::Max(AtlasRect.Min.X, 0);
                    Rect.Y = FMath::Max(AtlasRect.Min.Y, 0);
                    Rect.Z = FMath::Max(AtlasRect.Max.X, 0);
                    Rect.W = FMath::Max(AtlasRect.Max.Y, 0);
                    RectMinMaxToRender.Add(Rect);
                }

                NumRects = CardsToRender.Num();
                RectMinMaxBuffer = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateUploadDesc(sizeof(FUintVector4), FMath::RoundUpToPowerOfTwo(NumRects)), TEXT("Lumen.RectMinMaxBuffer"));

                FPixelShaderUtils::UploadRectMinMaxBuffer(GraphBuilder, RectMinMaxToRender, RectMinMaxBuffer);

                FRDGBufferSRVRef RectMinMaxBufferSRV = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(RectMinMaxBuffer, PF_R32G32B32A32_UINT));
                ClearLumenCards(GraphBuilder, View, AlbedoAtlasTexture, NormalAtlasTexture, EmissiveAtlasTexture, DepthStencilAtlasTexture, LumenSceneData.MaxAtlasSize, RectMinMaxBufferSRV, NumRects);
            }

            // 快取檢視資訊.
            FViewInfo* SharedView = View.CreateSnapshot();
            {
                SharedView->DynamicPrimitiveCollector = FGPUScenePrimitiveCollector(&GetGPUSceneDynamicContext());
                SharedView->StereoPass = eSSP_FULL;
                SharedView->DrawDynamicFlags = EDrawDynamicFlags::ForceLowestLOD;

                // Don't do material texture mip biasing in proxy card rendering
                SharedView->MaterialTextureMipBias = 0;

                TRefCountPtr<IPooledRenderTarget> NullRef;
                FPlatformMemory::Memcpy(&SharedView->PrevViewInfo.HZB, &NullRef, sizeof(SharedView->PrevViewInfo.HZB));

                SharedView->CachedViewUniformShaderParameters = MakeUnique<FViewUniformShaderParameters>();
                SharedView->CachedViewUniformShaderParameters->PrimitiveSceneData = Scene->GPUScene.PrimitiveBuffer.SRV;
                SharedView->CachedViewUniformShaderParameters->InstanceSceneData = Scene->GPUScene.InstanceDataBuffer.SRV;
                SharedView->CachedViewUniformShaderParameters->LightmapSceneData = Scene->GPUScene.LightmapDataBuffer.SRV;
                SharedView->ViewUniformBuffer = TUniformBufferRef<FViewUniformShaderParameters>::CreateUniformBufferImmediate(*SharedView->CachedViewUniformShaderParameters, UniformBuffer_SingleFrame);
            }

            // 設定場景的紋理快取.
            FLumenCardPassUniformParameters* PassUniformParameters = GraphBuilder.AllocParameters<FLumenCardPassUniformParameters>();
            SetupSceneTextureUniformParameters(GraphBuilder, Scene->GetFeatureLevel(), /*SceneTextureSetupMode*/ ESceneTextureSetupMode::None, PassUniformParameters->SceneTextures);

            // 捕獲網格卡片.
            {
                FLumenCardPassParameters* PassParameters = GraphBuilder.AllocParameters<FLumenCardPassParameters>();
                PassParameters->View = Scene->UniformBuffers.LumenCardCaptureViewUniformBuffer;
                PassParameters->CardPass = GraphBuilder.CreateUniformBuffer(PassUniformParameters);
                PassParameters->RenderTargets[0] = FRenderTargetBinding(AlbedoAtlasTexture, ERenderTargetLoadAction::ELoad);
                PassParameters->RenderTargets[1] = FRenderTargetBinding(NormalAtlasTexture, ERenderTargetLoadAction::ELoad);
                PassParameters->RenderTargets[2] = FRenderTargetBinding(EmissiveAtlasTexture, ERenderTargetLoadAction::ELoad);
                PassParameters->RenderTargets.DepthStencil = FDepthStencilBinding(DepthStencilAtlasTexture, ERenderTargetLoadAction::ELoad, FExclusiveDepthStencil::DepthWrite_StencilNop);

                InstanceCullingResult.GetDrawParameters(PassParameters->InstanceCullingDrawParams);

                // 捕獲網格卡片Pass.
                GraphBuilder.AddPass(
                    RDG_EVENT_NAME("MeshCardCapture"),
                    PassParameters,
                    ERDGPassFlags::Raster,
                    [this, Scene = Scene, PrimitiveIdVertexBuffer, SharedView, &CardsToRender, PassParameters](FRHICommandList& RHICmdList)
                    {
                        QUICK_SCOPE_CYCLE_COUNTER(MeshPass);

                        // 將所有待渲染的卡片準備資料並提交繪製指令.
                        for (FCardRenderData& CardRenderData : CardsToRender)
                        {
                            if (CardRenderData.NumMeshDrawCommands > 0)
                            {
                                FIntRect AtlasRect = CardRenderData.AtlasAllocation;
                                RHICmdList.SetViewport(AtlasRect.Min.X, AtlasRect.Min.Y, 0.0f, AtlasRect.Max.X, AtlasRect.Max.Y, 1.0f);

                                CardRenderData.PatchView(RHICmdList, Scene, SharedView);
                                Scene->UniformBuffers.LumenCardCaptureViewUniformBuffer.UpdateUniformBufferImmediate(*SharedView->CachedViewUniformShaderParameters);

                                FGraphicsMinimalPipelineStateSet GraphicsMinimalPipelineStateSet;
#if GPUCULL_TODO
                                if (Scene->GPUScene.IsEnabled())
                                {
                                    FRHIBuffer* DrawIndirectArgsBuffer = nullptr;
                                    FRHIBuffer* InstanceIdOffsetBuffer = nullptr;
                                    FInstanceCullingDrawParams& InstanceCullingDrawParams = PassParameters->InstanceCullingDrawParams;
                                    if (InstanceCullingDrawParams.DrawIndirectArgsBuffer != nullptr && InstanceCullingDrawParams.InstanceIdOffsetBuffer != nullptr)
                                    {
                                        DrawIndirectArgsBuffer = InstanceCullingDrawParams.DrawIndirectArgsBuffer->GetRHI();
                                        InstanceIdOffsetBuffer = InstanceCullingDrawParams.InstanceIdOffsetBuffer->GetRHI();
                                    }

                                    // GPU裁剪呼叫GPUInstanced介面.
                                    SubmitGPUInstancedMeshDrawCommandsRange(
                                        LumenCardRenderer.MeshDrawCommands,
                                        GraphicsMinimalPipelineStateSet,
                                        CardRenderData.StartMeshDrawCommandIndex,
                                        CardRenderData.NumMeshDrawCommands,
                                        1,
                                        InstanceIdOffsetBuffer,
                                        DrawIndirectArgsBuffer,
                                        RHICmdList);
                                }
                                else
#endif // GPUCULL_TODO
                                {
                                    // 非GPU裁剪呼叫普通繪製介面.
                                    SubmitMeshDrawCommandsRange(
                                        LumenCardRenderer.MeshDrawCommands,
                                        GraphicsMinimalPipelineStateSet,
                                        PrimitiveIdVertexBuffer,
                                        0,
                                        false,
                                        CardRenderData.StartMeshDrawCommandIndex,
                                        CardRenderData.NumMeshDrawCommands,
                                        1,
                                        RHICmdList);
                                }
                            }
                        }
                    }
                );
            }

            // 記錄待渲染卡片的id和檢測是否存在需要渲染Nanite網格的標記.
            bool bAnyNaniteMeshes = false;
            for (FCardRenderData& CardRenderData : CardsToRender)
            {
                bAnyNaniteMeshes = bAnyNaniteMeshes || CardRenderData.NaniteInstanceIds.Num() > 0 || CardRenderData.bDistantScene;
                LumenCardRenderer.CardIdsToRender.Add(CardRenderData.CardIndex);
            }

            // 渲染Lumen場景的Nanite網格.
            if (UseNanite(ShaderPlatform) && ViewFamily.EngineShowFlags.NaniteMeshes && bAnyNaniteMeshes)
            {
                TRACE_CPUPROFILER_EVENT_SCOPE(NaniteMeshPass);
                QUICK_SCOPE_CYCLE_COUNTER(NaniteMeshPass);

                const FIntPoint DepthStencilAtlasSize = DepthStencilAtlasDesc.Extent;
                const FIntRect DepthAtlasRect = FIntRect(0, 0, DepthStencilAtlasSize.X, DepthStencilAtlasSize.Y);
                FRDGBufferSRVRef RectMinMaxBufferSRV = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(RectMinMaxBuffer, PF_R32G32B32A32_UINT));

                // 光柵化上下文.
                Nanite::FRasterContext RasterContext = Nanite::InitRasterContext(
                    GraphBuilder,
                    FeatureLevel,
                    DepthStencilAtlasSize,
                    Nanite::EOutputBufferMode::VisBuffer,
                    true,
                    RectMinMaxBufferSRV,
                    NumRects);

                const bool bUpdateStreaming = false;
                const bool bSupportsMultiplePasses = true;
                const bool bForceHWRaster = RasterContext.RasterScheduling == Nanite::ERasterScheduling::HardwareOnly;
                // 非主要上下文(和Nanite的主要Pass區別開來)
                const bool bPrimaryContext = false;

                // 裁剪上下文
                Nanite::FCullingContext CullingContext = Nanite::InitCullingContext(
                    GraphBuilder,
                    *Scene,
                    nullptr,
                    FIntRect(),
                    false,
                    bUpdateStreaming,
                    bSupportsMultiplePasses,
                    bForceHWRaster,
                    bPrimaryContext);

                // 多檢視渲染.
                if (GLumenSceneNaniteMultiViewCapture)
                {
                    const uint32 NumCardsToRender = CardsToRender.Num();

                    // 第一層while迴圈是為了拆分卡片數量, 防止同一個批次的卡片超過MAX_VIEWS_PER_CULL_RASTERIZE_PASS.
                    uint32 NextCardIndex = 0;
                    while(NextCardIndex < NumCardsToRender)
                    {
                        TArray<Nanite::FPackedView, SceneRenderingAllocator> NaniteViews;
                        TArray<Nanite::FInstanceDraw, SceneRenderingAllocator> NaniteInstanceDraws;

                        // 給每個待渲染卡片生成一個FPackedViewParams例項, 新增到NaniteViews, 直到NaniteViews達到了最大檢視數量.
                        while(NextCardIndex < NumCardsToRender && NaniteViews.Num() < MAX_VIEWS_PER_CULL_RASTERIZE_PASS)
                        {
                            const FCardRenderData& CardRenderData = CardsToRender[NextCardIndex];

                            if(CardRenderData.NaniteInstanceIds.Num() > 0)
                            {
                                for(uint32 InstanceID : CardRenderData.NaniteInstanceIds)
                                {
                                    NaniteInstanceDraws.Add(Nanite::FInstanceDraw { InstanceID, (uint32)NaniteViews.Num() });
                                }

                                Nanite::FPackedViewParams Params;
                                Params.ViewMatrices = CardRenderData.ViewMatrices;
                                Params.PrevViewMatrices = CardRenderData.ViewMatrices;
                                Params.ViewRect = CardRenderData.AtlasAllocation;
                                Params.RasterContextSize = DepthStencilAtlasSize;
                                Params.LODScaleFactor = CardRenderData.NaniteLODScaleFactor;
                                NaniteViews.Add(Nanite::CreatePackedView(Params));
                            }

                            NextCardIndex++;
                        }

                        // 光柵化卡片.
                        if (NaniteInstanceDraws.Num() > 0)
                        {
                            RDG_EVENT_SCOPE(GraphBuilder, "Nanite::RasterizeLumenCards");

                            Nanite::FRasterState RasterState;
                            Nanite::CullRasterize(
                                GraphBuilder,
                                *Scene,
                                NaniteViews,
                                CullingContext,
                                RasterContext,
                                RasterState,
                                &NaniteInstanceDraws
                            );
                        }
                    }
                }
                else // 單檢視渲染
                {
                    RDG_EVENT_SCOPE(GraphBuilder, "RenderLumenCardsWithNanite");

                    // 單檢視渲染比較暴力, 線性遍歷所有待渲染卡片, 每個卡片構建一個view並呼叫一次繪製.
                    for(FCardRenderData& CardRenderData : CardsToRender)
                    {
                        if(CardRenderData.NaniteInstanceIds.Num() > 0)
                        {                        
                            TArray<Nanite::FInstanceDraw, SceneRenderingAllocator> NaniteInstanceDraws;
                            for( uint32 InstanceID : CardRenderData.NaniteInstanceIds )
                            {
                                NaniteInstanceDraws.Add( Nanite::FInstanceDraw { InstanceID, 0u } );
                            }
                        
                            CardRenderData.PatchView(GraphBuilder.RHICmdList, Scene, SharedView);
                            Nanite::FPackedView PackedView = Nanite::CreatePackedViewFromViewInfo(*SharedView, DepthStencilAtlasSize, 0);

                            Nanite::CullRasterize(
                                GraphBuilder,
                                *Scene,
                                { PackedView },
                                CullingContext,
                                RasterContext,
                                Nanite::FRasterState(),
                                &NaniteInstanceDraws
                            );
                        }
                    }
                }

                extern float GLumenDistantSceneMinInstanceBoundsRadius;

                // 為遠處的卡片渲染整個場景.
                for (FCardRenderData& CardRenderData : CardsToRender)
                {
                    // bDistantScene標記了是否遠處的卡片.
                    if (CardRenderData.bDistantScene)
                    {
                        Nanite::FRasterState RasterState;
                        RasterState.bNearClip = false;

                        CardRenderData.PatchView(GraphBuilder.RHICmdList, Scene, SharedView);
                        Nanite::FPackedView PackedView = Nanite::CreatePackedViewFromViewInfo(
                            *SharedView,
                            DepthStencilAtlasSize,
                            /*Flags*/ 0,
                            /*StreamingPriorityCategory*/ 0,
                            GLumenDistantSceneMinInstanceBoundsRadius,
                            Lumen::GetDistanceSceneNaniteLODScaleFactor());

                        Nanite::CullRasterize(
                            GraphBuilder,
                            *Scene,
                            { PackedView },
                            CullingContext,
                            RasterContext,
                            RasterState);
                    }
                }

                // Lumen網格捕獲Pass.
                Nanite::DrawLumenMeshCapturePass(
                    GraphBuilder,
                    *Scene,
                    SharedView,
                    CardsToRender,
                    CullingContext,
                    RasterContext,
                    PassUniformParameters,
                    RectMinMaxBufferSRV,
                    NumRects,
                    LumenSceneData.MaxAtlasSize,
                    AlbedoAtlasTexture,
                    NormalAtlasTexture,
                    EmissiveAtlasTexture,
                    DepthStencilAtlasTexture
                );
            }

            ConvertToExternalTexture(GraphBuilder, AlbedoAtlasTexture, LumenSceneData.AlbedoAtlas);
            ConvertToExternalTexture(GraphBuilder, NormalAtlasTexture, LumenSceneData.NormalAtlas);
            ConvertToExternalTexture(GraphBuilder, EmissiveAtlasTexture, LumenSceneData.EmissiveAtlas);
        }

        // 上傳卡片資料.
        {
            QUICK_SCOPE_CYCLE_COUNTER(UploadCardIndexBuffers);

            // 上傳索引緩衝.
            {
                FRDGBufferRef CardIndexBuffer = GraphBuilder.CreateBuffer(
                    FRDGBufferDesc::CreateUploadDesc(sizeof(uint32), FMath::Max(LumenCardRenderer.CardIdsToRender.Num(), 1)),
                    TEXT("Lumen.CardsToRenderIndexBuffer"));

                FLumenCardIdUpload* PassParameters = GraphBuilder.AllocParameters<FLumenCardIdUpload>();
                PassParameters->CardIds = CardIndexBuffer;

                const uint32 CardIdBytes = LumenCardRenderer.CardIdsToRender.GetTypeSize() * LumenCardRenderer.CardIdsToRender.Num();
                const void* CardIdPtr = LumenCardRenderer.CardIdsToRender.GetData();

                GraphBuilder.AddPass(
                    RDG_EVENT_NAME("Upload CardsToRenderIndexBuffer NumIndices=%d", LumenCardRenderer.CardIdsToRender.Num()),
                    PassParameters,
                    ERDGPassFlags::Copy,
                    [PassParameters, CardIdBytes, CardIdPtr](FRHICommandListImmediate& RHICmdList)
                    {
                        if (CardIdBytes > 0)
                        {
                            void* DestCardIdPtr = RHILockBuffer(PassParameters->CardIds->GetRHI(), 0, CardIdBytes, RLM_WriteOnly);
                            FPlatformMemory::Memcpy(DestCardIdPtr, CardIdPtr, CardIdBytes);
                            RHIUnlockBuffer(PassParameters->CardIds->GetRHI());
                        }
                    });

                ConvertToExternalBuffer(GraphBuilder, CardIndexBuffer, LumenCardRenderer.CardsToRenderIndexBuffer);
            }

            // 上傳雜湊對映表緩衝.
            {
                const uint32 NumHashMapUInt32 = FLumenCardRenderer::NumCardsToRenderHashMapBucketUInt32;
                const uint32 NumHashMapBytes = 4 * NumHashMapUInt32;
                const uint32 NumHashMapBuckets = 32 * NumHashMapUInt32;

                FRDGBufferRef CardHashMapBuffer = GraphBuilder.CreateBuffer(
                    FRDGBufferDesc::CreateUploadDesc(sizeof(uint32), NumHashMapUInt32),
                    TEXT("Lumen.CardsToRenderHashMapBuffer"));

                LumenCardRenderer.CardsToRenderHashMap.Init(0, NumHashMapBuckets);

                for (int32 CardIndex : LumenCardRenderer.CardIdsToRender)
                {
                    LumenCardRenderer.CardsToRenderHashMap[CardIndex % NumHashMapBuckets] = 1;
                }

                FLumenCardIdUpload* PassParameters = GraphBuilder.AllocParameters<FLumenCardIdUpload>();
                PassParameters->CardIds = CardHashMapBuffer;

                const void* HashMapDataPtr = LumenCardRenderer.CardsToRenderHashMap.GetData();

                GraphBuilder.AddPass(
                    RDG_EVENT_NAME("Upload CardsToRenderHashMapBuffer NumUInt32=%d", NumHashMapUInt32),
                    PassParameters,
                    ERDGPassFlags::Copy,
                    [PassParameters, NumHashMapBytes, HashMapDataPtr](FRHICommandListImmediate& RHICmdList)
                    {
                        if (NumHashMapBytes > 0)
                        {
                            void* DestCardIdPtr = RHILockBuffer(PassParameters->CardIds->GetRHI(), 0, NumHashMapBytes, RLM_WriteOnly);
                            FPlatformMemory::Memcpy(DestCardIdPtr, HashMapDataPtr, NumHashMapBytes);
                            RHIUnlockBuffer(PassParameters->CardIds->GetRHI());
                        }
                    });

                ConvertToExternalBuffer(GraphBuilder, CardHashMapBuffer, LumenCardRenderer.CardsToRenderHashMapBuffer);
            }

            // 上傳可見卡片索引緩衝.
            {
                FRDGBufferRef VisibleCardsIndexBuffer = GraphBuilder.CreateBuffer(
                    FRDGBufferDesc::CreateUploadDesc(sizeof(uint32), FMath::Max(LumenSceneData.VisibleCardsIndices.Num(), 1)),
                    TEXT("Lumen.VisibleCardsIndexBuffer"));

                FLumenCardIdUpload* PassParameters = GraphBuilder.AllocParameters<FLumenCardIdUpload>();
                PassParameters->CardIds = VisibleCardsIndexBuffer;

                const uint32 CardIdBytes = sizeof(uint32) * LumenSceneData.VisibleCardsIndices.Num();
                const void* CardIdPtr = LumenSceneData.VisibleCardsIndices.GetData();

                GraphBuilder.AddPass(
                    RDG_EVENT_NAME("Upload VisibleCardIndices NumIndices=%d", LumenSceneData.VisibleCardsIndices.Num()),
                    PassParameters,
                    ERDGPassFlags::Copy,
                    [PassParameters, CardIdBytes, CardIdPtr](FRHICommandListImmediate& RHICmdList)
                    {
                        if (CardIdBytes > 0)
                        {
                            void* DestCardIdPtr = RHILockBuffer(PassParameters->CardIds->GetRHI(), 0, CardIdBytes, RLM_WriteOnly);
                            FPlatformMemory::Memcpy(DestCardIdPtr, CardIdPtr, CardIdBytes);
                            RHIUnlockBuffer(PassParameters->CardIds->GetRHI());
                        }
                    });

                ConvertToExternalBuffer(GraphBuilder, VisibleCardsIndexBuffer, LumenSceneData.VisibleCardsIndexBuffer);
            }
        }

        // 預過濾Lumen場景深度.
        if (LumenCardRenderer.CardIdsToRender.Num() > 0)
        {
            TRDGUniformBufferRef<FLumenCardScene> LumenCardSceneUniformBuffer;
            {
                FLumenCardScene* LumenCardSceneParameters = GraphBuilder.AllocParameters<FLumenCardScene>();
                SetupLumenCardSceneParameters(GraphBuilder, Scene, *LumenCardSceneParameters);
                LumenCardSceneUniformBuffer = GraphBuilder.CreateUniformBuffer(LumenCardSceneParameters);
            }

            PrefilterLumenSceneDepth(GraphBuilder, LumenCardSceneUniformBuffer, DepthStencilAtlasTexture, LumenCardRenderer.CardIdsToRender, View);
        }
    }

    FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
    LumenSceneData.CardIndicesToUpdateInBuffer.Reset();
    LumenSceneData.MeshCardsIndicesToUpdateInBuffer.Reset();
    LumenSceneData.DFObjectIndicesToUpdateInBuffer.Reset();
}

更新Lumen場景的過程主要有裁剪卡片、上傳卡片ID、快取檢視和場景紋理、捕獲網格卡片、將卡片當做檢視光柵化Lumen場景、渲染遠處卡片、繪製網格捕獲、上傳卡片資料及可見資料等步驟。

由於以上過程比較多,無法將所有過程都詳細闡述,本節將重點闡述捕獲網格卡片和光柵化網格卡片涉及的階段。

6.5.5.2 CardsToRender

為了闡述捕獲網格卡片和光柵化網格卡片的階段,需要弄清楚LumenCardRenderer.CardsToRender的新增過程。下面捋清Lumen場景上有哪些卡片需要捕獲和渲染,它的處理者是InitView階段的BeginUpdateLumenSceneTasks

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneRendering.cpp

void FDeferredShadingSceneRenderer::BeginUpdateLumenSceneTasks(FRDGBuilder& GraphBuilder)
{
    LLM_SCOPE_BYTAG(Lumen);

    const FViewInfo& MainView = Views[0];
    const bool bAnyLumenActive = ShouldRenderLumenDiffuseGI(Scene, MainView, true)
        || ShouldRenderLumenReflections(MainView, true);

    if (bAnyLumenActive
        && !ViewFamily.EngineShowFlags.HitProxies)
    {
        SCOPED_NAMED_EVENT(FDeferredShadingSceneRenderer_BeginUpdateLumenSceneTasks, FColor::Emerald);
        QUICK_SCOPE_CYCLE_COUNTER(BeginUpdateLumenSceneTasks);
        const double StartTime = FPlatformTime::Seconds();

        FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
        // 獲取待渲染卡片列表並重置.
        TArray<FCardRenderData, SceneRenderingAllocator>& CardsToRender = LumenCardRenderer.CardsToRender;
        LumenCardRenderer.Reset();

        const int32 LocalLumenSceneGeneration = GLumenSceneGeneration;
        const bool bRecaptureLumenSceneOnce = LumenSceneData.Generation != LocalLumenSceneGeneration;
        LumenSceneData.Generation = LocalLumenSceneGeneration;
        const bool bReallocateAtlas = LumenSceneData.MaxAtlasSize != GetDesiredAtlasSize() 
            || (LumenSceneData.RadiosityAtlas && LumenSceneData.RadiosityAtlas->GetDesc().Extent != GetRadiosityAtlasSize(LumenSceneData.MaxAtlasSize))
            || GLumenSceneReset;

        if (GLumenSceneReset != 2)
        {
            GLumenSceneReset = 0;
        }

        LumenSceneData.NumMeshCardsToAddToSurfaceCache = 0;

        // 更新髒卡片.
        UpdateDirtyCards(Scene, bReallocateAtlas, bRecaptureLumenSceneOnce);
        // 更新Lumen場景的圖元資訊.
        UpdateLumenScenePrimitives(Scene);
        // 更新遠處場景.
        UpdateDistantScene(Scene, Views[0]);

        const FVector LumenSceneCameraOrigin = GetLumenSceneViewOrigin(MainView, GetNumLumenVoxelClipmaps() - 1);
        const float MaxCardUpdateDistanceFromCamera = ComputeMaxCardUpdateDistanceFromCamera();

        // 重新分配卡片Atlas.
        if (bReallocateAtlas)
        {
            LumenSceneData.MaxAtlasSize = GetDesiredAtlasSize();
            // 在重新建立Atlas之前,應該釋放所有內容
            ensure(LumenSceneData.NumCardTexels == 0);

            LumenSceneData.AtlasAllocator = FBinnedTextureLayout(LumenSceneData.MaxAtlasSize, GLumenSceneCardAtlasAllocatorBinSize);
        }

        // 每幀捕獲和更新卡片紋素以及它們的數量, 是否更新由GLumenSceneRecaptureLumenSceneEveryFrame(控制檯命令r.LumenScene.RecaptureEveryFrame)決定.
        const int32 CardCapturesPerFrame = GLumenSceneRecaptureLumenSceneEveryFrame != 0 ? INT_MAX : GetMaxLumenSceneCardCapturesPerFrame();
        const int32 CardTexelsToCapturePerFrame = GLumenSceneRecaptureLumenSceneEveryFrame != 0 ? INT_MAX : GetLumenSceneCardResToCapturePerFrame() * GetLumenSceneCardResToCapturePerFrame();

        if (CardCapturesPerFrame > 0 && CardTexelsToCapturePerFrame > 0)
        {
            QUICK_SCOPE_CYCLE_COUNTER(FillCardsToRender);

            TArray<FLumenSurfaceCacheUpdatePacket, SceneRenderingAllocator> Packets;
            TArray<FMeshCardsAdd, SceneRenderingAllocator> MeshCardsAddsSortedByPriority;

            // 準備表面快取更新.
            {
                TRACE_CPUPROFILER_EVENT_SCOPE(PrepareSurfaceCacheUpdate);

                const int32 NumPrimitivesPerPacket = FMath::Max(GLumenScenePrimitivesPerPacket, 1);
                const int32 NumPackets = FMath::DivideAndRoundUp(LumenSceneData.LumenPrimitives.Num(), NumPrimitivesPerPacket);

                CardsToRender.Reset(GetMaxLumenSceneCardCapturesPerFrame());
                Packets.Reserve(NumPackets);

                for (int32 PacketIndex = 0; PacketIndex < NumPackets; ++PacketIndex)
                {
                    Packets.Emplace(
                        LumenSceneData.LumenPrimitives,
                        LumenSceneData.MeshCards,
                        LumenSceneData.Cards,
                        LumenSceneCameraOrigin,
                        MaxCardUpdateDistanceFromCamera,
                        PacketIndex * NumPrimitivesPerPacket,
                        NumPrimitivesPerPacket);
                }
            }

            // 執行準備快取更新任務.
            {
                TRACE_CPUPROFILER_EVENT_SCOPE(RunPrepareSurfaceCacheUpdate);
                const bool bExecuteInParallel = FApp::ShouldUseThreadingForPerformance();

                ParallelFor(Packets.Num(),
                    [&Packets](int32 Index)
                    {
                        Packets[Index].AnyThreadTask();
                    },
                    !bExecuteInParallel
                );
            }

            // 打包上述任務的結果.
            {
                TRACE_CPUPROFILER_EVENT_SCOPE(PacketResults);

                const float CARD_DISTANCE_BUCKET_SIZE = 100.0f;
                uint32 NumMeshCardsAddsPerBucket[MAX_ADD_PRIMITIVE_PRIORITY + 1];

                for (int32 BucketIndex = 0; BucketIndex < UE_ARRAY_COUNT(NumMeshCardsAddsPerBucket); ++BucketIndex)
                {
                    NumMeshCardsAddsPerBucket[BucketIndex] = 0;
                }

                // Count how many cards fall into each bucket
                for (int32 PacketIndex = 0; PacketIndex < Packets.Num(); ++PacketIndex)
                {
                    const FLumenSurfaceCacheUpdatePacket& Packet = Packets[PacketIndex];
                    LumenSceneData.NumMeshCardsToAddToSurfaceCache += Packet.MeshCardsAdds.Num();

                    for (int32 CardIndex = 0; CardIndex < Packet.MeshCardsAdds.Num(); ++CardIndex)
                    {
                        const FMeshCardsAdd& MeshCardsAdd = Packet.MeshCardsAdds[CardIndex];
                        ++NumMeshCardsAddsPerBucket[MeshCardsAdd.Priority];
                    }
                }

                int32 NumMeshCardsInBucketsUpToMaxBucket = 0;
                int32 MaxBucketIndexToAdd = 0;

                // 選擇前N個桶進行分配
                for (int32 BucketIndex = 0; BucketIndex < UE_ARRAY_COUNT(NumMeshCardsAddsPerBucket); ++BucketIndex)
                {
                    NumMeshCardsInBucketsUpToMaxBucket += NumMeshCardsAddsPerBucket[BucketIndex];
                    MaxBucketIndexToAdd = BucketIndex;

                    if (NumMeshCardsInBucketsUpToMaxBucket > CardCapturesPerFrame)
                    {
                        break;
                    }
                }

                MeshCardsAddsSortedByPriority.Reserve(GetMaxLumenSceneCardCapturesPerFrame());

                // 拷貝前N個桶到CardsToAllocateSortedByDistance
                for (int32 PacketIndex = 0; PacketIndex < Packets.Num(); ++PacketIndex)
                {
                    const FLumenSurfaceCacheUpdatePacket& Packet = Packets[PacketIndex];

                    for (int32 CardIndex = 0; CardIndex < Packet.MeshCardsAdds.Num() && MeshCardsAddsSortedByPriority.Num() < CardCapturesPerFrame; ++CardIndex)
                    {
                        const FMeshCardsAdd& MeshCardsAdd = Packet.MeshCardsAdds[CardIndex];

                        if (MeshCardsAdd.Priority <= MaxBucketIndexToAdd)
                        {
                            MeshCardsAddsSortedByPriority.Add(MeshCardsAdd);
                        }
                    }
                }

                // 移除所有不可見的網格卡片.
                for (int32 PacketIndex = 0; PacketIndex < Packets.Num(); ++PacketIndex)
                {
                    const FLumenSurfaceCacheUpdatePacket& Packet = Packets[PacketIndex];

                    for (int32 MeshCardsToRemoveIndex = 0; MeshCardsToRemoveIndex < Packet.MeshCardsRemoves.Num(); ++MeshCardsToRemoveIndex)
                    {
                        const FMeshCardsRemove& MeshCardsRemove = Packet.MeshCardsRemoves[MeshCardsToRemoveIndex];
                        FLumenPrimitive& LumenPrimitive = LumenSceneData.LumenPrimitives[MeshCardsRemove.LumenPrimitiveIndex];
                        FLumenPrimitiveInstance& LumenPrimitiveInstance = LumenPrimitive.Instances[MeshCardsRemove.LumenInstanceIndex];

                        LumenSceneData.RemoveMeshCards(LumenPrimitive, LumenPrimitiveInstance);
                    }
                }
            }

            // 分配遠處場景.
            extern int32 GLumenUpdateDistantSceneCaptures;
            if (GLumenUpdateDistantSceneCaptures)
            {
                for (int32 DistantCardIndex : LumenSceneData.DistantCardIndices)
                {
                    FLumenCard& DistantCard = LumenSceneData.Cards[DistantCardIndex];

                    extern int32 GLumenDistantSceneCardResolution;
                    DistantCard.DesiredResolution = FIntPoint(GLumenDistantSceneCardResolution, GLumenDistantSceneCardResolution);

                    if (!DistantCard.bVisible)
                    {
                        LumenSceneData.AddCardToVisibleCardList(DistantCardIndex);
                        DistantCard.bVisible = true;
                    }

                    DistantCard.RemoveFromAtlas(LumenSceneData);
                    LumenSceneData.CardIndicesToUpdateInBuffer.Add(DistantCardIndex);

                    // 加入到CardsToRender列表.
                    CardsToRender.Add(FCardRenderData(
                        DistantCard,
                        nullptr,
                        -1,
                        FeatureLevel,
                        DistantCardIndex));
                }
            }

            // 分配新的卡片.
            for (int32 SortedCardIndex = 0; SortedCardIndex < MeshCardsAddsSortedByPriority.Num(); ++SortedCardIndex)
            {
                const FMeshCardsAdd& MeshCardsAdd = MeshCardsAddsSortedByPriority[SortedCardIndex];
                FLumenPrimitive& LumenPrimitive = LumenSceneData.LumenPrimitives[MeshCardsAdd.LumenPrimitiveIndex];
                FLumenPrimitiveInstance& LumenPrimitiveInstance = LumenPrimitive.Instances[MeshCardsAdd.LumenInstanceIndex];

                LumenSceneData.AddMeshCards(MeshCardsAdd.LumenPrimitiveIndex, MeshCardsAdd.LumenInstanceIndex);

                if (LumenPrimitiveInstance.MeshCardsIndex >= 0)
                {
                    // 獲取圖元例項的網格卡片.
                    const FLumenMeshCards& MeshCards = LumenSceneData.MeshCards[LumenPrimitiveInstance.MeshCardsIndex];

                    // 遍歷網格卡片的所有卡片, 新增有效的卡片到CardsToRender列表.
                    for (uint32 CardIndex = MeshCards.FirstCardIndex; CardIndex < MeshCards.FirstCardIndex + MeshCards.NumCards; ++CardIndex)
                    {
                        FLumenCard& LumenCard = LumenSceneData.Cards[CardIndex];

                        // 分配卡片.
                        FCardAllocationOutput CardAllocation;
                        ComputeCardAllocation(LumenCard, LumenSceneCameraOrigin, MaxCardUpdateDistanceFromCamera, CardAllocation);

                        LumenCard.DesiredResolution = CardAllocation.TextureAllocationSize;

                        if (LumenCard.bVisible != CardAllocation.bVisible)
                        {
                            LumenCard.bVisible = CardAllocation.bVisible;
                            if (LumenCard.bVisible)
                            {
                                LumenSceneData.AddCardToVisibleCardList(CardIndex);
                            }
                            else
                            {
                                LumenCard.RemoveFromAtlas(LumenSceneData);
                                LumenSceneData.RemoveCardFromVisibleCardList(CardIndex);
                            }
                            LumenSceneData.CardIndicesToUpdateInBuffer.Add(CardIndex);
                        }

                        // 如果卡片可見且解析度和預期不一樣, 才新增到CardsToRender.
                        if (LumenCard.bVisible && LumenCard.AtlasAllocation.Width() != LumenCard.DesiredResolution.X && LumenCard.AtlasAllocation.Height() != LumenCard.DesiredResolution.Y)
                        {
                            LumenCard.RemoveFromAtlas(LumenSceneData);
                            LumenSceneData.CardIndicesToUpdateInBuffer.Add(CardIndex);

                            // 加入到CardsToRender列表.
                            CardsToRender.Add(FCardRenderData(
                                LumenCard,
                                LumenPrimitive.Primitive,
                                LumenPrimitive.bMergedInstances ? -1 : MeshCardsAdd.LumenInstanceIndex,
                                FeatureLevel,
                                CardIndex));

                            LumenCardRenderer.NumCardTexelsToCapture += LumenCard.AtlasAllocation.Area();
                        }
                    } // for

                    // 如果卡片或卡片紋素超限, 終止迴圈.
                    if (CardsToRender.Num() >= CardCapturesPerFrame
                        || LumenCardRenderer.NumCardTexelsToCapture >= CardTexelsToCapturePerFrame)
                    {
                        break;
                    }
                }
            }
        }

        // 分配和更新卡片Atlas.
        AllocateOptionalCardAtlases(GraphBuilder, LumenSceneData, MainView, bReallocateAtlas);
        UpdateLumenCardAtlasAllocation(GraphBuilder, MainView, bReallocateAtlas, bRecaptureLumenSceneOnce);

         // 處理待渲染卡片.
        if (CardsToRender.Num() > 0)
        {
            // 設定網格通道.
            {
                QUICK_SCOPE_CYCLE_COUNTER(MeshPassSetup);

                // 在渲染之前,確保所有的網格渲染資料都已準備好.
                {
                    QUICK_SCOPE_CYCLE_COUNTER(PrepareStaticMeshData);

                    // Set of unique primitives requiring static mesh update
                    TSet<FPrimitiveSceneInfo*> PrimitivesToUpdateStaticMeshes;

                    for (FCardRenderData& CardRenderData : CardsToRender)
                    {
                        FPrimitiveSceneInfo* PrimitiveSceneInfo = CardRenderData.PrimitiveSceneInfo;

                        if (PrimitiveSceneInfo && PrimitiveSceneInfo->Proxy->AffectsDynamicIndirectLighting())
                        {
                            if (PrimitiveSceneInfo->NeedsUniformBufferUpdate())
                            {
                                PrimitiveSceneInfo->UpdateUniformBuffer(GraphBuilder.RHICmdList);
                            }

                            if (PrimitiveSceneInfo->NeedsUpdateStaticMeshes())
                            {
                                PrimitivesToUpdateStaticMeshes.Add(PrimitiveSceneInfo);
                            }
                        }
                    }

                    if (PrimitivesToUpdateStaticMeshes.Num() > 0)
                    {
                        TArray<FPrimitiveSceneInfo*> UpdatedSceneInfos;
                        UpdatedSceneInfos.Reserve(PrimitivesToUpdateStaticMeshes.Num());
                        for (FPrimitiveSceneInfo* PrimitiveSceneInfo : PrimitivesToUpdateStaticMeshes)
                        {
                            UpdatedSceneInfos.Add(PrimitiveSceneInfo);
                        }

                        FPrimitiveSceneInfo::UpdateStaticMeshes(GraphBuilder.RHICmdList, Scene, UpdatedSceneInfos, true);
                    }
                }

                // 增加卡片捕獲繪製.
                for (FCardRenderData& CardRenderData : CardsToRender)
                {
                    CardRenderData.StartMeshDrawCommandIndex = LumenCardRenderer.MeshDrawCommands.Num();
                    CardRenderData.NumMeshDrawCommands = 0;
                    int32 NumNanitePrimitives = 0;

                    const FLumenCard& Card = LumenSceneData.Cards[CardRenderData.CardIndex];
                    checkSlow(Card.bVisible && Card.bAllocated);

                    // 建立或處理卡片對應的FVisibleMeshDrawCommand.
                    AddCardCaptureDraws(Scene, 
                        GraphBuilder.RHICmdList, 
                        CardRenderData, 
                        LumenCardRenderer.MeshDrawCommands, 
                        LumenCardRenderer.MeshDrawPrimitiveIds);

                    CardRenderData.NumMeshDrawCommands = LumenCardRenderer.MeshDrawCommands.Num() - CardRenderData.StartMeshDrawCommandIndex;
                }
            }

            (.....)
        }
    }
}

以上可知,網格卡片並不是每幀更新,在GLumenSceneRecaptureLumenSceneEveryFrame(控制檯命令r.LumenScene.RecaptureEveryFrame)開啟的情況下,網格卡片的解析度發生改變且可見的情況下,才會加入到待渲染列表,並且每幀都有上限,防止一幀需要更新和繪製的卡片過多導致效能瓶頸。

6.5.5.3 MeshCardCapture

分析完如何將網格卡片加入到待渲染列表,便可以繼續分析捕獲卡片的具體過程了:

// 捕獲網格卡片.
{
    FLumenCardPassParameters* PassParameters = GraphBuilder.AllocParameters<FLumenCardPassParameters>();
    // 卡片檢視資訊.
    PassParameters->View = Scene->UniformBuffers.LumenCardCaptureViewUniformBuffer;
    PassParameters->CardPass = GraphBuilder.CreateUniformBuffer(PassUniformParameters);
    // Atlas渲染目標有3個: 基礎色, 法線, 自發光.
    PassParameters->RenderTargets[0] = FRenderTargetBinding(AlbedoAtlasTexture, ERenderTargetLoadAction::ELoad);
    PassParameters->RenderTargets[1] = FRenderTargetBinding(NormalAtlasTexture, ERenderTargetLoadAction::ELoad);
    PassParameters->RenderTargets[2] = FRenderTargetBinding(EmissiveAtlasTexture, ERenderTargetLoadAction::ELoad);
    // 深度目標緩衝.
    PassParameters->RenderTargets.DepthStencil = FDepthStencilBinding(DepthStencilAtlasTexture, ERenderTargetLoadAction::ELoad, FExclusiveDepthStencil::DepthWrite_StencilNop);

    InstanceCullingResult.GetDrawParameters(PassParameters->InstanceCullingDrawParams);

    // 捕獲網格卡片Pass.
    GraphBuilder.AddPass(
        RDG_EVENT_NAME("MeshCardCapture"),
        PassParameters,
        ERDGPassFlags::Raster,
        [this, Scene = Scene, PrimitiveIdVertexBuffer, SharedView, &CardsToRender, PassParameters](FRHICommandList& RHICmdList)
        {
            QUICK_SCOPE_CYCLE_COUNTER(MeshPass);

            // 將所有待渲染的卡片準備資料並提交繪製指令.
            for (FCardRenderData& CardRenderData : CardsToRender)
            {
                if (CardRenderData.NumMeshDrawCommands > 0)
                {
                    FIntRect AtlasRect = CardRenderData.AtlasAllocation;
                    // 設定視口.
                    RHICmdList.SetViewport(AtlasRect.Min.X, AtlasRect.Min.Y, 0.0f, AtlasRect.Max.X, AtlasRect.Max.Y, 1.0f);

                    // 處理檢視資料.
                    CardRenderData.PatchView(RHICmdList, Scene, SharedView);
                    Scene->UniformBuffers.LumenCardCaptureViewUniformBuffer.UpdateUniformBufferImmediate(*SharedView->CachedViewUniformShaderParameters);

                    FGraphicsMinimalPipelineStateSet GraphicsMinimalPipelineStateSet;
                #if GPUCULL_TODO
                    if (Scene->GPUScene.IsEnabled())
                    {
                        FRHIBuffer* DrawIndirectArgsBuffer = nullptr;
                        FRHIBuffer* InstanceIdOffsetBuffer = nullptr;
                        FInstanceCullingDrawParams& InstanceCullingDrawParams = PassParameters->InstanceCullingDrawParams;
                        if (InstanceCullingDrawParams.DrawIndirectArgsBuffer != nullptr && InstanceCullingDrawParams.InstanceIdOffsetBuffer != nullptr)
                        {
                            DrawIndirectArgsBuffer = InstanceCullingDrawParams.DrawIndirectArgsBuffer->GetRHI();
                            InstanceIdOffsetBuffer = InstanceCullingDrawParams.InstanceIdOffsetBuffer->GetRHI();
                        }

                        // GPU裁剪呼叫GPUInstanced介面.
                        SubmitGPUInstancedMeshDrawCommandsRange(
                            LumenCardRenderer.MeshDrawCommands,
                            GraphicsMinimalPipelineStateSet,
                            CardRenderData.StartMeshDrawCommandIndex,
                            CardRenderData.NumMeshDrawCommands,
                            1,
                            InstanceIdOffsetBuffer,
                            DrawIndirectArgsBuffer,
                            RHICmdList);
                    }
                #endif // GPUCULL_TODO
                    (......)
                }
            }
        }
    );
}

繪製卡片階段,渲染網格卡片時為每個網格卡片以低解析度從不同的方向獲取網格表面屬性的投影,這些投影后的網格屬性被儲存在紋理atlas中,但不同於傳統的渲染管線,此處只光柵化卡片檢視範圍內的Nanite網格的三種屬性:基礎色、法線、自發光。(下圖)

卡片捕捉階段投影在網格卡片的網格屬性圖集。上:基礎色圖集,下:法線圖集。

下面是捕獲網格卡片使用的VS和PS:

// Engine\Shaders\Private\Lumen\LumenCardVertexShader.usf

struct FLumenCardInterpolantsVSToPS
{
};

struct FLumenCardVSToPS
{
    FVertexFactoryInterpolantsVSToPS FactoryInterpolants;
    FLumenCardInterpolantsVSToPS PassInterpolants;
    float4 Position : SV_POSITION;
};

// 網格卡片VS主入口.
void Main(
    FVertexFactoryInput Input,
    OPTIONAL_VertexID
    out FLumenCardVSToPS Output
    )
{    
    uint EyeIndex = 0;
    ResolvedView = ResolveView();

    FVertexFactoryIntermediates VFIntermediates = GetVertexFactoryIntermediates(Input);
    float4 WorldPositionExcludingWPO = VertexFactoryGetWorldPosition(Input, VFIntermediates);
    float4 WorldPosition = WorldPositionExcludingWPO;
    float4 ClipSpacePosition;

    float3x3 TangentToLocal = VertexFactoryGetTangentToLocal(Input, VFIntermediates);    
    FMaterialVertexParameters VertexParameters = GetMaterialVertexParameters(Input, VFIntermediates, WorldPosition.xyz, TangentToLocal);

    ISOLATE
    {
        // 材質的位置偏移.
        WorldPosition.xyz += GetMaterialWorldPositionOffset(VertexParameters);
        // 光柵化的位置偏移.
        float4 RasterizedWorldPosition = VertexFactoryGetRasterizedWorldPosition(Input, VFIntermediates, WorldPosition);
        // 將位置變換到裁剪空間.
        ClipSpacePosition = INVARIANT(mul(RasterizedWorldPosition, ResolvedView.TranslatedWorldToClip));
        Output.Position = INVARIANT(ClipSpacePosition);
    }

    bool bClampToNearPlane = false;// GetPrimitiveData(Input.PrimitiveId).ObjectWorldPositionAndRadius.w < .5f * max();

    if (bClampToNearPlane && Output.Position.z < 0)
    {
        Output.Position.z = 0.01f;
        Output.Position.w = 1.0f;
    }

    Output.FactoryInterpolants = VertexFactoryGetInterpolantsVSToPS(Input, VFIntermediates, VertexParameters);
}


// Engine\Shaders\Private\Lumen\LumenCardPixelShader.usf

struct FLumenCardInterpolantsVSToPS
{
};

// 網格卡片PS主入口.
void Main(
    FVertexFactoryInterpolantsVSToPS Interpolants,
    FLumenCardInterpolantsVSToPS PassInterpolants,
    in INPUT_POSITION_QUALIFIERS float4 SvPosition : SV_Position        // after all interpolators
    OPTIONAL_IsFrontFace,
    out float4 OutTarget0 : SV_Target0,
    out float4 OutTarget1 : SV_Target1,
    out float4 OutTarget2 : SV_Target2)
{
    ResolvedView = ResolveView();

    // 獲取材質的基本屬性.
    FMaterialPixelParameters MaterialParameters = GetMaterialPixelParameters(Interpolants, SvPosition);
    FPixelMaterialInputs PixelMaterialInputs;
    
    // 計算材質的額外屬性.
    {
        float4 ScreenPosition = SvPositionToResolvedScreenPosition(SvPosition);
        float3 TranslatedWorldPosition = SvPositionToResolvedTranslatedWorld(SvPosition);
        CalcMaterialParametersEx(MaterialParameters, PixelMaterialInputs, SvPosition, ScreenPosition, bIsFrontFace, TranslatedWorldPosition, TranslatedWorldPosition);
    }

    // 獲取材質覆蓋和裁剪資料.
    GetMaterialCoverageAndClipping(MaterialParameters, PixelMaterialInputs);

    float3 BaseColor = GetMaterialBaseColor(PixelMaterialInputs);
    float  Metallic = GetMaterialMetallic(PixelMaterialInputs);
    float  Specular = GetMaterialSpecular(PixelMaterialInputs);

    float Roughness = GetMaterialRoughness(PixelMaterialInputs);
    float Opacity = GetMaterialOpacity(PixelMaterialInputs);

    float3 DiffuseColor = BaseColor - BaseColor * Metallic;
    float3 SpecularColor = lerp(0.08 * Specular.xxx, BaseColor, Metallic.xxx);

    // 計算環境光的影響.
    EnvBRDFApproxFullyRough(DiffuseColor, SpecularColor);

    // 儲存基礎色, 法線, 自發光.
    //@todo DynamicGI better encoding for low precision, hemispherical normal encoding
    OutTarget0 = float4(sqrt(DiffuseColor), Opacity);
    OutTarget1 = float4(MaterialParameters.WorldNormal * .5f + .5f, 0);
    OutTarget2 = float4(GetMaterialEmissive(PixelMaterialInputs), 0);
}

其中VS的輸入是區域性空間的長方體,VS的輸出是裁剪空間的長方體:

經過PS渲染完之後,會在基礎色、法線、自發光的三個RT圖集中對應的位置儲存資料。需要特意提出的是,這裡的VS和PS邏輯遠遠沒有傳統BasePass的VS和PS複雜,這也是Lumen得以實時渲染的其中一個重要優化措施。

另外說一下,渲染新卡片到Atlas圖集的位置可由Bin packing problem解決,渲染時只要將起始點和寬高設定到ViewPort就行了,對應的型別是FBinnedTextureLayout,其它相關型別還有FTextureLayoutFTextureLayout3d。比如以下截幀的卡片ViewPort的位置是(0, 0),寬高是(64, 64),意味著它將被渲染到圖集中最前面寬高為64的區域:

順帶提一下,網格卡片的繪製指令是在FLumenCardMeshProcessor中處理的:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneRendering.cpp

void FLumenCardMeshProcessor::AddMeshBatch(const FMeshBatch& RESTRICT MeshBatch, uint64 BatchElementMask, const FPrimitiveSceneProxy* RESTRICT PrimitiveSceneProxy, int32 StaticMeshId)
{
    LLM_SCOPE_BYTAG(Lumen);

    if (MeshBatch.bUseForMaterial && DoesPlatformSupportLumenGI(GetFeatureLevelShaderPlatform(FeatureLevel)))
    {
        // 處理材質.
        const FMaterialRenderProxy* FallbackMaterialRenderProxyPtr = nullptr;
        const FMaterial& Material = MeshBatch.MaterialRenderProxy->GetMaterialWithFallback(FeatureLevel, FallbackMaterialRenderProxyPtr);

        const FMaterialRenderProxy& MaterialRenderProxy = FallbackMaterialRenderProxyPtr ? *FallbackMaterialRenderProxyPtr : *MeshBatch.MaterialRenderProxy;

        // 處理渲染狀態.
        const EBlendMode BlendMode = Material.GetBlendMode();
        const FMaterialShadingModelField ShadingModels = Material.GetShadingModels();
        const bool bIsTranslucent = IsTranslucentBlendMode(BlendMode);
        const FMeshDrawingPolicyOverrideSettings OverrideSettings = ComputeMeshOverrideSettings(MeshBatch);
        const ERasterizerFillMode MeshFillMode = ComputeMeshFillMode(MeshBatch, Material, OverrideSettings);
        const ERasterizerCullMode MeshCullMode = ComputeMeshCullMode(MeshBatch, Material, OverrideSettings);

        if (!bIsTranslucent
            && (PrimitiveSceneProxy && PrimitiveSceneProxy->ShouldRenderInMainPass() && PrimitiveSceneProxy->AffectsDynamicIndirectLighting())
            && ShouldIncludeDomainInMeshPass(Material.GetMaterialDomain()))
        {
            // 選擇VS和PS等shader
            const FVertexFactory* VertexFactory = MeshBatch.VertexFactory;
            FVertexFactoryType* VertexFactoryType = VertexFactory->GetType();

            TMeshProcessorShaders<FLumenCardVS, FLumenCardPS> PassShaders;

            PassShaders.VertexShader = Material.GetShader<FLumenCardVS>(VertexFactoryType);
            PassShaders.PixelShader = Material.GetShader<FLumenCardPS>(VertexFactoryType);

            FMeshMaterialShaderElementData ShaderElementData;
            ShaderElementData.InitializeMeshMaterialData(ViewIfDynamicMeshCommand, PrimitiveSceneProxy, MeshBatch, StaticMeshId, false);

            const FMeshDrawCommandSortKey SortKey = CalculateMeshStaticSortKey(PassShaders.VertexShader, PassShaders.PixelShader);

            // 構建繪製指令
            BuildMeshDrawCommands(
                MeshBatch,
                BatchElementMask,
                PrimitiveSceneProxy,
                MaterialRenderProxy,
                Material,
                PassDrawRenderState,
                PassShaders,
                MeshFillMode,
                MeshCullMode,
                SortKey,
                EMeshPassFeatures::Default,
                ShaderElementData);
        }
    }
}

6.5.5.4 RasterizeLumenCards

光柵化Lumen卡片邏輯如下:

if (UseNanite(ShaderPlatform) && ViewFamily.EngineShowFlags.NaniteMeshes && bAnyNaniteMeshes)
{
    (......)

    Nanite::FRasterContext RasterContext = Nanite::InitRasterContext(...);

    (......)

    Nanite::FCullingContext CullingContext = Nanite::InitCullingContext(...);

    if (GLumenSceneNaniteMultiViewCapture) // 多檢視繪製模型
    {
        const uint32 NumCardsToRender = CardsToRender.Num();

        // 拆分檢視, 防止超過同批次的最大數量.
        uint32 NextCardIndex = 0;
        while(NextCardIndex < NumCardsToRender)
        {
            TArray<Nanite::FPackedView, SceneRenderingAllocator> NaniteViews;
            TArray<Nanite::FInstanceDraw, SceneRenderingAllocator> NaniteInstanceDraws;

            while(NextCardIndex < NumCardsToRender && NaniteViews.Num() < MAX_VIEWS_PER_CULL_RASTERIZE_PASS)
            {
                const FCardRenderData& CardRenderData = CardsToRender[NextCardIndex];

                if(CardRenderData.NaniteInstanceIds.Num() > 0)
                {
                    for(uint32 InstanceID : CardRenderData.NaniteInstanceIds)
                    {
                        NaniteInstanceDraws.Add(Nanite::FInstanceDraw { InstanceID, (uint32)NaniteViews.Num() });
                    }

                    Nanite::FPackedViewParams Params;
                    Params.ViewMatrices = CardRenderData.ViewMatrices;
                    Params.PrevViewMatrices = CardRenderData.ViewMatrices;
                    Params.ViewRect = CardRenderData.AtlasAllocation;
                    Params.RasterContextSize = DepthStencilAtlasSize;
                    Params.LODScaleFactor = CardRenderData.NaniteLODScaleFactor;
                    NaniteViews.Add(Nanite::CreatePackedView(Params));
                }

                NextCardIndex++;
            }

            // 例項化繪製.
            if (NaniteInstanceDraws.Num() > 0)
            {
                RDG_EVENT_SCOPE(GraphBuilder, "Nanite::RasterizeLumenCards");

                Nanite::FRasterState RasterState;
                Nanite::CullRasterize(
                    GraphBuilder,
                    *Scene,
                    NaniteViews,
                    CullingContext,
                    RasterContext,
                    RasterState,
                    &NaniteInstanceDraws
                );
            }
        }
    }
    else // 單檢視模式.
    {
        (......)
    }
    
    extern float GLumenDistantSceneMinInstanceBoundsRadius;

    // 渲染遠景的卡片.
    for (FCardRenderData& CardRenderData : CardsToRender)
    {
        if (CardRenderData.bDistantScene)
        {
            (......)
        }
    }

    // 繪製Lumen的網格.
    Nanite::DrawLumenMeshCapturePass(
        GraphBuilder,
        *Scene,
        SharedView,
        CardsToRender,
        CullingContext,
        RasterContext,
        PassUniformParameters,
        RectMinMaxBufferSRV,
        NumRects,
        LumenSceneData.MaxAtlasSize,
        AlbedoAtlasTexture,
        NormalAtlasTexture,
        EmissiveAtlasTexture,
        DepthStencilAtlasTexture
    );
}

光柵化卡片的階段跟Nanite流程基本一致:

光柵化後輸出的結果也是一致,包含可見性、深度模板緩衝、三角形ID等資訊:

之後的步驟就是繪製網格卡片,這個階段也和Nanite基本一致:

輸出的GBuffer依然是上面提及的基礎色、法線、自發光三個圖集,但會附加到它們的空白區域。

6.5.6 Lumen場景光照

6.5.6.1 Voxel Cone Tracing

後面小節會較多地涉及到Voxel Cone Tracing(體素椎體追蹤)的相關知識,本小節先補充一下它的相關知識,論文依據是Interactive Indirect Illumination Using Voxel Cone TracingVoxel Cone Tracing and Sparse Voxel Octree for Real-time Global Illumination

對場景執行Voxel Cone Tracing的第一步是構建場景物體的稀疏體素八叉樹(Sparse Voxel Octree),UE5使用了稀疏HLOD的網格距離場。

下圖是Sponza場景體素化後的情形:

渲染引擎(如UE)一般使用了混合渲染管線,直接光(Primary ray)使用傳統的光柵化獲得,次級光則使用椎體追蹤:

在體素椎體追蹤之前,會預過濾幾何體,然後像參合介質那樣去追蹤(可使用體積光線投射法)。而體素使用不透明場+入射輻射率來代表場景物體,這樣可以使用四線性(Quadrilinearly)插值取樣來模擬椎體射線覆蓋的腳印:

上圖步驟中的單條椎體射線追蹤需要用到MIP對映圖,MIP對映圖的生成使用了高斯權重,即體素中心的權重最大,偏離體素中心越遠的點權重越小:

利用高斯權重生成的MIP圖越高的Level越模糊,剛好可以匹配椎體的形狀:椎體射線離起點越遠,其覆蓋的範圍越大,接收到的光照越模糊!在此前提下,就可以根據椎體射線相交點與起點的距離去四線性取樣對應Level的MIP圖,以快速得到椎體射線相交點的輻射率:

Voxel的渲染過程可分拆成3個Pass:第一個Pass是光照,烘焙輻照度(反射陰影圖,RSM);第二個Pass是預過濾,使用稀疏八叉樹下采樣輻射率;第三個Pass是相機Pass,收集每個可見片元(畫素)的輻照度。(下圖)

同樣地,Voxel追蹤還可以用於鏡面反射、AO、軟陰影中。對於鏡面反射,可以採用類似的追蹤方式,只是生成的鏡面椎體數量少且範圍小:

實際上,在Cone Tracing中,不同粗糙度的表面可以構造不同的數量和大小的椎體進行追蹤:

左:高粗糙度表面,即漫反射,需要多個椎體追蹤;中:較粗糙的鏡面反射,只需一個角度較大的椎體追蹤;右:低粗糙的鏡面反射,只需一個角度較小的椎體追蹤。

對於AO,採用近處多采樣椎體追蹤+遠景AO+離線遮擋的綜合方式:

對於軟陰影,可以用一個畫素一個椎體的方式取樣,達到越光滑越高效的計算效果:

論文還提到了只用一個Pass達到體素化的技術,以及用Compute Shader構建稀疏八叉樹的技術和過程:

6.5.6.2 RenderLumenSceneLighting

Lumen的場景光照由RenderLumenSceneLighting擔當,它的程式碼如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneLighting.cpp

void FDeferredShadingSceneRenderer::RenderLumenSceneLighting(
    FRDGBuilder& GraphBuilder,
    FViewInfo& View)
{
    FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
    // 檢測是否開啟了Lumen: 非直接漫反射或反射方式的其中一個是Lumen即可.
    const bool bAnyLumenEnabled = GetViewPipelineState(Views[0]).DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen 
        || GetViewPipelineState(Views[0]).ReflectionsMethod == EReflectionsMethod::Lumen;

    if (bAnyLumenEnabled)
    {
        RDG_EVENT_SCOPE(GraphBuilder, "LumenSceneLighting");

        FGlobalShaderMap* GlobalShaderMap = View.ShaderMap;
        FLumenCardTracingInputs TracingInputs(GraphBuilder, Scene, Views[0]);

        if (LumenSceneData.VisibleCardsIndices.Num() > 0)
        {
            FRDGTextureRef RadiosityAtlas = GraphBuilder.RegisterExternalTexture(LumenSceneData.RadiosityAtlas, TEXT("Lumen.RadiosityAtlas"));

            // 渲染輻射度.
            RenderRadiosityForLumenScene(GraphBuilder, TracingInputs, GlobalShaderMap, RadiosityAtlas);

            ConvertToExternalTexture(GraphBuilder, RadiosityAtlas, LumenSceneData.RadiosityAtlas);

            FLumenCardScatterContext DirectLightingCardScatterContext;
            extern float GLumenSceneCardDirectLightingUpdateFrequencyScale;

            // 構建間接引數並寫入卡片的面,這些面用來更新這一幀的直接照明.
            DirectLightingCardScatterContext.Init(
                GraphBuilder,
                View,
                LumenSceneData,
                LumenCardRenderer,
                ECullCardsMode::OperateOnSceneForceUpdateForCardsToRender,
                1);

            // 裁剪卡片到指定形狀.
            DirectLightingCardScatterContext.CullCardsToShape(
                GraphBuilder,
                View,
                LumenSceneData,
                LumenCardRenderer,
                TracingInputs.LumenCardSceneUniformBuffer,
                ECullCardsShapeType::None,
                FCullCardsShapeParameters(),
                GLumenSceneCardDirectLightingUpdateFrequencyScale,
                0);

            // 構建散射非直接引數.
            DirectLightingCardScatterContext.BuildScatterIndirectArgs(
                GraphBuilder,
                View);

            extern int32 GLumenSceneRecaptureLumenSceneEveryFrame;

            // 清理光照相關的圖集: 最終收集圖集, 輻照度圖集, 非直接輻照度圖集.
            if (GLumenSceneRecaptureLumenSceneEveryFrame)
            {
                ClearAtlasRDG(GraphBuilder, TracingInputs.FinalLightingAtlas);
                if (Lumen::UseIrradianceAtlas(View))
                {
                    ClearAtlasRDG(GraphBuilder, TracingInputs.IrradianceAtlas);
                }
                if (Lumen::UseIndirectIrradianceAtlas(View))
                {
                    ClearAtlasRDG(GraphBuilder, TracingInputs.IndirectIrradianceAtlas);
                }
            }

            // 組合場景光照.
            CombineLumenSceneLighting(
                Scene,
                View,
                GraphBuilder,
                TracingInputs.LumenCardSceneUniformBuffer,
                TracingInputs.FinalLightingAtlas,
                TracingInputs.OpacityAtlas,
                RadiosityAtlas,
                GlobalShaderMap, 
                DirectLightingCardScatterContext);

            // 拷貝TracingInputs.FinalLightingAtlas的資料到TracingInputs.IndirectIrradianceAtlas.
            if (Lumen::UseIndirectIrradianceAtlas(View))
            {
                CopyLumenCardAtlas(
                    Scene,
                    View,
                    GraphBuilder,
                    TracingInputs.LumenCardSceneUniformBuffer,
                    TracingInputs.FinalLightingAtlas,
                    TracingInputs.IndirectIrradianceAtlas,
                    GlobalShaderMap,
                    DirectLightingCardScatterContext);
            }

            // 渲染Lumen場景的直接光照.
            RenderDirectLightingForLumenScene(
                GraphBuilder,
                TracingInputs.LumenCardSceneUniformBuffer,
                TracingInputs.FinalLightingAtlas,
                TracingInputs.OpacityAtlas,
                GlobalShaderMap,
                DirectLightingCardScatterContext);

            if (Lumen::UseIrradianceAtlas(View))
            {
                CopyLumenCardAtlas(
                    Scene,
                    View,
                    GraphBuilder,
                    TracingInputs.LumenCardSceneUniformBuffer,
                    TracingInputs.FinalLightingAtlas,
                    TracingInputs.IrradianceAtlas,
                    GlobalShaderMap,
                    DirectLightingCardScatterContext);
            }

            FRDGTextureRef AlbedoAtlas = GraphBuilder.RegisterExternalTexture(LumenSceneData.AlbedoAtlas, TEXT("Lumen.AlbedoAtlas"));
            FRDGTextureRef EmissiveAtlas = GraphBuilder.RegisterExternalTexture(LumenSceneData.EmissiveAtlas, TEXT("Lumen.EmissiveAtlas"));
            // 應用Lumen卡片的基礎色.
            ApplyLumenCardAlbedo(
                Scene,
                View,
                GraphBuilder,
                TracingInputs.LumenCardSceneUniformBuffer,
                TracingInputs.FinalLightingAtlas,
                AlbedoAtlas,
                EmissiveAtlas,
                GlobalShaderMap,
                DirectLightingCardScatterContext);

            LumenSceneData.bFinalLightingAtlasContentsValid = true;

            // 預過濾光照.
            PrefilterLumenSceneLighting(GraphBuilder, View, TracingInputs, GlobalShaderMap, DirectLightingCardScatterContext);

            ConvertToExternalTexture(GraphBuilder, TracingInputs.FinalLightingAtlas, LumenSceneData.FinalLightingAtlas);
            if (Lumen::UseIrradianceAtlas(View))
            {
                ConvertToExternalTexture(GraphBuilder, TracingInputs.IrradianceAtlas, LumenSceneData.IrradianceAtlas);
            }
            if (Lumen::UseIndirectIrradianceAtlas(View))
            {
                ConvertToExternalTexture(GraphBuilder, TracingInputs.IndirectIrradianceAtlas, LumenSceneData.IndirectIrradianceAtlas);
            }
        }

        // 計算Voxel光照.
        ComputeLumenSceneVoxelLighting(GraphBuilder, TracingInputs, GlobalShaderMap);

        // 透明物體GI.
        ComputeLumenTranslucencyGIVolume(GraphBuilder, TracingInputs, GlobalShaderMap);
    }
}

RenderDoc的截幀一目瞭然地顯示了以上流程:

後面的小節對部分主要步驟執行分析。

6.5.6.3 RenderRadiosityForLumenScene

RenderRadiosityForLumenScene的邏輯是渲染Lumen場景的輻射度,程式碼如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenRadiosity.cpp

void FDeferredShadingSceneRenderer::RenderRadiosityForLumenScene(
    FRDGBuilder& GraphBuilder, 
    const FLumenCardTracingInputs& TracingInputs, 
    FGlobalShaderMap* GlobalShaderMap, 
    FRDGTextureRef RadiosityAtlas)
{
    LLM_SCOPE_BYTAG(Lumen);

    const FViewInfo& MainView = Views[0];
    FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;

    extern int32 GLumenSceneRecaptureLumenSceneEveryFrame;

    if (IsRadiosityEnabled() 
        && !GLumenSceneRecaptureLumenSceneEveryFrame
        && LumenSceneData.bFinalLightingAtlasContentsValid
        && TracingInputs.NumClipmapLevels > 0)
    {
        RDG_EVENT_SCOPE(GraphBuilder, "Radiosity");

        FLumenCardScatterContext VisibleCardScatterContext;

        // 構建間接引數並寫入卡片的面,這些面用來更新這一幀的直接照明.
        VisibleCardScatterContext.Init(
            GraphBuilder,
            MainView,
            LumenSceneData,
            LumenCardRenderer,
            ECullCardsMode::OperateOnSceneForceUpdateForCardsToRender);

        VisibleCardScatterContext.CullCardsToShape(
            GraphBuilder,
            MainView,
            LumenSceneData,
            LumenCardRenderer,
            TracingInputs.LumenCardSceneUniformBuffer,
            ECullCardsShapeType::None,
            FCullCardsShapeParameters(),
            GLumenSceneCardRadiosityUpdateFrequencyScale,
            0);

        // 構建非直接散射引數.
        VisibleCardScatterContext.BuildScatterIndirectArgs(
            GraphBuilder,
            MainView);

        // 生成取樣點.
        RadiosityDirections.GenerateSamples(
            FMath::Clamp(GLumenRadiosityNumTargetCones, 1, (int32)MaxRadiosityConeDirections),
            1,
            GLumenRadiosityNumTargetCones,
            false,
            true /* Cosine distribution */);

        const bool bRenderSkylight = Lumen::ShouldHandleSkyLight(Scene, ViewFamily);

        // 渲染輻射度的散射.
        if (GLumenRadiosityComputeTraceBlocksScatter) // CS模式
        {
            RenderRadiosityComputeScatter(
                GraphBuilder,
                Scene,
                Views[0],
                bRenderSkylight,
                LumenSceneData,
                RadiosityAtlas,
                TracingInputs,
                VisibleCardScatterContext.Parameters,
                GlobalShaderMap);
        }
        else // PS模式
        {
            FLumenCardRadiosity* PassParameters = GraphBuilder.AllocParameters<FLumenCardRadiosity>();

            PassParameters->RenderTargets[0] = FRenderTargetBinding(RadiosityAtlas, ERenderTargetLoadAction::ENoAction);

            PassParameters->VS.LumenCardScene = TracingInputs.LumenCardSceneUniformBuffer;
            PassParameters->VS.CardScatterParameters = VisibleCardScatterContext.Parameters;
            PassParameters->VS.ScatterInstanceIndex = 0;
            PassParameters->VS.CardUVSamplingOffset = FVector2D::ZeroVector;

            SetupTraceFromTexelParameters(Views[0], TracingInputs, LumenSceneData, PassParameters->PS.TraceFromTexelParameters);

            FLumenCardRadiosityPS::FPermutationDomain PermutationVector;
            PermutationVector.Set<FLumenCardRadiosityPS::FDynamicSkyLight>(bRenderSkylight);
            auto PixelShader = GlobalShaderMap->GetShader<FLumenCardRadiosityPS>(PermutationVector);

            FScene* LocalScene = Scene;
            const int32 RadiosityDownsampleArea = GLumenRadiosityDownsampleFactor * GLumenRadiosityDownsampleFactor;

            // 從圖集中追蹤輻射度.
            GraphBuilder.AddPass(
                RDG_EVENT_NAME("TraceFromAtlasTexels: %u Cones", RadiosityDirections.SampleDirections.Num()),
                PassParameters,
                ERDGPassFlags::Raster,
                [LocalScene, PixelShader, PassParameters, GlobalShaderMap](FRHICommandListImmediate& RHICmdList)
            {
                FIntPoint ViewRect = FIntPoint::DivideAndRoundDown(LocalScene->LumenSceneData->MaxAtlasSize, GLumenRadiosityDownsampleFactor);
                DrawQuadsToAtlas(ViewRect, PixelShader, PassParameters, GlobalShaderMap, TStaticBlendState<>::GetRHI(), RHICmdList);
            });
        }
    }
    else
    {
        ClearAtlasRDG(GraphBuilder, RadiosityAtlas);
    }
}

以上程式碼中最後階段是計算輻射度,通常情況下,會進入CS模式RenderRadiosityComputeScatter,下面進入其程式碼分析:

void RenderRadiosityComputeScatter(
    FRDGBuilder& GraphBuilder,
    const FScene* Scene,
    const FViewInfo& View,
    bool bRenderSkylight, 
    const FLumenSceneData& LumenSceneData,
    FRDGTextureRef RadiosityAtlas,
    const FLumenCardTracingInputs& TracingInputs,
    const FLumenCardScatterParameters& CardScatterParameters,
    FGlobalShaderMap* GlobalShaderMap)
{
    const bool bUseIrradianceCache = GLumenRadiosityUseIrradianceCache != 0;

    // 構建追蹤塊的非直接引數.
    FRDGBufferRef SetupCardTraceBlocksIndirectArgsBuffer = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateIndirectDesc<FRHIDispatchIndirectParameters>(1), TEXT("SetupCardTraceBlocksIndirectArgsBuffer"));
    {
        FRDGBufferUAVRef SetupCardTraceBlocksIndirectArgsBufferUAV = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(SetupCardTraceBlocksIndirectArgsBuffer));

        FPlaceProbeIndirectArgsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FPlaceProbeIndirectArgsCS::FParameters>();
        PassParameters->RWIndirectArgs = SetupCardTraceBlocksIndirectArgsBufferUAV;
        PassParameters->QuadAllocator = CardScatterParameters.QuadAllocator;

        auto ComputeShader = GlobalShaderMap->GetShader< FPlaceProbeIndirectArgsCS >(0);

        ensure(GSetupCardTraceBlocksGroupSize == GPlaceRadiosityProbeGroupSize);
        const FIntVector GroupSize(1, 1, 1);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("SetupCardTraceBlocksIndirectArgsCS"),
            ComputeShader,
            PassParameters,
            GroupSize);
    }

    const int32 TraceBlockMaxSize = 2;
    extern int32 GLumenSceneCardLightingForceFullUpdate;
    const int32 Divisor = TraceBlockMaxSize * GLumenRadiosityDownsampleFactor * (GLumenSceneCardLightingForceFullUpdate ? 1 : GLumenRadiosityTraceBlocksAllocationDivisor);
    const int32 NumTraceBlocksToAllocate = (LumenSceneData.MaxAtlasSize.X / Divisor) 
        * (LumenSceneData.MaxAtlasSize.Y / Divisor);

    FRDGBufferRef CardTraceBlockAllocator = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(uint32), 1), TEXT("CardTraceBlockAllocator"));
    FRDGBufferRef CardTraceBlockData = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(FIntVector4), NumTraceBlocksToAllocate), TEXT("CardTraceBlockData"));
    FRDGBufferUAVRef CardTraceBlockAllocatorUAV = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(CardTraceBlockAllocator, PF_R32_UINT));
    FRDGBufferUAVRef CardTraceBlockDataUAV = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(CardTraceBlockData, PF_R32G32B32A32_UINT));

    FComputeShaderUtils::ClearUAV(GraphBuilder, View.ShaderMap, CardTraceBlockAllocatorUAV, 0);

    // 構建卡片追蹤塊.
    {
        FSetupCardTraceBlocksCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FSetupCardTraceBlocksCS::FParameters>();
        PassParameters->RWCardTraceBlockAllocator = CardTraceBlockAllocatorUAV;
        PassParameters->RWCardTraceBlockData = CardTraceBlockDataUAV;
        PassParameters->QuadAllocator = CardScatterParameters.QuadAllocator;
        PassParameters->QuadData = CardScatterParameters.QuadData;
        PassParameters->CardBuffer = LumenSceneData.CardBuffer.SRV;
        PassParameters->RadiosityAtlasSize = FIntPoint::DivideAndRoundDown(LumenSceneData.MaxAtlasSize, GLumenRadiosityDownsampleFactor);
        PassParameters->IndirectArgs = SetupCardTraceBlocksIndirectArgsBuffer;

        auto ComputeShader = GlobalShaderMap->GetShader<FSetupCardTraceBlocksCS>();

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("SetupCardTraceBlocksCS"),
            ComputeShader,
            PassParameters,
            SetupCardTraceBlocksIndirectArgsBuffer,
            0);
    }

    // 構建卡片追蹤引數.
    FRDGBufferRef TraceBlocksIndirectArgsBuffer = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateIndirectDesc<FRHIDispatchIndirectParameters>(1), TEXT("TraceBlocksIndirectArgsBuffer"));
    {
        FRDGBufferUAVRef TraceBlocksIndirectArgsBufferUAV = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(TraceBlocksIndirectArgsBuffer));

        FTraceBlocksIndirectArgsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FTraceBlocksIndirectArgsCS::FParameters>();
        PassParameters->RWIndirectArgs = TraceBlocksIndirectArgsBufferUAV;
        PassParameters->CardTraceBlockAllocator = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(CardTraceBlockAllocator, PF_R32_UINT));

        FTraceBlocksIndirectArgsCS::FPermutationDomain PermutationVector;
        PermutationVector.Set<FTraceBlocksIndirectArgsCS::FIrradianceCache>(bUseIrradianceCache);
        auto ComputeShader = GlobalShaderMap->GetShader< FTraceBlocksIndirectArgsCS >(PermutationVector);

        const FIntVector GroupSize(1, 1, 1);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("TraceBlocksIndirectArgsCS"),
            ComputeShader,
            PassParameters,
            GroupSize);
    }

    LumenRadianceCache::FRadianceCacheInterpolationParameters RadianceCacheParameters;

    // 渲染輻照度快取.
    if (bUseIrradianceCache)
    {
        const LumenRadianceCache::FRadianceCacheInputs RadianceCacheInputs = LumenRadiosity::SetupRadianceCacheInputs();

        FRadiosityMarkUsedProbesData MarkUsedProbesData;
        MarkUsedProbesData.Parameters.View = View.ViewUniformBuffer;
        MarkUsedProbesData.Parameters.DepthAtlas = LumenSceneData.DepthAtlas->GetRenderTargetItem().ShaderResourceTexture;
        MarkUsedProbesData.Parameters.CurrentOpacityAtlas = LumenSceneData.OpacityAtlas->GetRenderTargetItem().ShaderResourceTexture;
        MarkUsedProbesData.Parameters.CardTraceBlockAllocator = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(CardTraceBlockAllocator, PF_R32_UINT));
        MarkUsedProbesData.Parameters.CardTraceBlockData = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(CardTraceBlockData, PF_R32G32B32A32_UINT));
        MarkUsedProbesData.Parameters.CardBuffer = LumenSceneData.CardBuffer.SRV;
        MarkUsedProbesData.Parameters.RadiosityAtlasSize = FIntPoint::DivideAndRoundDown(LumenSceneData.MaxAtlasSize, GLumenRadiosityDownsampleFactor);
        MarkUsedProbesData.Parameters.IndirectArgs = TraceBlocksIndirectArgsBuffer;

        RenderRadianceCache(
            GraphBuilder, 
            TracingInputs, 
            RadianceCacheInputs, 
            Scene,
            View, 
            nullptr, 
            nullptr, 
            FMarkUsedRadianceCacheProbes::CreateStatic(&RadianceCacheMarkUsedProbes), 
            &MarkUsedProbesData, 
            View.ViewState->RadiosityRadianceCacheState, 
            RadianceCacheParameters);
    }

    // 從圖集中追蹤卡片紋素的輻射度.
    {
        FLumenCardRadiosityTraceBlocksCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FLumenCardRadiosityTraceBlocksCS::FParameters>();
        PassParameters->RWRadiosityAtlas = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(RadiosityAtlas));
        PassParameters->RadianceCacheParameters = RadianceCacheParameters;
        PassParameters->CardTraceBlockAllocator = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(CardTraceBlockAllocator, PF_R32_UINT));
        PassParameters->CardTraceBlockData = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(CardTraceBlockData, PF_R32G32B32A32_UINT));
        PassParameters->ProbeOcclusionNormalBias = GLumenRadiosityIrradianceCacheProbeOcclusionNormalBias;
        PassParameters->IndirectArgs = TraceBlocksIndirectArgsBuffer;

        SetupTraceFromTexelParameters(View, TracingInputs, LumenSceneData, PassParameters->TraceFromTexelParameters);

        FLumenCardRadiosityTraceBlocksCS::FPermutationDomain PermutationVector;
        PermutationVector.Set<FLumenCardRadiosityTraceBlocksCS::FDynamicSkyLight>(bRenderSkylight);
        PermutationVector.Set<FLumenCardRadiosityTraceBlocksCS::FIrradianceCache>(bUseIrradianceCache);
        auto ComputeShader = GlobalShaderMap->GetShader< FLumenCardRadiosityTraceBlocksCS >(PermutationVector);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("TraceFromAtlasTexels: %u Cones", RadiosityDirections.SampleDirections.Num()),
            ComputeShader,
            PassParameters,
            TraceBlocksIndirectArgsBuffer,
            0);
    }
}

由此可知計算輻射度的過程比較多,包含裁剪、構建追蹤引數、追蹤圖集紋素等:

最後階段的追蹤紋素主要是構造取樣方向,每個取樣方向構建一個椎體(Cone)去追蹤附近的輻射度,它的輸入引數主要有全域性距離場圖集、場景深度、場景透明度、場景法線、VoxelLighting等資料:

追蹤卡片紋素所需的資料:左上是全域性距離場圖集,右上是場景深度圖集,左下是場景透明度,右下是場景法線。

輸出的是場景輻射度圖集:

對應的CS shader程式碼如下:

// Engine\Shaders\Private\Lumen\LumenRadiosity.usf

float ProbeOcclusionNormalBias;
// 用於保持執行緒組的光照結果, 注意是groupshared的.
groupshared float3 ThreadLighting[THREADGROUP_SIZE];

[numthreads(THREADGROUP_SIZE, 1, 1)]
void LumenCardRadiosityTraceBlocksCS(
    uint3 DispatchThreadId : SV_DispatchThreadID,
    uint3 GroupThreadId : SV_GroupThreadID)
{
#if IRRADIANCE_CACHE // 輻照度快取模式
    uint ThreadIndex = DispatchThreadId.x;

    uint GlobalBlockIndex = ThreadIndex / (CARD_TRACE_BLOCK_SIZE * CARD_TRACE_BLOCK_SIZE);

    if (GlobalBlockIndex < CardTraceBlockAllocator[0])
    {
        // 計算紋素索引.
        uint TexelIndexInBlock = ThreadIndex % (CARD_TRACE_BLOCK_SIZE * CARD_TRACE_BLOCK_SIZE);
        uint2 TexelOffsetInBlock = uint2(TexelIndexInBlock % CARD_TRACE_BLOCK_SIZE, TexelIndexInBlock / CARD_TRACE_BLOCK_SIZE);

        // 獲取追蹤塊資料.
        uint4 TraceBlockData = CardTraceBlockData[GlobalBlockIndex];
        uint CardId = TraceBlockData.x;
        uint ProbeIndex = TraceBlockData.y;
        uint BlockIndex = TraceBlockData.z;

        // 獲取卡片資料.
        FLumenCardData CardData = GetLumenCardData(CardId, CardBuffer);

        float2 CardSizeTexels = abs(CardData.LocalExtent.xy * 2 * CardData.LocalPositionToAtlasUVScale * RadiosityAtlasSize);
        uint2 NumBlocksXY = ((uint2)CardSizeTexels + CARD_TRACE_BLOCK_SIZE - 1) / CARD_TRACE_BLOCK_SIZE;
        uint2 BlockOffset = uint2(BlockIndex % NumBlocksXY.x, BlockIndex / NumBlocksXY.x);
        float2 TexelCoord = BlockOffset * CARD_TRACE_BLOCK_SIZE + TexelOffsetInBlock;

        if (all(TexelCoord < CardSizeTexels))
        {
            // 計算卡片UV.
            float2 CardUV = (TexelCoord + .5f) / (float2)CardSizeTexels;
            float2 CardUVToAtlasScale = GetCardUVToAtlasScale(CardData.LocalPositionToAtlasUVScale, CardData.LocalExtent);
            float2 CardUVToAtlasBias = GetCardUVToAtlasBias(CardUVToAtlasScale, CardData.LocalPositionToAtlasUVBias);
            float2 AtlasUV = CardUV * CardUVToAtlasScale + CardUVToAtlasBias;

            float Opacity = Texture2DSampleLevel(CurrentOpacityAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).x;

            float3 DiffuseLighting = 0;

            // 透明度大於0的輻射度才有意義.
            if (Opacity > 0)
            {
                float Depth = 1.0f - Texture2DSampleLevel(DepthAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).x;

                float3 LocalPosition;
                LocalPosition.xy = (AtlasUV - CardData.LocalPositionToAtlasUVBias) / CardData.LocalPositionToAtlasUVScale;
                LocalPosition.z = -CardData.LocalExtent.z + Depth * 2 * CardData.LocalExtent.z;

                // 計算世界空間的位置和法線.
                float3 WorldPosition = mul(CardData.WorldToLocalRotation, LocalPosition) + CardData.Origin;
                float3 WorldNormal = normalize(Texture2DSampleLevel(NormalAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).xyz * 2 - 1);
                uint ClipmapIndex = GetRadianceProbeClipmap(WorldPosition);

                // 計算漫反射光照. 如果裁剪圖有效, 則從中插值獲得.
                if (ClipmapIndex < NumRadianceProbeClipmaps)
                {
                    float3 BiasOffset = WorldNormal * ProbeOcclusionNormalBias;
                    // 從RadianceProbeIndirectionTexture取樣計算漫反射.
                    DiffuseLighting = SampleIrradianceCacheInterpolated(WorldPosition, WorldNormal, BiasOffset, ClipmapIndex);
                }
                else // 沒有有效裁剪圖, 從天空光的球諧中計算漫反射.
                {
                    DiffuseLighting = GetSkySHDiffuse(WorldNormal) * View.SkyLightColor.rgb;
                }
            }

            // 儲存輻射度.
            uint2 AtlasCoord = uint2(AtlasUV * RadiosityAtlasSize);
            RWRadiosityAtlas[AtlasCoord] = float4(DiffuseLighting * PI, 0);
        }
    }
#else // 非輻照度快取模式
    ThreadLighting[GroupThreadId.x] = 0;

    uint ThreadIndex = DispatchThreadId.x;
    uint GlobalBlockIndex = ThreadIndex / (CARD_TRACE_BLOCK_SIZE * CARD_TRACE_BLOCK_SIZE * THREADS_PER_RADIOSITY_TEXEL);
    int2 AtlasCoord = -1;

    if (GlobalBlockIndex < CardTraceBlockAllocator[0])
    {
        uint TexelIndexInBlock = (ThreadIndex / THREADS_PER_RADIOSITY_TEXEL) % (CARD_TRACE_BLOCK_SIZE * CARD_TRACE_BLOCK_SIZE);
        uint2 TexelOffsetInBlock = uint2(TexelIndexInBlock % CARD_TRACE_BLOCK_SIZE, TexelIndexInBlock / CARD_TRACE_BLOCK_SIZE);

        uint4 TraceBlockData = CardTraceBlockData[GlobalBlockIndex];
        uint CardId = TraceBlockData.x;
        uint ProbeIndex = TraceBlockData.y;
        uint BlockIndex = TraceBlockData.z;

        FLumenCardData CardData = GetLumenCardData(CardId, CardBuffer);

        float2 CardSizeTexels = abs(CardData.LocalExtent.xy * 2 * CardData.LocalPositionToAtlasUVScale * RadiosityAtlasSize);
        uint2 NumBlocksXY = ((uint2)CardSizeTexels + CARD_TRACE_BLOCK_SIZE - 1) / CARD_TRACE_BLOCK_SIZE;
        uint2 BlockOffset = uint2(BlockIndex % NumBlocksXY.x, BlockIndex / NumBlocksXY.x);
        float2 TexelCoord = BlockOffset * CARD_TRACE_BLOCK_SIZE + TexelOffsetInBlock;

        if (all(TexelCoord < CardSizeTexels))
        {
            uint TraceThreadIndex = ThreadIndex % THREADS_PER_RADIOSITY_TEXEL;

            float2 CardUV = (TexelCoord + .5f) / (float2)CardSizeTexels;
            float2 CardUVToAtlasScale = GetCardUVToAtlasScale(CardData.LocalPositionToAtlasUVScale, CardData.LocalExtent);
            float2 CardUVToAtlasBias = GetCardUVToAtlasBias(CardUVToAtlasScale, CardData.LocalPositionToAtlasUVBias);
            float2 AtlasUV = CardUV * CardUVToAtlasScale + CardUVToAtlasBias;

            uint NumTracesPerThread = NumCones / THREADS_PER_RADIOSITY_TEXEL;
            uint ConeStartIndex = TraceThreadIndex * NumTracesPerThread;
            AtlasCoord = int2(AtlasUV * RadiosityAtlasSize);
            // 從卡片紋素追蹤輻射度.
            float3 Lighting = RadiosityTraceFromTexel(AtlasUV, AtlasCoord, ProbeIndex, CardData, ConeStartIndex, ConeStartIndex + NumTracesPerThread);
            ThreadLighting[GroupThreadId.x] = Lighting;
        }
    }

    // 等待同線程組的其它執行緒完成計算.
    GroupMemoryBarrierWithGroupSync();

    uint TraceThreadIndex = ThreadIndex % THREADS_PER_RADIOSITY_TEXEL;

    // 疊加同線程組所有執行緒的光照結果並儲存. TraceThreadIndex == 0表明只在每個執行緒組的第一個執行緒執行.
    if (TraceThreadIndex == 0 && all(AtlasCoord >= 0))
    {
        float3 Lighting = 0;

        for (uint OtherThreadIndex = GroupThreadId.x; OtherThreadIndex < GroupThreadId.x + THREADS_PER_RADIOSITY_TEXEL; OtherThreadIndex += 1)
        {
            Lighting += ThreadLighting[OtherThreadIndex];
        }

        RWRadiosityAtlas[AtlasCoord] = float4(Lighting, 0);
    }
#endif
}

由此可知,追蹤輻射度時,支援兩種模式:輻照度快取模式和非輻照度快取模式。輻照度快取模式是從3D的RadianceProbeIndirectionTexture取樣、插值計算而得到輻射度,而非輻照度快取模式是實時追蹤卡片紋素附近的輻射度,再疊加它們的結果,其中用到了RadiosityTraceFromTexel的邏輯如下:

float3 RadiosityTraceFromTexel(float2 AtlasUV, int2 AtlasCoord, uint ProbeIndex, FLumenCardData LumenCardData, uint ConeStartIndex, uint ConeEndIndex)
{
    float Opacity = Texture2DSampleLevel(CurrentOpacityAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).x;

    float3 Lighting = 0;

    if (Opacity > 0)
    {
        float Depth = 1.0f - Texture2DSampleLevel(DepthAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).x;

        // 重建區域性位置
        float3 LocalPosition;
        LocalPosition.xy = (AtlasUV - LumenCardData.LocalPositionToAtlasUVBias) / LumenCardData.LocalPositionToAtlasUVScale;
        LocalPosition.z = -LumenCardData.LocalExtent.z + Depth * 2 * LumenCardData.LocalExtent.z;

        // 世界空間的位置和法線.
        float3 WorldPosition = mul(LumenCardData.WorldToLocalRotation, LocalPosition) + LumenCardData.Origin;
        float3 WorldNormal = normalize(Texture2DSampleLevel(NormalAtlas, GlobalBilinearClampedSampler, AtlasUV, 0).xyz * 2 - 1);

        //@todo - derive bias from texel world size
        WorldPosition += WorldNormal * SurfaceBias;

        // 追蹤起點.
        float VoxelTraceStartDistance = CalculateVoxelTraceStartDistance(MinTraceDistance, MaxTraceDistance, MaxMeshSDFTraceDistance, false);

        // 遍歷所有方向的椎體, 疊加它們的結果.
        for (uint ConeIndex = ConeStartIndex; ConeIndex < ConeEndIndex; ConeIndex++)
        {
            //uint ConeIndex = ConeStartIndex;
            float3x3 TangentBasis = GetTangentBasisFrisvad(WorldNormal);

            // 計算椎體方向.
            #define PRECOMPUTED_SAMPLE_DIRECTIONS 1
            #if PRECOMPUTED_SAMPLE_DIRECTIONS // 預計算的方向.
                float3 LocalConeDirection = RadiosityConeDirections[ConeIndex].xyz;
                float3 WorldConeDirection = mul(LocalConeDirection, TangentBasis);
            #else // 非預計算, 直接通過低差異序列生成方向.
                uint2 Seed0 = Rand3DPCG16(int3(AtlasCoord + 17, 0)).xy;
                float2 E = Hammersley16(ConeIndex, NumCones, Seed0);
                float2 DiskE = UniformSampleDiskConcentric(E.xy);
                float TangentZ = sqrt(1 - length2(DiskE));
                float3 WorldConeDirection = mul(float3(DiskE, TangentZ), TangentBasis);
            #endif

            //@todo - derive bias from texel world size
            // 取樣位置.
            float3 SamplePosition = WorldPosition + SurfaceBias * WorldConeDirection;

            // 構建椎體追蹤輸入資料.
            FConeTraceInput TraceInput;
            TraceInput.Setup(SamplePosition, WorldConeDirection, DiffuseConeHalfAngle, MinSampleRadius, MinTraceDistance, MaxTraceDistance, StepFactor);
            TraceInput.VoxelStepFactor = VoxelStepFactor;
            TraceInput.VoxelTraceStartDistance = VoxelTraceStartDistance;
            TraceInput.SDFStepFactor = 1;

            // 執行椎體追蹤, 儲存結果.
            FConeTraceResult TraceResult;
            ConeTraceVoxels(TraceInput, TraceResult);

            // 用椎體計算天空光的輻射度.
            EvaluateSkyRadianceForCone(WorldConeDirection, TraceInput.TanConeAngle, TraceResult);

            // 疊加取樣的光照結果.
            Lighting += TraceResult.Lighting;
        }
    }

    // 縮放取樣結果, 防止能量不守恆.
    Lighting *= PI / (float)NumCones;
    return Lighting;
}

上面涉及到了椎體追蹤場景的介面ConeTraceVoxels就是6.5.6.1 Voxel Cone Tracing提及的方式,程式碼如下:

// Engine\Shaders\Private\Lumen\LumenTracingCommon.ush

void ConeTraceVoxels(
    FConeTraceInput TraceInput,
    inout FConeTraceResult OutResult)
{
    FGlobalSDFTraceResult SDFTraceResult;

    // 追蹤SDF射線
    {
        FGlobalSDFTraceInput SDFTraceInput = SetupGlobalSDFTraceInput(TraceInput.ConeOrigin, TraceInput.ConeDirection, TraceInput.MinTraceDistance, TraceInput.MaxTraceDistance, TraceInput.SDFStepFactor, TraceInput.VoxelStepFactor);
        SDFTraceInput.bExpandSurfaceUsingRayTimeInsteadOfMaxDistance = TraceInput.bExpandSurfaceUsingRayTimeInsteadOfMaxDistance;
        SDFTraceInput.InitialMaxDistance = TraceInput.InitialMaxDistance;

        // 追蹤全域性距離場.
        SDFTraceResult = RayTraceGlobalDistanceField(SDFTraceInput);
    }

    float4 LightingAndAlpha = float4(0, 0, 0, 1);

    // 只有全域性距離場命中才執行下面的邏輯.
    if (GlobalSDFTraceResultIsHit(SDFTraceResult))
    {
        float3 SampleWorldPosition = TraceInput.ConeOrigin + TraceInput.ConeDirection * SDFTraceResult.HitTime;

        uint VoxelClipmapIndex = 0;
        float3 VoxelClipmapCenter = ClipmapWorldCenter[VoxelClipmapIndex].xyz;
        float3 VoxelClipmapExtent = ClipmapWorldSamplingExtent[VoxelClipmapIndex].xyz;

        bool bOutsideValidRegion = any(SampleWorldPosition > VoxelClipmapCenter + VoxelClipmapExtent || SampleWorldPosition < VoxelClipmapCenter - VoxelClipmapExtent);

        // 查詢匹配當前步進的椎體寬度的voxel clipmap.
        while (bOutsideValidRegion && VoxelClipmapIndex + 1 < NumClipmapLevels)
        {
            VoxelClipmapIndex++;
            VoxelClipmapCenter = ClipmapWorldCenter[VoxelClipmapIndex].xyz;
            VoxelClipmapExtent = ClipmapWorldSamplingExtent[VoxelClipmapIndex].xyz;
            bOutsideValidRegion = any(SampleWorldPosition > VoxelClipmapCenter + VoxelClipmapExtent || SampleWorldPosition < VoxelClipmapCenter - VoxelClipmapExtent);
        }

        LightingAndAlpha.xyzw = 0.0f;

        // 如果沒有超出有效範圍, 則計算Voxel光照.
        if (!bOutsideValidRegion)
        {
            float3 DistanceFieldGradient = -TraceInput.ConeDirection;

            float3 ClipmapVolumeUV = ComputeGlobalUV(SampleWorldPosition, SDFTraceResult.HitClipmapIndex);
            uint PageIndex = GetGlobalDistanceFieldPage(ClipmapVolumeUV, SDFTraceResult.HitClipmapIndex);

            if (PageIndex < GLOBAL_DISTANCE_FIELD_INVALID_PAGE_ID)
            {
                float3 PageUV = ComputeGlobalDistanceFieldPageUV(ClipmapVolumeUV, PageIndex);
                DistanceFieldGradient = GlobalDistanceFieldPageCentralDiff(PageUV);
            }

            float DistanceFieldGradientLength = length(DistanceFieldGradient);
            float3 SampleNormal = DistanceFieldGradientLength > 0.001 ? DistanceFieldGradient / DistanceFieldGradientLength : -TraceInput.ConeDirection;

            // 取樣3D紋理VoxelLighting, 獲得光照.
            float4 StepLighting = SampleVoxelLighting(SampleWorldPosition, -SampleNormal, VoxelClipmapIndex);

            StepLighting.xyz = StepLighting.xyz * (1.0f / max(StepLighting.w, 0.1));

            // 計算自遮擋因子.
            float VoxelSelfLightingBias = 1.0f;
            if (TraceInput.bExpandSurfaceUsingRayTimeInsteadOfMaxDistance)
            {
                // 對於漫射光線,最好是過度遮擋, 而不該漏光.
                VoxelSelfLightingBias = smoothstep(1.5 * ClipmapVoxelSizeAndRadius[VoxelClipmapIndex].w, 2.0 * ClipmapVoxelSizeAndRadius[VoxelClipmapIndex].w, SDFTraceResult.HitTime);
            }

            // 獲得自遮擋後的光照結果.
            LightingAndAlpha.xyz = StepLighting.xyz * VoxelSelfLightingBias;
        }
    }

    // 根據Opacity過渡光照結果.
    LightingAndAlpha = FadeOutVoxelConeTraceMinTransparency(LightingAndAlpha);

    // 儲存結果.
    OutResult = (FConeTraceResult)0;
    #if !VISIBILITY_ONLY_TRACE
        OutResult.Lighting = LightingAndAlpha.rgb;
    #endif
    OutResult.Transparency = LightingAndAlpha.a;
    OutResult.NumSteps = SDFTraceResult.TotalStepsTaken;
    OutResult.OpaqueHitDistance = GlobalSDFTraceResultIsHit(SDFTraceResult) ? SDFTraceResult.HitTime : TraceInput.MaxTraceDistance;
}

上面的椎體追蹤中使用了VoxelLighting的3D紋理,該紋理同時還是Clipmap,筆者所擷取的資料中顯示它的維度是64x256x384,並且很多切片(Slice)是黑色的,只有少許是有畫素的,且區域很小:

6.5.6.4 CombineLumenSceneLighting

CombineLumenSceneLighting是組合光照,具體邏輯如下:

void CombineLumenSceneLighting(
    FScene* Scene, 
    FViewInfo& View,
    FRDGBuilder& GraphBuilder,
    TRDGUniformBufferRef<FLumenCardScene> LumenCardSceneUniformBuffer,
    FRDGTextureRef FinalLightingAtlas, 
    FRDGTextureRef OpacityAtlas, 
    FRDGTextureRef RadiosityAtlas, 
    FGlobalShaderMap* GlobalShaderMap,
    const FLumenCardScatterContext& VisibleCardScatterContext)
{
    LLM_SCOPE_BYTAG(Lumen);

    FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;

    {
        FLumenCardLightingEmissive* PassParameters = GraphBuilder.AllocParameters<FLumenCardLightingEmissive>();
        
        extern int32 GLumenRadiosityDownsampleFactor;
        FVector2D CardUVSamplingOffset = FVector2D::ZeroVector;
        if (GLumenRadiosityDownsampleFactor > 1)
        {
            // Offset bilinear samples in order to not sample outside of the lower res radiosity card bounds
            CardUVSamplingOffset.X = (GLumenRadiosityDownsampleFactor * 0.25f) / LumenSceneData.MaxAtlasSize.X;
            CardUVSamplingOffset.Y = (GLumenRadiosityDownsampleFactor * 0.25f) / LumenSceneData.MaxAtlasSize.Y;
        }

        PassParameters->RenderTargets[0] = FRenderTargetBinding(FinalLightingAtlas, ERenderTargetLoadAction::ENoAction);
        PassParameters->VS.LumenCardScene = LumenCardSceneUniformBuffer;
        PassParameters->VS.CardScatterParameters = VisibleCardScatterContext.Parameters;
        PassParameters->VS.ScatterInstanceIndex = 0;
        PassParameters->VS.CardUVSamplingOffset = CardUVSamplingOffset;
        PassParameters->PS.View = View.ViewUniformBuffer;
        PassParameters->PS.LumenCardScene = LumenCardSceneUniformBuffer;
        PassParameters->PS.RadiosityAtlas = RadiosityAtlas;
        PassParameters->PS.OpacityAtlas = OpacityAtlas;

        // 增加光照組合Pass, 用的是傳統的光柵化流程.
        GraphBuilder.AddPass(
            RDG_EVENT_NAME("LightingCombine"),
            PassParameters,
            ERDGPassFlags::Raster,
            [MaxAtlasSize = Scene->LumenSceneData->MaxAtlasSize, PassParameters, GlobalShaderMap](FRHICommandListImmediate& RHICmdList)
        {
            FLumenCardLightingInitializePS::FPermutationDomain PermutationVector;
            auto PixelShader = GlobalShaderMap->GetShader< FLumenCardLightingInitializePS >(PermutationVector);

            DrawQuadsToAtlas(MaxAtlasSize, PixelShader, PassParameters, GlobalShaderMap, TStaticBlendState<>::GetRHI(), RHICmdList);
        });
    }
}

這個階段是將上一節的場景輻射度圖集作為輸入,然後輸出輸出輻射度顏色到SceneFinalLighting中。

6.5.6.5 RenderDirectLightingForLumenScene

RenderDirectLightingForLumenScene是計算Lumen場景的直接光照,流程有點類似於傳統的光照:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenSceneDirectLighting.cpp

void FDeferredShadingSceneRenderer::RenderDirectLightingForLumenScene(
    FRDGBuilder& GraphBuilder,
    TRDGUniformBufferRef<FLumenCardScene> LumenCardSceneUniformBuffer,
    FRDGTextureRef FinalLightingAtlas,
    FRDGTextureRef OpacityAtlas,
    FGlobalShaderMap* GlobalShaderMap,
    const FLumenCardScatterContext& VisibleCardScatterContext)
{
    LLM_SCOPE_BYTAG(Lumen);

    if (GLumenDirectLighting)
    {
        RDG_EVENT_SCOPE(GraphBuilder, "DirectLighting");
        QUICK_SCOPE_CYCLE_COUNTER(RenderDirectLightingForLumenScene);

        const FViewInfo& MainView = Views[0];
        FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
        const bool bLumenUseHardwareRayTracedShadow = Lumen::UseHardwareRayTracedShadows(MainView);
        FLumenDirectLightingHardwareRayTracingData LumenDirectLightingHardwareRayTracingData;
        
        if(bLumenUseHardwareRayTracedShadow)
        {
            LumenDirectLightingHardwareRayTracingData.Initialize(GraphBuilder, Scene);
        }

        TArray<const FLightSceneInfo*, TInlineAllocator<64>> GatheredLocalLights;

        // 遍歷場景的所有光源.
        for (TSparseArray<FLightSceneInfoCompact>::TConstIterator LightIt(Scene->Lights); LightIt; ++LightIt)
        {
            const FLightSceneInfoCompact& LightSceneInfoCompact = *LightIt;
            const FLightSceneInfo* LightSceneInfo = LightSceneInfoCompact.LightSceneInfo;

            if (LightSceneInfo->ShouldRenderLightViewIndependent()
                && LightSceneInfo->ShouldRenderLight(MainView, true)
                && LightSceneInfo->Proxy->GetIndirectLightingScale() > 0.0f)
            {
                const ELightComponentType LightType = (ELightComponentType)LightSceneInfo->Proxy->GetLightType();

                // 平行光
                if (LightType == LightType_Directional)
                {
                    // 不需要裁剪, 直接繪製.

                    FString LightNameWithLevel;
                    FSceneRenderer::GetLightNameForDrawEvent(LightSceneInfo->Proxy, LightNameWithLevel);

                    // 渲染直接光到Lumen卡片.
                    RenderDirectLightIntoLumenCards(
                        GraphBuilder,
                        Scene,
                        MainView,
                        ViewFamily.EngineShowFlags,
                        VisibleLightInfos,
                        LumenCardSceneUniformBuffer,
                        FinalLightingAtlas,
                        OpacityAtlas,
                        LightSceneInfo,
                        LightNameWithLevel,
                        VisibleCardScatterContext,
                        0,
                        LumenDirectLightingHardwareRayTracingData,
                        VirtualShadowMapArray);
                }
                else // 非平行光, 收集到GatheredLocalLights.
                {
                    GatheredLocalLights.Add(LightSceneInfo);
                }
            }
        }

        const int32 LightBatchSize = FMath::Clamp(GLumenDirectLightingBatchSize, 1, 256);

        // 分批的光照裁剪和繪圖
        for (int32 LightBatchIndex = 0; LightBatchIndex * LightBatchSize < GatheredLocalLights.Num(); ++LightBatchIndex)
        {
            const int32 FirstLightIndex = LightBatchIndex * LightBatchSize;
            const int32 LastLightIndex = FMath::Min((LightBatchIndex + 1) * LightBatchSize, GatheredLocalLights.Num());

            FLumenCardScatterContext CardScatterContext;

            {
                RDG_EVENT_SCOPE(GraphBuilder, "Cull Cards %d Lights", LastLightIndex - FirstLightIndex);

                // 初始化上下文.
                CardScatterContext.Init(
                    GraphBuilder,
                    MainView,
                    LumenSceneData,
                    LumenCardRenderer,
                    ECullCardsMode::OperateOnSceneForceUpdateForCardsToRender,
                    LightBatchSize);

                // 將卡片裁剪到光源的形狀上.
                for (int32 LightIndex = FirstLightIndex; LightIndex < LastLightIndex; ++LightIndex)
                {
                    const int32 ScatterInstanceIndex = LightIndex - FirstLightIndex;
                    const FLightSceneInfo* LightSceneInfo = GatheredLocalLights[LightIndex];
                    const ELightComponentType LightType = (ELightComponentType)LightSceneInfo->Proxy->GetLightType();
                    const FSphere LightBounds = LightSceneInfo->Proxy->GetBoundingSphere();

                    ECullCardsShapeType ShapeType = ECullCardsShapeType::None;

                    if (LightType == LightType_Point)
                    {
                        ShapeType = ECullCardsShapeType::PointLight;
                    }
                    else if (LightType == LightType_Spot)
                    {
                        ShapeType = ECullCardsShapeType::SpotLight;
                    }
                    else if (LightType == LightType_Rect)
                    {
                        ShapeType = ECullCardsShapeType::RectLight;
                    }
                    else
                    {
                        ensureMsgf(false, TEXT("Need Lumen card culling for new light type"));
                    }

                    FCullCardsShapeParameters ShapeParameters;
                    ShapeParameters.InfluenceSphere = FVector4(LightBounds.Center, LightBounds.W);
                    ShapeParameters.LightPosition = LightSceneInfo->Proxy->GetPosition();
                    ShapeParameters.LightDirection = LightSceneInfo->Proxy->GetDirection();
                    ShapeParameters.LightRadius = LightSceneInfo->Proxy->GetRadius();
                    ShapeParameters.CosConeAngle = FMath::Cos(LightSceneInfo->Proxy->GetOuterConeAngle());
                    ShapeParameters.SinConeAngle = FMath::Sin(LightSceneInfo->Proxy->GetOuterConeAngle());

                    // 根據光源形狀裁剪卡片
                    CardScatterContext.CullCardsToShape(
                        GraphBuilder,
                        MainView,
                        LumenSceneData,
                        LumenCardRenderer,
                        LumenCardSceneUniformBuffer,
                        ShapeType,
                        ShapeParameters,
                        GLumenSceneCardDirectLightingUpdateFrequencyScale,
                        ScatterInstanceIndex);
                }

                // 構建散射非直接引數.
                CardScatterContext.BuildScatterIndirectArgs(
                    GraphBuilder,
                    MainView);
            }

            // 繪製非平行光的光源.
            {
                RDG_EVENT_SCOPE(GraphBuilder, "Draw %d Lights", LastLightIndex - FirstLightIndex);

                for (int32 LightIndex = FirstLightIndex; LightIndex < LastLightIndex; ++LightIndex)
                {
                    const int32 ScatterInstanceIndex = LightIndex - FirstLightIndex;
                    const FLightSceneInfo* LightSceneInfo = GatheredLocalLights[LightIndex];

                    FString LightNameWithLevel;
                    FSceneRenderer::GetLightNameForDrawEvent(LightSceneInfo->Proxy, LightNameWithLevel);

                    // 繪製非平行光的光源到Lumen卡片.
                    RenderDirectLightIntoLumenCards(
                        GraphBuilder,
                        Scene,
                        MainView,
                        ViewFamily.EngineShowFlags,
                        VisibleLightInfos,
                        LumenCardSceneUniformBuffer,
                        FinalLightingAtlas,
                        OpacityAtlas,
                        LightSceneInfo,
                        LightNameWithLevel,
                        CardScatterContext,
                        ScatterInstanceIndex,
                        LumenDirectLightingHardwareRayTracingData,
                        VirtualShadowMapArray);
                }
            }
        }
    }
}

下面是繪製單個光源RenderDirectLightIntoLumenCards的程式碼:

void RenderDirectLightIntoLumenCards(
    FRDGBuilder& GraphBuilder,
    const FScene* Scene,
    const FViewInfo& View,
    const FEngineShowFlags& EngineShowFlags,
    TArray<FVisibleLightInfo, SceneRenderingAllocator>& VisibleLightInfos,
    TRDGUniformBufferRef<FLumenCardScene> LumenCardSceneUniformBuffer,
    FRDGTextureRef FinalLightingAtlas,
    FRDGTextureRef OpacityAtlas,
    const FLightSceneInfo* LightSceneInfo,
    const FString& LightName,
    const FLumenCardScatterContext& CardScatterContext,
    int32 ScatterInstanceIndex,
    FLumenDirectLightingHardwareRayTracingData& LumenDirectLightingHardwareRayTracingData,
    const FVirtualShadowMapArray& VirtualShadowMapArray)
{
    FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;
    const FSphere LightBounds = LightSceneInfo->Proxy->GetBoundingSphere();
    const ELightComponentType LightType = (ELightComponentType)LightSceneInfo->Proxy->GetLightType();
    bool bShadowed = LightSceneInfo->Proxy->CastsDynamicShadow();

    // 轉換光源型別.
    ELumenLightType LumenLightType = ELumenLightType::MAX;
    {
        switch (LightType)
        {
        case LightType_Directional: LumenLightType = ELumenLightType::Directional;    break;
        case LightType_Point:        LumenLightType = ELumenLightType::Point;        break;
        case LightType_Spot:        LumenLightType = ELumenLightType::Spot;            break;
        case LightType_Rect:        LumenLightType = ELumenLightType::Rect;            break;
        }
        check(LumenLightType != ELumenLightType::MAX);
    }

    // 設定陰影資訊.
    FVisibleLightInfo& VisibleLightInfo = VisibleLightInfos[LightSceneInfo->Id];
    FLumenShadowSetup ShadowSetup = GetShadowForLumenDirectLighting(VisibleLightInfo);

    const bool bDynamicallyShadowed = ShadowSetup.DenseShadowMap != nullptr;

    FDistanceFieldObjectBufferParameters ObjectBufferParameters = DistanceField::SetupObjectBufferParameters(Scene->DistanceFieldSceneData);

    FLightTileIntersectionParameters LightTileIntersectionParameters;
    FDistanceFieldCulledObjectBufferParameters CulledObjectBufferParameters;
    FMatrix WorldToMeshSDFShadowValue = FMatrix::Identity;

    const bool bLumenUseHardwareRayTracedShadow = Lumen::UseHardwareRayTracedShadows(View) && bShadowed;
    const bool bTraceMeshSDFs = bShadowed 
        && LumenLightType == ELumenLightType::Directional 
        && DoesPlatformSupportDistanceFieldShadowing(View.GetShaderPlatform())
        && GLumenDirectLightingOffscreenShadowingTraceMeshSDFs != 0
        && Lumen::UseMeshSDFTracing()
        && ObjectBufferParameters.NumSceneObjects > 0;

    // 處理虛擬陰影圖ID.
    int32 VirtualShadowMapId = -1;
    if (bDynamicallyShadowed
        && !bLumenUseHardwareRayTracedShadow
        && GLumenDirectLightingVirtualShadowMap != 0
        && VirtualShadowMapArray.IsAllocated())
    {
        if (LightType == LightType_Directional)
        {
            VirtualShadowMapId = VisibleLightInfo.VirtualShadowMapClipmaps[0]->GetVirtualShadowMap()->ID;
        }
        else if (ShadowSetup.VirtualShadowMap)
        {
            VirtualShadowMapId = ShadowSetup.VirtualShadowMap->VirtualShadowMaps[0]->ID;
        }
    }

    const bool bUseVirtualShadowMap = VirtualShadowMapId >= 0;
    if (!bUseVirtualShadowMap)
    {
        // Fallback to a complete shadow map
        ShadowSetup.VirtualShadowMap = nullptr;
        ShadowSetup.DenseShadowMap = GetShadowForInjectionIntoVolumetricFog(VisibleLightInfo);
    }

    if (bLumenUseHardwareRayTracedShadow)
    {
        RenderHardwareRayTracedShadowIntoLumenCards(
            GraphBuilder, Scene, View, LumenCardSceneUniformBuffer, OpacityAtlas, 
            LightSceneInfo, LightName, CardScatterContext, ScatterInstanceIndex,
            LumenDirectLightingHardwareRayTracingData, bDynamicallyShadowed, LumenLightType);
    }
    else if (bTraceMeshSDFs)
    {
        CullMeshSDFsForLightCards(GraphBuilder, Scene, View, LightSceneInfo, ObjectBufferParameters, WorldToMeshSDFShadowValue, CulledObjectBufferParameters, LightTileIntersectionParameters);
    }

    FLumenCardDirectLighting* PassParameters = GraphBuilder.AllocParameters<FLumenCardDirectLighting>();
    {
        PassParameters->RenderTargets[0] = FRenderTargetBinding(FinalLightingAtlas, ERenderTargetLoadAction::ELoad);
        PassParameters->VS.InfluenceSphere = FVector4(LightBounds.Center, LightBounds.W);
        PassParameters->VS.LumenCardScene = LumenCardSceneUniformBuffer;
        PassParameters->VS.CardScatterParameters = CardScatterContext.Parameters;
        PassParameters->VS.ScatterInstanceIndex = ScatterInstanceIndex;
        PassParameters->VS.CardUVSamplingOffset = FVector2D::ZeroVector;

        // 獲取體積陰影shader引數.
        GetVolumeShadowingShaderParameters(
            GraphBuilder,
            View,
            LightSceneInfo,
            ShadowSetup.DenseShadowMap,
            0,
            bDynamicallyShadowed,
            PassParameters->PS.VolumeShadowingShaderParameters);

        // 光源全域性緩衝.
        FDeferredLightUniformStruct DeferredLightUniforms = GetDeferredLightParameters(View, *LightSceneInfo);

        if (LightSceneInfo->Proxy->IsInverseSquared())
        {
            DeferredLightUniforms.LightParameters.FalloffExponent = 0;
        }

        PassParameters->PS.View = View.ViewUniformBuffer;
        PassParameters->PS.LumenCardScene = LumenCardSceneUniformBuffer;
        PassParameters->PS.OpacityAtlas = OpacityAtlas;
        DeferredLightUniforms.LightParameters.Color *= LightSceneInfo->Proxy->GetIndirectLightingScale();
        PassParameters->PS.DeferredLightUniforms = CreateUniformBufferImmediate(DeferredLightUniforms, UniformBuffer_SingleDraw);
        PassParameters->PS.ForwardLightData = View.ForwardLightingResources->ForwardLightDataUniformBuffer;
        SetupLightFunctionParameters(LightSceneInfo, 1.0f, PassParameters->PS.LightFunctionParameters);

        PassParameters->PS.VirtualShadowMapId = VirtualShadowMapId;
        if (bUseVirtualShadowMap)
        {
            PassParameters->PS.VirtualShadowMapSamplingParameters = VirtualShadowMapArray.GetSamplingParameters(GraphBuilder);
        }
        
        PassParameters->PS.ObjectBufferParameters = ObjectBufferParameters;
        PassParameters->PS.CulledObjectBufferParameters = CulledObjectBufferParameters;
        PassParameters->PS.LightTileIntersectionParameters = LightTileIntersectionParameters;

        FDistanceFieldAtlasParameters DistanceFieldAtlasParameters = DistanceField::SetupAtlasParameters(Scene->DistanceFieldSceneData);

        // 距離場圖集
        PassParameters->PS.DistanceFieldAtlasParameters = DistanceFieldAtlasParameters;
        PassParameters->PS.WorldToShadow = WorldToMeshSDFShadowValue;
        extern float GTwoSidedMeshDistanceBias;
        PassParameters->PS.TwoSidedMeshDistanceBias = GTwoSidedMeshDistanceBias;

        PassParameters->PS.TanLightSourceAngle = FMath::Tan(LightSceneInfo->Proxy->GetLightSourceAngle());
        PassParameters->PS.MaxTraceDistance = GOffscreenShadowingMaxTraceDistance;
        PassParameters->PS.StepFactor = FMath::Clamp(GOffscreenShadowingTraceStepFactor, .1f, 10.0f);
        PassParameters->PS.SurfaceBias = FMath::Clamp(GShadowingSurfaceBias, .01f, 100.0f);
        PassParameters->PS.SlopeScaledSurfaceBias = FMath::Clamp(GShadowingSlopeScaledSurfaceBias, .01f, 100.0f);
        PassParameters->PS.SDFSurfaceBiasScale = FMath::Clamp(GOffscreenShadowingSDFSurfaceBiasScale, .01f, 100.0f);
        PassParameters->PS.VirtualShadowMapSurfaceBias = FMath::Clamp(GLumenDirectLightingVirtualShadowMapBias, .01f, 100.0f);
        PassParameters->PS.ForceOffscreenShadowing = GLumenDirectLightingForceOffscreenShadowing;

        if (bLumenUseHardwareRayTracedShadow)
        {
            PassParameters->PS.ShadowMaskAtlas = LumenDirectLightingHardwareRayTracingData.ShadowMaskAtlas;
        }

        // IES
        {
            FTexture* IESTextureResource = LightSceneInfo->Proxy->GetIESTextureResource();

            if (View.Family->EngineShowFlags.TexturedLightProfiles && IESTextureResource)
            {
                PassParameters->PS.UseIESProfile = 1;
                PassParameters->PS.IESTexture = IESTextureResource->TextureRHI;
            }
            else
            {
                PassParameters->PS.UseIESProfile = 0;
                PassParameters->PS.IESTexture = GWhiteTexture->TextureRHI;
            }

            PassParameters->PS.IESTextureSampler = TStaticSamplerState<SF_Bilinear,AM_Clamp,AM_Clamp,AM_Clamp>::GetRHI();
        }
    }

    FRasterizeToCardsVS::FPermutationDomain VSPermutationVector;
    VSPermutationVector.Set< FRasterizeToCardsVS::FClampToInfluenceSphere >(LightType != LightType_Directional);
    auto VertexShader = View.ShaderMap->GetShader<FRasterizeToCardsVS>(VSPermutationVector);
    const FMaterialRenderProxy* LightFunctionMaterialProxy = LightSceneInfo->Proxy->GetLightFunctionMaterial();
    bool bUseLightFunction = true;

    if (!LightFunctionMaterialProxy
        || !LightFunctionMaterialProxy->GetIncompleteMaterialWithFallback(Scene->GetFeatureLevel()).IsLightFunction()
        || !EngineShowFlags.LightFunctions)
    {
        bUseLightFunction = false;
        LightFunctionMaterialProxy = UMaterial::GetDefaultMaterial(MD_LightFunction)->GetRenderProxy();
    }

    const bool bUseCloudTransmittance = SetupLightCloudTransmittanceParameters(Scene, View, GLumenDirectLightingCloudTransmittance != 0 ? LightSceneInfo : nullptr, PassParameters->PS.LightCloudTransmittanceParameters);

    // 設定排列.
    FLumenCardDirectLightingPS::FPermutationDomain PermutationVector;
    PermutationVector.Set< FLumenCardDirectLightingPS::FLightType >(LumenLightType);
    PermutationVector.Set< FLumenCardDirectLightingPS::FDynamicallyShadowed >(bDynamicallyShadowed);
    PermutationVector.Set< FLumenCardDirectLightingPS::FShadowed >(bShadowed);
    PermutationVector.Set< FLumenCardDirectLightingPS::FTraceMeshSDFs >(bTraceMeshSDFs);
    PermutationVector.Set< FLumenCardDirectLightingPS::FVirtualShadowMap >(bUseVirtualShadowMap);
    PermutationVector.Set< FLumenCardDirectLightingPS::FLightFunction >(bUseLightFunction);
    PermutationVector.Set< FLumenCardDirectLightingPS::FRayTracingShadowPassCombine>(bLumenUseHardwareRayTracedShadow);
    PermutationVector.Set< FLumenCardDirectLightingPS::FCloudTransmittance >(bUseCloudTransmittance);
    
    PermutationVector = FLumenCardDirectLightingPS::RemapPermutation(PermutationVector);

    const FMaterial& Material = LightFunctionMaterialProxy->GetMaterialWithFallback(Scene->GetFeatureLevel(), LightFunctionMaterialProxy);
    const FMaterialShaderMap* MaterialShaderMap = Material.GetRenderingThreadShaderMap();
    auto PixelShader = MaterialShaderMap->GetShader<FLumenCardDirectLightingPS>(PermutationVector);

    ClearUnusedGraphResources(PixelShader, &PassParameters->PS);

    const uint32 CardIndirectArgOffset = CardScatterContext.GetIndirectArgOffset(ScatterInstanceIndex);

    // 光照繪製Pass.
    GraphBuilder.AddPass(
        RDG_EVENT_NAME("%s %s", *LightName, bDynamicallyShadowed ? TEXT("Shadowmap") : TEXT("")),
        PassParameters,
        ERDGPassFlags::Raster,
        [MaxAtlasSize = LumenSceneData.MaxAtlasSize, PassParameters, LightSceneInfo, VertexShader, PixelShader, GlobalShaderMap = View.ShaderMap, LightFunctionMaterialProxy, &Material, &View, CardIndirectArgOffset](FRHICommandListImmediate& RHICmdList)
        {
            DrawQuadsToAtlas(
                MaxAtlasSize,
                VertexShader,
                PixelShader,
                PassParameters,
                GlobalShaderMap,
                TStaticBlendState<CW_RGBA, BO_Add, BF_One, BF_One>::GetRHI(),
                RHICmdList,
                [LightFunctionMaterialProxy, &Material, &View](FRHICommandListImmediate& RHICmdList, TShaderRefBase<FLumenCardDirectLightingPS, FShaderMapPointerTable> Shader, FRHIPixelShader* ShaderRHI, const FLumenCardDirectLightingPS::FParameters& Parameters)
                {
                    Shader->SetParameters(RHICmdList, ShaderRHI, LightFunctionMaterialProxy, Material, View);
                },
                CardIndirectArgOffset);
        });
}

直接光照被截幀後的流程如下所示:

光照計算過程中輸入的紋理資料根據光源型別有所不同,但所有光源型別都會輸入深度、法線、Opacity等資料,不同的是區域性光源(非平行光)會輸入距離場相關紋理和16x16x16的Perlin噪點3D紋理,而平行光會輸入128x128x128的3D材質VolumeTexture(下圖是切片0放大4倍後的效果):

經過光照計算後輸出如下所示的結果:

直接光照計算使用的PS如下所示:

// Engine\Shaders\Private\Lumen\LumenSceneDirectLighting.usf

void LumenCardDirectLightingPS(
    FCardVSToPS CardInterpolants,
    out float4 OutColor : SV_Target0)
{
    float Opacity = Texture2DSampleLevel(OpacityAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0).x;
    float3 Irradiance = 0;

    if (Opacity > 0)
    {
        // 構建光源資料.
        FDeferredLightData LightData;
        {
            LightData.Position = DeferredLightUniforms.Position;
            LightData.InvRadius = DeferredLightUniforms.InvRadius;
            LightData.Color = DeferredLightUniforms.Color;
            LightData.FalloffExponent = DeferredLightUniforms.FalloffExponent;
            LightData.Direction = DeferredLightUniforms.Direction;  
            LightData.Tangent = DeferredLightUniforms.Tangent;
            LightData.SpotAngles = DeferredLightUniforms.SpotAngles;
            LightData.SourceRadius = DeferredLightUniforms.SourceRadius;
            LightData.SourceLength = DeferredLightUniforms.SourceLength;
            LightData.SoftSourceRadius = DeferredLightUniforms.SoftSourceRadius;
            LightData.SpecularScale = DeferredLightUniforms.SpecularScale;
            LightData.ContactShadowLength = abs(DeferredLightUniforms.ContactShadowLength);
            LightData.ContactShadowLengthInWS = DeferredLightUniforms.ContactShadowLength < 0.0f;
            LightData.DistanceFadeMAD = DeferredLightUniforms.DistanceFadeMAD;
            LightData.ShadowMapChannelMask = DeferredLightUniforms.ShadowMapChannelMask;
            LightData.ShadowedBits = DeferredLightUniforms.ShadowedBits;
            LightData.RectLightBarnCosAngle = DeferredLightUniforms.RectLightBarnCosAngle;
            LightData.RectLightBarnLength = DeferredLightUniforms.RectLightBarnLength;

            LightData.bInverseSquared = LightData.FalloffExponent == 0.0f;
            LightData.bRadialLight = LIGHT_TYPE != LIGHT_TYPE_DIRECTIONAL;
            LightData.bSpotLight = LIGHT_TYPE == LIGHT_TYPE_SPOT;
            LightData.bRectLight = LIGHT_TYPE == LIGHT_TYPE_RECT;
        }

        // 獲取Lumen卡片資料.
        FLumenCardData LumenCardData = GetLumenCardData(CardInterpolants.CardId);

        float Depth = 1.0f - Texture2DSampleLevel(LumenCardScene.DepthAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0).x;

        // 計算位置.
        float3 LocalPosition;
        LocalPosition.xy = (CardInterpolants.AtlasCoord - LumenCardData.LocalPositionToAtlasUVBias) / LumenCardData.LocalPositionToAtlasUVScale;
        LocalPosition.z = -LumenCardData.LocalExtent.z + Depth * 2 * LumenCardData.LocalExtent.z;

        float3 WorldPosition = mul(LumenCardData.WorldToLocalRotation, LocalPosition) + LumenCardData.Origin;

        float3 LightColor = DeferredLightUniforms.Color;
        float3 L = LightData.Direction;
        float3 ToLight = L;
    
        // 計算光源衰減.
#if LIGHT_TYPE == LIGHT_TYPE_DIRECTIONAL
        float CombinedAttenuation = 1;
#else
        float LightMask = 1;
        if (LightData.bRadialLight)
        {
            LightMask = GetLocalLightAttenuation(WorldPosition, LightData, ToLight, L);
        }

        float Attenuation;

        if (LightData.bRectLight)
        {
            FRect Rect = GetRect(ToLight, LightData);
            FRectTexture RectTexture = InitRectTexture(DeferredLightUniforms.SourceTexture);
            Attenuation = IntegrateLight(Rect, RectTexture);
        }
        else
        {
            FCapsuleLight Capsule = GetCapsule(ToLight, LightData);
            Capsule.DistBiasSqr = 0;
            Attenuation = IntegrateLight(Capsule, LightData.bInverseSquared);
        }

        float CombinedAttenuation = Attenuation * LightMask;
#endif

        if (CombinedAttenuation > 0)
        {
            float3 WorldNormal = Texture2DSampleLevel(LumenCardScene.NormalAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0).xyz * 2 - 1;

            // 面向光源的表面才計算光源.
            if (dot(WorldNormal, L) > 0)
            {
                float ShadowFactor = 1.0f;

                #if SHADOWED_LIGHT  // 帶陰影
                {
                    // 硬體光追陰影
                    #if HARDWARE_RAYTRACING_SHADOW_PASS_COMBINE
                    {
                        float2 AtlasTextureSize = LumenCardScene.AtlasSize;
                        uint2 Pos2D = CardInterpolants.AtlasCoord * AtlasTextureSize.xy - float2(0.5, 0.5) / AtlasTextureSize.xy;
                        ShadowFactor = ShadowMaskAtlas.Load(uint3(Pos2D, 0));
                    }
                    #else // 非硬體光追陰影
                    {
                        bool bShadowFactorComplete = false;
                        bool bVSMValid = false;

                        // 使用虛擬陰影圖
                        #if VIRTUAL_SHADOW_MAP
                        {
                            // Bias only ray start to maximize chances of hitting an allocated page
                            FVirtualShadowMapSampleResult VirtualShadowMapSample = SampleVirtualShadowMap(VirtualShadowMapId, WorldPosition, VirtualShadowMapSurfaceBias, WorldNormal);

                            bVSMValid = VirtualShadowMapSample.bValid;
                            bShadowFactorComplete = VirtualShadowMapSample.bValid && VirtualShadowMapSample.bOccluded;
                            ShadowFactor = VirtualShadowMapSample.ShadowFactor;
                        }
                        #endif

                        // 計算陰影強度ShadowFactor.
                        if (!bShadowFactorComplete)
                        {
                            float3 WorldPositionForShadowing = GetWorldPositionForShadowing(WorldPosition, L, WorldNormal, 1.0f);

                            #if LIGHT_TYPE == LIGHT_TYPE_DIRECTIONAL
                            {
                                #if DYNAMICALLY_SHADOWED
                                    float SceneDepth = dot(WorldPositionForShadowing - View.WorldCameraOrigin, View.ViewForward);

                                    bool bShadowingFromValidUVArea = false;
                                    float NewShadowFactor = ComputeDirectionalLightDynamicShadowing(WorldPositionForShadowing, SceneDepth, bShadowingFromValidUVArea);

                                    float4 PostProjectionPosition = mul(float4(WorldPosition, 1.0), View.WorldToClip);
                                    // CSM's are culled so only query points inside the view are valid
                                    float2 ValidTexelSize = float2(length(ddx(WorldPosition)), length(ddy(WorldPosition))) * 2;
                                    if (bShadowingFromValidUVArea && all(PostProjectionPosition.xy - ValidTexelSize < PostProjectionPosition.w&& PostProjectionPosition.xy + ValidTexelSize > -PostProjectionPosition.w))
                                    { 
                                        ShadowFactor *= NewShadowFactor;
                                        bShadowFactorComplete = VIRTUAL_SHADOW_MAP ? bVSMValid : true;
                                    }
                                #endif
                            }
                            #else
                            {
                                bool bShadowingFromValidUVArea = false;
                                float NewShadowFactor = ComputeVolumeShadowing(WorldPositionForShadowing, LightData.bRadialLight && !LightData.bSpotLight, LightData.bSpotLight, bShadowingFromValidUVArea);

                                if (bShadowingFromValidUVArea) 
                                {
                                    ShadowFactor *= NewShadowFactor;
                                    bShadowFactorComplete = VIRTUAL_SHADOW_MAP ? bVSMValid : true;
                                }
                            }
                            #endif
                        }

                        // 處理離屏陰影.
                        bool bOffscreenShadowing = !bShadowFactorComplete;
                        if (ForceOffscreenShadowing != 0)
                        {
                            ShadowFactor = 1.0;
                            bOffscreenShadowing = true;
                        }

                        if (bOffscreenShadowing)
                        {
                            ShadowFactor *= TraceOffscreenShadows(WorldPosition, L, ToLight, WorldNormal);
                        }
                    }
                    #endif // End hardware/software shadow selection        
                }
                #endif // End ShadowLight

                // 光照圖
                #if LIGHT_FUNCTION
                    ShadowFactor *= GetLightFunction(WorldPosition);
                #endif

                // 雲體透射
                #if USE_CLOUD_TRANSMITTANCE
                {
                    float OutOpticalDepth = 0.0f;
                    ShadowFactor *= lerp(1.0f, GetCloudVolumetricShadow(WorldPosition, CloudShadowmapWorldToLightClipMatrix, CloudShadowmapFarDepthKm, CloudShadowmapTexture, CloudShadowmapSampler, OutOpticalDepth), CloudShadowmapStrength);
                }
                #endif

                // IES
                if (UseIESProfile > 0)
                {
                    ShadowFactor *= ComputeLightProfileMultiplier(WorldPosition, DeferredLightUniforms.Position, -DeferredLightUniforms.Direction, DeferredLightUniforms.Tangent);
                }

                // 最終輻照度
                float NoL = saturate(dot(WorldNormal, L));
                Irradiance = LightColor * (CombinedAttenuation * NoL * ShadowFactor);
                //Irradiance = bShadowFactorValid ? float3(0, 1, 0) : float3(0.2f, 0.0f, 0.0f);
            }
        }
    }
        
    OutColor = float4(Irradiance, 0);
}

6.5.6.6 PrefilterLumenSceneLighting

這個過程類似於6.5.6.1 Voxel Cone Tracing提及的Geometry Prefiltering:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenScenePrefilter.cpp

void FDeferredShadingSceneRenderer::PrefilterLumenSceneLighting(
    FRDGBuilder& GraphBuilder,
    const FViewInfo& View,
    FLumenCardTracingInputs& TracingInputs,
    FGlobalShaderMap* GlobalShaderMap,
    const FLumenCardScatterContext& VisibleCardScatterContext)
{
    LLM_SCOPE_BYTAG(Lumen);
    RDG_EVENT_SCOPE(GraphBuilder, "Prefilter");

    FLumenSceneData& LumenSceneData = *Scene->LumenSceneData;

    // 根據解析度計算Mip的數量.
    const int32 NumMips = FMath::CeilLogTwo(FMath::Max(LumenSceneData.MaxAtlasSize.X, LumenSceneData.MaxAtlasSize.Y)) + 1;
    {
        FIntPoint SrcSize = LumenSceneData.MaxAtlasSize;
        FIntPoint DestSize = SrcSize / 2;

        // 迴圈Mip數量-1次(第0級就是初始紋理本身), 每次生成一個MIP.
        for (int32 MipIndex = 1; MipIndex < NumMips; MipIndex++)
        {
            SrcSize.X = FMath::Max(SrcSize.X, 1);
            SrcSize.Y = FMath::Max(SrcSize.Y, 1);
            DestSize.X = FMath::Max(DestSize.X, 1);
            DestSize.Y = FMath::Max(DestSize.Y, 1);

            FLumenCardPrefilterLighting* PassParameters = GraphBuilder.AllocParameters<FLumenCardPrefilterLighting>();
            
            // 設定渲染目標, 最多3個: 最終光照圖集, 輻照度圖集, 非直接輻照度圖集.
            PassParameters->RenderTargets[0] = FRenderTargetBinding(TracingInputs.FinalLightingAtlas, ERenderTargetLoadAction::ENoAction, MipIndex);
            bool bUseIrradianceAtlas = Lumen::UseIrradianceAtlas(View);
            bool bUseIndirectIrradianceAtlas = Lumen::UseIndirectIrradianceAtlas(View);
            if (bUseIrradianceAtlas)
            {
                PassParameters->RenderTargets[1] = FRenderTargetBinding(TracingInputs.IrradianceAtlas, ERenderTargetLoadAction::ENoAction, MipIndex);
                if (bUseIndirectIrradianceAtlas)
                {
                    PassParameters->RenderTargets[2] = FRenderTargetBinding(TracingInputs.IndirectIrradianceAtlas, ERenderTargetLoadAction::ENoAction, MipIndex);
                }
            }
            else if (bUseIndirectIrradianceAtlas)
            {
                PassParameters->RenderTargets[1] = FRenderTargetBinding(TracingInputs.IndirectIrradianceAtlas, ERenderTargetLoadAction::ENoAction, MipIndex);
            }
            PassParameters->VS.LumenCardScene = TracingInputs.LumenCardSceneUniformBuffer;
            PassParameters->VS.CardScatterParameters = VisibleCardScatterContext.Parameters;
            PassParameters->VS.ScatterInstanceIndex = 0;
            PassParameters->VS.CardUVSamplingOffset = FVector2D::ZeroVector;
            PassParameters->PS.View = View.ViewUniformBuffer;
            PassParameters->PS.LumenCardScene = TracingInputs.LumenCardSceneUniformBuffer;
            PassParameters->PS.ParentFinalLightingAtlas = GraphBuilder.CreateSRV(FRDGTextureSRVDesc::CreateForMipLevel(TracingInputs.FinalLightingAtlas, MipIndex - 1));
            // 注意建立SRV使用的是CreateForMipLevel.
            if (bUseIrradianceAtlas)
            {
                PassParameters->PS.ParentIrradianceAtlas = GraphBuilder.CreateSRV(FRDGTextureSRVDesc::CreateForMipLevel(TracingInputs.IrradianceAtlas, MipIndex - 1));
            }
            if (bUseIndirectIrradianceAtlas)
            {
                PassParameters->PS.ParentIndirectIrradianceAtlas = GraphBuilder.CreateSRV(FRDGTextureSRVDesc::CreateForMipLevel(TracingInputs.IndirectIrradianceAtlas, MipIndex - 1));
            }
            PassParameters->PS.InvSize = FVector2D(1.0f / SrcSize.X, 1.0f / SrcSize.Y);

            FScene* LocalScene = Scene;

            // 增加預過濾Pass.
            GraphBuilder.AddPass(
                RDG_EVENT_NAME("PrefilterMip"),
                PassParameters,
                ERDGPassFlags::Raster,
                [LocalScene, PassParameters, DestSize, GlobalShaderMap, bUseIrradianceAtlas, bUseIndirectIrradianceAtlas](FRHICommandListImmediate& RHICmdList)
            {
                FLumenCardPrefilterLightingPS::FPermutationDomain PermutationVector;
                PermutationVector.Set<FLumenCardPrefilterLightingPS::FUseIrradianceAtlas>(bUseIrradianceAtlas != 0);
                PermutationVector.Set<FLumenCardPrefilterLightingPS::FUseIndirectIrradianceAtlas>(bUseIndirectIrradianceAtlas != 0);
                auto PixelShader = GlobalShaderMap->GetShader< FLumenCardPrefilterLightingPS >(PermutationVector);
                DrawQuadsToAtlas(DestSize, PixelShader, PassParameters, GlobalShaderMap, TStaticBlendState<>::GetRHI(), RHICmdList);
            });

            SrcSize /= 2;
            DestSize /= 2;
        }
    }
}

使用的Shader如下:

// Engine\Shaders\Private\Lumen\LumenSceneLighting.usf

Texture2D ParentFinalLightingAtlas;
Texture2D ParentIrradianceAtlas;
Texture2D ParentIndirectIrradianceAtlas;

void LumenCardPrefilterLightingPS(
    FCardVSToPS CardInterpolants,
    out float4 OutLighting : SV_Target0,
    out float4 OutColor1 : SV_Target1,
    out float4 OutColor2 : SV_Target2)
{
    // 直接使用雙線性過濾獲得該MIP層級的顏色, 並沒有像6.5.6.1節使用高斯權重.
    OutLighting = Texture2DSampleLevel(ParentFinalLightingAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0);
#if USE_IRRADIANCE_ATLAS
    OutColor1 = Texture2DSampleLevel(ParentIrradianceAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0);
    #if USE_INDIRECTIRRADIANCE_ATLAS
        OutColor2 = Texture2DSampleLevel(ParentIndirectIrradianceAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0);
    #endif
#elif USE_INDIRECTIRRADIANCE_ATLAS
    OutColor1 = Texture2DSampleLevel(ParentIndirectIrradianceAtlas, GlobalBilinearClampedSampler, CardInterpolants.AtlasCoord, 0);
#endif
}

從截幀可看到,紋理的MIP層級和PrefilterMip的Pass數量一致:

6.5.6.7 ComputeLumenSceneVoxelLighting

ComputeLumenSceneVoxelLighting的主要作用是計算Lumen場景的Voxel光照,程式碼如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenVoxelLighting.cpp

void FDeferredShadingSceneRenderer::ComputeLumenSceneVoxelLighting(
    FRDGBuilder& GraphBuilder,
    FLumenCardTracingInputs& TracingInputs,
    FGlobalShaderMap* GlobalShaderMap)
{
    LLM_SCOPE_BYTAG(Lumen);

    const FViewInfo& View = Views[0];

    const int32 ClampedNumClipmapLevels = GetNumLumenVoxelClipmaps();
    const FIntVector ClipmapResolution = GetClipmapResolution();
    bool bForceFullUpdate = GLumenSceneVoxelLightingForceFullUpdate != 0;

    // 處理體素光照3D紋理.
    FRDGTextureRef VoxelLighting = TracingInputs.VoxelLighting;
    {
        FRDGTextureDesc LightingDesc(FRDGTextureDesc::Create3D(
            FIntVector(
                ClipmapResolution.X,
                ClipmapResolution.Y * ClampedNumClipmapLevels,
                ClipmapResolution.Z * GNumVoxelDirections),
            PF_FloatRGBA,
            FClearValueBinding::Black,
            TexCreate_ShaderResource | TexCreate_UAV | TexCreate_3DTiling));

        if (!VoxelLighting || VoxelLighting->Desc != LightingDesc)
        {
            bForceFullUpdate = true;
            VoxelLighting = GraphBuilder.CreateTexture(LightingDesc, TEXT("Lumen.VoxelLighting"));
        }
    }

    // 處理可見性紋理.
    FRDGTextureRef VoxelVisBuffer = View.ViewState->Lumen.VoxelVisBuffer ? GraphBuilder.RegisterExternalTexture(View.ViewState->Lumen.VoxelVisBuffer) : nullptr;
    {
        FRDGTextureDesc VoxelVisBufferDesc(FRDGTextureDesc::Create3D(
            FIntVector(
                ClipmapResolution.X,
                ClipmapResolution.Y * ClampedNumClipmapLevels,
                ClipmapResolution.Z * GNumVoxelDirections),
            PF_R32_UINT,
            FClearValueBinding::Black,
            TexCreate_ShaderResource | TexCreate_UAV | TexCreate_3DTiling));

        if (!VoxelVisBuffer
            || VoxelVisBuffer->Desc.Extent != VoxelVisBufferDesc.Extent
            || VoxelVisBuffer->Desc.Depth != VoxelVisBufferDesc.Depth)
        {
            bForceFullUpdate = true;
            VoxelVisBuffer = GraphBuilder.CreateTexture(VoxelVisBufferDesc, TEXT("Lumen.VoxelVisBuffer"));

            uint32 VisBufferClearValue[4] = { 0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF };
            AddClearUAVPass(GraphBuilder, GraphBuilder.CreateUAV(VoxelVisBuffer), VisBufferClearValue);
        }
    }

    // 可見性緩衝區資料僅對特定場景有效,如果場景發生變化需要重新建立.
    if (View.ViewState->Lumen.VoxelVisBufferCachedScene != Scene)
    {
        bForceFullUpdate = true;
        View.ViewState->Lumen.VoxelVisBufferCachedScene = Scene;
    }

    // 處理需要更新的Clipmap.
    TArray<int32, SceneRenderingAllocator> ClipmapsToUpdate;
    ClipmapsToUpdate.Empty(ClampedNumClipmapLevels);

    for (int32 ClipmapIndex = 0; ClipmapIndex < ClampedNumClipmapLevels; ClipmapIndex++)
    {
        if (bForceFullUpdate || ShouldUpdateVoxelClipmap(ClipmapIndex, ClampedNumClipmapLevels, View.ViewState->GetFrameIndex()))
        {
            ClipmapsToUpdate.Add(ClipmapIndex);
        }
    }

    ensureMsgf(bForceFullUpdate || ClipmapsToUpdate.Num() <= 1, TEXT("Tweak ShouldUpdateVoxelClipmap for better clipmap update distribution"));

    FString ClipmapsToUpdateString;

    for (int32 ToUpdateIndex = 0; ToUpdateIndex < ClipmapsToUpdate.Num(); ++ToUpdateIndex)
    {
        ClipmapsToUpdateString += FString::FromInt(ClipmapsToUpdate[ToUpdateIndex]);
        if (ToUpdateIndex + 1 < ClipmapsToUpdate.Num())
        {
            ClipmapsToUpdateString += TEXT(",");
        }
    }

    RDG_EVENT_SCOPE(GraphBuilder, "VoxelizeCards Clipmaps=[%s]", *ClipmapsToUpdateString);

    // 更新並體素化可見性緩衝.
    if (ClipmapsToUpdate.Num() > 0)
    {
        TracingInputs.VoxelLighting = VoxelLighting;
        TracingInputs.VoxelGridResolution = GetClipmapResolution();
        TracingInputs.NumClipmapLevels = ClampedNumClipmapLevels;

        // 更新可見性緩衝
        UpdateVoxelVisBuffer(GraphBuilder, Scene, View, TracingInputs, VoxelVisBuffer, ClipmapsToUpdate, bForceFullUpdate);
        // 體素化可見性緩衝
        VoxelizeVisBuffer(View, Scene, TracingInputs, VoxelLighting, VoxelVisBuffer, ClipmapsToUpdate, GraphBuilder);

        ConvertToExternalTexture(GraphBuilder, VoxelLighting, View.ViewState->Lumen.VoxelLighting);
        View.ViewState->Lumen.VoxelGridResolution = TracingInputs.VoxelGridResolution;
        View.ViewState->Lumen.NumClipmapLevels = TracingInputs.NumClipmapLevels;
    }

    ConvertToExternalTexture(GraphBuilder, VoxelVisBuffer, View.ViewState->Lumen.VoxelVisBuffer);
}

上面涉及了更新和體素化可見性快取,其具體的程式碼不再分析,但截幀的過程如下所示:

其中UpdateVoxelVisBuffer過程的最後階段VoxelTraceCS的輸入是距離場塊3D紋理,輸出是VoxelVisBuffer的3D紋理:

而VoxelizeVoxelVisBuffer過程的最後階段VisBufferShading的輸入有SceneFinalLighting、SceneOpacity、SceneDepth、距離場塊3D紋理和VoxelVisBuffer,輸出是VoxelLighting3D紋理,此階段之後,Lumen場景的光照資訊已經儲存在體素化後的3D紋理中了:

6.5.7 Lumen非直接光照

6.5.7.1 RenderDiffuseIndirectAndAmbientOcclusion

此階段就是利用之前Lumen計算生成的資訊計算最終的非直接光照,以模擬全域性光照效果,它的過程如下所示:

可知有SSGI降噪、螢幕空間探針收集、反射以及非直接光組合等幾個階段。對應的原始碼RenderDiffuseIndirectAndAmbientOcclusion如下:

// Engine\Source\Runtime\Renderer\Private\IndirectLightRendering.cpp

oid FDeferredShadingSceneRenderer::RenderDiffuseIndirectAndAmbientOcclusion(
    FRDGBuilder& GraphBuilder,
    FSceneTextures& SceneTextures,
    FRDGTextureRef LightingChannelsTexture,
    bool bIsVisualizePass)
{
    using namespace HybridIndirectLighting;

    if (ViewFamily.EngineShowFlags.VisualizeLumenIndirectDiffuse != bIsVisualizePass)
    {
        return;
    }

    RDG_EVENT_SCOPE(GraphBuilder, "DiffuseIndirectAndAO");

    FSceneTextureParameters SceneTextureParameters = GetSceneTextureParameters(GraphBuilder, SceneTextures.UniformBuffer);
    FRDGTextureRef SceneColorTexture = SceneTextures.Color.Target;

    const FRDGSystemTextures& SystemTextures = FRDGSystemTextures::Get(GraphBuilder);

    // 每個view都需要單獨計算一次.
    for (FViewInfo& View : Views)
    {
        RDG_GPU_MASK_SCOPE(GraphBuilder, View.GPUMask);

        const FPerViewPipelineState& ViewPipelineState = GetViewPipelineState(View);

        int32 DenoiseMode = CVarDiffuseIndirectDenoiser.GetValueOnRenderThread();

        // 設定通用的漫反射引數.
        FCommonParameters CommonDiffuseParameters;
        SetupCommonDiffuseIndirectParameters(GraphBuilder, SceneTextureParameters, View, /* out */ CommonDiffuseParameters);

        // 為降噪器更新舊的光線追蹤配置.
        IScreenSpaceDenoiser::FAmbientOcclusionRayTracingConfig RayTracingConfig;
        {
            RayTracingConfig.RayCountPerPixel = CommonDiffuseParameters.RayCountPerPixel;
            RayTracingConfig.ResolutionFraction = 1.0f / float(CommonDiffuseParameters.DownscaleFactor);
        }

        // 上一幀場景顏色
        ScreenSpaceRayTracing::FPrevSceneColorMip PrevSceneColorMip;
        if ((ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen || ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::SSGI) && View.PrevViewInfo.ScreenSpaceRayTracingInput.IsValid())
        {
            PrevSceneColorMip = ScreenSpaceRayTracing::ReducePrevSceneColorMip(GraphBuilder, SceneTextureParameters, View);
        }

        // 降噪器輸入輸出引數
        FSSDSignalTextures DenoiserOutputs;
        IScreenSpaceDenoiser::FDiffuseIndirectInputs DenoiserInputs;
        IScreenSpaceDenoiser::FDiffuseIndirectHarmonic DenoiserSphericalHarmonicInputs;
        FLumenReflectionCompositeParameters LumenReflectionCompositeParameters;
        bool bLumenUseDenoiserComposite = ViewPipelineState.bUseLumenProbeHierarchy;

        // 根據不同的非直接光方法獲得降噪輸入或輸出結構.
        
        // Lumen探針層次結構
        if (ViewPipelineState.bUseLumenProbeHierarchy)
        {
            check(ViewPipelineState.DiffuseIndirectDenoiser == IScreenSpaceDenoiser::EMode::Disabled);
            DenoiserOutputs = RenderLumenProbeHierarchy(
                GraphBuilder,
                SceneTextures,
                CommonDiffuseParameters, PrevSceneColorMip,
                View, &View.PrevViewInfo);
        }
        // 螢幕空間全域性光照
        else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::SSGI)
        {
            RDG_EVENT_SCOPE(GraphBuilder, "SSGI %dx%d", CommonDiffuseParameters.TracingViewportSize.X, CommonDiffuseParameters.TracingViewportSize.Y);
            DenoiserInputs = ScreenSpaceRayTracing::CastStandaloneDiffuseIndirectRays(
                GraphBuilder, CommonDiffuseParameters, PrevSceneColorMip, View);
        }
        // 光線追蹤全域性光照
        else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::RTGI)
        {
            // TODO: Refactor under the HybridIndirectLighting standard API.
            // TODO: hybrid SSGI / RTGI
            RenderRayTracingGlobalIllumination(GraphBuilder, SceneTextureParameters, View, /* out */ &RayTracingConfig, /* out */ &DenoiserInputs);
        }
        // Lumen全域性光照
        else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
        {
            check(ViewPipelineState.DiffuseIndirectDenoiser == IScreenSpaceDenoiser::EMode::Disabled);

            FLumenMeshSDFGridParameters MeshSDFGridParameters;

            DenoiserOutputs = RenderLumenScreenProbeGather(
                GraphBuilder, 
                SceneTextures,
                PrevSceneColorMip, 
                LightingChannelsTexture,
                View,
                &View.PrevViewInfo,
                bLumenUseDenoiserComposite,
                MeshSDFGridParameters);

            if (ViewPipelineState.ReflectionsMethod == EReflectionsMethod::Lumen)
            {
                DenoiserOutputs.Textures[2] = RenderLumenReflections(
                    GraphBuilder,
                    View,
                    SceneTextures, 
                    MeshSDFGridParameters,
                    LumenReflectionCompositeParameters);
            }

            if (!DenoiserOutputs.Textures[2])
            {
                DenoiserOutputs.Textures[2] = DenoiserOutputs.Textures[1];
            }
        }

        FRDGTextureRef AmbientOcclusionMask = DenoiserInputs.AmbientOcclusionMask;

        // 處理降噪.
        if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
        {
            // 由於Lumen全域性輸出的已經帶了降噪, 所以此處不需要任何操作.
        }
        else if (ViewPipelineState.DiffuseIndirectDenoiser == IScreenSpaceDenoiser::EMode::Disabled)
        {
            DenoiserOutputs.Textures[0] = DenoiserInputs.Color;
            DenoiserOutputs.Textures[1] = SystemTextures.White;
        }
        else
        {
            const IScreenSpaceDenoiser* DefaultDenoiser = IScreenSpaceDenoiser::GetDefaultDenoiser();
            const IScreenSpaceDenoiser* DenoiserToUse = 
                ViewPipelineState.DiffuseIndirectDenoiser == IScreenSpaceDenoiser::EMode::DefaultDenoiser
                ? DefaultDenoiser : GScreenSpaceDenoiser;

            RDG_EVENT_SCOPE(GraphBuilder, "%s%s(DiffuseIndirect) %dx%d",
                DenoiserToUse != DefaultDenoiser ? TEXT("ThirdParty ") : TEXT(""),
                DenoiserToUse->GetDebugName(),
                View.ViewRect.Width(), View.ViewRect.Height());

            if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::RTGI)
            {
                // 對RTGI進行降噪.
                DenoiserOutputs = DenoiserToUse->DenoiseDiffuseIndirect(
                    GraphBuilder,
                    View,
                    &View.PrevViewInfo,
                    SceneTextureParameters,
                    DenoiserInputs,
                    RayTracingConfig);

                AmbientOcclusionMask = DenoiserOutputs.Textures[1];
            }
            else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::SSGI)
            {
                // 對SSGI的結果降噪.
                DenoiserOutputs = DenoiserToUse->DenoiseScreenSpaceDiffuseIndirect(
                    GraphBuilder,
                    View,
                    &View.PrevViewInfo,
                    SceneTextureParameters,
                    DenoiserInputs,
                    RayTracingConfig);

                AmbientOcclusionMask = DenoiserOutputs.Textures[1];
            }
        }

        // 渲染AO
        bool bWritableAmbientOcclusionMask = true;
        if (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::Disabled)
        {
            ensure(!HasBeenProduced(SceneTextures.ScreenSpaceAO));
            AmbientOcclusionMask = nullptr;
            bWritableAmbientOcclusionMask = false;
        }
        else if (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::RTAO)
        {
            RenderRayTracingAmbientOcclusion(
                GraphBuilder,
                View,
                SceneTextureParameters,
                &AmbientOcclusionMask);
        }
        else if (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::SSGI)
        {
            check(AmbientOcclusionMask);
        }
        else if (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::SSAO)
        {
            // Fetch result of SSAO that was done earlier.
            if (HasBeenProduced(SceneTextures.ScreenSpaceAO))
            {
                AmbientOcclusionMask = SceneTextures.ScreenSpaceAO;
            }
            else
            {
                AmbientOcclusionMask = GetScreenSpaceAOFallback(SystemTextures);
                bWritableAmbientOcclusionMask = false;
            }
        }
        else
        {
            unimplemented();
            bWritableAmbientOcclusionMask = false;
        }

        // Extract the dynamic AO for application of AO beyond RenderDiffuseIndirectAndAmbientOcclusion()
        if (AmbientOcclusionMask && ViewPipelineState.AmbientOcclusionMethod != EAmbientOcclusionMethod::SSAO)
        {
            ensureMsgf(Views.Num() == 1, TEXT("Need to add support for one AO texture per view in FSceneTextures"));
            SceneTextures.ScreenSpaceAO = AmbientOcclusionMask;
        }

        if (HairStrands::HasViewHairStrandsData(View) && (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::SSGI || ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::SSAO) && bWritableAmbientOcclusionMask)
        {
            RenderHairStrandsAmbientOcclusion(
                GraphBuilder,
                View,
                AmbientOcclusionMask);
        }

        // 應用漫反射非直接光和環境光AO到場景顏色.
        if ((DenoiserOutputs.Textures[0] || AmbientOcclusionMask) && (!bIsVisualizePass || ViewPipelineState.DiffuseIndirectDenoiser != IScreenSpaceDenoiser::EMode::Disabled || ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
            && !IsMetalPlatform(ShaderPlatform))
        {
            // 用的PS是FDiffuseIndirectCompositePS
            FDiffuseIndirectCompositePS::FParameters* PassParameters = GraphBuilder.AllocParameters<FDiffuseIndirectCompositePS::FParameters>();
            
            PassParameters->AmbientOcclusionStaticFraction = FMath::Clamp(View.FinalPostProcessSettings.AmbientOcclusionStaticFraction, 0.0f, 1.0f);

            PassParameters->ApplyAOToDynamicDiffuseIndirect = 0.0f;

            if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
            {
                PassParameters->ApplyAOToDynamicDiffuseIndirect = 1.0f;
            }

            const FIntPoint BufferExtent = SceneTextureParameters.SceneDepthTexture->Desc.Extent;

            {
                // Placeholder texture for textures pulled in from SSDCommon.ush
                FRDGTextureDesc Desc = FRDGTextureDesc::Create2D(
                    FIntPoint(1),
                    PF_R32_UINT,
                    FClearValueBinding::Black,
                    TexCreate_ShaderResource);
                FRDGTextureRef CompressedMetadataPlaceholder = GraphBuilder.CreateTexture(Desc, TEXT("CompressedMetadataPlaceholder"));

                PassParameters->CompressedMetadata[0] = CompressedMetadataPlaceholder;
                PassParameters->CompressedMetadata[1] = CompressedMetadataPlaceholder;
            }

            PassParameters->BufferUVToOutputPixelPosition = BufferExtent;
            PassParameters->EyeAdaptation = GetEyeAdaptationTexture(GraphBuilder, View);
            PassParameters->LumenReflectionCompositeParameters = LumenReflectionCompositeParameters;

            PassParameters->bVisualizeDiffuseIndirect = bIsVisualizePass;

            PassParameters->DiffuseIndirect = DenoiserOutputs;
            PassParameters->DiffuseIndirectSampler = TStaticSamplerState<SF_Point>::GetRHI();

            PassParameters->PreIntegratedGF = GSystemTextures.PreintegratedGF->GetRenderTargetItem().ShaderResourceTexture;
            PassParameters->PreIntegratedGFSampler = TStaticSamplerState<SF_Bilinear, AM_Clamp, AM_Clamp, AM_Clamp>::GetRHI();

            PassParameters->AmbientOcclusionTexture = AmbientOcclusionMask;
            PassParameters->AmbientOcclusionSampler = TStaticSamplerState<SF_Point>::GetRHI();
            
            if (!PassParameters->AmbientOcclusionTexture || bIsVisualizePass)
            {
                PassParameters->AmbientOcclusionTexture = SystemTextures.White;
            }

            // 設定降噪器的通用shader引數.
            Denoiser::SetupCommonShaderParameters(
                View, SceneTextureParameters,
                View.ViewRect,
                1.0f / CommonDiffuseParameters.DownscaleFactor,
                /* out */ &PassParameters->DenoiserCommonParameters);
            PassParameters->SceneTextures = SceneTextureParameters;
            PassParameters->ViewUniformBuffer = View.ViewUniformBuffer;

            PassParameters->RenderTargets[0] = FRenderTargetBinding(
                SceneColorTexture, ERenderTargetLoadAction::ELoad);

            {
                FRDGTextureDesc Desc = FRDGTextureDesc::Create2D(
                    SceneColorTexture->Desc.Extent,
                    PF_FloatRGBA,
                    FClearValueBinding::None,
                    TexCreate_ShaderResource | TexCreate_UAV);

                PassParameters->PassDebugOutput = GraphBuilder.CreateUAV(
                    GraphBuilder.CreateTexture(Desc, TEXT("DebugDiffuseIndirectComposite")));
            }

            const TCHAR* DiffuseIndirectSampling = TEXT("Disabled");
            FDiffuseIndirectCompositePS::FPermutationDomain PermutationVector;
            bool bUpscale = false;

            if (DenoiserOutputs.Textures[0])
            {
                if (bLumenUseDenoiserComposite)
                {
                    PermutationVector.Set<FDiffuseIndirectCompositePS::FApplyDiffuseIndirectDim>(2);
                    DiffuseIndirectSampling = TEXT("ProbeHierarchy");
                }
                else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::RTGI)
                {
                    PermutationVector.Set<FDiffuseIndirectCompositePS::FApplyDiffuseIndirectDim>(3);
                    DiffuseIndirectSampling = TEXT("RTGI");
                }
                else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
                {
                    PermutationVector.Set<FDiffuseIndirectCompositePS::FApplyDiffuseIndirectDim>(4);
                    DiffuseIndirectSampling = TEXT("ScreenProbeGather");
                }
                else
                {
                    PermutationVector.Set<FDiffuseIndirectCompositePS::FApplyDiffuseIndirectDim>(1);
                    DiffuseIndirectSampling = TEXT("SSGI");
                    bUpscale = DenoiserOutputs.Textures[0]->Desc.Extent != SceneColorTexture->Desc.Extent;
                }

                PermutationVector.Set<FDiffuseIndirectCompositePS::FUpscaleDiffuseIndirectDim>(bUpscale);
            }

            TShaderMapRef<FDiffuseIndirectCompositePS> PixelShader(View.ShaderMap, PermutationVector);
            // 清理和優化無用的shader資源繫結.
            ClearUnusedGraphResources(PixelShader, PassParameters);

            FRHIBlendState* BlendState = TStaticBlendState<CW_RGBA, BO_Add, BF_One, BF_Source1Color, BO_Add, BF_One, BF_Source1Alpha>::GetRHI();

            if (bIsVisualizePass)
            {
                BlendState = TStaticBlendState<>::GetRHI();
            }

            // 組合非直接光Pass.
            FPixelShaderUtils::AddFullscreenPass(
                GraphBuilder,
                View.ShaderMap,
                RDG_EVENT_NAME(
                    "DiffuseIndirectComposite(DiffuseIndirect=%s%s%s%s) %dx%d",
                    DiffuseIndirectSampling,
                    PermutationVector.Get<FDiffuseIndirectCompositePS::FUpscaleDiffuseIndirectDim>() ? TEXT(" UpscaleDiffuseIndirect") : TEXT(""),
                    AmbientOcclusionMask ? TEXT(" ApplyAOToSceneColor") : TEXT(""),
                    PassParameters->ApplyAOToDynamicDiffuseIndirect > 0.0f ? TEXT(" ApplyAOToDynamicDiffuseIndirect") : TEXT(""),
                    View.ViewRect.Width(), View.ViewRect.Height()),
                PixelShader,
                PassParameters,
                View.ViewRect,
                BlendState);
        } // if (DenoiserOutputs.Color || bApplySSAO)

        // 應用環境cubemap.
        if (IsAmbientCubemapPassRequired(View) && !bIsVisualizePass && !ViewPipelineState.bUseLumenProbeHierarchy)
        {
            FAmbientCubemapCompositePS::FParameters* PassParameters = GraphBuilder.AllocParameters<FAmbientCubemapCompositePS::FParameters>();
            
            PassParameters->PreIntegratedGF = GSystemTextures.PreintegratedGF->GetRenderTargetItem().ShaderResourceTexture;
            PassParameters->PreIntegratedGFSampler = TStaticSamplerState<SF_Bilinear, AM_Clamp, AM_Clamp, AM_Clamp>::GetRHI();
            
            PassParameters->AmbientOcclusionTexture = AmbientOcclusionMask;
            PassParameters->AmbientOcclusionSampler = TStaticSamplerState<SF_Point>::GetRHI();
            
            if (!PassParameters->AmbientOcclusionTexture)
            {
                PassParameters->AmbientOcclusionTexture = SystemTextures.White;
            }

            PassParameters->SceneTextures = SceneTextureParameters;
            PassParameters->ViewUniformBuffer = View.ViewUniformBuffer;

            PassParameters->RenderTargets[0] = FRenderTargetBinding(
                SceneColorTexture, ERenderTargetLoadAction::ELoad);
        
            TShaderMapRef<FAmbientCubemapCompositePS> PixelShader(View.ShaderMap);
            GraphBuilder.AddPass(
                RDG_EVENT_NAME("AmbientCubemapComposite %dx%d", View.ViewRect.Width(), View.ViewRect.Height()),
                PassParameters,
                ERDGPassFlags::Raster,
                [PassParameters, &View, PixelShader](FRHICommandList& RHICmdList)
            {
                TShaderMapRef<FPostProcessVS> VertexShader(View.ShaderMap);
                
                RHICmdList.SetViewport(View.ViewRect.Min.X, View.ViewRect.Min.Y, 0.0f, View.ViewRect.Max.X, View.ViewRect.Max.Y, 0.0);

                FGraphicsPipelineStateInitializer GraphicsPSOInit;
                RHICmdList.ApplyCachedRenderTargets(GraphicsPSOInit);

                // set the state
                GraphicsPSOInit.BlendState = TStaticBlendState<CW_RGB, BO_Add, BF_One, BF_One, BO_Add, BF_One, BF_One>::GetRHI();
                GraphicsPSOInit.RasterizerState = TStaticRasterizerState<>::GetRHI();
                GraphicsPSOInit.DepthStencilState = TStaticDepthStencilState<false, CF_Always>::GetRHI();

                GraphicsPSOInit.BoundShaderState.VertexDeclarationRHI = GFilterVertexDeclaration.VertexDeclarationRHI;
                GraphicsPSOInit.BoundShaderState.VertexShaderRHI = VertexShader.GetVertexShader();
                GraphicsPSOInit.BoundShaderState.PixelShaderRHI = PixelShader.GetPixelShader();
                GraphicsPSOInit.PrimitiveType = PT_TriangleList;

                SetGraphicsPipelineState(RHICmdList, GraphicsPSOInit);

                uint32 Count = View.FinalPostProcessSettings.ContributingCubemaps.Num();
                for (const FFinalPostProcessSettings::FCubemapEntry& CubemapEntry : View.FinalPostProcessSettings.ContributingCubemaps)
                {
                    FAmbientCubemapCompositePS::FParameters ShaderParameters = *PassParameters;
                    SetupAmbientCubemapParameters(CubemapEntry, &ShaderParameters.AmbientCubemap);
                    SetShaderParameters(RHICmdList, PixelShader, PixelShader.GetPixelShader(), ShaderParameters);
                    
                    DrawPostProcessPass(
                        RHICmdList,
                        0, 0,
                        View.ViewRect.Width(), View.ViewRect.Height(),
                        View.ViewRect.Min.X, View.ViewRect.Min.Y,
                        View.ViewRect.Width(), View.ViewRect.Height(),
                        View.ViewRect.Size(),
                        GetSceneTextureExtent(),
                        VertexShader,
                        View.StereoPass, 
                        false, // TODO.
                        EDRF_UseTriangleOptimization);
                }
            });
        } // if (IsAmbientCubemapPassRequired(View))
    } // for (FViewInfo& View : Views)
}

6.5.7.2 RenderLumenScreenProbeGather

RenderLumenScreenProbeGather的功能是渲染Lumen螢幕空間的探針收集,其程式碼如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenScreenProbeGather.cpp

FSSDSignalTextures FDeferredShadingSceneRenderer::RenderLumenScreenProbeGather(
    FRDGBuilder& GraphBuilder,
    const FSceneTextures& SceneTextures,
    const ScreenSpaceRayTracing::FPrevSceneColorMip& PrevSceneColorMip,
    FRDGTextureRef LightingChannelsTexture,
    const FViewInfo& View,
    FPreviousViewInfo* PreviousViewInfos,
    bool& bLumenUseDenoiserComposite,
    FLumenMeshSDFGridParameters& MeshSDFGridParameters)
{
    LLM_SCOPE_BYTAG(Lumen);

    // 渲染Lumen輻照度場收集.
    if (GLumenIrradianceFieldGather != 0)
    {
        bLumenUseDenoiserComposite = false;
        return RenderLumenIrradianceFieldGather(GraphBuilder, SceneTextures, View);
    }

    RDG_EVENT_SCOPE(GraphBuilder, "LumenScreenProbeGather");
    RDG_GPU_STAT_SCOPE(GraphBuilder, LumenScreenProbeGather);

    check(ShouldRenderLumenDiffuseGI(Scene, View, true));
    const FRDGSystemTextures& SystemTextures = FRDGSystemTextures::Get(GraphBuilder);

    if (!LightingChannelsTexture)
    {
        LightingChannelsTexture = SystemTextures.Black;
    }

    // 如果沒有啟用LumenScreenProbeGather, 則直接清理降噪輸入.
    if (!GLumenScreenProbeGather)
    {
        FSSDSignalTextures ScreenSpaceDenoiserInputs;
        ScreenSpaceDenoiserInputs.Textures[0] = SystemTextures.Black;
        FRDGTextureDesc RoughSpecularIndirectDesc = FRDGTextureDesc::Create2D(SceneTextures.Config.Extent, PF_FloatRGB, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV);
        ScreenSpaceDenoiserInputs.Textures[1] = GraphBuilder.CreateTexture(RoughSpecularIndirectDesc, TEXT("Lumen.ScreenProbeGather.RoughSpecularIndirect"));
        AddClearUAVPass(GraphBuilder, GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenSpaceDenoiserInputs.Textures[1])), FLinearColor::Black);
        bLumenUseDenoiserComposite = false;
        return ScreenSpaceDenoiserInputs;
    }

    // 從統一緩衝區拉取備用紋理.
    const FSceneTextureParameters SceneTextureParameters = GetSceneTextureParameters(GraphBuilder, SceneTextures.UniformBuffer);

    // 設定螢幕空間探針的引數.
    FScreenProbeParameters ScreenProbeParameters;
    ScreenProbeParameters.ScreenProbeTracingOctahedronResolution = LumenScreenProbeGather::GetTracingOctahedronResolution(View);
    ensureMsgf(ScreenProbeParameters.ScreenProbeTracingOctahedronResolution < (1 << 6) - 1, TEXT("Tracing resolution %u was larger than supported by PackRayInfo()"), ScreenProbeParameters.ScreenProbeTracingOctahedronResolution);
    ScreenProbeParameters.ScreenProbeGatherOctahedronResolution = LumenScreenProbeGather::GetGatherOctahedronResolution(ScreenProbeParameters.ScreenProbeTracingOctahedronResolution);
    ScreenProbeParameters.ScreenProbeGatherOctahedronResolutionWithBorder = ScreenProbeParameters.ScreenProbeGatherOctahedronResolution + 2 * (1 << (GLumenScreenProbeGatherNumMips - 1));
    ScreenProbeParameters.ScreenProbeDownsampleFactor = LumenScreenProbeGather::GetScreenDownsampleFactor(View);

    ScreenProbeParameters.ScreenProbeViewSize = FIntPoint::DivideAndRoundUp(View.ViewRect.Size(), (int32)ScreenProbeParameters.ScreenProbeDownsampleFactor);
    ScreenProbeParameters.ScreenProbeAtlasViewSize = ScreenProbeParameters.ScreenProbeViewSize;
    ScreenProbeParameters.ScreenProbeAtlasViewSize.Y += FMath::TruncToInt(ScreenProbeParameters.ScreenProbeViewSize.Y * GLumenScreenProbeGatherAdaptiveProbeAllocationFraction);

    ScreenProbeParameters.ScreenProbeAtlasBufferSize = FIntPoint::DivideAndRoundUp(SceneTextures.Config.Extent, (int32)ScreenProbeParameters.ScreenProbeDownsampleFactor);
    ScreenProbeParameters.ScreenProbeAtlasBufferSize.Y += FMath::TruncToInt(ScreenProbeParameters.ScreenProbeAtlasBufferSize.Y * GLumenScreenProbeGatherAdaptiveProbeAllocationFraction);

    ScreenProbeParameters.ScreenProbeGatherMaxMip = GLumenScreenProbeGatherNumMips - 1;
    ScreenProbeParameters.RelativeSpeedDifferenceToConsiderLightingMoving = GLumenScreenProbeRelativeSpeedDifferenceToConsiderLightingMoving;
    ScreenProbeParameters.ScreenTraceNoFallbackThicknessScale = Lumen::UseHardwareRayTracedScreenProbeGather() ? 1.0f : GLumenScreenProbeScreenTracesThicknessScaleWhenNoFallback;
    ScreenProbeParameters.NumUniformScreenProbes = ScreenProbeParameters.ScreenProbeViewSize.X * ScreenProbeParameters.ScreenProbeViewSize.Y;
    ScreenProbeParameters.MaxNumAdaptiveProbes = FMath::TruncToInt(ScreenProbeParameters.NumUniformScreenProbes * GLumenScreenProbeGatherAdaptiveProbeAllocationFraction);
    extern int32 GLumenScreenProbeGatherVisualizeTraces;
    ScreenProbeParameters.FixedJitterIndex = GLumenScreenProbeGatherVisualizeTraces == 0 ? GLumenScreenProbeFixedJitterIndex : 6;

    FRDGTextureDesc DownsampledDepthDesc(FRDGTextureDesc::Create2D(ScreenProbeParameters.ScreenProbeAtlasBufferSize, PF_R32_UINT, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    ScreenProbeParameters.ScreenProbeSceneDepth = GraphBuilder.CreateTexture(DownsampledDepthDesc, TEXT("Lumen.ScreenProbeGather.ScreenProbeSceneDepth"));

    FRDGTextureDesc DownsampledSpeedDesc(FRDGTextureDesc::Create2D(ScreenProbeParameters.ScreenProbeAtlasBufferSize, PF_R16F, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    ScreenProbeParameters.ScreenProbeWorldSpeed = GraphBuilder.CreateTexture(DownsampledSpeedDesc, TEXT("Lumen.ScreenProbeGather.ScreenProbeWorldSpeed"));

    FBlueNoise BlueNoise;
    InitializeBlueNoise(BlueNoise);
    ScreenProbeParameters.BlueNoise = CreateUniformBufferImmediate(BlueNoise, EUniformBufferUsage::UniformBuffer_SingleDraw);

    ScreenProbeParameters.OctahedralSolidAngleParameters.OctahedralSolidAngleTextureResolutionSq = GLumenOctahedralSolidAngleTextureSize * GLumenOctahedralSolidAngleTextureSize;
    ScreenProbeParameters.OctahedralSolidAngleParameters.OctahedralSolidAngleTexture = InitializeOctahedralSolidAngleTexture(GraphBuilder, View.ShaderMap, GLumenOctahedralSolidAngleTextureSize, View.ViewState->Lumen.ScreenProbeGatherState.OctahedralSolidAngleTextureRT);

    // 探針下采樣深度.
    {
        FScreenProbeDownsampleDepthUniformCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeDownsampleDepthUniformCS::FParameters>();
        PassParameters->RWScreenProbeSceneDepth = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenProbeSceneDepth));
        PassParameters->RWScreenProbeWorldSpeed = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenProbeWorldSpeed));
        PassParameters->View = View.ViewUniformBuffer;
        PassParameters->SceneTexturesStruct = SceneTextures.UniformBuffer;
        PassParameters->SceneTextures = SceneTextureParameters;
        PassParameters->ScreenProbeParameters = ScreenProbeParameters;

        auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeDownsampleDepthUniformCS>(0);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("UniformPlacement DownsampleFactor=%u", ScreenProbeParameters.ScreenProbeDownsampleFactor),
            ComputeShader,
            PassParameters,
            FComputeShaderUtils::GetGroupCount(ScreenProbeParameters.ScreenProbeViewSize, FScreenProbeDownsampleDepthUniformCS::GetGroupSize()));
    }

    FRDGBufferRef NumAdaptiveScreenProbes = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(uint32), 1), TEXT("Lumen.ScreenProbeGather.NumAdaptiveScreenProbes"));
    FRDGBufferRef AdaptiveScreenProbeData = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(uint32), FMath::Max<uint32>(ScreenProbeParameters.MaxNumAdaptiveProbes, 1)), TEXT("Lumen.ScreenProbeGather.daptiveScreenProbeData"));

    ScreenProbeParameters.NumAdaptiveScreenProbes = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(NumAdaptiveScreenProbes, PF_R32_UINT));
    ScreenProbeParameters.AdaptiveScreenProbeData = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(AdaptiveScreenProbeData, PF_R32_UINT));

    const FIntPoint ScreenProbeViewportBufferSize = FIntPoint::DivideAndRoundUp(SceneTextures.Config.Extent, (int32)ScreenProbeParameters.ScreenProbeDownsampleFactor);
    FRDGTextureDesc ScreenTileAdaptiveProbeHeaderDesc(FRDGTextureDesc::Create2D(ScreenProbeViewportBufferSize, PF_R32_UINT, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    FIntPoint ScreenTileAdaptiveProbeIndicesBufferSize = FIntPoint(ScreenProbeViewportBufferSize.X * ScreenProbeParameters.ScreenProbeDownsampleFactor, ScreenProbeViewportBufferSize.Y * ScreenProbeParameters.ScreenProbeDownsampleFactor);
    FRDGTextureDesc ScreenTileAdaptiveProbeIndicesDesc(FRDGTextureDesc::Create2D(ScreenTileAdaptiveProbeIndicesBufferSize, PF_R16_UINT, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    ScreenProbeParameters.ScreenTileAdaptiveProbeHeader = GraphBuilder.CreateTexture(ScreenTileAdaptiveProbeHeaderDesc, TEXT("Lumen.ScreenProbeGather.ScreenTileAdaptiveProbeHeader"));
    ScreenProbeParameters.ScreenTileAdaptiveProbeIndices = GraphBuilder.CreateTexture(ScreenTileAdaptiveProbeIndicesDesc, TEXT("Lumen.ScreenProbeGather.ScreenTileAdaptiveProbeIndices"));

    FComputeShaderUtils::ClearUAV(GraphBuilder, View.ShaderMap, GraphBuilder.CreateUAV(FRDGBufferUAVDesc(NumAdaptiveScreenProbes, PF_R32_UINT)), 0);
    uint32 ClearValues[4] = {0, 0, 0, 0};
    AddClearUAVPass(GraphBuilder, GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenTileAdaptiveProbeHeader)), ClearValues);

    const uint32 AdaptiveProbeMinDownsampleFactor = FMath::Clamp(GLumenScreenProbeGatherAdaptiveProbeMinDownsampleFactor, 1, 64);

    if (ScreenProbeParameters.MaxNumAdaptiveProbes > 0 && AdaptiveProbeMinDownsampleFactor < ScreenProbeParameters.ScreenProbeDownsampleFactor)
    { 
        // 探針自適應地放置位置.
        uint32 PlacementDownsampleFactor = ScreenProbeParameters.ScreenProbeDownsampleFactor;
        do
        {
            PlacementDownsampleFactor /= 2;
            FScreenProbeAdaptivePlacementCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeAdaptivePlacementCS::FParameters>();
            PassParameters->RWScreenProbeSceneDepth = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenProbeSceneDepth));
            PassParameters->RWScreenProbeWorldSpeed = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenProbeWorldSpeed));
            PassParameters->RWNumAdaptiveScreenProbes = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(NumAdaptiveScreenProbes, PF_R32_UINT));
            PassParameters->RWAdaptiveScreenProbeData = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(AdaptiveScreenProbeData, PF_R32_UINT));
            PassParameters->RWScreenTileAdaptiveProbeHeader = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenTileAdaptiveProbeHeader));
            PassParameters->RWScreenTileAdaptiveProbeIndices = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenTileAdaptiveProbeIndices));
            PassParameters->View = View.ViewUniformBuffer;
            PassParameters->SceneTexturesStruct = SceneTextures.UniformBuffer;
            PassParameters->SceneTextures = SceneTextureParameters;
            PassParameters->ScreenProbeParameters = ScreenProbeParameters;
            PassParameters->PlacementDownsampleFactor = PlacementDownsampleFactor;

            auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeAdaptivePlacementCS>(0);

            FComputeShaderUtils::AddPass(
                GraphBuilder,
                RDG_EVENT_NAME("AdaptivePlacement DownsampleFactor=%u", PlacementDownsampleFactor),
                ComputeShader,
                PassParameters,
                FComputeShaderUtils::GetGroupCount(FIntPoint::DivideAndRoundDown(View.ViewRect.Size(), (int32)PlacementDownsampleFactor), FScreenProbeAdaptivePlacementCS::GetGroupSize()));
        }
        while (PlacementDownsampleFactor > AdaptiveProbeMinDownsampleFactor);
    }
    else
    {
        FComputeShaderUtils::ClearUAV(GraphBuilder, View.ShaderMap, GraphBuilder.CreateUAV(FRDGBufferUAVDesc(AdaptiveScreenProbeData, PF_R32_UINT)), 0);
        AddClearUAVPass(GraphBuilder, GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenTileAdaptiveProbeIndices)), ClearValues);
    }

    FRDGBufferRef ScreenProbeIndirectArgs = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateIndirectDesc<FRHIDispatchIndirectParameters>((uint32)EScreenProbeIndirectArgs::Max), TEXT("Lumen.ScreenProbeGather.ScreenProbeIndirectArgs"));

    // 設定自適應探針的非直接引數.
    {
        FSetupAdaptiveProbeIndirectArgsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FSetupAdaptiveProbeIndirectArgsCS::FParameters>();
        PassParameters->RWScreenProbeIndirectArgs = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(ScreenProbeIndirectArgs, PF_R32_UINT));
        PassParameters->ScreenProbeParameters = ScreenProbeParameters;

        auto ComputeShader = View.ShaderMap->GetShader<FSetupAdaptiveProbeIndirectArgsCS>(0);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("SetupAdaptiveProbeIndirectArgs"),
            ComputeShader,
            PassParameters,
            FIntVector(1, 1, 1));
    }

    ScreenProbeParameters.ProbeIndirectArgs = ScreenProbeIndirectArgs;

    FLumenCardTracingInputs TracingInputs(GraphBuilder, Scene, View);

    FRDGTextureRef BRDFProbabilityDensityFunction = nullptr;
    FRDGBufferSRVRef BRDFProbabilityDensityFunctionSH = nullptr;
    GenerateBRDF_PDF(GraphBuilder, View, SceneTextures, BRDFProbabilityDensityFunction, BRDFProbabilityDensityFunctionSH, ScreenProbeParameters);

    const LumenRadianceCache::FRadianceCacheInputs RadianceCacheInputs = LumenScreenProbeGatherRadianceCache::SetupRadianceCacheInputs();
    LumenRadianceCache::FRadianceCacheInterpolationParameters RadianceCacheParameters;

    // 輻射率快取.
    if (LumenScreenProbeGather::UseRadianceCache(View))
    {
        FScreenGatherMarkUsedProbesData MarkUsedProbesData;
        MarkUsedProbesData.Parameters.View = View.ViewUniformBuffer;
        MarkUsedProbesData.Parameters.SceneTexturesStruct = SceneTextures.UniformBuffer;
        MarkUsedProbesData.Parameters.ScreenProbeParameters = ScreenProbeParameters;
        MarkUsedProbesData.Parameters.VisualizeLumenScene = View.Family->EngineShowFlags.VisualizeLumenScene != 0 ? 1 : 0;
        MarkUsedProbesData.Parameters.RadianceCacheParameters = RadianceCacheParameters;

        // 渲染輻射率快取.
        RenderRadianceCache(
            GraphBuilder, 
            TracingInputs, 
            RadianceCacheInputs, 
            Scene,
            View, 
            &ScreenProbeParameters, 
            BRDFProbabilityDensityFunctionSH, 
            FMarkUsedRadianceCacheProbes::CreateStatic(&ScreenGatherMarkUsedProbes), 
            &MarkUsedProbesData, 
            View.ViewState->RadianceCacheState, 
            RadianceCacheParameters);
    }

    if (LumenScreenProbeGather::UseImportanceSampling(View))
    {
        // 生成重要性取樣射線.
        GenerateImportanceSamplingRays(
            GraphBuilder,
            View,
            SceneTextures,
            RadianceCacheParameters,
            BRDFProbabilityDensityFunction,
            BRDFProbabilityDensityFunctionSH,
            ScreenProbeParameters);
    }

    const FIntPoint ScreenProbeTraceBufferSize = ScreenProbeParameters.ScreenProbeAtlasBufferSize * ScreenProbeParameters.ScreenProbeTracingOctahedronResolution;
    FRDGTextureDesc TraceRadianceDesc(FRDGTextureDesc::Create2D(ScreenProbeTraceBufferSize, PF_FloatRGB, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    ScreenProbeParameters.TraceRadiance = GraphBuilder.CreateTexture(TraceRadianceDesc, TEXT("Lumen.ScreenProbeGather.TraceRadiance"));
    ScreenProbeParameters.RWTraceRadiance = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.TraceRadiance));

    FRDGTextureDesc TraceHitDesc(FRDGTextureDesc::Create2D(ScreenProbeTraceBufferSize, PF_R32_UINT, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    ScreenProbeParameters.TraceHit = GraphBuilder.CreateTexture(TraceHitDesc, TEXT("Lumen.ScreenProbeGather.TraceHit"));
    ScreenProbeParameters.RWTraceHit = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.TraceHit));

    // 追蹤螢幕空間的探針.
    TraceScreenProbes(
        GraphBuilder, 
        Scene,
        View, 
        GLumenGatherCvars.TraceMeshSDFs != 0 && Lumen::UseMeshSDFTracing(),
        SceneTextures.UniformBuffer,
        PrevSceneColorMip,
        LightingChannelsTexture,
        TracingInputs,
        RadianceCacheParameters,
        ScreenProbeParameters,
        MeshSDFGridParameters);
    
    FScreenProbeGatherParameters GatherParameters;
    // 過濾螢幕空間探針.
    FilterScreenProbes(GraphBuilder, View, ScreenProbeParameters, GatherParameters);

    FScreenSpaceBentNormalParameters ScreenSpaceBentNormalParameters;
    ScreenSpaceBentNormalParameters.UseScreenBentNormal = 0;
    ScreenSpaceBentNormalParameters.ScreenBentNormal = SystemTextures.Black;
    ScreenSpaceBentNormalParameters.ScreenDiffuseLighting = SystemTextures.Black;

    // 計算螢幕空間的環境法線.
    if (LumenScreenProbeGather::UseScreenSpaceBentNormal())
    {
        ScreenSpaceBentNormalParameters = ComputeScreenSpaceBentNormal(GraphBuilder, Scene, View, SceneTextures, LightingChannelsTexture, ScreenProbeParameters);
    }

    FRDGTextureDesc DiffuseIndirectDesc = FRDGTextureDesc::Create2D(SceneTextures.Config.Extent, PF_FloatRGBA, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV);
    FRDGTextureRef DiffuseIndirect = GraphBuilder.CreateTexture(DiffuseIndirectDesc, TEXT("Lumen.ScreenProbeGather.DiffuseIndirect"));

    FRDGTextureDesc RoughSpecularIndirectDesc = FRDGTextureDesc::Create2D(SceneTextures.Config.Extent, PF_FloatRGB, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV);
    FRDGTextureRef RoughSpecularIndirect = GraphBuilder.CreateTexture(RoughSpecularIndirectDesc, TEXT("Lumen.ScreenProbeGather.RoughSpecularIndirect"));

    {
        FScreenProbeIndirectCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeIndirectCS::FParameters>();
        PassParameters->RWDiffuseIndirect = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(DiffuseIndirect));
        PassParameters->RWRoughSpecularIndirect = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(RoughSpecularIndirect));
        PassParameters->GatherParameters = GatherParameters;
        PassParameters->ScreenProbeParameters = ScreenProbeParameters;
        PassParameters->View = View.ViewUniformBuffer;
        PassParameters->SceneTexturesStruct = SceneTextures.UniformBuffer;
        PassParameters->FullResolutionJitterWidth = GLumenScreenProbeFullResolutionJitterWidth;
        extern float GLumenReflectionMaxRoughnessToTrace;
        extern float GLumenReflectionRoughnessFadeLength;
        PassParameters->MaxRoughnessToTrace = GLumenReflectionMaxRoughnessToTrace;
        PassParameters->RoughnessFadeLength = GLumenReflectionRoughnessFadeLength;
        PassParameters->ScreenSpaceBentNormalParameters = ScreenSpaceBentNormalParameters;

        FScreenProbeIndirectCS::FPermutationDomain PermutationVector;
        PermutationVector.Set< FScreenProbeIndirectCS::FDiffuseIntegralMethod >(LumenScreenProbeGather::GetDiffuseIntegralMethod());
        auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeIndirectCS>(PermutationVector);

        // 計算螢幕空間探針的非直接光.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("ComputeIndirect %ux%u", View.ViewRect.Width(), View.ViewRect.Height()),
            ComputeShader,
            PassParameters,
            FComputeShaderUtils::GetGroupCount(View.ViewRect.Size(), FScreenProbeIndirectCS::GetGroupSize()));
    }

    FSSDSignalTextures DenoiserOutputs;
    DenoiserOutputs.Textures[0] = DiffuseIndirect;
    DenoiserOutputs.Textures[1] = RoughSpecularIndirect;
    bLumenUseDenoiserComposite = false;

    // 螢幕空間探針的時間過濾.
    if (GLumenScreenProbeTemporalFilter)
    {
        if (GLumenScreenProbeUseHistoryNeighborhoodClamp)
        {
            FRDGTextureRef CompressedDepthTexture;
            FRDGTextureRef CompressedShadingModelTexture;
            {
                FRDGTextureDesc Desc = FRDGTextureDesc::Create2D(
                    SceneTextures.Depth.Resolve->Desc.Extent,
                    PF_R16F,
                    FClearValueBinding::None,                    
                    /* InTargetableFlags = */ TexCreate_ShaderResource | TexCreate_UAV);

                CompressedDepthTexture = GraphBuilder.CreateTexture(Desc, TEXT("Lumen.ScreenProbeGather.CompressedDepth"));

                Desc.Format = PF_R8_UINT;
                CompressedShadingModelTexture = GraphBuilder.CreateTexture(Desc, TEXT("Lumen.ScreenProbeGather.CompressedShadingModelID"));
            }

            {
                FGenerateCompressedGBuffer::FParameters* PassParameters = GraphBuilder.AllocParameters<FGenerateCompressedGBuffer::FParameters>();
                PassParameters->RWCompressedDepthBufferOutput = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(CompressedDepthTexture));
                PassParameters->RWCompressedShadingModelOutput = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(CompressedShadingModelTexture));
                PassParameters->View = View.ViewUniformBuffer;
                PassParameters->SceneTextures = SceneTextureParameters;

                auto ComputeShader = View.ShaderMap->GetShader<FGenerateCompressedGBuffer>(0);

                FComputeShaderUtils::AddPass(
                    GraphBuilder,
                    RDG_EVENT_NAME("GenerateCompressedGBuffer"),
                    ComputeShader,
                    PassParameters,
                    FComputeShaderUtils::GetGroupCount(View.ViewRect.Size(), FGenerateCompressedGBuffer::GetGroupSize()));
            }

            FSSDSignalTextures ScreenSpaceDenoiserInputs;
            ScreenSpaceDenoiserInputs.Textures[0] = DiffuseIndirect;
            ScreenSpaceDenoiserInputs.Textures[1] = RoughSpecularIndirect;

            DenoiserOutputs = IScreenSpaceDenoiser::DenoiseIndirectProbeHierarchy(
                GraphBuilder,
                View, 
                PreviousViewInfos,
                SceneTextureParameters,
                ScreenSpaceDenoiserInputs,
                CompressedDepthTexture,
                CompressedShadingModelTexture);

            bLumenUseDenoiserComposite = true;
        }
        else
        {
            UpdateHistoryScreenProbeGather(
                GraphBuilder,
                View,
                SceneTextures,
                DiffuseIndirect,
                RoughSpecularIndirect);

            DenoiserOutputs.Textures[0] = DiffuseIndirect;
            DenoiserOutputs.Textures[1] = RoughSpecularIndirect;
        }
    }

    return DenoiserOutputs;
}

結合原始碼和RenderDoc截幀資料,可知螢幕空間的探針收集階段異常複雜,常規流程的主要步驟有:全域性並自適應調整位置、計算BRDF、渲染輻射率快取、計算光照PDF、生成取樣射線、追蹤螢幕空間的探針、壓縮追蹤結果、追蹤Voxel體素、組合追蹤結果、過濾帶收集的輻射率、處理環境法線、計算非直接光、更新歷史資料:

由於以上步驟涉及太多了,只能結合截幀資料挑選部分重要步驟加以分析。

  • RadianceCache

光照快取(RadianceCache)也是一系列非常複雜的過程,先後經歷清理、標記、更新、分配探針,設定繪製引數,追蹤探針,過濾探針輻射度等階段:

RadianceCache最重要的是追蹤螢幕空間的探針,它的輸入資料有全域性距離場、VoxelLighting等紋理。

輸出是4096x4096的輻射率探針圖集和深度:

TraceFromProbes輸出的探針圖集(區域性放大)。

其使用的Compute Shader程式碼如下:

// Engine\Shaders\Private\Lumen\LumenRadianceCache.usf

groupshared float3 SharedTraceRadiance[THREADGROUP_SIZE][THREADGROUP_SIZE];
groupshared float SharedTraceHitDistance[THREADGROUP_SIZE][THREADGROUP_SIZE];

[numthreads(THREADGROUP_SIZE, THREADGROUP_SIZE, 1)]
void TraceFromProbesCS(
    uint3 GroupId : SV_GroupID,
    uint2 GroupThreadId : SV_GroupThreadID)
{
    uint TraceTileIndex = GroupId.y * TRACE_TILE_GROUP_STRIDE + GroupId.x;

    if (TraceTileIndex < ProbeTraceTileAllocator[0])
    {
        uint2 TraceTileCoord;
        uint TraceTileLevel;
        uint ProbeTraceIndex;
        // 獲取追蹤塊的資訊
        UnpackTraceTileInfo(ProbeTraceTileData[TraceTileIndex], TraceTileCoord, TraceTileLevel, ProbeTraceIndex);

        uint TraceResolution = (RadianceProbeResolution / 2) << TraceTileLevel;
        // 探針紋素座標
        uint2 ProbeTexelCoord = TraceTileCoord * THREADGROUP_SIZE + GroupThreadId.xy;


        float3 ProbeWorldCenter;
        uint ClipmapIndex;
        uint ProbeIndex;
        // 獲取探針的追蹤資料.
        GetProbeTraceData(ProbeTraceIndex, ProbeWorldCenter, ClipmapIndex, ProbeIndex);

        if (all(ProbeTexelCoord < TraceResolution))
        {
            float2 ProbeTexelCenter = float2(0.5, 0.5);
            float2 ProbeUV = (ProbeTexelCoord + ProbeTexelCenter) / float(TraceResolution);
            float3 WorldConeDirection = OctahedralMapToDirection(ProbeUV);

            float FinalMinTraceDistance = max(MinTraceDistance, GetRadianceProbeTMin(ClipmapIndex));
            float FinalMaxTraceDistance = MaxTraceDistance;
            float EffectiveStepFactor = StepFactor;

            // 將球的立體角均勻地分佈在所有錐體上,而不是基於八面體的畸變.
            float ConeHalfAngle = acosFast(1.0f - 1.0f / (float)(TraceResolution * TraceResolution));

            // 設定錐體追蹤輸入資料.
            FConeTraceInput TraceInput;
            TraceInput.Setup(
                ProbeWorldCenter, WorldConeDirection,
                ConeHalfAngle, MinSampleRadius,
                FinalMinTraceDistance, FinalMaxTraceDistance,
                EffectiveStepFactor);
            TraceInput.VoxelStepFactor = VoxelStepFactor;

            bool bContinueCardTracing = false;

            TraceInput.VoxelTraceStartDistance = CalculateVoxelTraceStartDistance(FinalMinTraceDistance, FinalMaxTraceDistance, MaxMeshSDFTraceDistance, bContinueCardTracing);

            // 為探針紋素執行錐體追蹤.
            FConeTraceResult TraceResult = TraceForProbeTexel(TraceInput);

            // 儲存追蹤的光照結果.
            SharedTraceRadiance[GroupThreadId.y][GroupThreadId.x] = TraceResult.Lighting;

            // 儲存追蹤的深度.
            #if RADIANCE_CACHE_STORE_DEPTHS
                SharedTraceHitDistance[GroupThreadId.y][GroupThreadId.x] = TraceResult.OpaqueHitDistance;
            #endif
        }

        GroupMemoryBarrierWithGroupSync();

        uint2 ProbeAtlasBaseCoord = RadianceProbeResolution * uint2(ProbeIndex % ProbeAtlasResolutionInProbes.x, ProbeIndex / ProbeAtlasResolutionInProbes.x);

        // 儲存光照結果和相交點的距離.
        if (TraceResolution < RadianceProbeResolution)
        {
            uint UpsampleFactor = RadianceProbeResolution / TraceResolution;
            ProbeAtlasBaseCoord += (THREADGROUP_SIZE * TraceTileCoord + GroupThreadId.xy) * UpsampleFactor;

            float3 Lighting = SharedTraceRadiance[GroupThreadId.y][GroupThreadId.x];

            for (uint Y = 0; Y < UpsampleFactor; Y++)
            {
                for (uint X = 0; X < UpsampleFactor; X++)
                {
                    RWRadianceProbeAtlasTexture[ProbeAtlasBaseCoord + uint2(X, Y)] = Lighting;
                }
            }

            #if RADIANCE_CACHE_STORE_DEPTHS
                float HitDistance = min(SharedTraceHitDistance[GroupThreadId.y][GroupThreadId.x], MaxHalfFloat);

                for (uint Y = 0; Y < UpsampleFactor; Y++)
                {
                    for (uint X = 0; X < UpsampleFactor; X++)
                    {
                        RWDepthProbeAtlasTexture[ProbeAtlasBaseCoord + uint2(X, Y)] = HitDistance;
                    }
                }
            #endif
        }
        else
        {
            uint DownsampleFactor = TraceResolution / RadianceProbeResolution;
            uint WriteTileSize = THREADGROUP_SIZE / DownsampleFactor;

            if (all(GroupThreadId.xy < WriteTileSize))
            {
                float3 Lighting = 0;

                for (uint Y = 0; Y < DownsampleFactor; Y++)
                {
                    for (uint X = 0; X < DownsampleFactor; X++)
                    {
                        Lighting += SharedTraceRadiance[GroupThreadId.y * DownsampleFactor + Y][GroupThreadId.x * DownsampleFactor + X];
                    }
                }

                ProbeAtlasBaseCoord += WriteTileSize * TraceTileCoord + GroupThreadId.xy;
                RWRadianceProbeAtlasTexture[ProbeAtlasBaseCoord] = Lighting / (float)(DownsampleFactor * DownsampleFactor);

                #if RADIANCE_CACHE_STORE_DEPTHS
                    float HitDistance = MaxHalfFloat;

                    for (uint Y = 0; Y < DownsampleFactor; Y++)
                    {
                        for (uint X = 0; X < DownsampleFactor; X++)
                        {
                            HitDistance = min(HitDistance, SharedTraceHitDistance[GroupThreadId.y * DownsampleFactor + Y][GroupThreadId.x * DownsampleFactor + X]);
                        }
                    }

                    RWDepthProbeAtlasTexture[ProbeAtlasBaseCoord] = HitDistance;
                #endif
            }
        }
    }
}

下面再進入TraceForProbeTexel分析探針紋素的追蹤堆疊:

FConeTraceResult TraceForProbeTexel(FConeTraceInput TraceInput)
{
    // 構造追蹤結果結構體.
    FConeTraceResult TraceResult;
    TraceResult = (FConeTraceResult)0;
    TraceResult.Lighting = 0.0;
    TraceResult.Transparency = 1.0;
    TraceResult.OpaqueHitDistance = TraceInput.MaxTraceDistance;

    // 錐體追蹤Lumen場景的紋素, 後面有解析.
    ConeTraceLumenSceneVoxels(TraceInput, TraceResult);

    // 遠景距離場的追蹤.
#if TRACE_DISTANT_SCENE
    if (TraceResult.Transparency > .01f)
    {
        FConeTraceResult DistantTraceResult;
        // 錐體追蹤Lumen遠處場景, 後面有解析.
        ConeTraceLumenDistantScene(TraceInput, DistantTraceResult);
        TraceResult.Lighting += DistantTraceResult.Lighting * TraceResult.Transparency;
        TraceResult.Transparency *= DistantTraceResult.Transparency;
    }
#endif

    // 天空光處理.
#if ENABLE_DYNAMIC_SKY_LIGHT
    if (ReflectionStruct.SkyLightParameters.y > 0)
    {
        float SkyAverageBrightness = 1.0f;
        float Roughness = TanConeAngleToRoughness(tan(TraceInput.ConeAngle));

        TraceResult.Lighting = TraceResult.Lighting + GetSkyLightReflection(TraceInput.ConeDirection, Roughness, SkyAverageBrightness) * TraceResult.Transparency;
    }
#endif

    return TraceResult;
}

// 錐體追蹤Lumen場景的紋素
void ConeTraceLumenSceneVoxels(
    FConeTraceInput TraceInput,
    inout FConeTraceResult OutResult)
{
#if SCENE_TRACE_VOXELS
    if (TraceInput.VoxelTraceStartDistance < TraceInput.MaxTraceDistance)
    {
        FConeTraceInput VoxelTraceInput = TraceInput;
        VoxelTraceInput.MinTraceDistance = TraceInput.VoxelTraceStartDistance;
        FConeTraceResult VoxelTraceResult;
        // 錐體追蹤體素, 之前就解析過了.
        ConeTraceVoxels(VoxelTraceInput, VoxelTraceResult);

        // 應用透明度.
        #if !VISIBILITY_ONLY_TRACE
            OutResult.Lighting += VoxelTraceResult.Lighting * OutResult.Transparency;
        #endif
        OutResult.Transparency *= VoxelTraceResult.Transparency;
        OutResult.NumSteps += VoxelTraceResult.NumSteps;
        OutResult.OpaqueHitDistance = min(OutResult.OpaqueHitDistance, VoxelTraceResult.OpaqueHitDistance);
    }
#endif
}

// 錐體追蹤Lumen遠處場景.
void ConeTraceLumenDistantScene(
    FConeTraceInput TraceInput,
    inout FConeTraceResult OutResult)
{
    float3 debug = 0;
    TraceInput.MaxTraceDistance = LumenCardScene.DistantSceneMaxTraceDistance;
    TraceInput.bBlackOutSteepIntersections = true;

    FCardTraceBlendState CardTraceBlendState;
    CardTraceBlendState.Initialize(TraceInput.MaxTraceDistance);

    if (LumenCardScene.NumDistantCards > 0)
    {
        // 從裁剪圖獲取最小追蹤距離.
        if (NumClipmapLevels > 0)
        {
            float3 VoxelLightingCenter = ClipmapWorldCenter[NumClipmapLevels - 1].xyz;
            float3 VoxelLightingExtent = ClipmapWorldSamplingExtent[NumClipmapLevels - 1].xyz;

            float3 RayEnd = TraceInput.ConeOrigin + TraceInput.ConeDirection * TraceInput.MaxTraceDistance;
            float2 IntersectionTimes = LineBoxIntersect(TraceInput.ConeOrigin, RayEnd, VoxelLightingCenter - VoxelLightingExtent, VoxelLightingCenter + VoxelLightingExtent);

            // If we are starting inside the voxel clipmaps, move the start of the trace past the voxel clipmaps
            if (IntersectionTimes.x < IntersectionTimes.y && IntersectionTimes.x < .001f)
            {
                TraceInput.MinTraceDistance = IntersectionTimes.y * TraceInput.MaxTraceDistance;
            }
        }

        float TraceEndDistance = TraceInput.MinTraceDistance;

        {
            uint ListIndex = 0;
            uint CardIndex = LumenCardScene.DistantCardIndices[ListIndex];

            // 錐體追蹤單個Lumen卡片, 後面有解析.
            ConeTraceSingleLumenCard(
                TraceInput,
                CardIndex,
                debug,
                TraceEndDistance,
                CardTraceBlendState);
        }
    }

    OutResult = (FConeTraceResult)0;

    // 儲存結果.
    #if !VISIBILITY_ONLY_TRACE
        OutResult.Lighting = CardTraceBlendState.GetFinalLighting();
    #endif
    OutResult.Transparency = CardTraceBlendState.GetTransparency();
    OutResult.NumSteps = CardTraceBlendState.NumSteps;
    OutResult.NumOverlaps = CardTraceBlendState.NumOverlaps;
    OutResult.OpaqueHitDistance = CardTraceBlendState.OpaqueHitDistance;
    OutResult.Debug = debug;
}

// 錐體追蹤單個Lumen卡片
void ConeTraceSingleLumenCard(
    FConeTraceInput TraceInput,
    uint CardIndex,
    inout float3 Debug,
    inout float OutTraceEndDistance,
    inout FCardTraceBlendState CardTraceBlendState)
{
    // 獲取卡片資料.
    FLumenCardData LumenCardData = GetLumenCardData(CardIndex);

    // 計算區域性空間的錐體資料.
    float3 LocalConeOrigin = mul(TraceInput.ConeOrigin - LumenCardData.Origin, LumenCardData.WorldToLocalRotation);
    float3 LocalConeDirection = mul(TraceInput.ConeDirection, LumenCardData.WorldToLocalRotation);
    float3 LocalTraceEnd = LocalConeOrigin + LocalConeDirection * TraceInput.MaxTraceDistance;

    // 相交範圍.
    float2 IntersectionRange = LineBoxIntersect(LocalConeOrigin, LocalTraceEnd, -LumenCardData.LocalExtent, LumenCardData.LocalExtent);
    IntersectionRange.x = max(IntersectionRange.x, TraceInput.MinTraceDistance / TraceInput.MaxTraceDistance);
    OutTraceEndDistance = IntersectionRange.y * TraceInput.MaxTraceDistance;

    if (IntersectionRange.y > IntersectionRange.x
        && LumenCardData.bVisible)
    {
        {
            // 卡片追蹤混合狀態.
            FCardTraceBlendState ConeStepBlendState;
            ConeStepBlendState.Initialize(TraceInput.MaxTraceDistance);

            float StepTime = IntersectionRange.x * TraceInput.MaxTraceDistance;
            float3 SamplePosition = LocalConeOrigin + StepTime * LocalConeDirection;
            float TraceEndDistance = IntersectionRange.y * TraceInput.MaxTraceDistance;

            float IntersectionLength = (IntersectionRange.y - IntersectionRange.x) * TraceInput.MaxTraceDistance;
            float MinStepSize = IntersectionLength / (float)LumenCardScene.MaxConeSteps;

            float PreviousStepTime = StepTime;
            float3 PreviousSamplePosition = SamplePosition;
            // Magic value to prevent linear intersection approximation on first step
            float PreviousHeightfieldZ = -2;

            bool bClampedToEnd = false;
            bool bFoundSurface = false;
            bool bRayAboveSurface = false;
            float IntersectionStepTime = 0;
            float2 IntersectionSamplePositionXY = SamplePosition.xy;
            float IntersectionSlope = 0;

            uint NumStepsPerLoop = 4; // 每次迴圈取樣4次.
            for (uint StepIndex = 0; StepIndex < LumenCardScene.MaxConeSteps && StepTime < TraceEndDistance; StepIndex += NumStepsPerLoop)
            {
                float SampleRadius = max(TraceInput.ConeStartRadius + TraceInput.TanConeAngle * StepTime, TraceInput.MinSampleRadius);
                float StepSize = max(SampleRadius * TraceInput.StepFactor, MinStepSize);
                float TraceClampDistance = TraceEndDistance - StepSize * .0001f;

                float DepthMip;
                float2 DepthValidRegionScale;
                CalculateMip(SampleRadius, LumenCardData, LumenCardData.LocalExtent, LumenCardData.MaxMip, DepthMip, DepthValidRegionScale);

                // 4個取樣位置.
                float3 SamplePosition1 = LocalConeOrigin + min(StepTime + 0 * StepSize, TraceClampDistance) * LocalConeDirection;
                float3 SamplePosition2 = LocalConeOrigin + min(StepTime + 1 * StepSize, TraceClampDistance) * LocalConeDirection;
                float3 SamplePosition3 = LocalConeOrigin + min(StepTime + 2 * StepSize, TraceClampDistance) * LocalConeDirection;
                float3 SamplePosition4 = LocalConeOrigin + min(StepTime + 3 * StepSize, TraceClampDistance) * LocalConeDirection;

                // 4個深度UV.
                float2 DepthAtlasUV1 = CalculateAtlasUV(SamplePosition1.xy, DepthValidRegionScale, LumenCardData);
                float2 DepthAtlasUV2 = CalculateAtlasUV(SamplePosition2.xy, DepthValidRegionScale, LumenCardData);
                float2 DepthAtlasUV3 = CalculateAtlasUV(SamplePosition3.xy, DepthValidRegionScale, LumenCardData);
                float2 DepthAtlasUV4 = CalculateAtlasUV(SamplePosition4.xy, DepthValidRegionScale, LumenCardData);

                // 4個深度.
                float Depth1 = Texture2DSampleLevel(DepthAtlas, TRACING_ATLAS_SAMPLER, DepthAtlasUV1, DepthMip).x;
                float Depth2 = Texture2DSampleLevel(DepthAtlas, TRACING_ATLAS_SAMPLER, DepthAtlasUV2, DepthMip).x;
                float Depth3 = Texture2DSampleLevel(DepthAtlas, TRACING_ATLAS_SAMPLER, DepthAtlasUV3, DepthMip).x;
                float Depth4 = Texture2DSampleLevel(DepthAtlas, TRACING_ATLAS_SAMPLER, DepthAtlasUV4, DepthMip).x;

                // 4個高度場Z值.
                float HeightfieldZ1 = LumenCardData.LocalExtent.z - Depth1 * 2 * LumenCardData.LocalExtent.z;
                float HeightfieldZ2 = LumenCardData.LocalExtent.z - Depth2 * 2 * LumenCardData.LocalExtent.z;
                float HeightfieldZ3 = LumenCardData.LocalExtent.z - Depth3 * 2 * LumenCardData.LocalExtent.z;
                float HeightfieldZ4 = LumenCardData.LocalExtent.z - Depth4 * 2 * LumenCardData.LocalExtent.z;

                ConeStepBlendState.RegisterStep(NumStepsPerLoop);

                // 高度場是否相交.
                bool4 HeightfieldHit = bool4(
                    SamplePosition1.z < HeightfieldZ1,
                    SamplePosition2.z < HeightfieldZ2,
                    SamplePosition3.z < HeightfieldZ3,
                    SamplePosition4.z < HeightfieldZ4);

                bool bRayBelowHeightfield = any(HeightfieldHit);
                bool bRayWasAboveSurface = bRayAboveSurface;

                if (!bRayBelowHeightfield)
                {
                    bRayAboveSurface = true;
                }

                // 從高度場以下開始的追蹤必須在到達高度場以上才能被命中
                if (bRayBelowHeightfield && bRayWasAboveSurface)
                {
                    float HeightfieldZ;
                    if (HeightfieldHit.x)
                    {
                        SamplePosition = SamplePosition1;
                        HeightfieldZ = HeightfieldZ1;
                        StepTime = StepTime + 0 * StepSize;
                    }
                    else if (HeightfieldHit.y)
                    {
                        PreviousSamplePosition = SamplePosition1;
                        PreviousHeightfieldZ = HeightfieldZ1;
                        PreviousStepTime = StepTime + 0 * StepSize;

                        SamplePosition = SamplePosition2;
                        HeightfieldZ = HeightfieldZ2;
                        StepTime = StepTime + 1 * StepSize;
                    }
                    else if (HeightfieldHit.z)
                    {
                        PreviousSamplePosition = SamplePosition2;
                        PreviousHeightfieldZ = HeightfieldZ2;
                        PreviousStepTime = StepTime + 1 * StepSize;

                        SamplePosition = SamplePosition3;
                        HeightfieldZ = HeightfieldZ3;
                        StepTime = StepTime + 2 * StepSize;
                    }
                    else
                    {
                        PreviousSamplePosition = SamplePosition3;
                        PreviousHeightfieldZ = HeightfieldZ3;
                        PreviousStepTime = StepTime + 2 * StepSize;

                        SamplePosition = SamplePosition4;
                        HeightfieldZ = HeightfieldZ4;
                        StepTime = StepTime + 3 * StepSize;
                    }

                    StepTime = min(StepTime, TraceClampDistance);

                    if (PreviousHeightfieldZ != -2)
                    {
                        // 求出x的交點.
                        IntersectionStepTime = PreviousStepTime + ((PreviousSamplePosition.z - PreviousHeightfieldZ) * (StepTime - PreviousStepTime)) / (HeightfieldZ - PreviousHeightfieldZ + PreviousSamplePosition.z - SamplePosition.z);

                        float2 LocalPositionSlopeXY = (SamplePosition.xy - PreviousSamplePosition.xy) / (StepTime - PreviousStepTime);
                        IntersectionSamplePositionXY = LocalPositionSlopeXY * (IntersectionStepTime - PreviousStepTime) + PreviousSamplePosition.xy;

                        IntersectionSlope = abs(PreviousHeightfieldZ - HeightfieldZ) / max(length(PreviousSamplePosition.xy - SamplePosition.xy), .0001f);

                        PreviousHeightfieldZ = -2;
                        // 找到了表面.
                        bFoundSurface = true;
                    }
                    break;
                }

                PreviousStepTime = StepTime + 3 * StepSize;
                PreviousSamplePosition = SamplePosition4;
                PreviousHeightfieldZ = HeightfieldZ4;
                StepTime += 4 * StepSize;

                if (StepTime >= TraceEndDistance && !bClampedToEnd)
                {
                    bClampedToEnd = true;
                    // Stop the last step just before the intersection end, since the linear approximation needs to step past the surface to detect a hit, without terminating the loop
                    StepTime = TraceClampDistance;
                }
            }

            // 如果找到了表面點.
            if (bFoundSurface)
            {
                float IntersectionSampleRadius = TraceInput.ConeStartRadius + TraceInput.TanConeAngle * IntersectionStepTime;

                float MaxMip;
                float2 ValidRegionScale;
                CalculateMip(IntersectionSampleRadius, LumenCardData, LumenCardData.LocalExtent, LumenCardData.MaxMip, MaxMip, ValidRegionScale);

                float2 IntersectionAtlasUV = CalculateAtlasUV(IntersectionSamplePositionXY, ValidRegionScale, LumenCardData);

                float DistanceToSurface = 0;
                float ConeIntersectSurface = saturate(DistanceToSurface / IntersectionSampleRadius);
                float ConeVisibility = ConeIntersectSurface;

                float MaxDistanceFade = 1;

                ConeStepBlendState.RegisterOpaqueHit(IntersectionStepTime);
                OutTraceEndDistance = IntersectionStepTime;

                float Opacity = Texture2DSampleLevel(OpacityAtlas, TRACING_ATLAS_SAMPLER, IntersectionAtlasUV, MaxMip).x;
                float ConeOcclusion = (1.0f - ConeVisibility) * Opacity * MaxDistanceFade;

                #if VISIBILITY_ONLY_TRACE
                    float3 StepLighting = 0;
                #else
                    float3 StepLighting = Texture2DSampleLevel(FinalLightingAtlas, TRACING_ATLAS_SAMPLER, IntersectionAtlasUV, MaxMip).rgb;
                #endif
            
                if (TraceInput.bBlackOutSteepIntersections)
                {
                    // 假設陡峭的部分被其他面覆蓋,然後淡出。
                    float SlopeFade = 1 - saturate((IntersectionSlope - 5) / 1.0f);
                    StepLighting = lerp(0, StepLighting, SlopeFade);
                    ConeOcclusion = lerp(0, ConeOcclusion, SlopeFade);
                }

                ConeStepBlendState.AddLighting(StepLighting, ConeOcclusion, IntersectionStepTime);
            }

            CardTraceBlendState.AddCardTrace(ConeStepBlendState);
        }
    }
}

以上可知,RadianceCache階段經歷紛繁複雜的渲染過程,其中單單TraceFromProbes就先後考慮了錐體追蹤Voxel光場和場景遠處的卡片,最後還需要考慮天空光的影響。

  • TraceScreenProbes

TraceScreenProbes包含追蹤螢幕的探針、網格距離場、Voxel光照等,具體的程式碼如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenScreenProbeTracing.cpp

void TraceScreenProbes(
    FRDGBuilder& GraphBuilder, 
    const FScene* Scene,
    const FViewInfo& View, 
    bool bTraceMeshSDFs,
    TRDGUniformBufferRef<FSceneTextureUniformParameters> SceneTexturesUniformBuffer,
    const ScreenSpaceRayTracing::FPrevSceneColorMip& PrevSceneColor,
    FRDGTextureRef LightingChannelsTexture,
    const FLumenCardTracingInputs& TracingInputs,
    const LumenRadianceCache::FRadianceCacheInterpolationParameters& RadianceCacheParameters,
    FScreenProbeParameters& ScreenProbeParameters,
    FLumenMeshSDFGridParameters& MeshSDFGridParameters)
{
    const FSceneTextureParameters SceneTextures = GetSceneTextureParameters(GraphBuilder, SceneTexturesUniformBuffer);

    // 清理探針.
    {
        FClearTracesCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FClearTracesCS::FParameters>();
        PassParameters->ScreenProbeParameters = ScreenProbeParameters;

        auto ComputeShader = View.ShaderMap->GetShader<FClearTracesCS>(0);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("ClearTraces %ux%u", ScreenProbeParameters.ScreenProbeTracingOctahedronResolution, ScreenProbeParameters.ScreenProbeTracingOctahedronResolution),
            ComputeShader,
            PassParameters,
            ScreenProbeParameters.ProbeIndirectArgs,
            (uint32)EScreenProbeIndirectArgs::ThreadPerTrace * sizeof(FRHIDispatchIndirectParameters));
    }

    FLumenIndirectTracingParameters IndirectTracingParameters;
    SetupLumenDiffuseTracingParameters(IndirectTracingParameters);

    const bool bTraceScreen = View.PrevViewInfo.ScreenSpaceRayTracingInput.IsValid() 
        && GLumenScreenProbeGatherScreenTraces != 0
        && !View.Family->EngineShowFlags.VisualizeLumenIndirectDiffuse;

    // 追蹤螢幕空間的探針.
    if (bTraceScreen)
    {
        FScreenProbeTraceScreenTexturesCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeTraceScreenTexturesCS::FParameters>();

        ScreenSpaceRayTracing::SetupCommonScreenSpaceRayParameters(GraphBuilder, SceneTextures, PrevSceneColor, View, /* out */ &PassParameters->ScreenSpaceRayParameters);

        PassParameters->ScreenSpaceRayParameters.CommonDiffuseParameters.SceneTextures = SceneTextures;

        {
            const FVector2D HZBUvFactor(
                float(View.ViewRect.Width()) / float(2 * View.HZBMipmap0Size.X),
                float(View.ViewRect.Height()) / float(2 * View.HZBMipmap0Size.Y));

            const FVector4 ScreenPositionScaleBias = View.GetScreenPositionScaleBias(SceneTextures.SceneDepthTexture->Desc.Extent, View.ViewRect);
            const FVector2D HZBUVToScreenUVScale = FVector2D(1.0f / HZBUvFactor.X, 1.0f / HZBUvFactor.Y) * FVector2D(2.0f, -2.0f) * FVector2D(ScreenPositionScaleBias.X, ScreenPositionScaleBias.Y);
            const FVector2D HZBUVToScreenUVBias = FVector2D(-1.0f, 1.0f) * FVector2D(ScreenPositionScaleBias.X, ScreenPositionScaleBias.Y) + FVector2D(ScreenPositionScaleBias.W, ScreenPositionScaleBias.Z);
            PassParameters->HZBUVToScreenUVScaleBias = FVector4(HZBUVToScreenUVScale, HZBUVToScreenUVBias);
        }

        checkf(View.ClosestHZB, TEXT("Lumen screen tracing: ClosestHZB was not setup, should have been setup by FDeferredShadingSceneRenderer::RenderHzb"));
        PassParameters->ClosestHZBTexture = View.ClosestHZB;
        PassParameters->SceneDepthTexture = SceneTextures.SceneDepthTexture;
        PassParameters->LightingChannelsTexture = LightingChannelsTexture;
        PassParameters->HZBBaseTexelSize = FVector2D(1.0f / View.ClosestHZB->Desc.Extent.X, 1.0f / View.ClosestHZB->Desc.Extent.Y);
        PassParameters->MaxHierarchicalScreenTraceIterations = GLumenScreenProbeGatherHierarchicalScreenTracesMaxIterations;
        PassParameters->UncertainTraceRelativeDepthThreshold = GLumenScreenProbeGatherUncertainTraceRelativeDepthThreshold;
        PassParameters->NumThicknessStepsToDetermineCertainty = GLumenScreenProbeGatherNumThicknessStepsToDetermineCertainty;

        PassParameters->ScreenProbeParameters = ScreenProbeParameters;
        PassParameters->IndirectTracingParameters = IndirectTracingParameters;
        PassParameters->RadianceCacheParameters = RadianceCacheParameters;

        FScreenProbeTraceScreenTexturesCS::FPermutationDomain PermutationVector;
        PermutationVector.Set< FScreenProbeTraceScreenTexturesCS::FRadianceCache >(LumenScreenProbeGather::UseRadianceCache(View));
        PermutationVector.Set< FScreenProbeTraceScreenTexturesCS::FHierarchicalScreenTracing >(GLumenScreenProbeGatherHierarchicalScreenTraces != 0);
        PermutationVector.Set< FScreenProbeTraceScreenTexturesCS::FStructuredImportanceSampling >(LumenScreenProbeGather::UseImportanceSampling(View));
        auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeTraceScreenTexturesCS>(PermutationVector);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("TraceScreen"),
            ComputeShader,
            PassParameters,
            ScreenProbeParameters.ProbeIndirectArgs,
            (uint32)EScreenProbeIndirectArgs::ThreadPerTrace * sizeof(FRHIDispatchIndirectParameters));
    }

    // 追蹤網格距離場.
    if (bTraceMeshSDFs)
    {
        // 硬體模式
        if (Lumen::UseHardwareRayTracedScreenProbeGather())
        {
            FCompactedTraceParameters CompactedTraceParameters = CompactTraces(
                GraphBuilder,
                View,
                ScreenProbeParameters,
                WORLD_MAX,
                IndirectTracingParameters.MaxTraceDistance);

            RenderHardwareRayTracingScreenProbe(GraphBuilder,
                Scene,
                SceneTextures,
                ScreenProbeParameters,
                View,
                TracingInputs,
                IndirectTracingParameters,
                RadianceCacheParameters,
                CompactedTraceParameters);
        }
        // 軟體模式
        else
        {
            CullForCardTracing(
                GraphBuilder,
                Scene, View,
                TracingInputs,
                IndirectTracingParameters,
                /* out */ MeshSDFGridParameters);

            if (MeshSDFGridParameters.TracingParameters.DistanceFieldObjectBuffers.NumSceneObjects > 0)
            {
                FCompactedTraceParameters CompactedTraceParameters = CompactTraces(
                    GraphBuilder,
                    View,
                    ScreenProbeParameters,
                    IndirectTracingParameters.CardTraceEndDistanceFromCamera,
                    IndirectTracingParameters.MaxMeshSDFTraceDistance);

                {
                    FScreenProbeTraceMeshSDFsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeTraceMeshSDFsCS::FParameters>();
                    GetLumenCardTracingParameters(View, TracingInputs, PassParameters->TracingParameters);
                    PassParameters->MeshSDFGridParameters = MeshSDFGridParameters;
                    PassParameters->ScreenProbeParameters = ScreenProbeParameters;
                    PassParameters->IndirectTracingParameters = IndirectTracingParameters;
                    PassParameters->SceneTexturesStruct = SceneTexturesUniformBuffer;
                    PassParameters->CompactedTraceParameters = CompactedTraceParameters;

                    FScreenProbeTraceMeshSDFsCS::FPermutationDomain PermutationVector;
                    PermutationVector.Set< FScreenProbeTraceMeshSDFsCS::FStructuredImportanceSampling >(LumenScreenProbeGather::UseImportanceSampling(View));
                    auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeTraceMeshSDFsCS>(PermutationVector);

                    FComputeShaderUtils::AddPass(
                        GraphBuilder,
                        RDG_EVENT_NAME("TraceMeshSDFs"),
                        ComputeShader,
                        PassParameters,
                        CompactedTraceParameters.IndirectArgs,
                        0);
                }
            }
        }
    }

    // 壓縮追蹤引數.
    FCompactedTraceParameters CompactedTraceParameters = CompactTraces(
        GraphBuilder,
        View,
        ScreenProbeParameters,
        WORLD_MAX,
        // Make sure the shader runs on all misses to apply radiance cache + skylight
        IndirectTracingParameters.MaxTraceDistance + 1);

    // 追蹤Voxel光照.
    {
        FScreenProbeTraceVoxelsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeTraceVoxelsCS::FParameters>();
        PassParameters->RadianceCacheParameters = RadianceCacheParameters;
        GetLumenCardTracingParameters(View, TracingInputs, PassParameters->TracingParameters);
        PassParameters->ScreenProbeParameters = ScreenProbeParameters;
        PassParameters->IndirectTracingParameters = IndirectTracingParameters;
        PassParameters->SceneTexturesStruct = SceneTexturesUniformBuffer;
        PassParameters->CompactedTraceParameters = CompactedTraceParameters;

        const bool bRadianceCache = LumenScreenProbeGather::UseRadianceCache(View);

        FScreenProbeTraceVoxelsCS::FPermutationDomain PermutationVector;
        PermutationVector.Set< FScreenProbeTraceVoxelsCS::FDynamicSkyLight >(Lumen::ShouldHandleSkyLight(Scene, *View.Family));
        PermutationVector.Set< FScreenProbeTraceVoxelsCS::FTraceDistantScene >(Scene->LumenSceneData->DistantCardIndices.Num() > 0);
        PermutationVector.Set< FScreenProbeTraceVoxelsCS::FRadianceCache >(bRadianceCache);
        PermutationVector.Set< FScreenProbeTraceVoxelsCS::FStructuredImportanceSampling >(LumenScreenProbeGather::UseImportanceSampling(View));
        auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeTraceVoxelsCS>(PermutationVector);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("TraceVoxels"),
            ComputeShader,
            PassParameters,
            CompactedTraceParameters.IndirectArgs,
            0);
    }

    if (GLumenScreenProbeGatherVisualizeTraces)
    {
        SetupVisualizeTraces(GraphBuilder, Scene, View, ScreenProbeParameters);
    }
}

先結合截幀資料分析TraceScreen,它的輸入是BlueNoise、Velocity、深度、探針速度、射線資訊、HZB、SSRReducedSceneColor等紋理,輸出是畫素格式為R11G11B10的TraceRadiance和R32的TraceHit紋理:

左:TraceRadiance,右:TraceHit。

它使用的Compute Shader如下:

// Engine\Shaders\Private\Lumen\LumenScreenProbeTracing.usf

[numthreads(PROBE_THREADGROUP_SIZE_2D, PROBE_THREADGROUP_SIZE_2D, 1)]
void ScreenProbeTraceScreenTexturesCS(
    uint3 GroupId : SV_GroupID,
    uint3 DispatchThreadId : SV_DispatchThreadID,
    uint3 GroupThreadId : SV_GroupThreadID)
{
#define DEINTERLEAVED_SCREEN_TRACING 1
    // 計算紋理座標
#if DEINTERLEAVED_SCREEN_TRACING
    uint2 AtlasSizeInProbes = uint2(ScreenProbeAtlasViewSize.x, (GetNumScreenProbes() + ScreenProbeAtlasViewSize.x - 1) / ScreenProbeAtlasViewSize.x);
    uint2 ScreenProbeAtlasCoord = DispatchThreadId.xy % AtlasSizeInProbes;
    uint2 TraceTexelCoord = DispatchThreadId.xy / AtlasSizeInProbes;
#else
    uint2 ScreenProbeAtlasCoord = DispatchThreadId.xy / ScreenProbeTracingOctahedronResolution;
    uint2 TraceTexelCoord = DispatchThreadId.xy - ScreenProbeAtlasCoord * ScreenProbeTracingOctahedronResolution;
#endif

    uint ScreenProbeIndex = ScreenProbeAtlasCoord.y * ScreenProbeAtlasViewSize.x + ScreenProbeAtlasCoord.x;

    uint2 ScreenProbeScreenPosition = GetScreenProbeScreenPosition(ScreenProbeIndex);
    uint2 ScreenTileCoord = GetScreenTileCoord(ScreenProbeScreenPosition);

    if (ScreenProbeIndex < GetNumScreenProbes() && all(TraceTexelCoord < ScreenProbeTracingOctahedronResolution))
    {
        float2 ScreenUV = GetScreenUVFromScreenProbePosition(ScreenProbeScreenPosition);
        float SceneDepth = GetScreenProbeDepth(ScreenProbeAtlasCoord);

        if (SceneDepth > 0.0f)
        {
            float3 WorldPosition = GetWorldPositionFromScreenUV(ScreenUV, SceneDepth);

            float2 ProbeUV;
            float ConeHalfAngle;
            // 獲取探針追蹤的UV.
            GetProbeTracingUV(ScreenProbeAtlasCoord, TraceTexelCoord, GetProbeTexelCenter(ScreenTileCoord), 1, ProbeUV, ConeHalfAngle);

            float3 WorldConeDirection = OctahedralMapToDirection(ProbeUV);

            float DepthThresholdScale = HasDistanceFieldRepresentation(ScreenUV) ? 1.0f : ScreenTraceNoFallbackThicknessScale;

            {
                float TraceDistance = MaxTraceDistance;
                bool bCoveredByRadianceCache = false;
                #if RADIANCE_CACHE
                    float ProbeOcclusionDistance = GetRadianceProbeOcclusionDistanceWithInterpolation(WorldPosition, WorldConeDirection, bCoveredByRadianceCache);
                    TraceDistance = min(TraceDistance, ProbeOcclusionDistance);
                #endif


#if HIERARCHICAL_SCREEN_TRACING // 層級螢幕追蹤

                bool bHit;
                bool bUncertain;
                float3 HitUVz;

                // 螢幕追蹤
                TraceScreen(
                    WorldPosition + View.PreViewTranslation,
                    WorldConeDirection,
                    TraceDistance,
                    HZBUvFactorAndInvFactor,
                    MaxHierarchicalScreenTraceIterations, 
                    UncertainTraceRelativeDepthThreshold * DepthThresholdScale,
                    NumThicknessStepsToDetermineCertainty,
                    bHit,
                    bUncertain,
                    HitUVz);
                
                float Level = 1;
                bool bWriteDepthOnMiss = true;
#else // 非層級螢幕追蹤
    
                uint NumSteps = 16;
                float StartMipLevel = 1.0f;
                float MaxScreenTraceFraction = .2f;

                // 通過限制跟蹤距離,只能在固定步長計數的螢幕跟蹤中獲得良好的質量.
                float MaxWorldTraceDistance = SceneDepth * MaxScreenTraceFraction * 2.0 * GetTanHalfFieldOfView().x;
                TraceDistance = min(TraceDistance, MaxWorldTraceDistance);

                uint2 NoiseCoord = ScreenProbeAtlasCoord * ScreenProbeTracingOctahedronResolution + TraceTexelCoord;
                float StepOffset = InterleavedGradientNoise(NoiseCoord + 0.5f, 0);

                float RayRoughness = .2f;
                StepOffset = StepOffset - .9f;

                FSSRTCastingSettings CastSettings = CreateDefaultCastSettings();
                CastSettings.bStopWhenUncertain = true;

                bool bHit = false;
                float Level;
                float3 HitUVz;
                bool bRayWasClipped;

                // 初始化螢幕空間的來自世界空間的光線.
                FSSRTRay Ray = InitScreenSpaceRayFromWorldSpace(
                    WorldPosition + View.PreViewTranslation, WorldConeDirection,
                    /* WorldTMax = */ TraceDistance,
                    /* SceneDepth = */ SceneDepth,
                    /* SlopeCompareToleranceScale */ 2.0f * DepthThresholdScale,
                    /* bExtendRayToScreenBorder = */ false,
                    /* out */ bRayWasClipped);

                bool bUncertain;
                float3 DebugOutput;

                // 投射螢幕空間的射線.
                CastScreenSpaceRay(
                    FurthestHZBTexture, FurthestHZBTextureSampler,
                    StartMipLevel,
                    CastSettings,
                    Ray, RayRoughness, NumSteps, StepOffset,
                    HZBUvFactorAndInvFactor, false,
                    /* out */ DebugOutput,
                    /* out */ HitUVz,
                    /* out */ Level,
                    /* out */ bHit,
                    /* out */ bUncertain);

                // CastScreenSpaceRay skips Mesh SDF tracing in a lot of places where it shouldn't, in particular missing thin occluders due to low NumSteps.  
                bool bWriteDepthOnMiss = !bUncertain;

#endif
                bHit = bHit && !bUncertain;

                uint2 TraceCoord = GetTraceBufferCoord(ScreenProbeAtlasCoord, TraceTexelCoord);
                bool bFastMoving = false;

                // 處理相交後的邏輯.
                if (bHit)
                {
                    float2 ReducedColorUV = HitUVz.xy * ColorBufferScaleBias.xy + ColorBufferScaleBias.zw;
                    ReducedColorUV = min(ReducedColorUV, ReducedColorUVMax);

                    float3 Lighting = ColorTexture.SampleLevel(ColorTextureSampler, ReducedColorUV, Level).rgb;
                    
                    #if DEBUG_VISUALIZE_TRACE_TYPES
                        RWTraceRadiance[TraceCoord] = float3(.5f, 0, 0) * View.PreExposure;
                    #else
                        RWTraceRadiance[TraceCoord] = Lighting;
                    #endif

                    float3 HitWorldVelocity;
                    {
                        float2 HitScreenUV = HitUVz.xy;
                        float2 HitScreenPosition = (HitScreenUV.xy - View.ScreenPositionScaleBias.wz) / View.ScreenPositionScaleBias.xy;

                        float HitDeviceZ = HitUVz.z;
                        float HitSceneDepth = ConvertFromDeviceZ(HitUVz.z);
                        float3 HitHistoryScreenPosition = GetHistoryScreenPosition(HitScreenPosition, HitScreenUV, HitDeviceZ);

                        float3 HitTranslatedWorldPosition = mul(float4(HitScreenPosition * HitSceneDepth, HitSceneDepth, 1), View.ScreenToTranslatedWorld).xyz;
                        HitWorldVelocity = HitTranslatedWorldPosition - GetPrevTranslatedWorldPosition(HitHistoryScreenPosition);
                    }

                    float ProbeWorldSpeed = ScreenProbeWorldSpeed.Load(int3(ScreenProbeAtlasCoord, 0)).x;
                    float HitWorldSpeed = length(HitWorldVelocity);

                    bFastMoving = abs(ProbeWorldSpeed - HitWorldSpeed) / max(SceneDepth, 100.0f) > RelativeSpeedDifferenceToConsiderLightingMoving;
                }

                // 相交或要求寫深度則儲存深度.
                if (bHit || bWriteDepthOnMiss)
                {
                    float HitDistance = min(sqrt(ComputeRayHitSqrDistance(WorldPosition + View.PreViewTranslation, HitUVz)), MaxTraceDistance);
                    RWTraceHit[TraceCoord] = EncodeProbeRayDistance(HitDistance, bHit, bFastMoving);
                }
            }
        }
    }
}

上面會根據是否HIERARCHICAL_SCREEN_TRACING而進入兩種不同的螢幕追蹤方式,截幀資料顯示HIERARCHICAL_SCREEN_TRACING為1,即會進入TraceScreen而不會進入CastScreenSpaceRay。下面分析TraceScreen

// Engine\Shaders\Private\Lumen\LumenScreenTracing.ush

// 通過遍歷HZB追蹤螢幕空間, 雖然精確但比較慢。
void TraceScreen(
    float3 RayTranslatedWorldOrigin, 
    float3 RayWorldDirection,
    float MaxWorldTraceDistance,
    float4 HZBUvFactorAndInvFactor,
    float MaxIterations,
    float UncertainTraceRelativeDepthThreshold,
    float NumThicknessStepsToDetermineCertainty,
    inout bool bHit,
    inout bool bUncertain,
    inout float3 OutScreenUV)
{
    // 計算射線起點的螢幕UV.
    float3 RayStartScreenUV;
    {
        float4 RayStartClip = mul(float4(RayTranslatedWorldOrigin, 1.0f), View.TranslatedWorldToClip);
        float3 RayStartScreenPosition = RayStartClip.xyz / max(RayStartClip.w, 1.0f);
        RayStartScreenUV = float3((RayStartScreenPosition.xy * float2(0.5f, -0.5f) + 0.5f) * HZBUvFactorAndInvFactor.xy, RayStartScreenPosition.z);
    }
    
    // 計算射線終點的螢幕UV.
    float3 RayEndScreenUV;
    {
        float3 ViewRayDirection = mul(float4(RayWorldDirection, 0.0), View.TranslatedWorldToView).xyz;
        float SceneDepth = mul(float4(RayTranslatedWorldOrigin, 1.0f), View.TranslatedWorldToView).z;
        // 將射線夾在Z==0的平面結束,這樣結束點將在NDC空間中有效.
        float RayEndWorldDistance = ViewRayDirection.z < 0.0 ? min(-0.99f * SceneDepth / ViewRayDirection.z, MaxWorldTraceDistance) : MaxWorldTraceDistance;

        float3 RayWorldEnd = RayTranslatedWorldOrigin + RayWorldDirection * RayEndWorldDistance;
        float4 RayEndClip = mul(float4(RayWorldEnd, 1.0f), View.TranslatedWorldToClip);
        float3 RayEndScreenPosition = RayEndClip.xyz / RayEndClip.w;
        RayEndScreenUV = float3((RayEndScreenPosition.xy * float2(0.5f, -0.5f) + 0.5f) * HZBUvFactorAndInvFactor.xy, RayEndScreenPosition.z);

        float2 ScreenEdgeIntersections = LineBoxIntersect(RayStartScreenUV, RayEndScreenUV, float3(0, 0, 0), float3(HZBUvFactorAndInvFactor.xy, 1));

        // 重新計算它離開螢幕的終點.
        RayEndScreenUV = RayStartScreenUV + (RayEndScreenUV - RayStartScreenUV) * ScreenEdgeIntersections.y;
    }

    float BaseMipLevel = HZB_TRACE_INCLUDE_FULL_RES_DEPTH ? -1 : 0;
    float MipLevel = BaseMipLevel;

    // 跳出當前分塊而不進行命中測試,以避免自遮擋. 這是必要的,因為HZB mip 0是最接近2x2深度的,而且HZB儲存在16位浮點數中
    bool bStepOutOfCurrentTile = true;
    if (bStepOutOfCurrentTile)
    {
        float2 HZBTileSize = exp2(MipLevel) * HZBBaseTexelSize;
        float2 BiasedUV = RayStartScreenUV.xy;
        float3 HZBTileMin = float3(floor(BiasedUV.xy / HZBTileSize) * HZBTileSize, 0.0f);
        float3 HZBTileMax = float3(HZBTileMin.xy + HZBTileSize, 1);
        float2 TileIntersections = LineBoxIntersect(RayStartScreenUV, RayEndScreenUV, HZBTileMin, HZBTileMax);

        {
            float3 RayTileHit = RayStartScreenUV + (RayEndScreenUV - RayStartScreenUV) * TileIntersections.y;
            RayStartScreenUV = RayTileHit;
        }
    }

    bHit = false;
    bUncertain = false;

    float RayLength2D = length(RayEndScreenUV.xy - RayStartScreenUV.xy);
    float2 RayDirectionScreenUV = (RayEndScreenUV.xy - RayStartScreenUV.xy) / max(RayLength2D, .0001f);
    float3 RayScreenUV = RayStartScreenUV;
    float NumIterations = 0;
    
    // 無棧遍歷HZB.
    while (MipLevel >= BaseMipLevel && NumIterations < MaxIterations)
    {
        float2 HZBTileSize = exp2(MipLevel) * HZBBaseTexelSize;
        // RayScreenUV is on a tile boundary due to bStepOutOfCurrentTile
        // Offset the UV along the ray direction so it always quantizes to the next tile
        float2 BiasedUV = RayScreenUV.xy + .01f * RayDirectionScreenUV.xy * HZBTileSize;
        float3 HZBTileMin = float3(floor(BiasedUV / HZBTileSize) * HZBTileSize, 0.0f);
        float3 HZBTileMax = float3(HZBTileMin.xy + HZBTileSize, 1);
        float2 TileIntersections = LineBoxIntersect(RayStartScreenUV, RayEndScreenUV, HZBTileMin, HZBTileMax);
        float3 RayTileHit = RayStartScreenUV + (RayEndScreenUV - RayStartScreenUV) * TileIntersections.y;

        float TileZ;
        float AvoidSelfIntersectionZScale = 1.0f;

#if HZB_TRACE_INCLUDE_FULL_RES_DEPTH
        if (MipLevel < 0)
        {
            TileZ = SceneDepthTexture.SampleLevel(GlobalPointClampedSampler, BiasedUV * HZBUVToScreenUVScaleBias.xy + HZBUVToScreenUVScaleBias.zw, 0).x;
        }
        else
#endif
        {
            TileZ = ClosestHZBTexture.SampleLevel(GlobalPointClampedSampler, BiasedUV, MipLevel).x;
            // 啟發式避免錯誤的自遮擋, 因為HZB mip 0是最接近2x2深度的,而且HZB儲存在16位浮點數中
            AvoidSelfIntersectionZScale = lerp(.99f, 1.0f, saturate(TileIntersections.y * 10.0f));
        }

        if (RayTileHit.z > TileZ * AvoidSelfIntersectionZScale)
        {
            RayScreenUV = RayTileHit;
            MipLevel++;

            if (TileIntersections.y == 1.0f)
            {
                // 射線沒有和HZB塊相交.
                MipLevel = BaseMipLevel - 1;
            }
        }
        else
        {
            if (abs(MipLevel - BaseMipLevel) < .1f)
            {
                // 將相交點的UV對齊到紋素的中心,進行SceneColor查詢.
                RayScreenUV = float3(.5f * (HZBTileMin.xy + HZBTileMax.xy), RayTileHit.z);
                bHit = true;
                float IntersectionDepth = ConvertFromDeviceZ(TileZ);
                float RayTileEnterZ = RayStartScreenUV.z + (RayEndScreenUV.z - RayStartScreenUV.z) * TileIntersections.x;
                bUncertain = (ConvertFromDeviceZ(RayTileEnterZ) - IntersectionDepth) / max(IntersectionDepth, .00001f) > UncertainTraceRelativeDepthThreshold;
            }

            MipLevel--;
        }

        NumIterations++;
    }

    // 沿著射線確定特定厚度的線性步驟,以拒絕非常薄的表面(草, 頭髮, 植被)後面的相交.
    if (bHit && !bUncertain && NumThicknessStepsToDetermineCertainty > 0)
    {
        float ThicknessSearchMipLevel = 0.0f;
        float MipNumTexels = exp2(ThicknessSearchMipLevel);
        float2 HZBTileSize = MipNumTexels * HZBBaseTexelSize;
        float NumSteps = NumThicknessStepsToDetermineCertainty / MipNumTexels;
        float ThicknessSearchEndTime = min(length(RayDirectionScreenUV * HZBTileSize * NumSteps) / length(RayEndScreenUV.xy - RayScreenUV.xy), 1.0f);

        for (float I = 0; I < NumSteps; I++)
        {
            float3 SampleUV = RayScreenUV + (I / NumSteps) * ThicknessSearchEndTime * (RayEndScreenUV - RayScreenUV);

            if (all(SampleUV.xy > 0 && SampleUV.xy < HZBUvFactorAndInvFactor.xy))
            {
                float SampleTileZ = ClosestHZBTexture.SampleLevel(GlobalPointClampedSampler, SampleUV.xy, ThicknessSearchMipLevel).x;

                if (SampleUV.z > SampleTileZ)
                {
                    bUncertain = true;
                }
            }
        }
    }

    OutScreenUV.xy = RayScreenUV.xy * HZBUVToScreenUVScaleBias.xy + HZBUVToScreenUVScaleBias.zw;
    OutScreenUV.z = RayScreenUV.z;
}

關於HZB螢幕空間的光線追蹤,推薦參看閆令琪大神的圖形學課程《GAMES202-高質量實時渲染》Lecture9 Real-Time Global Illumination(Screen Space),其視訊詳盡動態地描述了HZB的遍歷和追蹤過程。下圖只是擷取視訊的其中一幅圖例:

  • TraceVoxels

追蹤體素的輸入有全域性距離場、法線、深度、天空光、藍噪點、VoxelLighting、RadianceProbeIndirectTexture、FinalRadianceAtlas、射線資訊等,輸出有R32的TraceHit、R11G11B10的TraceRandiance:

TraceVoxels的輸出紋理TraceHit,儲存了相交點的深度,注意右上角範圍做了調整。

TraceVoxels的輸出紋理TraceRadiance,儲存了相交點的輻射率。

再分析其使用的compute shader:

// Engine\Shaders\Private\Lumen\LumenScreenProbeTracing.usf

[numthreads(PROBE_THREADGROUP_SIZE_1D, 1, 1)]
void ScreenProbeTraceVoxelsCS(
    uint3 GroupId : SV_GroupID,
    uint3 DispatchThreadId : SV_DispatchThreadID,
    uint3 GroupThreadId : SV_GroupThreadID)
{
    if (DispatchThreadId.x < CompactedTraceTexelAllocator[0])
    {
        uint ScreenProbeIndex;
        uint2 TraceTexelCoord;
        float TraceHitDistance;
        // 解碼需要追蹤的紋素資訊.
        DecodeTraceTexel(CompactedTraceTexelData[DispatchThreadId.x], ScreenProbeIndex, TraceTexelCoord, TraceHitDistance);

        // 計算探針所在圖集的UV.
        uint2 ScreenProbeAtlasCoord = uint2(ScreenProbeIndex % ScreenProbeAtlasViewSize.x, ScreenProbeIndex / ScreenProbeAtlasViewSize.x);
        // 追蹤探針紋素的體素光照.
        TraceVoxels(ScreenProbeAtlasCoord, TraceTexelCoord, ScreenProbeIndex, TraceHitDistance);
    }
}

void TraceVoxels(
    uint2 ScreenProbeAtlasCoord,
    uint2 TraceTexelCoord,
    uint ScreenProbeIndex,
    float TraceHitDistance)
{
    // 計算追蹤的UV.
    uint2 ScreenProbeScreenPosition = GetScreenProbeScreenPosition(ScreenProbeIndex);
    uint2 ScreenTileCoord = GetScreenTileCoord(ScreenProbeScreenPosition);

    uint2 TraceCoord = GetTraceBufferCoord(ScreenProbeAtlasCoord, TraceTexelCoord);
    
    {
        // 獲取螢幕空間的各類資料.
        float2 ScreenUV = GetScreenUVFromScreenProbePosition(ScreenProbeScreenPosition);
        float SceneDepth = GetScreenProbeDepth(ScreenProbeAtlasCoord);
        float3 SceneNormal = DecodeNormal(SceneTexturesStruct.GBufferATexture.Load(int3(ScreenUV * View.BufferSizeAndInvSize.xy, 0)).xyz);

        bool bHit = false;

        {
            // 計算世界座標.
            float3 WorldPosition = GetWorldPositionFromScreenUV(ScreenUV, SceneDepth);

            float2 ProbeUV;
            float ConeHalfAngle;
            // 獲取探針追蹤UV.
            GetProbeTracingUV(ScreenProbeAtlasCoord, TraceTexelCoord, GetProbeTexelCenter(ScreenTileCoord), 1, ProbeUV, ConeHalfAngle);

            // 從八面體圖反算成方向.
            float3 WorldConeDirection = OctahedralMapToDirection(ProbeUV);

            // 取樣位置.
            float3 SamplePosition = WorldPosition + SurfaceBias * WorldConeDirection;
            SamplePosition += SurfaceBias * SceneNormal;

            float TraceDistance = MaxTraceDistance;
            bool bCoveredByRadianceCache = false;
#if RADIANCE_CACHE
            float ProbeOcclusionDistance = GetRadianceProbeOcclusionDistanceWithInterpolation(WorldPosition, WorldConeDirection, bCoveredByRadianceCache);
            TraceDistance = min(TraceDistance, ProbeOcclusionDistance);
#endif

            // 構建錐體追蹤輸入資料.
            FConeTraceInput TraceInput;
            TraceInput.Setup(SamplePosition, WorldConeDirection, ConeHalfAngle, MinSampleRadius, MinTraceDistance, TraceDistance, StepFactor);
            TraceInput.VoxelStepFactor = VoxelStepFactor;
            TraceInput.VoxelTraceStartDistance = max(MinTraceDistance, TraceHitDistance);

            // 構建錐體追蹤輸出資料.
            FConeTraceResult TraceResult = (FConeTraceResult)0;
            TraceResult.Lighting = 0;
            TraceResult.Transparency = 1;
            TraceResult.OpaqueHitDistance = TraceInput.MaxTraceDistance;

            // 錐體追蹤Lumen場景的光照體素.
            ConeTraceLumenSceneVoxels(TraceInput, TraceResult);

            if (TraceResult.Transparency <= .5f)
            {
                // 掠射角追蹤的自相交產生的噪點無法被空間濾波器消除.
                #define USE_VOXEL_TRACE_HIT_DISTANCE 0
                #if USE_VOXEL_TRACE_HIT_DISTANCE
                    TraceHitDistance = TraceResult.OpaqueHitDistance;
                #else
                    TraceHitDistance = TraceDistance;
                #endif
                bHit = true;
            }

#if RADIANCE_CACHE
            if (bCoveredByRadianceCache)
            {
                if (TraceResult.Transparency > .5f)
                {
                    // 不儲存輻射率快取相交點的深度.
                    TraceHitDistance = MaxTraceDistance;
                }

                SampleRadianceCacheAndApply(WorldPosition, WorldConeDirection, ConeHalfAngle, float3(0, 0, 0), TraceResult.Lighting, TraceResult.Transparency);
            }
            else
#endif
            {
#if TRACE_DISTANT_SCENE
                // 追蹤遠處場景.
                if (TraceResult.Transparency > .01f)
                {
                    FConeTraceResult DistantTraceResult;
                    ConeTraceLumenDistantScene(TraceInput, DistantTraceResult);
                    TraceResult.Lighting += DistantTraceResult.Lighting * TraceResult.Transparency;
                    TraceResult.Transparency *= DistantTraceResult.Transparency;
                }
#endif
                // 計算天空光.
                EvaluateSkyRadianceForCone(WorldConeDirection, tan(ConeHalfAngle), TraceResult);

                if (TraceHitDistance >= GetProbeMaxHitDistance())
                {
                    TraceHitDistance = MaxTraceDistance;
                }
            }
            
            #if USE_PREEXPOSURE
                TraceResult.Lighting *= View.PreExposure;
            #endif

            #if DEBUG_VISUALIZE_TRACE_TYPES
                RWTraceRadiance[TraceCoord] = float3(0, 0, .5f) * View.PreExposure;
            #else
                RWTraceRadiance[TraceCoord] = TraceResult.Lighting;
            #endif
        }

        // 儲存追蹤結果, 將相交點距離/是否相交/是否移動編碼到32位非負整數中.
        RWTraceHit[TraceCoord] = EncodeProbeRayDistance(TraceHitDistance, bHit, false);
    }
}
  • CompositeTraces

CompositeTraces就是根據前面步驟生成的TraceHit、RayInfo和TraceRadianc生成ScreenProbeRadiance、ScreenProbeHitDistance、ScreenProbeTraceMoving紋理。其使用的Compute Shader是LumenScreenProbeFiltering.usf,主入口是ScreenProbeCompositeTracesWithScatterCS,具體程式碼此文忽略。

  • FilterRadianceWithGather

CompositeTraces之後會經歷數次FilterRadianceWithGather,執行探針輻射率過濾:

左:過濾前的ScreenProbeRadiance;右:執行若干次過濾後的ScreenProbeRadiance。

  • ComputeIndirect

這個階段就是利用之前生成的各種螢幕空間的探針資料(深度、法線、基礎色、FilteredScreenProbeRadiance、BentNormal)計算出最終的場景非直接光顏色(下圖):

6.5.7.3 RenderLumenReflections

RenderLumenReflections就是渲染Lumen場景中粗糙度比較低比較光滑的表面的反射,其流程和RenderLumenScreenProbeGather類似,但更簡單步驟更少:

其涉及的C++渲染程式碼如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenReflections.cpp

FRDGTextureRef FDeferredShadingSceneRenderer::RenderLumenReflections(
    FRDGBuilder& GraphBuilder, 
    const FViewInfo& View,
    const FSceneTextures& SceneTextures,
    const FLumenMeshSDFGridParameters& MeshSDFGridParameters,
    FLumenReflectionCompositeParameters& OutCompositeParameters)
{
    // 反射追蹤的最大的粗糙度, 大於此的表面將忽略.
    OutCompositeParameters.MaxRoughnessToTrace = GLumenReflectionMaxRoughnessToTrace;
    OutCompositeParameters.InvRoughnessFadeLength = 1.0f / GLumenReflectionRoughnessFadeLength;

    (......)

    {
        (......)

        auto ComputeShader = View.ShaderMap->GetShader<FReflectionGenerateRaysCS>(0);

        // 生成射線Pass.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("GenerateRaysCS"),
            ComputeShader,
            PassParameters,
            ReflectionTileParameters.TracingIndirectArgs,
            0);
    }

    FLumenCardTracingInputs TracingInputs(GraphBuilder, Scene, View);

    (......)

    // 追蹤反射.
    TraceReflections(
        GraphBuilder, 
        Scene,
        View, 
        GLumenReflectionTraceMeshSDFs != 0 && Lumen::UseMeshSDFTracing(),
        SceneTextures,
        TracingInputs,
        ReflectionTracingParameters,
        ReflectionTileParameters,
        MeshSDFGridParameters);
    
    (......)

    {
        FReflectionResolveCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FReflectionResolveCS::FParameters>();
        
        (......)
        
        auto ComputeShader = View.ShaderMap->GetShader<FReflectionResolveCS>(PermutationVector);

        // 解析反射.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("ReflectionResolve"),
            ComputeShader,
            PassParameters,
            ReflectionTileParameters.ResolveIndirectArgs,
            0);
    }

    (......)

    // 更新歷史資料.
    UpdateHistoryReflections(
        GraphBuilder,
        View,
        SceneTextures,
        ReflectionTileParameters,
        ResolvedSpecularIndirect,
        SpecularIndirect);

    return SpecularIndirect;
}

void TraceReflections(
    FRDGBuilder& GraphBuilder,
    const FScene* Scene,
    const FViewInfo& View,
    bool bTraceMeshSDFs,
    const FSceneTextures& SceneTextures,
    const FLumenCardTracingInputs& TracingInputs,
    const FLumenReflectionTracingParameters& ReflectionTracingParameters,
    const FLumenReflectionTileParameters& ReflectionTileParameters,
    const FLumenMeshSDFGridParameters& InMeshSDFGridParameters)
{
    {
        (......)

        auto ComputeShader = View.ShaderMap->GetShader<FReflectionClearTracesCS>(0);

        // 清理追蹤輸出紋理.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("ClearTraces"),
            ComputeShader,
            PassParameters,
            ReflectionTileParameters.TracingIndirectArgs,
            0);
    }

    FLumenIndirectTracingParameters IndirectTracingParameters;
    SetupIndirectTracingParametersForReflections(IndirectTracingParameters);

    const FSceneTextureParameters& SceneTextureParameters = GetSceneTextureParameters(GraphBuilder, SceneTextures);

    const bool bScreenTraces = GLumenReflectionScreenTraces != 0;

    if (bScreenTraces)
    {
        FReflectionTraceScreenTexturesCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FReflectionTraceScreenTexturesCS::FParameters>();

        (......)

        FReflectionTraceScreenTexturesCS::FPermutationDomain PermutationVector;
        auto ComputeShader = View.ShaderMap->GetShader<FReflectionTraceScreenTexturesCS>(PermutationVector);

        // 螢幕追蹤.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("TraceScreen"),
            ComputeShader,
            PassParameters,
            ReflectionTileParameters.TracingIndirectArgs,
            0);
    }
    
    // 網格距離場追蹤.
    if (bTraceMeshSDFs)
    {
        if (Lumen::UseHardwareRayTracedReflections()) // 硬體追蹤反射.
        {
            FCompactedReflectionTraceParameters CompactedTraceParameters = CompactTraces(
                GraphBuilder,
                View,
                ReflectionTracingParameters,
                ReflectionTileParameters,
                WORLD_MAX,
                IndirectTracingParameters.MaxTraceDistance);

            RenderLumenHardwareRayTracingReflections(
                GraphBuilder,
                SceneTextureParameters,
                View,
                ReflectionTracingParameters,
                ReflectionTileParameters,
                TracingInputs,
                CompactedTraceParameters,
                IndirectTracingParameters.MaxTraceDistance);
        }
        else
        {
            FLumenMeshSDFGridParameters MeshSDFGridParameters = InMeshSDFGridParameters;
            if (!MeshSDFGridParameters.NumGridCulledMeshSDFObjects)
            {
                CullForCardTracing(
                    GraphBuilder,
                    Scene, View,
                    TracingInputs,
                    IndirectTracingParameters,
                    /* out */ MeshSDFGridParameters);
            }

            if (MeshSDFGridParameters.TracingParameters.DistanceFieldObjectBuffers.NumSceneObjects > 0)
            {
                // 壓縮追蹤.
                FCompactedReflectionTraceParameters CompactedTraceParameters = CompactTraces(
                    GraphBuilder,
                    View,
                    ReflectionTracingParameters,
                    ReflectionTileParameters,
                    IndirectTracingParameters.CardTraceEndDistanceFromCamera,
                    IndirectTracingParameters.MaxMeshSDFTraceDistance);

                {
                    (......)
                    
                    auto ComputeShader = View.ShaderMap->GetShader<FReflectionTraceMeshSDFsCS>(PermutationVector);

                    // 追蹤網格距離場.
                    FComputeShaderUtils::AddPass(
                        GraphBuilder,
                        RDG_EVENT_NAME("TraceMeshSDFs"),
                        ComputeShader,
                        PassParameters,
                        CompactedTraceParameters.IndirectArgs,
                        0);
                }
            }
        }
    }

    FCompactedReflectionTraceParameters CompactedTraceParameters = CompactTraces(...);

    {
        (......)
        
        auto ComputeShader = View.ShaderMap->GetShader<FReflectionTraceVoxelsCS>(PermutationVector);

        // 追蹤Voxel光照.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("TraceVoxels"),
            ComputeShader,
            PassParameters,
            CompactedTraceParameters.IndirectArgs,
            0);
    }
}

Lumen反射非直接光和Lumen漫反射非直接光最重要的區別是它們追蹤的射線數量和方式有所不同,Lumen反射需要指定追蹤的最大粗糙度GLumenReflectionMaxRoughnessToTrace(預設值是0.4,可由控制檯命令r.Lumen.Reflections.MaxRoughnessToTrace改變),生成的TraceHit、TraceRadiance結果也會不同。

由於反射和漫反射涉及到的技術高度相似,此文就不再細究其技術細節了。

6.5.7.4 DiffuseIndirectComposite

此階段就是將之前的RenderLumenScreenProbeGather生成的探針的資訊(DiffuseIndirect、RoughSpecularIndirect)和RenderLumenReflections生成的反射資訊(SpecularIndirect),結合場景的GBuffer及相關資料,生成最終的場景顏色:

組合了GI的漫反射和鏡面反射後的場景顏色。(放大1.5倍,顏色範圍做了調整)

至於組合的過程,可以在其使用的PS中找到答案:

// Engine\Shaders\Private\DiffuseIndirectComposite.usf

void MainPS(
    float4 SvPosition : SV_POSITION
    , out float4 OutAddColor : SV_Target0
    , out float4 OutMultiplyColor : SV_Target1
)
{
    float2 SceneBufferUV = SvPositionToBufferUV(SvPosition);
    float2 ScreenPosition = SvPositionToScreenPosition(SvPosition).xy;

    // 取樣場景的GBuffer.
    FGBufferData GBuffer = GetGBufferDataFromSceneTextures(SceneBufferUV);

    // 取樣每幀動態生成的AO.
    float DynamicAmbientOcclusion = AmbientOcclusionTexture.SampleLevel(AmbientOcclusionSampler, SceneBufferUV, 0).r;

    // 計算最終要應用的AO.  
    float AOMask = (GBuffer.ShadingModelID != SHADINGMODELID_UNLIT);
    float FinalAmbientOcclusion = lerp(1.0f, GBuffer.GBufferAO * DynamicAmbientOcclusion, AOMask * AmbientOcclusionStaticFraction);

    float3 TranslatedWorldPosition = mul(float4(ScreenPosition * GBuffer.Depth, GBuffer.Depth, 1), View.ScreenToTranslatedWorld).xyz;

    float3 N = GBuffer.WorldNormal;
    float3 V = normalize(View.TranslatedWorldCameraOrigin - TranslatedWorldPosition);
    float NoV = saturate(dot(N, V));

    // 應用非直接漫反射.
#if DIM_APPLY_DIFFUSE_INDIRECT
    {
        float3 DiffuseIndirectLighting = 0;
        float3 RoughSpecularIndirectLighting = 0;
        float3 SpecularIndirectLighting = 0;

        #if DIM_APPLY_DIFFUSE_INDIRECT == 4
            DiffuseIndirectLighting = DiffuseIndirect_Textures_0.SampleLevel(GlobalPointClampedSampler, SceneBufferUV, 0).rgb;
            RoughSpecularIndirectLighting = DiffuseIndirect_Textures_1.SampleLevel(GlobalPointClampedSampler, SceneBufferUV, 0).rgb;
            SpecularIndirectLighting = DiffuseIndirect_Textures_2.SampleLevel(GlobalPointClampedSampler, SceneBufferUV, 0).rgb;
        #else
        {
            // 取樣降噪器的輸出.
            FSSDKernelConfig KernelConfig = CreateKernelConfig();
                
            #if DEBUG_OUTPUT
            {
                KernelConfig.DebugPixelPosition = uint2(SvPosition.xy);
                KernelConfig.DebugEventCounter = 0;
            }
            #endif

            // Compile time.
            KernelConfig.bSampleKernelCenter = true;
            KernelConfig.BufferLayout = CONFIG_SIGNAL_INPUT_LAYOUT;
            KernelConfig.bUnroll = true;

            #if DIM_UPSCALE_DIFFUSE_INDIRECT
            {
                KernelConfig.SampleSet = SAMPLE_SET_2X2_BILINEAR;
                KernelConfig.BilateralDistanceComputation = SIGNAL_WORLD_FREQUENCY_REF_METADATA_ONLY;
                KernelConfig.WorldBluringDistanceMultiplier = 16.0;
                
                KernelConfig.BilateralSettings[0] = BILATERAL_POSITION_BASED(3);
                
                // SGPRs(Scalar General Purpose Register, 標量通用暫存器)
                KernelConfig.BufferSizeAndInvSize = View.BufferSizeAndInvSize * float4(0.5, 0.5, 2.0, 2.0);
                KernelConfig.BufferBilinearUVMinMax = View.BufferBilinearUVMinMax;
            }
            #else
            {
                KernelConfig.SampleSet = SAMPLE_SET_1X1;
                KernelConfig.bNormalizeSample = true;
                
                // SGPRs
                KernelConfig.BufferSizeAndInvSize = View.BufferSizeAndInvSize;
                KernelConfig.BufferBilinearUVMinMax = View.BufferBilinearUVMinMax;
            }
            #endif

            // VGPRs(Vector General Purpose Register, 向量通用暫存器)
            KernelConfig.BufferUV = SceneBufferUV; 
            {
                KernelConfig.CompressedRefSceneMetadata = GBufferDataToCompressedSceneMetadata(GBuffer);
                KernelConfig.RefBufferUV = SceneBufferUV;
                KernelConfig.RefSceneMetadataLayout = METADATA_BUFFER_LAYOUT_DISABLED;
            }
            KernelConfig.HammersleySeed = Rand3DPCG16(int3(SvPosition.xy, View.StateFrameIndexMod8)).xy;
                
            FSSDSignalAccumulatorArray UncompressedAccumulators = CreateSignalAccumulatorArray();
            FSSDCompressedSignalAccumulatorArray CompressedAccumulators = CompressAccumulatorArray(
                UncompressedAccumulators, CONFIG_ACCUMULATOR_VGPR_COMPRESSION);

            // 累加捲積核
            AccumulateKernel(
                KernelConfig,
                DiffuseIndirect_Textures_0,
                DiffuseIndirect_Textures_1,
                DiffuseIndirect_Textures_2,
                DiffuseIndirect_Textures_3,
                /* inout */ UncompressedAccumulators,
                /* inout */ CompressedAccumulators);

            // 取樣
            FSSDSignalSample Sample;
            #if DIM_UPSCALE_DIFFUSE_INDIRECT
                Sample = NormalizeToOneSample(UncompressedAccumulators.Array[0].Moment1);
            #else
                Sample = UncompressedAccumulators.Array[0].Moment1;
            #endif
            
            // DIM_APPLY_DIFFUSE_INDIRECT是1或3時只有漫反射非直接光.
            #if DIM_APPLY_DIFFUSE_INDIRECT == 1 || DIM_APPLY_DIFFUSE_INDIRECT == 3
            {
                DiffuseIndirectLighting = Sample.SceneColor.rgb;
            }
            // DIM_APPLY_DIFFUSE_INDIRECT是2時有漫反射和鏡面非直接光.
            #elif DIM_APPLY_DIFFUSE_INDIRECT == 2
            {
                DiffuseIndirectLighting = UncompressedAccumulators.Array[0].Moment1.ColorArray[0];
                SpecularIndirectLighting = UncompressedAccumulators.Array[0].Moment1.ColorArray[1];
            }
            #else
                #error Unimplemented
            #endif
        }
        #endif

        float3 DiffuseColor = bVisualizeDiffuseIndirect ? float3(.18f, .18f, .18f) : GBuffer.DiffuseColor;
        float3 SpecularColor = GBuffer.SpecularColor;

        #if DIM_APPLY_DIFFUSE_INDIRECT == 4
            RemapClearCoatDiffuseAndSpecularColor(GBuffer, NoV, DiffuseColor, SpecularColor);
        #endif

        #if DIM_APPLY_DIFFUSE_INDIRECT == 2 || DIM_APPLY_DIFFUSE_INDIRECT == 4
            float DiffuseIndirectAO = 1;
        #else
            float DiffuseIndirectAO = lerp(1, FinalAmbientOcclusion, ApplyAOToDynamicDiffuseIndirect);
        #endif

        FDirectLighting IndirectLighting;
        if (GBuffer.ShadingModelID == SHADINGMODELID_HAIR)
        {
            IndirectLighting.Diffuse = DiffuseIndirectLighting * GBuffer.BaseColor;
            IndirectLighting.Specular = 0;
        }
        else
        {
            IndirectLighting.Diffuse = DiffuseIndirectLighting * DiffuseColor * DiffuseIndirectAO;
            IndirectLighting.Transmission = 0;

            #if DIM_APPLY_DIFFUSE_INDIRECT == 4
                IndirectLighting.Specular = CombineRoughSpecular(GBuffer, NoV, SpecularIndirectLighting, RoughSpecularIndirectLighting, SpecularColor);
            #else
                IndirectLighting.Specular = SpecularIndirectLighting * EnvBRDF(SpecularColor, GBuffer.Roughness, NoV);
            #endif
        }

        const bool bNeedsSeparateSubsurfaceLightAccumulation = UseSubsurfaceProfile(GBuffer.ShadingModelID);

        if (bNeedsSeparateSubsurfaceLightAccumulation &&
            View.bSubsurfacePostprocessEnabled > 0 && View.bCheckerboardSubsurfaceProfileRendering > 0)
        {
            bool bChecker = CheckerFromSceneColorUV(SceneBufferUV);

            // Adjust for checkerboard. only apply non-diffuse lighting (including emissive) 
            // to the specular component, otherwise lighting is applied twice
            IndirectLighting.Specular *= !bChecker;
        }

        // 累加光照結果.
        FLightAccumulator LightAccumulator = (FLightAccumulator)0;
        LightAccumulator_Add(
            LightAccumulator,
            IndirectLighting.Diffuse + IndirectLighting.Specular,
            IndirectLighting.Diffuse,
            1.0f,
            bNeedsSeparateSubsurfaceLightAccumulation);
        // 獲取光照結果.
        OutAddColor = LightAccumulator_GetResult(LightAccumulator);
    }
    #else
    {
        OutAddColor = 0;
    }
    #endif

    OutMultiplyColor = FinalAmbientOcclusion;
}

6.5.8 Lumen總結

Lumen的步驟很多很複雜,但總結起來可分為幾個步驟:

1、構建MeshCard和LumenCard,更新它們。

2、根據Lumen場景的Card資訊,追蹤並更新對應的紋素(Texel)。

3、在漫反射和鏡面反射階段,利用多種方式追蹤和計算螢幕空間表面的光照。

4、組合前述步驟得到的非直接光的漫反射和鏡面反射,獲得疊加了非直接光的最終場景顏色。

另外,在追蹤過程中涉及到了多種方式,並且它們是按照權重過渡而成(下圖)。

混合追蹤示意圖。紅色表示螢幕追蹤,綠色表示網格距離場追蹤,藍色表示Voxel Lighting追蹤。顏色過渡代表著不同型別追蹤之間的過渡。

修改DEBUG_VISUALIZE_TRACE_TYPES為1且在命令列關閉ShowFlag.DirectLighting可以開啟追蹤權重視覺化模式:

// Engine\Shaders\Private\Lumen\LumenScreenProbeTracing.usf

#define DEBUG_VISUALIZE_TRACE_TYPES 1 // 啟用追蹤權重視覺化(預設為0)

整體上,Lumen綜合了SSGI、SDF(Mesh SDF和Global SDF)、Lumen Card、Voxel Cone等追蹤技術,應用了各種技術生成了各類資料息(自適應的Screen Space Probe、 Irradiance Probe、Surface Cache、Prefilter Radiance、Voxel Lighting、RSM、Virtual Texture、Clipmap),計算出非直接光的漫反射和鏡面反射,最後按權重混合成場景顏色。

Lumen漫反射GI支援軟硬體兩種方式,預設引數下,其軟體方式涉及的各類追蹤描述如下:

追蹤型別 譯名 範圍 描述
Screen Trace 螢幕追蹤 全場景 亦即SSGI,只要能追蹤到相交點,優先使用其反彈資訊。
Voxel Lighting Trace 體素光照追蹤 距相機200米內 基於Cone的射線追蹤,會取樣MIP快速得到不同Hit距離的資訊。
Detail MeshCard Trace 細節網格卡片追蹤 2~40米 取樣MeshCard 光照資訊時會使⽤類似VSM的⽅式使⽤概率估算遮擋。
Distant MeshCard Trace 遠距網格卡片追蹤 200~1000米 會追蹤預先生成的全域性距離場,不再使用遮擋估算。

Lumen鏡面反射GI也支援軟硬體兩種方式,其中軟體方式結合了SSR + SDF Tracing(Mesh SDF、Global SDF)的技術。

6.6 其它渲染技術

6.6.1 Temporal Super Resolution

時間超解析度(Temporal Super Resolution,TSR)是新一代的時間抗鋸齒演算法,用來替換傳統(UE4)的TAA。它的特性有利於低解析度輸入獲得高解析度的輸出,且質量解決原生解析度,在高頻下更少鬼影更少閃爍,針對PS5等平臺做了優化,但同時需要SM5.0以上的圖形平臺。

TSR使用的技術跟NVIDIA的DLSS和AMD的FidelityFX Super Resolution(FSR)相似,只是DLSS基於Tensor Core的深度學習做了加速,而TSR不需要依賴Tensor Core。換句話說,TSR可以不依賴RTX顯示卡而運行於其它顯示卡廠商的裝置。TSR由於可以採用低解析度輸出高解析度的紋理,所以不僅可以提升抗鋸齒效果,還可以提升渲染效能,減少能耗。

不同於UE4,UE5只要配置沒有顯式禁用TemporalAA,無論選擇了何種抗鋸齒,在後處理階段都會走TSR通道。呼叫堆疊如下所示:

// Engine\Source\Runtime\Renderer\Private\PostProcess\PostProcessing.cpp

void AddPostProcessingPasses(FRDGBuilder& GraphBuilder, const FViewInfo& View, ...)
{
    (......)
    
    // TAA抗鋸齒.
    EMainTAAPassConfig TAAConfig = ITemporalUpscaler::GetMainTAAPassConfig(View);
    // TAA配置沒有禁用.
    if (TAAConfig != EMainTAAPassConfig::Disabled)
    {
        (......)
        
        // 呼叫FDefaultTemporalUpscaler::AddPasses, 見後面的解析.
        UpscalerToUse->AddPasses(
            GraphBuilder,
            View,
            UpscalerPassInputs,
            &SceneColor.Texture,
            &SecondaryViewRect,
            &DownsampledSceneColor.Texture,
            &DownsampledSceneColor.ViewRect);
    }
    
    (......)
}

// Engine\Source\Runtime\Renderer\Private\PostProcess\TemporalAA.cpp

void FDefaultTemporalUpscaler::AddPasses(FRDGBuilder& GraphBuilder, const FViewInfo& View,...) const final
{
    // 如果啟用了且支援第五代TAA, 則進入TSR通道.
    if (CVarTAAAlgorithm.GetValueOnRenderThread() && DoesPlatformSupportGen5TAA(View.GetShaderPlatform()))
    {
        *OutSceneColorHalfResTexture = nullptr;

        return AddTemporalSuperResolutionPasses(
            GraphBuilder,
            View,
            PassInputs,
            OutSceneColorTexture,
            OutSceneColorViewRect);
    }
    (......)
}

由此進入了AddTemporalSuperResolutionPasses,以下是RenderDoc擷取的TSR渲染過程:

由此可知,TSR相比UE4的TAA多了很多個Pass,主要包含清理上一幀紋理、放大速度緩衝、摒棄無效速度緩衝、過濾頻率、對比歷史資料、後置過濾重投射、放大重投射、更新歷史等幾個階段。

其中以上階段最重要的一步是更新歷史階段,它會根據輸入的場景顏色、深度、放大後速度、視差係數、歷史幀資料(放大後重投影、重投影、高頻、低頻、元資料、子畫素資訊)等資料生成最終的抗鋸齒後的場景顏色和當前的歷史幀資料。

左:場景顏色輸入;右:TSR後的場景顏色輸出。

TSR輸出的歷史幀資料:低頻、高頻、元資料、子畫素資訊。

下面直接進入更新歷史階段使用的Compute Shader進行分析:

// /Engine/Private/TemporalAA/TAAUpdateHistory.usf

[numthreads(TILE_SIZE, TILE_SIZE, 1)]
void MainCS(
    uint2 GroupId : SV_GroupID,
    uint GroupThreadIndex : SV_GroupIndex)
{
    uint GroupWaveIndex = GetGroupWaveIndex(GroupThreadIndex, /* GroupSize = */ TILE_SIZE * TILE_SIZE);

    float4 Debug = 0.0;

    // 歷史畫素位置.
    taa_short2 HistoryPixelPos = (
        taa_short2(GroupId) * taa_short2(TILE_SIZE, TILE_SIZE) +
        Map8x8Tile2x2Lane(GroupThreadIndex));

    float2 ViewportUV = (float2(HistoryPixelPos) + 0.5f) * HistoryInfo_ViewportSizeInverse;
    float2 ScreenPos = ViewportUVToScreenPos(ViewportUV);
    
    // 輸入視口中輸出畫素O中心的畫素座標.
    float2 PPCo = ViewportUV * InputInfo_ViewportSize + InputJitter;

    // 最近的輸入畫素K的中心畫素座標。
    float2 PPCk = floor(PPCo) + 0.5;
    
    taa_short2 InputPixelPos = ClampPixelOffset(
        taa_short2(InputPixelPosMin) + taa_short2(PPCo),
        InputPixelPosMin, InputPixelPosMax);

    // 獲取重投影相關的資訊.
    float2 PrevScreenPos = ScreenPos;
    taa_half ParallaxRejectionMask = taa_half(1.0);
    taa_half LowFrequencyRejection = taa_half(1.0);
    taa_half OutputPixelVelocity = taa_half(0.0);
    #if 1
    {
        float2 EncodedVelocity = DilatedVelocityTexture[InputPixelPos];
        ParallaxRejectionMask = ParallaxRejectionMaskTexture[InputPixelPos];

        float2 ScreenVelocity = DecodeVelocityFromTexture(float4(EncodedVelocity, 0.0, 0.0)).xy;

        PrevScreenPos = ScreenPos - ScreenVelocity;
        OutputPixelVelocity = taa_half(length(ScreenVelocity * HistoryInfo_ViewportSize));

        taa_ushort2 RejectionPixelPos = (taa_ushort2(InputPixelPos) - taa_short2(InputPixelPosMin)) / 2;
        LowFrequencyRejection = HistoryRejectionTexture[RejectionPixelPos];
        
        #if !CONFIG_CLAMP
        {
            ParallaxRejectionMask = taa_half(1.0);
            LowFrequencyRejection = taa_half(1.0);
        }
        #endif
    }
    #endif

    // 獲取畫素是否響應AA.
    bool bIsResponsiveAAPixel = false;
    #if CONFIG_RESPONSIVE_STENCIL
    {
        const uint kResponsiveStencilMask = 1 << 3;
            
        uint SceneStencilRef = InputSceneStencilTexture.Load(int3(InputPixelPos, 0)) STENCIL_COMPONENT_SWIZZLE;

        bIsResponsiveAAPixel = (SceneStencilRef & kResponsiveStencilMask) != 0;
    }
    #endif
    
    // 檢測HistoryBufferUV是否在視口之外.
    bool bOffScreen = IsOffScreen(bCameraCut, PrevScreenPos, ParallaxRejectionMask);
    
    taa_half TotalRejection = bOffScreen ? 0.0 : saturate(LowFrequencyRejection * 4.0);


    // 以預測頻率過濾輸入場景顏色.
    taa_half3 FilteredInputColor;
    taa_half3 InputMinColor;
    taa_half3 InputMaxColor;
    taa_half InputPixelAlignement;
    taa_half ClosestInputLuma4;
    
    ISOLATE
    {
        // 從畫素K到O的向量.
        taa_half2 dKO = taa_half2(PPCo - PPCk);

        FilteredInputColor = taa_half(0.0);

        taa_half FilteredInputColorWeight = taa_half(0.0);
        
        #if 0 // shader compiler bug :'(
            taa_half InputToHistoryFactor = taa_half(HistoryInfo_ViewportSize.x * InputInfo_ViewportSizeInverse.x);
            taa_half FinalInputToHistoryFactor = bOffScreen ? taa_half(1.0) : InputToHistoryFactor;
        #else
            float InputToHistoryFactor = float(HistoryInfo_ViewportSize.x * InputInfo_ViewportSizeInverse.x);
            float FinalInputToHistoryFactor = lerp(1.0, InputToHistoryFactor, TotalRejection);
        #endif

        InputMinColor = taa_half(INFINITE_FLOAT);
        InputMaxColor = taa_half(-INFINITE_FLOAT);

        // 根據CONFIG_SAMPLES用不同方式生成取樣座標並採樣輸入的場景顏色.
        UNROLL_N(CONFIG_SAMPLES)
        for (uint SampleId = 0; SampleId < CONFIG_SAMPLES; SampleId++)
        {
            taa_short2 SampleInputPixelPos;
            taa_half2 PixelOffset;
            
            #if CONFIG_SAMPLES == 9
            {
                taa_short2 iPixelOffset = taa_short2(kOffsets3x3[kSquareIndexes3x3[SampleId]]);
                PixelOffset = taa_half2(iPixelOffset);
                
                SampleInputPixelPos = AddAndClampPixelOffset(
                    InputPixelPos,
                    iPixelOffset, iPixelOffset,
                    InputPixelPosMin, InputPixelPosMax);
            }
            #elif CONFIG_SAMPLES == 5 || CONFIG_SAMPLES == 6
            {
                if (SampleId == 5)
                {
                    taa_short2 iPixelOffset;
                    #if CONFIG_COMPILE_FP16
                        iPixelOffset = int16_t2(1, 1) - int16_t2((asuint16(dKO) & uint16_t(0x8000)) >> uint16_t(14));
                        PixelOffset = asfloat16(asuint16(half(1.0)).xx | (asuint16(dKO) & uint16_t(0x8000)));
                    #else
                        iPixelOffset = SignFastInt(dKO);
                        PixelOffset = asfloat(asuint(1.0).xx | (asuint(dKO) & uint(0x80000000)));
                    #endif
                        
                    SampleInputPixelPos = ClampPixelOffset(InputPixelPos, InputPixelPosMin, InputPixelPosMax);
                }
                else
                {
                    taa_short2 iPixelOffset = taa_short2(kOffsets3x3[kPlusIndexes3x3[SampleId]]);
                    PixelOffset = taa_half2(iPixelOffset);
                    
                    SampleInputPixelPos = AddAndClampPixelOffset(
                        InputPixelPos,
                        iPixelOffset, iPixelOffset,
                        InputPixelPosMin, InputPixelPosMax);
                }
            }
            #else
                #error Unknown sample count
            #endif

            taa_half3 InputColor = InputSceneColorTexture[SampleInputPixelPos];

            taa_half2 dPP = PixelOffset - dKO;
            taa_half SampleSpatialWeight = ComputeSampleWeigth(FinalInputToHistoryFactor, dPP, /* MinimalContribution = */ float(0.005));

            taa_half ToneWeight = HdrWeight4(InputColor);

            FilteredInputColor       += (SampleSpatialWeight * ToneWeight) * InputColor;
            FilteredInputColorWeight += (SampleSpatialWeight * ToneWeight);

            if (SampleId == 0)
            {
                ClosestInputLuma4 = Luma4(InputColor);
                InputMinColor = TransformColorForClampingBox(InputColor);
                InputMaxColor = TransformColorForClampingBox(InputColor);
            }
            else
            {
                InputMinColor = min(InputMinColor, TransformColorForClampingBox(InputColor));
                InputMaxColor = max(InputMaxColor, TransformColorForClampingBox(InputColor));
            }
        }
        
        FilteredInputColor *= rcp(FilteredInputColorWeight);

        InputPixelAlignement = ComputeSampleWeigth(InputToHistoryFactor, dKO, /* MinimalContribution = */ float(0.0));
    }
        
    // 儲存到LDS中,為VGPR取樣歷史資料騰出空間.
    #if CONFIG_MANUAL_LDS_SPILL
    ISOLATE
    {
        uint LocalGroupThreadIndex = GetGroupThreadIndex(GroupThreadIndex, GroupWaveIndex);

        SharedArray0[LocalGroupThreadIndex] = taa_half4(FilteredInputColor, LowFrequencyRejection);
        SharedArray1[LocalGroupThreadIndex] = taa_half4(InputMinColor, InputPixelAlignement);
        SharedArray2[LocalGroupThreadIndex] = taa_half4(InputMaxColor, OutputPixelVelocity);
    }
    #endif
    
    // 重投影歷史資料.
    taa_half3 PrevHistoryMoment1;
    taa_half PrevHistoryValidity;
    
    taa_half3 PrevHistoryMommentMin;
    taa_half3 PrevHistoryMommentMax;

    taa_half3 PrevFallbackColor;
    taa_half PrevFallbackWeight;
    
    taa_subpixel_details PrevSubpixelDetails;

    ISOLATE
    {
        // 重投影歷史資料.
        taa_half3 RawHistory0 = taa_half(0);
        taa_half3 RawHistory1 = taa_half(0);
        taa_half2 RawHistory2 = taa_half(0);

        taa_half3 RawHistory1Min = INFINITE_FLOAT;
        taa_half3 RawHistory1Max = -INFINITE_FLOAT;

        // 取樣原始的歷史資料.
        {
            float2 PrevHistoryBufferUV = (PrevHistoryInfo_ScreenPosToViewportScale * PrevScreenPos + PrevHistoryInfo_ScreenPosToViewportBias) * PrevHistoryInfo_ExtentInverse;
            PrevHistoryBufferUV = clamp(PrevHistoryBufferUV, PrevHistoryInfo_UVViewportBilinearMin, PrevHistoryInfo_UVViewportBilinearMax);

            #if 1
            {
                FCatmullRomSamples Samples = GetBicubic2DCatmullRomSamples(PrevHistoryBufferUV, PrevHistoryInfo_Extent, PrevHistoryInfo_ExtentInverse);

                UNROLL
                for (uint i = 0; i < Samples.Count; i++)
                {
                    float2 SampleUV = clamp(Samples.UV[i], PrevHistoryInfo_UVViewportBilinearMin, PrevHistoryInfo_UVViewportBilinearMax);

                    taa_half3 Sample0 = PrevHistory_Textures_0.SampleLevel(GlobalBilinearClampedSampler, SampleUV, 0);
                    taa_half3 Sample1 = PrevHistory_Textures_1.SampleLevel(GlobalBilinearClampedSampler, SampleUV, 0);
                    taa_half2 Sample2 = PrevHistory_Textures_2.SampleLevel(GlobalBilinearClampedSampler, SampleUV, 0);

                    RawHistory1Min = min(RawHistory1Min, Sample1 * SafeRcp(Sample2.g));
                    RawHistory1Max = max(RawHistory1Max, Sample1 * SafeRcp(Sample2.g));

                    RawHistory0 += Sample0 * taa_half(Samples.Weight[i]);
                    RawHistory1 += Sample1 * taa_half(Samples.Weight[i]);
                    RawHistory2 += Sample2 * taa_half(Samples.Weight[i]);
                }
                RawHistory0 *= taa_half(Samples.FinalMultiplier);
                RawHistory1 *= taa_half(Samples.FinalMultiplier);
                RawHistory2 *= taa_half(Samples.FinalMultiplier);
            }
            #else
            {
                RawHistory0 = PrevHistory_Textures_0.SampleLevel(GlobalBilinearClampedSampler, PrevHistoryBufferUV, 0);
                RawHistory1 = PrevHistory_Textures_1.SampleLevel(GlobalBilinearClampedSampler, PrevHistoryBufferUV, 0);
                RawHistory2 = PrevHistory_Textures_2.SampleLevel(GlobalBilinearClampedSampler, PrevHistoryBufferUV, 0);
            }
            #endif
            
            FSubpixelNeighborhood SubpixelNeighborhood = GatherPrevSubpixelNeighborhood(PrevHistory_Textures_3, PrevHistoryBufferUV);
            {
                PrevSubpixelDetails = 0;
                UNROLL_N(SUB_PIXEL_COUNT)
                for (uint SubpixelId = 0; SubpixelId < SUB_PIXEL_COUNT; SubpixelId++)
                {
                    taa_subpixel_payload SubpixelPayload = GetSubpixelPayload(SubpixelNeighborhood, SubpixelId);
                    PrevSubpixelDetails |= SubpixelPayload << (SUB_PIXEL_BIT_COUNT * SubpixelId);
                }
            }

            RawHistory0 = -min(-RawHistory0, taa_half(0.0));
            RawHistory1 = -min(-RawHistory1, taa_half(0.0));
            RawHistory2 = -min(-RawHistory2, taa_half(0.0));
        }
        
        // 解壓歷史資料.
        {
            PrevFallbackColor = RawHistory0;
            PrevFallbackWeight = RawHistory2.r;
            
            PrevHistoryMommentMin = RawHistory1Min;
            PrevHistoryMommentMax = RawHistory1Max;

            PrevHistoryMoment1 = RawHistory1;
            PrevHistoryValidity = RawHistory2.g;
        }

        // 校正歷史資料.
        {
            PrevHistoryMommentMin *= taa_half(HistoryPreExposureCorrection);
            PrevHistoryMommentMax *= taa_half(HistoryPreExposureCorrection);
            PrevHistoryMoment1 *= taa_half(HistoryPreExposureCorrection);
            PrevFallbackColor *= taa_half(HistoryPreExposureCorrection);
        }
    }
    
    // 從LDS讀取資料.
    #if CONFIG_MANUAL_LDS_SPILL
    ISOLATE
    {
        uint LocalGroupThreadIndex = GetGroupThreadIndex(GroupThreadIndex, GroupWaveIndex);

        taa_half4 RawLDS0 = SharedArray0[LocalGroupThreadIndex];
        taa_half4 RawLDS1 = SharedArray1[LocalGroupThreadIndex];
        taa_half4 RawLDS2 = SharedArray2[LocalGroupThreadIndex];

        FilteredInputColor = RawLDS0.rgb;
        InputMinColor = RawLDS1.rgb;
        InputMaxColor = RawLDS2.rgb;
        
        LowFrequencyRejection = RawLDS0.a;
        InputPixelAlignement = RawLDS1.a;
        OutputPixelVelocity = RawLDS2.a;
    }
    #endif

    // 如果當前低頻偏離歷史低頻, 摒棄高頻細節.
    #if CONFIG_LOW_FREQUENCY_DRIFT_REJECTION
    {
        taa_half3 PrevHighFrequencyYCoCg = TransformColorForClampingBox(PrevHistoryMoment1 * SafeRcp(PrevHistoryValidity));
        taa_half3 PrevYCoCg = TransformColorForClampingBox(PrevFallbackColor);
        taa_half3 ClampedPrevYCoCg = TransformColorForClampingBox(clamp(PrevFallbackColor, PrevHistoryMommentMin, PrevHistoryMommentMax));

        taa_half HighFrequencyRejection = MeasureRejectionFactor(
            PrevYCoCg, ClampedPrevYCoCg,
            PrevHighFrequencyYCoCg, InputMinColor, InputMaxColor);
        
        PrevHistoryMoment1 *= HighFrequencyRejection;
        PrevHistoryValidity *= HighFrequencyRejection;
    }
    #endif

    // 將當前幀的輸入輸入到下一幀的預測器中.
    const taa_half Histeresis = rcp(taa_half(MAX_SAMPLE_COUNT));
    const taa_half PredictionOnlyValidity = Histeresis * taa_half(2.0);
    
    // 擷取備選資料.
    taa_half LumaMin;
    taa_half LumaMax;
    taa_half3 ClampedFallbackColor;
    taa_half FallbackRejection;
    {
        LumaMin = InputMinColor.x;
        LumaMax = InputMaxColor.x;

        taa_half3 PrevYCoCg = TransformColorForClampingBox(PrevFallbackColor);
        taa_half3 ClampedPrevYCoCg = clamp(PrevYCoCg, InputMinColor, InputMaxColor);
        taa_half3 InputCenterYCoCg = TransformColorForClampingBox(FilteredInputColor);

        ClampedFallbackColor = YCoCgToRGB(ClampedPrevYCoCg);
        
        FallbackRejection = MeasureRejectionFactor(
            PrevYCoCg, ClampedPrevYCoCg,
            InputCenterYCoCg, InputMinColor, InputMaxColor);

        #if !CONFIG_CLAMP
        {
            ClampedFallbackColor = PrevFallbackColor;
            FallbackRejection = taa_half(1.0);
        }
        #endif
    }

    taa_half3 FinalHistoryMoment1;
    taa_half FinalHistoryValidity;
    {
        // 根據完整性,計算需要摒棄多少歷史記錄.
        taa_half PrevHistoryRejectionWeight = LowFrequencyRejection;
            
        FLATTEN
        if (bOffScreen)
        {
            PrevHistoryRejectionWeight = taa_half(0.0);
        }

        taa_half DesiredCurrentContribution = max(Histeresis * InputPixelAlignement, taa_half(0.0));

        // 確定基於預測的摒棄是否足夠可信.
        taa_half RejectionConfidentEnough = taa_half(1); // saturate(RejectionValidity * MAX_SAMPLE_COUNT - 3.0);

        // 計算新摒棄的有效性.
        taa_half RejectedValidity = (
            min(PrevHistoryValidity, PredictionOnlyValidity - DesiredCurrentContribution) +
            max(PrevHistoryValidity - PredictionOnlyValidity + DesiredCurrentContribution, taa_half(0.0)) * PrevHistoryRejectionWeight);

        RejectedValidity = PrevHistoryValidity * PrevHistoryRejectionWeight;

        // 計算最大輸出有效性.
        taa_half OutputValidity = (
            clamp(RejectedValidity + DesiredCurrentContribution, taa_half(0.0), PredictionOnlyValidity) +
            clamp(RejectedValidity + DesiredCurrentContribution * PrevHistoryRejectionWeight * RejectionConfidentEnough - PredictionOnlyValidity, 0.0, 1.0 - PredictionOnlyValidity));

        FLATTEN
        if (bIsResponsiveAAPixel)
        {
            OutputValidity = taa_half(0.0);
        }
        
        taa_half InvPrevHistoryValidity = SafeRcp(PrevHistoryValidity);

        taa_half PrevMomentWeight = max(OutputValidity - DesiredCurrentContribution, taa_half(0.0));
        taa_half CurrentMomentWeight = min(DesiredCurrentContribution, OutputValidity);
        
        {
            taa_half PrevHistoryToneWeight = HdrWeightY(Luma4(PrevHistoryMoment1) * InvPrevHistoryValidity);
            taa_half FilteredInputToneWeight = HdrWeight4(FilteredInputColor);
            
            taa_half BlendPrevHistory = PrevMomentWeight * PrevHistoryToneWeight;
            taa_half BlendFilteredInput = CurrentMomentWeight * FilteredInputToneWeight;

            taa_half CommonWeight = OutputValidity * SafeRcp(BlendPrevHistory + BlendFilteredInput);

            FinalHistoryMoment1 = (
                PrevHistoryMoment1 * (CommonWeight * BlendPrevHistory * InvPrevHistoryValidity) +
                FilteredInputColor * (CommonWeight * BlendFilteredInput));
        }

        // 量化有效性的8位編碼調整,以避免數字偏移.
        taa_half OutputInvValidity = SafeRcp(OutputValidity);
        FinalHistoryValidity = ceil(taa_half(255.0) * OutputValidity) * rcp(taa_half(255.0));
        FinalHistoryMoment1 *= FinalHistoryValidity * OutputInvValidity;
    }

    // 計算備用的歷史資料.
    taa_half3 FinalFallbackColor;
    taa_half FinalFallbackWeight;
    {
        const taa_half TargetHesteresisCurrentFrameWeight = rcp(taa_half(MAX_FALLBACK_SAMPLE_COUNT));

        taa_half LumaHistory = Luma4(PrevFallbackColor);
        taa_half LumaFiltered = Luma4(FilteredInputColor);

        {
            taa_half OutputBlend = ComputeFallbackContribution(FinalHistoryValidity);
        }

        taa_half BlendFinal;
        #if 1
        {
            taa_half CurrentFrameSampleCount = max(InputPixelAlignement, taa_half(0.005));
            
            // 僅使用一個樣本計數就可以極快地恢復歷史摒棄, 但隨後立即穩定,以便子畫素頻率可以儘快使用.
            taa_half PrevFallbackSampleCount;
            FLATTEN
            if (PrevFallbackWeight < taa_half(1.0))
            {
                PrevFallbackSampleCount = PrevFallbackWeight;
            }
            else
            {
                PrevFallbackSampleCount = taa_half(MAX_FALLBACK_SAMPLE_COUNT);
            }

            // 根據低頻摒棄歷史資料.
            #if 1
            {
                taa_half PrevFallbackRejectionFactor = saturate(LowFrequencyRejection * (CurrentFrameSampleCount + PrevFallbackSampleCount) / PrevFallbackSampleCount);

                PrevFallbackSampleCount *= PrevFallbackRejectionFactor;
            }
            #endif

            BlendFinal = CurrentFrameSampleCount / (CurrentFrameSampleCount + PrevFallbackSampleCount);

            // 增加運動的混合權重.
            #if 1
            {
                BlendFinal = lerp(BlendFinal, max(taa_half(0.2), BlendFinal), saturate(OutputPixelVelocity * rcp(taa_half(40.0))));
            }
            #endif

            // 抗閃爍.
            #if 1
            {
                taa_half DistToClamp = min( abs(LumaHistory - LumaMin), abs(LumaHistory - LumaMax) ) / max3( LumaHistory, LumaFiltered, taa_half(1e-4) );
                BlendFinal *= taa_half(0.2) + taa_half(0.8) * saturate(taa_half(0.5) * DistToClamp);
            }
            #endif
            
            // 確保至少有一些小的貢獻.
            #if 1
            {
                BlendFinal = max( BlendFinal, saturate( taa_half(0.01) * LumaHistory / abs( LumaFiltered - LumaHistory ) ) );
            }
            #endif

            // 反應力度是新幀的1/4.
            BlendFinal = bIsResponsiveAAPixel ? taa_half(1.0/4.0) : BlendFinal;

            // 完全摒棄歷史資料.
            {
                PrevFallbackSampleCount *= TotalRejection;
                BlendFinal = lerp(1.0, BlendFinal, TotalRejection);
            }

            FinalFallbackWeight = saturate(CurrentFrameSampleCount + PrevFallbackSampleCount);
            
            #if 1
                FinalFallbackWeight = saturate(floor(255.0 * (CurrentFrameSampleCount + PrevFallbackSampleCount)) * rcp(255.0));
            #endif
        }
        #endif

        {
            taa_half FilterWeight = HdrWeight4(FilteredInputColor);
            taa_half ClampedHistoryWeight = HdrWeight4(ClampedFallbackColor);

            taa_half2 Weights = WeightedLerpFactors(ClampedHistoryWeight, FilterWeight, BlendFinal);

            FinalFallbackColor = ClampedFallbackColor * Weights.x + FilteredInputColor * Weights.y;
        }
    }

    // 更新子畫素細節.
    taa_subpixel_details FinalSubpixelDetails;
    {
        taa_half2 dKO = taa_half2(PPCo - PPCk);

        bool bUpdate = all(abs(dKO) < 0.5 * (InputInfo_ViewportSize.x * HistoryInfo_ViewportSizeInverse.x));

        FinalSubpixelDetails = PrevSubpixelDetails;

        taa_subpixel_payload ParallaxFactorBits = ParallaxFactorTexture[InputPixelPos] & SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK;

        {
            const uint ParallaxFactorMask = (
                (SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK << (SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET + 0 * SUB_PIXEL_BIT_COUNT)) | 
                (SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK << (SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET + 1 * SUB_PIXEL_BIT_COUNT)) | 
                (SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK << (SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET + 2 * SUB_PIXEL_BIT_COUNT)) | 
                (SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK << (SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET + 3 * SUB_PIXEL_BIT_COUNT)) | 
                0x0);
            
            // 重置視差係數.
            FLATTEN
            if (bOffScreen)
            {
                FinalSubpixelDetails = FinalSubpixelDetails & ~ParallaxFactorMask;
            }
        }

        FLATTEN
        if (bUpdate)
        {
            bool2 bBool = dKO < 0.0;

            uint SubpixelId = dot(uint2(bBool), uint2(1, SUB_PIXEL_GRID_SIZE));
            uint SubpixelShift = SubpixelId * SUB_PIXEL_BIT_COUNT;

            taa_subpixel_payload SubpixelPayload = (ParallaxFactorBits << SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET);

            FinalSubpixelDetails = (FinalSubpixelDetails & (~(SUB_PIXEL_BIT_MASK << SubpixelShift))) | (SubpixelPayload << SubpixelShift);
        }
    }

    // 計算最終輸出.
    taa_half3 FinalOutputColor;
    taa_half FinalOutputValidity;
    {
        taa_half OutputBlend = ComputeFallbackContribution(FinalHistoryValidity);

        FinalOutputValidity = lerp(taa_half(1.0), saturate(FinalHistoryValidity), OutputBlend);

        taa_half3 NormalizedFinalHistoryMoment1 = taa_half3(FinalHistoryMoment1 * float(SafeRcp(FinalHistoryValidity)));

        taa_half FallbackWeight = HdrWeight4(FinalFallbackColor);
        taa_half Moment1Weight = HdrWeight4(NormalizedFinalHistoryMoment1);

        taa_half2 Weights = WeightedLerpFactors(FallbackWeight, Moment1Weight, OutputBlend);

        #if DEBUG_FALLBACK_BLENDING
            taa_half3 FallbackColor = taa_half3(1, 0.25, 0.25);
            taa_half3 HighFrequencyColor = taa_half3(0.25, 1, 0.25);

            FinalOutputColor = FinalFallbackColor * Weights.x * FallbackColor + NormalizedFinalHistoryMoment1 * Weights.y * HighFrequencyColor;
        #elif DEBUG_LOW_FREQUENCY_REJECTION
            taa_half3 DebugColor = lerp(taa_half3(1, 0.5, 0.5), taa_half3(0.5, 1, 0.5), LowFrequencyRejection);
            
            FinalOutputColor = FinalFallbackColor * Weights.x * DebugColor + NormalizedFinalHistoryMoment1 * Weights.y * DebugColor;
        #else
            FinalOutputColor = FinalFallbackColor * Weights.x + NormalizedFinalHistoryMoment1 * Weights.y;
        #endif
    }

    ISOLATE
    {
        uint LocalGroupThreadIndex = GetGroupThreadIndex(GroupThreadIndex, GroupWaveIndex);

        taa_short2 LocalHistoryPixelPos = (
            taa_short2(GroupId) * taa_short2(TILE_SIZE, TILE_SIZE) +
            Map8x8Tile2x2Lane(LocalGroupThreadIndex));
            
        LocalHistoryPixelPos = InvalidateOutputPixelPos(LocalHistoryPixelPos, HistoryInfo_ViewportMax);

        // 輸出最終的歷史資料.
        {
            #if CONFIG_ENABLE_STOCASTIC_QUANTIZATION
            {
                uint2 Random = Rand3DPCG16(int3(LocalHistoryPixelPos, View.StateFrameIndexMod8)).xy;
                float2 E = Hammersley16(0, 1, Random);

                FinalHistoryMoment1 = QuantizeForFloatRenderTarget(FinalHistoryMoment1, E.x, HistoryQuantizationError);
                FinalFallbackColor = QuantizeForFloatRenderTarget(FinalFallbackColor, E.x, HistoryQuantizationError);
            }
            #endif

            FinalFallbackColor = -min(-FinalFallbackColor, taa_half(0.0));
            FinalHistoryMoment1 = -min(-FinalHistoryMoment1, taa_half(0.0));
            FinalFallbackColor = min(FinalFallbackColor, taa_half(Max10BitsFloat));
            FinalHistoryMoment1 = min(FinalHistoryMoment1, taa_half(Max10BitsFloat));
            
            HistoryOutput_Textures_0[LocalHistoryPixelPos] = FinalFallbackColor;
            HistoryOutput_Textures_1[LocalHistoryPixelPos] = FinalHistoryMoment1;
            HistoryOutput_Textures_2[LocalHistoryPixelPos] = taa_half2(FinalFallbackWeight, FinalHistoryValidity);
            HistoryOutput_Textures_3[LocalHistoryPixelPos] = FinalSubpixelDetails;

            #if DEBUG_OUTPUT
            {
                DebugOutput[LocalHistoryPixelPos] = Debug;
            }
            #endif
        }

        // 輸出最終的場景顏色.
        {
            taa_half3 OutputColor = FinalOutputColor;
                
            OutputColor = -min(-OutputColor, taa_half(0.0));
            OutputColor = min(OutputColor, taa_half(Max10BitsFloat));

            SceneColorOutput[LocalHistoryPixelPos] = OutputColor;
        }
    }
}

由此可知,相較傳統的TAA,TSR增加了很多資料,包含當前和歷史的高頻、低頻、視差係數、重投影等等資料,先後根據這些資訊摒棄或恢復歷史資料,生成當前幀的混合權重,最終算出抗鋸齒之後的場景顏色和歷史幀資料。

以上程式碼只是TSR的最後一個階段更新歷史資料的程式碼,前面還有很多步驟來生成此階段所需的資料,此文不再分析,留給讀者們自行研究。

6.6.2 Strata

筆者粗略地看了Strata的相關程式碼,看起來Strata類似於UE4的Material Layer,但它主要應用於Nanite幾何體的材質投射、混合和光影處理。Strata有專用的材質、材質節點、著色模型、視覺化模式和Shader處理模組。不過,當前EA版本尚處於體驗階段,限制較多。涉及Strata的主要檔案有:

  • Strata.h/cpp
  • StrataMaterial.h/cpp
  • StrataDefinitions.h
  • MaterialExpressionStrata.h
  • Strata.ush
  • BasePassPixelShader.usf
  • DeferredLightPixelShaders.usf
  • 場景渲染管線、光照相關的程式碼。

有興趣的同學自行研讀相關原始碼。

6.7 本篇總結

本篇主要闡述了UE5的編輯器特性、Nanite、Lumen及相關渲染技術,但由於UE5改動巨大,無法覆蓋所有的技術點,除了本篇文章談及的技術,實際上還有很多未涉及的,這就需要感興趣的讀者自己去探索UE的原始碼了。

UE5 EA階段,無論是Nanite還是Lumen,都存在著諸多瑕疵,如Nanite只支援靜態物體,Lumen的噪點、漏光,TSR的閃爍和模糊,陰影精度的不足(下圖),海量傳統特性的不支援......

鏡頭離物體足夠近時出現的物體模糊和陰影瑕疵。

雖然UE5目前存在著諸多瑕疵,但它是沐浴著陽雨露的小樹苗,經過Epic Game的精心培育,假以時日,終會成長為枝繁葉茂的參天大樹,蔭護著UE引擎關聯的各行各業。UE5 No.1!!!

特別說明

  • 感謝所有參考文獻的作者,部分圖片來自參考文獻和網路,侵刪。
  • 本系列文章為筆者原創,只發表在部落格園上,歡迎分享本文連結,但未經同意,不允許轉載
  • 系列文章,未完待續,完整目錄請戳內容綱目
  • 系列文章,未完待續,完整目錄請戳內容綱目
  • 系列文章,未完待續,完整目錄請戳內容綱目

參考文獻