2023-05-23

how to return an object count from compute shader

I've implemented occlusion culling via a compute shader in conjunction with indirect rendering in hlsl on DX12.

I would like to get back a count of the number of objects that have been culled to the CPU for output to console.

The code I have has mostly been achieved by looking at existing examples, and I'm not really aware of the best methods for doing what I guess is a reduction. I've seen things like InterlockedAdd but don't know if that's the route to take either..

My current code looks like this (details of culling omitted):

SamplerState DepthSampler                                  : register(s0);
StructuredBuffer<IndirectCommand> inputCommands            : register(t0);      // SRV: Indirect commands
StructuredBuffer<VSIndirectConstants> indirectConstants    : register(t1);      // SRV: of per-object constants
StructuredBuffer<TransformData> TransformBuffer            : register(t2);      // SRV: transforms (per object)
Texture2D<float> DepthTexture                              : register(t3);
AppendStructuredBuffer<IndirectCommand> outputCommands     : register(u0);      // UAV: Processed indirect commands

bool isOccluded(uint index)
{
    bool occluded = false;
    uint transformIndex = indirectConstants[index].transformIndex;
    TransformData tData = TransformBuffer[transformIndex];
    VSIndirectConstants constants = indirectConstants[index];
    ...
}

[numthreads(threadBlockSize, 1, 1)]
void main(uint3 groupId : SV_GroupID, uint groupIndex : SV_GroupIndex)
{
    // Each thread of the CS operates on one of the indirect commands.
    uint index = (groupId.x * threadBlockSize) + groupIndex;

    // Don't attempt to access commands that don't exist if more threads are allocated
    // than commands.
    if (index < (uint)commandCount)
    {
            if (isWithinFrustum(index) && !isOccluded(index))
            {
                outputCommands.Append(inputCommands[index]);
            }                    
    }
}

I'd just like to increment a counter somewhere if I've culled an object and be able to read it back from the CPU efficiently. Would appreciate suggestions on the approach to take, shouldn't need too much detail code-wise.



No comments:

Post a Comment