Compute shader synchronization

Furtys · Jun 26, 2017

Hello !

I need your help today ! I begin to work with compute shader in a really simple use case :
I have a depth camera and I want to calculate the bounding box of an object near to the camera.

But I have too much pixel to process and I want to use GPGPU, compute shader and parallelization to compute this.

I currently have a problem, when I run my program, I have the same min and max coordinates. So I think that all my group and threads write in the same time to my StructuredBuffers.

Do you have an idea to how to do that ?

Thanks a lot !

PS : Sorry for my English, I'm French

Here is the code of my compute shader :

Code (HLSL Compute Shader):

#pragma kernel ComputeBoundingBox

//We define the size of a group in the x, y and z directions, z direction will just be one

#define thread_group_size_x 1024

#define thread_group_size_y 1

#define thread_group_size_z 1

//Size of the depthData frame

#define width 512;

#define height 424;

//DataBuffer = depthData of the camera

//minBuffer, maxBuffer, array of size 3 with min/max x, y and z

//mask = image area to process

RWStructuredBuffer<float> dataBuffer;

globallycoherent RWStructuredBuffer<float>minBuffer;

globallycoherent RWStructuredBuffer<float> maxBuffer;

RWStructuredBuffer<float> mask;

float xValue = 0, yValue = 0, zValue = 0;

[numthreads(thread_group_size_x, thread_group_size_y, thread_group_size_z)]

void ComputeBoundingBox(uint3 id : SV_DispatchThreadID)

{

xValue = (id.x + 1) % width;

yValue = (id.x + 1) / width;

zValue = dataBuffer[id.x];

if (mask[id.x] > 0.49)

{

if (zValue > 500 && zValue < 1500)

{

if (xValue < minBuffer[0])

minBuffer[0] = xValue;

else if (xValue > maxBuffer[0])

maxBuffer[0] = xValue;

if (yValue < minBuffer[1])

minBuffer[1] = yValue;

else if (yValue > maxBuffer[1])

maxBuffer[1] = yValue;

if (zValue < minBuffer[2])

minBuffer[2] = zValue;

else if (zValue > maxBuffer[2])

maxBuffer[2] = zValue;

}

}

}

LukasCh · Jun 28, 2017

Yes, this indeed requires sync, you can try interlocked operations https://msdn.microsoft.com/en-us/library/windows/desktop/ff471411(v=vs.85).aspx. Any way I really recommend you to re-think this code, because performance is going to be extremely poor:
- All your threads and thread groups going to fight for locks, as basically all the work is in critical sections
- You have lots of branching, that will force basically only few threads actually working in group

Search Unity

Compute shader synchronization

Furtys

LukasCh

Unity Technologies

Search Unity

Unity ID

Useful Searches

Compute shader synchronization

Furtys

LukasCh

Unity Technologies