Monday, 12 August 2013

Why not allow sequential execution across a grid dimension in CUDA?

Why not allow sequential execution across a grid dimension in CUDA?

just wondering... Is there a reason there isn't an option to force code
blocks to execute sequentially across a certain grid dimension?
For example, allowing all the blocks and threads of blockIdx.x==k to run
concurrently, but forcing blockIdx.x==k+1 (and all blocks and threads
along that block index) to run only after blockIdx.x==k is done.
Obviously, if, for example, gridDim.y==1 or any other low number, this
wouldn't be very efficient. but In some cases, you could prevent race
conditions by allowing such a thing.
Is it just not worth the trouble for nVidia to add such a feature, or is
there an actual allowing this?

No comments:

Post a Comment