Mark Harris’ talk on Tuesday evening was really interesting. He outlined some of the key algorithms for data parallel computing:
- (Map) Reduce – e.g. Sum
- Split – e.g. Radix Sort, Trees
- Compact – remove unneeded elements
- Allocate – variable output per thread (e.g. "marching cubes")
The key to all these turns out to be implementing Parallel Prefix Sum (Scan). Currently the GTC decks aren’t online (older deck from SC06) but scan implementations on CUDA are covered in GPU Gems 3, which is online. I’m looking at this because I’d like to implement a Barnes-Hut tree code on the GPU.
This morning’s keynote was thought provoking to say the least… Computational chemistry with GPGPUs, literally trying to cure (diseases like) cancer.