EE5355: Algorithmic Techniques for Scalable Many-core Computing
3 CreditsField StudyOnline Available
Algorithm techniques for enhancing the scalability of parallel software: scatter-to-gather, problem decomposition, binning, privatization, tiling, regularization, compaction, double-buffering, and data layout. These techniques address the most challenging problems in building scalable parallel software: limited parallelism, data contention, insufficient memory bandwidth, load balance, and communication latency. Programming assignments will be given to reinforce the understanding of the techniques.
prereq: basic knowledge of CUDA, experience working in a Unix environment, and experience developing and running scientific codes written in C or C++. Completion of EE 5351 is not required but highly recommended.