Introduction to heterogeneous systems
- GPU architecture
- CUDA
- Programming and execution models
- Memory organization in CUDA
- Task level parallelism streams, events, dynamic parallelism
- Tools CUDA compiler, profiler and debugger
Optimizing parallel patterns in CUDA with some case studies:
Brief over view on OpenACC and OpenCL HSA foundations
CSE 599I: Accelerated Computing - Programming GPUs (tschmidt23.github.io) (276) Heterogenous Parallel Programming - CUDA Programming - YouTube