Anatomy of an Autograd
cuDNN!
Level 1: DNN
Resources
- GPU MODE Lecture 6: Optimizing Optimizers
- GPU MODE Lecture 7: Advanced Quantization
- GPU MODE Lecture 11: Sparsity
- GPU MODE Lecture 12: Flash Attention
- GPU MODE Lecture 13: Ring Attention
- GPU MODE Lecture 30: Quantized Training
- GPU MODE Lecture 60: Optimizing Linear Attention
- GPU MODE Lecture 65: Neighborhood Attention
- GPU MODE Lecture 73: Quantization in Large Models
- GPU MODE Lecture 23: Tensor Cores
- GPU MODE Lecture 15: CUTLASS
- GPU MODE Lecture 36: CUTLASS and Flash Attention 3
- GPU MODE Lecture 57: CuTe
Differentiable Compilation
Resources