3. Scaling Networks
age of scaling (2020 - 2025)
Contents
- 3.1 Scaling Large Language Models
- 3.2 Accelerating
nanogptwith FlashAttention - 3.3 Programming a Compiled and Distributed
Tensor
Press ← or → to navigate between chapters
Press S or / to search in the book
Press ? to show this help
Press Esc to hide this help
age of scaling (2020 - 2025)
Contents
nanogpt with FlashAttentionTensor
nanogpt with FlashAttentionTensorOpCode, OpNode Intermediate Graph RepresentationExecItem Kernelizer/Fuser, SchedulerRuntime, Allocator, Heterogenous Runtime