Machine Learning Systems: A Programmer's Perspective

The hacker's guide to accelerated machine learning
Train nanogpt and nanochat with picograd: the bridge from micrograd to tinygrad
Implement your own distributed parallel compiler for differentiating high dimensional functions (from scratch)

by j4orz.
First Edition.