Le Moyen Age et la Renaissance Paris. 1848-1859

1. Elements of Learning

A computational process is indeed much like a sorcerer's idea of a spirit. It cannot be seen or touched. It is not composed of matter at all. However, it is very real. It can perform intellectual work. It can answer questions. — Harold Abelson and Gerald Sussman

Contents

1.1 Regression with Least Squares

all the math and axiomaticizations
distributions, high dimensional functions, optimization

import numpy

1.2 Matrix Multiply on Serial Processors

programmers: map abstraction to assembly. semantic gap. computation: orchestrating electrons.

As mentioned in the introduction, the Ancient Greeks were some of the first people to discover the beauty in mathematics and programming, even though they manually executed their algorithms such as Egyptian Multiplication and Euclidean GCD Algorithm by carrying out the calculation of instruction by hand. For the next 2000 years humanity would discover and invent new algorithms (like Newton's method, which will soon become our friend), but the execution remained a manual process for the profession of human "computers" until a group of German mathematicians found formal models of computation itself. Although these computational models have scary-sounding names of the lambda calculus, recursive functions, and turing machines, the simple yet powerful formal result is that all three are able to express one another, and thus the conjectured thesis that any algorithm can be computed by a turing machine (or equivalent computational model).

von Neumann architecture. mainframes, minis, micros c on decpdp11 green card ibm (intel/amd/arm cpu illusion still a pdp11) to green card risc operating systems and compilers were the first applications to use c. but in numerical computing, there was the BLAS. let's make matmul the fastest on CPU matmul is linear transformation, and linear transformation is matmul

def matmul(A,B):
  raise NotImplementedError("todo")

if __name__ == "__main__":
  N = 4096
  matmul()

function matmul(A,B) {
  throw new Error("todo");
}

fn matmul(x: Vec<f64>, y: Vec<f64>) -> Vec<f64> {
  todo!()
}

fn main() {
  let n = 4096;
}

#include <cuda_runtime.h>
#include <stdio.h>

__global__ void addKernel(int *C, int *A, int *B) {
    int tid = blockIdx.x * blockDim.x + threadIdx.x;
    C[tid] = A[tid] + B[tid];
}

correctness
perf

SITP

1. Elements of Learning

1.1 Regression with Least Squares

1.2 Matrix Multiply on Serial Processors

1.3 Matrix Multiply on Parallel Processors

1.4 Tensor Languages, Device Runtimes

1.4 Matrix Calculus for Optimization