๐ค Machine Learning
Based on Deisenroth, Faisal & Ong, Mathematics for Machine Learning, licensed CC BY 4.0.
Also draws from Zhang et al., Dive into Deep Learning, licensed CC BY-SA 4.0.
Scheme REPLs run in the browser. Python equivalents in collapsible blocks.
| Chapter | |||
|---|---|---|---|
| 1. | What is ML? | Function approximation from data — supervised, unsupervised, reinforcement | ๐ค |
| 2. | Linear Regression | Find weights that minimize squared error — normal equation or gradient descent | ๐ค |
| 3. | Optimization | Follow the negative gradient downhill — learning rate, convexity, convergence | ๐ค |
| 4. | Logistic Regression | Squash linear output through sigmoid for probabilities — cross-entropy loss | ๐ค |
| 5. | Kernel Methods | Replace dot products with kernel functions to learn nonlinear boundaries | ๐ค |
| 6. | Neural Networks | Stack linear layers with nonlinearities — universal approximation by composition | ๐ค |
| 7. | Backpropagation | The chain rule applied layer by layer — how gradients flow backward through a network | ๐ค |
| 8. | Regularization | Penalize complexity to prevent overfitting — L1, L2, dropout, early stopping | ๐ค |
| 9. | Convolutional Networks | Exploit spatial structure — shared filters, pooling, translation equivariance | ๐ค |
| 10. | Recurrent Networks | Process sequences by feeding hidden state forward — vanishing gradients and LSTMs | ๐ค |
| 11. | Dimensionality Reduction | PCA finds the axes of maximum variance — compress data without losing structure | ๐ค |
| 12. | Clustering | Group data by similarity — k-means, EM, and the bias-variance tradeoff in unsupervised learning | ๐ค |
๐บ Video lectures: Stanford CS229: Machine Learning ยท 3Blue1Brown: Neural Networks
Neighbors
- ๐ Linear Algebra — the language of weights, features, and transformations
- ๐ฐ Probability — generative models and Bayesian inference
- โซ Calculus — gradient descent and backpropagation
- ๐ก Information Theory — cross-entropy loss and mutual information
- ๐๏ธ Cognitive Architecture — transformers as cognitive models