๐ฅ Flash Attention derived and coded from first principles with Triton (Python)
๐ Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation
๐ชข ML Interpretability: feature visualization, adversarial example, interp. for language models
๐ธ๏ธ Kolmogorov-Arnold Networks: MLP vs KAN, Math, B-Splines, Universal Approximation Theorem
๐ Reinforcement Learning from Human Feedback explained with math derivations and the PyTorch code
๐ Mamba and S4 Explained: Architecture, Parallel Scan, Kernel Fusion, Recurrent, Convolution, Math
๐ฌ Distributed Training with PyTorch: complete tutorial with cloud infrastructure and code
โ๏ธ Quantization explained with PyTorch - Post-Training Quantization, Quantization-Aware Training
๐๏ธ Retrieval Augmented Generation (RAG) Explained: Embedding, Sentence BERT, Vector Database (HNSW)
๐จ BERT explained: Training, Inference, BERT vs GPT/LLamA, Fine tuning, [CLS] token
๐ Coding Stable Diffusion From Scratch
๐ฆ Coding LLaMA 2 From Scratch
๐ฆ LLaMA explained: KV-Cache, Rotary Positional Embedding, RMS Norm, Grouped Query Attention, SwiGLU
๐ Segment Anything - Model explanation with code
๐งฎ LoRA: Low-Rank Adaptation of Large Language Models - Explained visually + PyTorch code from scratch
โ LongNet: Scaling Transformers to 1,000,000,000 tokens: Python Code + Explanation
๐ผ How diffusion models work - explanation and code!
โ๏ธ Variational Autoencoder - Model, ELBO, loss function and maths explained easily!
๐ Coding a Transformer from scratch on PyTorch, with full explanation, training and inference.
๐ชฌ Attention is all you need (Transformer) - Model explanation (including math), Inference and Training