Darshan Fofadiya

Darshan Fofadiya

Senior Applied Scientist at Amazon | AI Researcher
Designing Novel Transformer Architectures


LLM Inference at Scale Zero to Quantum Research Paper Deep Dives Open Source

The Illustrated Guide to LLM Inference at Scale

14 parts across 3 phases · 4 published

A visual, math-heavy deep dive into running Llama-70B with a 1 million token context window on GPU clusters. From memory constraints through parallelism strategies to production serving — with step-by-step calculations and animations.

Read the series →
Phase 1: Parallelism
Phase 2: Quantization
  • Part 6: Why Quantization — The Memory Multiplier
  • Part 7: Post-Training Quantization (PTQ)
  • Part 8: Weight-Only vs Weight-Activation Quantization
  • Part 9: Advanced Quantization — GGUF, AQLM, QuIP#
Phase 3: Production
  • Part 10: Throughput & Latency
  • Part 11: Serving Mixed Workloads
  • Part 12: KV Cache Management
  • Part 13: Multi-Node Deployment
  • Part 14: Code Walkthrough

Zero to Quantum

14 parts across 4 phases · 1 published

From knowing nothing about quantum computing to understanding and designing quantum algorithms. Every concept from first principles — qubits, gates, Grover's search, Shor's algorithm, variational methods, error correction — with math, code, and Colab notebooks you can run in your browser.

Read the series →
Phase 1: Foundations
Phase 2: Algorithms
  • Part 4: Shor's Algorithm
  • Part 5: Quantum Phase Estimation
  • Part 6: Quantum Walks
  • Part 7: Variational Quantum Eigensolver (VQE)
Phase 3: Applications
  • Part 8: Quantum Machine Learning
  • Part 9: Quantum Optimization (QAOA)
  • Part 10: Quantum Simulation
  • Part 11: Quantum Cryptography
Phase 4: Hardware & Error Correction
  • Part 12: Quantum Error Correction
  • Part 13: Quantum Hardware Deep Dive
  • Part 14: The Road to Fault Tolerance

Research Paper Deep Dives

Illustrated, math-heavy breakdowns of important papers

Open Source

Libraries and tools I'm building in the open

Get notified when I publish new deep dives

Illustrated guides on LLM inference, quantum computing, and AI research papers.

⚠ After subscribing, check your inbox for a confirmation email. You won't receive posts until you confirm.