< BLOG.FEED />Writing

// dispatches from the intersection of math, ML, and madness

notes, reflections, and the occasional deep dive

SYSTEMS TUTORIAL

The Multi-GPU Training Gauntlet: Threads, Processes, and the GIL

Four attempts to parallelize PyTorch training across 8 GPUs — from threading deadlocks to GIL starvation to torch.compile explosions. A war story about why OS process isolation beats every clever concurrency trick.

RL TUTORIAL

Landing the Actor-Critic: DQN, A2C & PPO in the Browser

How we built a Lunar Lander with DQN, A2C, and PPO in pure JavaScript — ~4,200 parameters, zero GPUs, and a live actor-critic visualization that makes the GAN-RL connection click.

RL TUTORIAL

Teaching Pong to Play Itself: 3 RL Algorithms in the Browser

How we trained Q-Learning, DQN, and REINFORCE to play Pong in pure JavaScript -- 579 parameters, zero GPUs, and the reward shaping tricks the textbooks leave out. A build log of what worked, what didn't, and what we changed.

GAN RL MATH

When GANs Meet RL: The Adversarial Game Behind Generative AI

GANs and RL are secretly the same game. A mathematical deep dive into how generators are actors, discriminators are critics, and RLHF completes the picture -- through game theory, f-divergences, and the Boltzmann connection.

DIFFUSION MATH

Score Matching Explained: From Theory to Diffusion Models

A deep dive into score matching and DDPM -- the two intertwined threads behind modern diffusion models. From score functions to reverse-time SDEs to classifier-free guidance, with all the math you need.