RL PONG

Play against AI that learns in real time. Hit AUTO-TRAIN or just start playing.

ALGO Q-LEARNING EPISODE 0 AI WIN % 0 ε 1.00 BOT —

MOVE YOUR MOUSE TO PLAY

or arrow keys / W,S / touch

Q-Learning

The simplest RL algorithm here. It builds a big lookup table mapping every game state to the best action (UP / STAY / DOWN). Each frame it updates one entry in the table based on the reward it received. Early on it explores randomly (ε-greedy), but as ε decays it exploits what it's learned. You should see improvement within ~20 rounds.