RL PONG
Play against AI that learns in real time. Hit AUTO-TRAIN or just start playing.
ALGO Q-LEARNING
EPISODE 0
AI WIN % 0
ε 1.00
BOT —
Q-Learning
The simplest RL algorithm here. It builds a big lookup table mapping every game state to the best action (UP / STAY / DOWN). Each frame it updates one entry in the table based on the reward it received. Early on it explores randomly (ε-greedy), but as ε decays it exploits what it's learned. You should see improvement within ~20 rounds.