Reinforcement-Learning

The Loop Problem

Every RL agent that has played a text adventure has tried to take the lantern fifty times in a row. The fix is not better exploration heuristics. The fix is representing state properly.

Mar 24, 2026 10 min read

juliabayesianmachine-learningaireinforcement-learning