Reinforcement-Learning

The Loop Problem

Every RL agent that has played a text adventure has tried to take the lantern fifty times in a row. The fix is not better exploration heuristics. The fix is representing state properly.

juliabayesianmachine-learningaireinforcement-learning