Paper Title
Alternatives on Explorations and Exploitations in Q-learning
Abstract
In Q-learning, it is important to choose the next actions to increase learning performance: Exploration and Exploitation are the main approaches to choosing the next action, and while they are opposing concepts, they have complementary properties. The most common approach is to choose exploration in the early stages of learning and then take exploitation in the later stages of learning. A balance between these two choices is important for the agent to learn the optimal policy efficiently. In this paper, we propose alternatives to Exploitation and Exploration in Q-learning and experiment with these alternatives. The proposed methods demonstrated reasonable performance.
Keywords - Reinforcement Learning, Q-Learning, Exploration, Exploitation, Optimal Policy