https://jjcit.org/paper/24
BAT Q-LEARNING ALGORITHM
10.5455/jjcit.71-1480540385
Bilal H. Abed-alguni
Q-learning,Bat algorithm,Optimization,Cooperative reinforcement learning.
17
333
156
2016-11-30
2017-02-01
2017-02-23
Cooperative Q-learning approach allows multiple learners to learn independently and then share their
Q-values among each other using a Q-value sharing strategy. A main problem with this approach is that
the solutions of the learners may not converge to optimality, because the optimal Q-values may not be
found. Another problem is that some cooperative algorithms perform very well with single-task problems,
but quite poorly with multi-task problems. This paper proposes a new cooperative Q-learning algorithm
called the Bat Q-learning algorithm (BQ-learning) that implements a Q-value sharing strategy based on
the Bat algorithm. The Bat algorithm is a powerful optimization algorithm that increases the possibility of
finding the optimal Q-values by balancing between the exploration and exploitation of actions by tuning
the parameters of the algorithm. The BQ-learning algorithm was tested using two problems: the shortest
path problem (single-task problem) and the taxi problem (multi-task problem). The experimental results
suggest that BQ-learning performs better than single-agent Q-learning and some well-known cooperative
Q-learning algorithms.