BAT Q-LEARNING ALGORITHM 10.5455/jjcit.71-1480540385 Bilal H. Abed-alguni Q-learning,Bat algorithm,Optimization,Cooperative reinforcement learning. 17 333 156 2016-11-30 2017-02-01 2017-02-23 Cooperative Q-learning approach allows multiple learners to learn independently and then share their Q-values among each other using a Q-value sharing strategy. A main problem with this approach is that the solutions of the learners may not converge to optimality, because the optimal Q-values may not be found. Another problem is that some cooperative algorithms perform very well with single-task problems, but quite poorly with multi-task problems. This paper proposes a new cooperative Q-learning algorithm called the Bat Q-learning algorithm (BQ-learning) that implements a Q-value sharing strategy based on the Bat algorithm. The Bat algorithm is a powerful optimization algorithm that increases the possibility of finding the optimal Q-values by balancing between the exploration and exploitation of actions by tuning the parameters of the algorithm. The BQ-learning algorithm was tested using two problems: the shortest path problem (single-task problem) and the taxi problem (multi-task problem). The experimental results suggest that BQ-learning performs better than single-agent Q-learning and some well-known cooperative Q-learning algorithms.