Abstract—Intrinsic motivation is one of the potential candidates to help improve performance of reinforcement learning algorithm in complex environments. The method enhances exploration capability without explicitly told by the creator. This is suitable for the case of multi-agent reinforcement learning where the environment complexity is beyond standard. In this paper, the Random Network Distillation method is applied to implement intrinsic motivation in the multi-agent environment. Two intrinsic motivation architectures are developed and compared with the benchmark in different scenarios. The experiments show an increase in performance of the very complex environments while little to no improvement over the non-complex ones. Although there exists some overhead which results in less sample efficiency, the centralized intrinsic motivation architecture shows a long-term on par or even better optimization performance as it could explore on more states. The performance of the centralized architecture shows a solid improvement in 2s3z environment and achieves almost 70%win rate over the benchmark of 43%.
Index Terms—Reinforcement learning, multi-agent learning, curiosity exploration, intrinsic reward.
The authors are with the Department of Computer Engineering, Chulalongkorn University, Bangkok 10330, Thailand (e-mail: email@example.com, Yachai.firstname.lastname@example.org)
Cite: K. Charoenpitaks and Y. Limpiyakorn, "Multi-Agent Reinforcement Learning with Clipping Intrinsic Motivation," International Journal of Machine Learning and Computing vol. 12, no. 3, pp. 85-90, 2022.Copyright © 2022 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).