Automatic Convergence Estimation by Utilizing Fractal Dimensional Analysis for Reinforcement Learning
Abstract
This paper presents an automatic convergence estimation method for the reinforcement learning (RL) of autonomous agents. Recently, a multi-agent robot system (MARS) that uses an RL algorithm has been studied in real-world situations. This system can obtain behaviors autonomously through multi-agent reinforcement learning (MARL). However, MARL takes a long time to obtain an optimal or near-optimal solution. Furthermore, the agents continued learning after obtaining of the solution may lead to overfitting. It is difficult for the MARL operator to determine whether the learning has been converged or not, because the operator has to determine the convergence of all learning agents. The convergence of the learning curve indicates that knowledge has been obtained. However, judging the convergence of RL depends on human intuition and
experience, and few studies have discussed convergence estimation methods. Therefore, in this paper, we address development of an automatic convergence estimation method that does not require human judgement. In prior work, we proposed automatic convergence estimation method using fractal dimensional analysis which evaluates the learning curve of the learner as agents, and we confirmed the effectiveness of our previous method by conducting a computer simulation. In this study, we propose a method based on fractal dimensional analysis considering implementation to the robots and evaluate the effectiveness of the method by using an expanded experimental setup that considering a computer simulation and an actual mobile robot environment.
Keywords
Full Text:
PDFReferences
R. d’Andrea, “A Revolution in the Warehouse : A Retrospective on Kiva Systems and the Grand Challenges Ahead”, Automation Science and Engineering, IEEE Transaction on, vol.9, no.4, pp.6338–639, 2012.
H.Sugiyama, T. Tsujioka, and M.Murata, “Coordination of rescue robots for real-time exploration over disaster areas”, In Proceedings of the Object Oriented Real-Time Distributed Computing (ISORC) 2008, 11th IEEE International Symposium on, pp.170–177, 2008.
A. Marino, L. E Parker, G. Antonelli, and F. Caccavale, “A decentralized architecture for multi-robot systems based on the null-space-behavioral control with application to multi-robot border patrolling”, Journal of Intelligent & Robotic Systems, vol.71, no.3–4, pp.423–444, 2013.
M. Tan, “Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents”, In Proceedings of the Tenth International Conference on Machine Learning, vol.337, pp.330–337, 1993.
M. J. Matari´c, “Reinforcement learning in the multi-robot domain”, Autonomous Robots, vol.4, pp.73–83, 1997.
S. Arai, K. Sycara, and T. R. Payne, “Experience-based Reinforcement
Learning to Acquire Effective Behavior in a multiagent Domain”, In Proceedings of the 6th Pacific Rim International Conference on Artificial Intelligence, pp.125–135, 2000.
E. Yang and D. Gu, “A Survey on multi-agent Reinforcement Learning Towards Multi-Robot Systems”, In Proceedings of the IEEE symposium on Computational Intelligence and Games (CIG) 2005, vol.2, 2005.
M. E. Taylor, “Transfer in Reinforcement Learning Domains”, vol.216, Springer, 2009.
H. Kono, A. Kamimura, K. Tomita, Y. Murata and T. Suzuki, “Transfer Learning Method Using Ontology for Heterogeneous Multi-agent Reinforcement Learning”, International Journal of Advanced Computer Science and Applications, vol.5, no.10, pp.156–164, 2014.
M. Waibel, M. Beetz, J. Civera, R. D’Andrea, J. Elfring, D. Galvez-Lopez, K. Haussermann, R. Janssen, J. Montiel, A. Perzylo, B. Schieble, M. Tenorth, O. Zwegle and R. Van de Molengraft, “A World Wide Web for Robots”, Robotics and Automation Magazine, vol.18, no.2, pp.69–82, 2011.
G. Hu, W. P. Tay and Y. Wen, “Cloud robotics: Architecture, challenges and applications”, IEEE Network, vol.26, no.3, pp.21–28, 2012.
H. Kono, K. Sawai and T. Suzuki, “Convergence Estimation Utilizing Fractal Dimensional Analysis for Reinforcement Learning”, In Proceedings SICE Annual Conference 2013, pp.2752– 2757, 2013.
C. J. C. H. Watkins and P. Dayan, “Q-Learning”, Machine Learning, vol.8, no.3–4, pp.279–292, 1992.
K. Sichao, H. Yamamoto and K. Yamaji, “Evaluation of co2 free electricity trading market in japan by multi-agent simulations”, Energy Policy, vol. 38, no. 7, pp.3309–3319, 2010.
M. Otani, H. Sato, K. Hattori and K. Takadama, “Analyzing an Influence of Collecting Broken Robots in Large-scale Structure Assembly though a cooperation among Multiple Robots”, IEEJ Transactions on Electronics, Information and Systems, vol. 133, no. 9, pp.1729–1737, 2013.(in Japanese)
B. B. Mandelbrot, “How long is the coast of Britain”, Science, vol.156, no.3775, pp.636–638, 1967.
C. A. Schneider, W. S. Rasband and K. W. Eliceiri, “Nih image to imagej: 25 years of image analysis”, Nature methods, vol.9, no.7, pp.671–675, 2012.
A. Forsythe, M. Nadal, N. Sheehy, C. J. Cela-Conde and M. Sawey, “Predicting beauty: Fractal dimension and visual complexity in art”, British Journal of Psychology, Vol.102, Issue 1, pp.49–70, 2011.
F. Yas¸ar and F. Akg¨unl¨u, “Fractal dimension and lacunarity analysis of dental radiographs”, Dentomaxillofacial Radiology, DOI: http://dx.doi.org/10.1259/dmfr/85149245, 2014.
N. N. Karle and K. M. Kolwankar, “Characterization of the Irregularity of a Terrain Using Fractal Dimension of Lakes’ Boundaries”, Fractals Complex Geometry, Patterns, and Scaling in Nature and Society vol.23, issue 2, 2015.
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 3.0 License.