Open Access Open Access  Restricted Access Subscription Access

Air Combat Stratigies for an Unmanned Fighter using Q-learning

Dongjin Lee, Hyochoong Bang, Jangseong Park

Abstract


Autonomous air combat maneuvers for an unmanned combat aircraft have been accomplished using an intelligent learning approach. As interacting with an opponent, the unmanned fighter learns about the opponent action model by a reinforcement learning algorithm, Q-learning. The air combat problem is formulated using combat states and kinematic equations of motion of all vehicles. The bogy aircraft is assumed to follow a pure proportional navigation guidance (PPNG) to attack the unmanned fighter. One-step Q-learning algorithm is employed and states and action spaces are quantized. A reward function has been constructed for the unmanned fighter to learn wining strategies during the engagement. A numerical simulation has been performed to evaluate the proposed approach and the results showed that the unmanned fighter wins the combat with the optimized action value function.

Full Text:

PDF

References


R. Isaacs, Differential Games, Wiley, New York, 1965.

W. Grimm and K. H. Well, “Modeling Air Combat as Differential Game Recent Approach and Future Requirement,” Differential Games-Developments in Modeling and Computation, 1991.

J. Karelahti, “Modeling and On-line Solution of Air Combat Optimization Problem and Games,” Ph.D. Thesis, Helsinki University of Technology, 2007.

K. Virtanen, J. Karelahi and T. Raivio, “Modeling Air Combat by a Moving Horizon Influence Diagram Game,” Journal of Guidance, Control, and Dynamics, vol.29, no. 5, 2006.

J. Ben-Asher, E. M. Cliff, and H. J. Kelley, “Optimal evasion against a proportional guided pursure,” Journal of Guidance, Control, and Dynamics, vol. 12, no. 4, pp. 598-600, 1989.

F. Imado and S. Miwa, “Fighter evasive maneuvers against proportional navigation missile,” Journal of Aircraft, vol. 23, no. 11, pp. 825-830, 1986.

S. Y. Ong and B. L. Pierson, “Optimal planar evasive aircraft maneuvers against proportional navigation missiles,” Journal of Guidance, Control, and Dynamics, vol. 19, no. 6, pp. 1210-1215, 1996.

C. Watkins, “Learning from delayed rewards,” PhD. Thesis, Cambridge, England, 1989.

M. S. Manju, “An analysis of Q-learning algorithms with strategies of reward function,” International Journal on Computer Science and Engineering , col. 3, no. 2, pp.814-820, Feburary 2011.

C. D. Yang and C. C. Yang, “Optimal pure proportional navigation for maneuvering targets,” IEEE Transactions on Aerospace and Electronic Systems, vol. 33, pp. 949–957, 1997.

T. Raivio, “Capture set computation of an optimal guided missile,” Journal of Guidance, Control, and Dynamics, vol. 24, no. 6, pp. 1167-1175, 2001.

R. S. Sutton and A. G. Barto, Reinforcement Learning, The MIT Press, 1998.

D. Lee, H. Bang and K. Baek, "Autorotation of an Unmanned Helicopter by a Reinforcement Learning Algorithm," AIAA Guidance, Navigation and Control Conference and Exhibit, Honolulu, Hawaii, 2008.

B. Jung, K. S. Kim and Y. Kim, “Guidance law for evasive aircraft maneuvers using artificial intelligence,” Guidance, Navigation, and Control Conference and Exhibit, Austin, Texas, 2003.




DOI: http://dx.doi.org/10.21535%2FProICIUS.2012.v8.755

Refbacks

  • There are currently no refbacks.