Open Access Open Access  Restricted Access Subscription Access

Learning Strategic Information Support for Controlling Traffic Flow

Sachiyo Arai, Yuta Mabuchi


We have been concerned with the desirable way of information services to control the behavior of the multiagent network system. This paper takes Braess’s paradox of traffic flow, where each driver selects its route in a shortsighted manner. The difficulty of this problem is that optimal traffic assignment is not always satisfied with minimal travel time of each driver. Thus, this paper attempts to find out how to compromise these two conflicting viewpoints. The most of previous researches focused on the driver’s decision making processes to resolve this paradox. Meanwhile, we focused on information services side to make traffic flow desirable.

Firstly, we adopt reinforcement learning framework to acquire the strategy for solving the paradox. Secondly, we show the criteria of information distribution to realize a desirable traffic flow, through some experiments.

Full Text:



D.Braess, A.Nagurney, T.Wakolbinger, On a Paradox of Traffic Planning, Transportation Science, vol. 39, No. 4, pp. 446-450, (2005).

R.S.Sutton and A.G.Barto, Reinforcement learning: An Introduction,

MIT Press, Cambridge, MA, (1998).

N.Rajewsky, L.Santen, A. Schadschneider, and M. Schreckenberg,

“The asymmetric exclusion process: Comparison of update procedures”, J. Stat. Phys. 92 2 pp.151-194, (1998).

K. Nagel, M. Schreckenberg, A cellular automaton model for freeway traffic, J. Phys. I France 2 2221-2229, (1992).

A.B. Haurie and P. Marcotte, On the relationship between nashcournot

and wardrop equilibria, Networks, 15:295-08, (1985).

C.J.C.H. Watkins, Learning from Delayed Rewards. PhD thesis, Cambridge University, Cambridge, England, (1989).

M. Kanai, K. Nishinari, and T.tokihiro, Lecture Notes in Computer Science, 2006, Volume 4173/2006 pp.538-547, (2006).

M. Takayasu, 1=f noise in traffics model, Fractals 1, pp. 860-866 (1993).

S. Choi, D. Yeung: Predictive Q Routing: A Memory-Based Reinforcement Learning Approach to Adaptive Traffic Control, Advances in Neural Information Processing Systems, Vol. 8, pp.945-951(1996).

H. Youn, M. Gastner, H. Jeong:Price of Anarchy in Transportation

Networks: Efficiency and Optimality Control,Physical Review Letters, 101, 128701, (2008).

D.H. Wolpert and K. Tumer, Collective intelligence, data routing and braess’ paradox, Journal of Artificial Intelligence Research, Vol.16, Issue 1, pp.359-387, (2002).

T. Yoshii, M. Kuwahara:An Evaluation Method on Effects of Dynamic Traffic Information Provision, Journal of Structural Mechanics and Earthquake Engineering,No. 653/IV-48,pp. 39-48 (2000) (in Japanese).

T. Nagae. T. Akamatsu: Dynamic System Optimal Traffic Control

based on Realtime Observation of Stochastic Travel Time, Journal of Structural Mechanics and Earthquake Engineering(D), Vol. 63,No. 3,pp. 311-327,(2007) (in Japanese).



  • There are currently no refbacks.