TY - JOUR
T1 - Comparing a class of dynamic model-based reinforcement learning schemes for handoff prioritization in mobile communication networks
AU - El-Alfy, El Sayed M.
AU - Yao, Yu Dong
PY - 2011/7
Y1 - 2011/7
N2 - This paper presents and compares three model-based reinforcement learning schemes for admission policy with handoff prioritization in mobile communication networks. The goal is to reduce the handoff failures while making efficient use of the wireless network resources. A performance measure is formed as a weighted linear function of the blocking probability of new connection requests and the handoff failure probability. Then, the problem is formulated as a semi-Markov decision process with an average cost criterion and a simulation-based learning algorithm is developed to approximate the optimal control policy. The proposed schemes are driven by a dynamic model estimated simultaneously while learning the control policy using samples generated from direct interactions with the network. Extensive simulations are provided to assess and compare their effectiveness of the algorithm under a variety of traffic conditions with some well-known policies.
AB - This paper presents and compares three model-based reinforcement learning schemes for admission policy with handoff prioritization in mobile communication networks. The goal is to reduce the handoff failures while making efficient use of the wireless network resources. A performance measure is formed as a weighted linear function of the blocking probability of new connection requests and the handoff failure probability. Then, the problem is formulated as a semi-Markov decision process with an average cost criterion and a simulation-based learning algorithm is developed to approximate the optimal control policy. The proposed schemes are driven by a dynamic model estimated simultaneously while learning the control policy using samples generated from direct interactions with the network. Extensive simulations are provided to assess and compare their effectiveness of the algorithm under a variety of traffic conditions with some well-known policies.
KW - Cellular systems
KW - Handoff prioritization
KW - Mobile communication networks
KW - Reinforcement learning
KW - Resource management
KW - Semi-Markov decision process
UR - http://www.scopus.com/inward/record.url?scp=79952445859&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79952445859&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2011.01.082
DO - 10.1016/j.eswa.2011.01.082
M3 - Article
AN - SCOPUS:79952445859
SN - 0957-4174
VL - 38
SP - 8730
EP - 8737
JO - Expert Systems with Applications
JF - Expert Systems with Applications
IS - 7
ER -