Dynamic routing in large-scale service systems with heterogeneous servers. (English) Zbl 1094.60058

A single Poisson stream of stochastically identical customers is fed into a large scale service system of pools with heterogeneous exponential servers. Letting the arrival intensity go to infinity and taking the overall service capacity as the demand size plus a safety capacity proportional to the square root of the demand yields the so called Halfin-Whitt regime. Under this asymptotics it is shown that for large systems the policy that sends customers to the fastest servers first is nearly optimal. (This is different from the finite system situation where threshold policies are optimal.) It turns out that in the limit a state space collapse occurs. Using this, asymptotic performance measures based on diffusion approximations are computed. Especially it is shown that in the limiting regime under consideration both the quality of service (delay probability) and the system efficiency achieve high performance.


60K25 Queueing theory (aspects of probability theory)
68M20 Performance evaluation, queueing, and scheduling in the context of computer systems
90B22 Queues and service in operations research
Full Text: DOI


[1] M. Armony and N. Bambos, Queueing dynamics and maximal throughput scheduling in switched processing systems, Queueing Systems 44 (2003) 209–252. · Zbl 1035.90017
[2] M. Armony and C. Maglaras, On customer contact centers with a call-back option: Customer decisions, routing rules and system design, {Operations Research} {52}(2) (2004) 271–292. · Zbl 1165.90385
[3] M. Armony and C. Maglaras, Contact centers with a call-back option and real-time delay information, {Operations Research} {52}(4) (2004) 527–545. · Zbl 1165.90386
[4] M. Armony and A. Mandelbaum, Routing and staffing in large-scale service systems with heterogeneous servers and impatient customers, Preprint (2005). · Zbl 1217.90062
[5] R. Atar, A diffusion model of scheduling control in queueing systems with many servers, { Ann. Appl. Probab.} { 15}(1B) (2005) 820–852. · Zbl 1084.60053
[6] R. Atar, Scheduling control for queueing systems with many servers: Asymptotic optimality in heavy traffic, { Ann. Appl. Probab.}, to appear (2005). · Zbl 1098.60083
[7] R. Atar, A. Mandelbaum and M. Reiman, Scheduling a multi-class queue with many exponential servers: Asymptotic optimality in heavy traffic, { Ann. Appl. Prob} { 14}(3) (2004) 1084–1134. · Zbl 1049.60079
[8] A. Bassamboo, J.M. Harrison and A. Zeevi, Design and control of a large call center: Asymptotic analysis of an LP-based method, preprint (2004). · Zbl 1167.90528
[9] A. Bassamboo, J.M. Harrison and A. Zeevi, Dynamic routing and admission control in high-volume service systems: Asymptotic analysis via multi-scale fluid limits, preprint (2004). · Zbl 1081.60064
[10] S.L. Bell and R.J. Williams, Dynamic scheduling of a system with two parallel servers in heavy traffic with complete resource pooling: Asymptotic optimality of a continuous review threshold policy, {Annals of Applied Probability} {11} (2001) 608–649. · Zbl 1015.60080
[11] S. Bhulai and G. Koole, A queueing model for call blending in call centers, {IEEE Transactions on Automatic Control} {48} (2003) 1434–1438. · Zbl 1364.90117
[12] S. Borst, A. Mandelbaum and M. Reiman, Dimensioning large call centers, { Operations Research} {52}(1) (2004) 17–34. · Zbl 1165.90388
[13] M. Bramson, State space collapse with applications to heavy-traffic limits for multiclass queueing networks, {Queueing Systems} { 30} (1997) 89–148. · Zbl 0911.90162
[14] A. Brandt and M. Brandt, On a two-queue priority system with impatience and its application to a call center, { Methodology and Computing in Applied Probablity} {1} (1999) 191–210. · Zbl 0948.60087
[15] S. Browne and W. Whitt, Piecewise-linear diffusion processes, in: {Advances in Queueing. Theory, Methods, and Open Problems}, ed. J.H. Dshalalow, (CRC Press, 1995), Chapter 18, pp. 463–480. · Zbl 0845.60087
[16] H. Chen and D. Yao, Fundamentals of Queueing Networks: Performance, Asymptotics, and Optimization., (Springer, New-York, 2001). · Zbl 0992.60003
[17] J.G. Dai, Stability of fluid and stochastic processing networks, {MaPhySto}, 9 (1999). · Zbl 0923.76045
[18] F. de Véricourt and Y.-P. Zhou, A routing problem for call centers with customer callbacks after service failure, { Operations Research}, to appear, (2004).
[19] F. de Véricourt and Y.-P. Zhou, On the incomplete results for the multiple-server slow-server problem, Technical report, Duke University, The Fuqua School of Business (2004).
[20] A.K. Erlang, On the rational determination of the number of circuits, in: The Life and Works of. A.K. Erlang. eds. E. Brockmeyer, H.L. Halstrom, and A. Jensen, (Copenhagen: The Copenhagen Telephone Company, Copenhagen, 1948).
[21] S.N. Ethier and T.G. Kurtz, Markov Processes, Characterization and Convergence. (John Wiley & Sons, 1985).
[22] A. Federgruen and H. Groenevelt, M/G/c. systems with multiple customer classes: Characterization and achievable performance unrder nonpreemptive priority rules, Management Science {34} (1988) 1121–1138. · Zbl 0651.90033
[23] P. Fleming, A. Stolyar and B. Simon, Heavy traffic limit for a mobile phone system loss model, in: Proceedings of 2nd Int’l Conf. on Telecomm. Syst. Mod. and Analysis., Nashville, TN (1994).
[24] G.J. Foschini, On heavy traffic diffusion analysis and dynamic routing in packet switched networks, in: Computer Performance., eds. K.M. Chandy and M. Reiser (North Holland, 1977).
[25] N. Gans, G. Koole and A. Mandelbaum, Telephone call centers: Tutorial, review and research prospects, {Manufacturing & Service Operations Management} 5(2) (2003) 79–141.
[26] N. Gans and Y.-P. Zhou, A call-routing problem with service-level constraints, {Operations Research} {51}(2) (2003) 255–271. · Zbl 1163.90547
[27] O. Garnett, A. Mandelbaum and M. Reiman, Designing a call center with impatient customers, { Manufacturing & Service Operations Management} {4}(3) (2002) 208–227.
[28] K. Glazebrook and J. Ni {n}o-Mora, Parallel scheduling of multiclass M./M./m. queues: Approximate and heavy-traffic optimization of achievable performance, { Operations Research} 49(4) (2001) 609–623. · Zbl 1163.90422
[29] P.W. Glynn, Diffusion approximations, in: Stochastic Models, Handbooks in OR & MS., eds. D. Heyman and M. Sobel (North-Holland, 1990) vol. 2, pp. 145–198.
[30] I. Gurvich, Design and control of the M/M/N. queue with multi-class customers and many servers, {Masters Thesis}, Technion Institute of Technology, Israel (2004).
[31] I. Gurvich, M. Armony and A. Mandelbaum, Staffing and control of large-scale service systems with multiple customer classes and fully flexible servers, preprint, (2004).
[32] S. Halfin and W. Whitt, Heavy-traffic limits for queues with many exponential servers, {Operations Research} {29}(3) (1981) 567–588. · Zbl 0455.60079
[33] J.M. Harrison, Heavy traffic analysis of a system with parallel servers: Asymptotic analysis of discrete-review policies, { Annals of Applied Probability} {8} (1998) 822–848. · Zbl 0938.60094
[34] J.M. Harrison and A. Zeevi, Dynamic scheduling of a multiclass queue in the Halfin and Whitt heavy traffic regime, {Operations Research} {52}(2) (2004) 243-257. · Zbl 1165.90474
[35] D.L. Jagerman, Some properties of the Erlang loss function, {Bell Systems Technical Journal} {53}(3) (1974) 525–551. · Zbl 0276.60092
[36] P. Jelenkovic, A. Mandelbaum and P. Momcilović, The GI/D/N queue in the {QED} regime, {Queueing Systems} {47} (2004) 53–69. · Zbl 1048.60069
[37] O. Kella and U. Yechiali, Waiting times in the nonpreemptive priority M/M/c. queue, {Stochastic Models} {1}(2) (1985) 257–262. · Zbl 0594.60096
[38] F.P. Kelly and C.N. Laws, Dynamic routing in open queueing networks: Brownian models, cut constraints and resource pooling, {Queueing Systems} {13} (1993) 47–86. · Zbl 0772.60068
[39] W. Lin and P.R. Kumar, Optimal control of a queueing system with two heterogeneous servers, {IEEE Trans. Automat. Control} {29} (1984) 696–703. · Zbl 0546.90035
[40] R.Sh. Lipster and A.N. Shiryaev, Theory of Martingales., (Kluwer, Amsterdam, 1989).
[41] H.P. Luh and I. Viniotis, Threshold control policies for heterogeneous server systems, {Math Meth Oper Res} {55} (2002) 121–142. · Zbl 1037.90010
[42] C. Maglaras and A. Zeevi, Pricing and capacity sizing for systems with shared resources: Scaling relations and approximate solutions, {Management Science} {49}(8) (2003) 1018–1038. · Zbl 06007525
[43] A. Mandelbaum and A.L. Stolyar, Scheduling flexible servers with convex delay costs: Heavy-traffic optimality of the generalized c.{\(\mu\)}-rule, Operations Research 52(6) (2004) 836–855. · Zbl 1165.90402
[44] W.A. Massey and R.B. Wallace, An optimal design of the M/M/C/K. queue for call centers, { Queueing Systems} to appear (2004).
[45] A. Puhalskii, On the invariance principle for the first passage time, { Mathematics of Operations Research} {19}(4) (1994) 946–954. · Zbl 0837.60034
[46] A.A Puhalskii and M.I. Reiman, The multiclass GI/PH/N. queue in the Halfin-Whitt regime, { Advances in Applied Probability} {32} (2000) 564–595. · Zbl 0962.60089
[47] M.I. Reiman, Some diffusion approximations with state space collapse, in: Modelling and Performance Evaluation Methodology., eds. F. Baccelli and G. Fayolle (Springer-Verlag, 1984) pp. 209–240.
[48] V.V. Rykov, Monotone control of queueing systems with heterogeneous servers, {Queueing Systems} {37} (2001) 391–403. · Zbl 1017.90026
[49] V.V. Rykov and D. Efrosinin, Optimal control of queueing systems with heterogeneous servers, { Queueing Systems} {46}, (2004) 389–407. · Zbl 1061.90035
[50] A.A. Scheller-Wolf, Necessary and sufficient conditions for delay moments in FIFO multiserver queues with an application comparing slow servers with one fast one, { Operations Research} {51} (2003) 748–758. · Zbl 1165.90406
[51] J.G. Shanthikumar and D.D. Yao, Comparing ordered-entry queues with heterogeneous servers, { Queueing Systems} {2} (1987) 235–244. · Zbl 0666.60099
[52] R.A. Shumsky, Approximation and analysis of a call center with flexible and Specialized servers, {OR Spectrum} {26}(3) (2004) 307–330. · Zbl 1109.90025
[53] S. Stolyar, Optimal routing in output-queues flexible server systems, {Probability in the Engineering and Informational Sciences} 19 (2005) 141–189. · Zbl 1071.60090
[54] D.Y. Sze, A queueing model for telephone operator staffing, {Operations Research} {32} (1984) 229–249.
[55] Y.-Ch. Teh and A.R. Ward, Critical thresholds for dynamic routing in queueing networks, {Queueing Systems} {42} (2002) 297–316. · Zbl 1011.60075
[56] R.B. Wallace and W. Whitt, A Staffing Algorithm for Call Centers with Skill-Based Routing, working paper (2004).
[57] W. Whitt, Heavy traffic approximations for service systems with blocking, { AT & T Bell Lab. Tech. Journal} {63} (1984) 689–708. · Zbl 0591.90034
[58] W. Whitt, {Stochastic-process limits: An introduction to stochastic-process limits and their application to queues}, Springer (2002). · Zbl 0993.60001
[59] W. Whitt, A diffusion approximation for the G/GI/n/m queue, {Operations Research} {52}(6) (2004) 922–941. · Zbl 1165.90413
[60] W. Whitt, Heavy-traffic limits for the G./H.2 */n./m. queue, {Mathematics of Operations Research} {30}(1) (2005) 1–27. · Zbl 1082.90019
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.