##
**Join-the-shortest queue diffusion limit in Halfin-Whitt regime: tail asymptotics and scaling of extrema.**
*(English)*
Zbl 1467.60069

Summary: Consider a system of \(N\) parallel single-server queues with unit-ex ponential service time distribution and a single dispatcher where tasks arrive as a Poisson process of rate \(\lambda(N)\). When a task arrives, the dispatcher assigns it to one of the servers according to the Join-the-Shortest Queue (JSQ) policy. P. Eschenfeldt and D. Gamarnik [Math. Oper. Res. 43, No. 3, 867–886 (2018; Zbl 1433.60087)] established that in the Halfin-Whitt regime where \((N-\lambda(N))/\sqrt{N}\to\beta>0\) as \(N\to\infty\), appropriately scaled occupancy measure of the system under the JSQ policy converges weakly on any finite time interval to a certain diffusion process as \(N\to\infty\). Recently, it was further established by A. Braverman [Math. Oper. Res. 45, No. 3, 1069–1103 (2020; Zbl 1455.90035)] that the convergence result extends to the steady state as well, that is, stationary occupancy measure of the system converges weakly to the steady state of the diffusion process as \(N\to\infty\), proving the interchange of limits result.

In this paper, we perform a detailed analysis of the steady state of the above diffusion process. Specifically, we establish precise tail-asymptotics of the stationary distribution and scaling of extrema of the process on large time interval. Our results imply that the asymptotic steady-state scaled number of servers with queue length two or larger exhibits an exponential tail, whereas that for the number of idle servers turns out to be Gaussian. From the methodological point of view, the diffusion process under consideration goes beyond the state-of-the-art techniques in the study of the steady state of diffusion processes. Lack of any closed-form expression for the steady state and intricate interdependency of the process dynamics on its local times make the analysis significantly challenging. We develop a technique involving the theory of regenerative processes that provides a tractable form for the stationary measure, and in conjunction with several sharp hitting time estimates, acts as a key vehicle in establishing the results. The technique and the intermediate results might be of independent interest, and can possibly be used in understanding the bulk behavior of the process.

In this paper, we perform a detailed analysis of the steady state of the above diffusion process. Specifically, we establish precise tail-asymptotics of the stationary distribution and scaling of extrema of the process on large time interval. Our results imply that the asymptotic steady-state scaled number of servers with queue length two or larger exhibits an exponential tail, whereas that for the number of idle servers turns out to be Gaussian. From the methodological point of view, the diffusion process under consideration goes beyond the state-of-the-art techniques in the study of the steady state of diffusion processes. Lack of any closed-form expression for the steady state and intricate interdependency of the process dynamics on its local times make the analysis significantly challenging. We develop a technique involving the theory of regenerative processes that provides a tractable form for the stationary measure, and in conjunction with several sharp hitting time estimates, acts as a key vehicle in establishing the results. The technique and the intermediate results might be of independent interest, and can possibly be used in understanding the bulk behavior of the process.

### MSC:

60K25 | Queueing theory (aspects of probability theory) |

60J60 | Diffusion processes |

60K05 | Renewal theory |

60H20 | Stochastic integral equations |

### Keywords:

join the shortest queue; diffusion limit; steady state analysis; local time; nonelliptic diffusion; Halfin-Whitt regime; regenerative processes
PDF
BibTeX
XML
Cite

\textit{S. Banerjee} and \textit{D. Mukherjee}, Ann. Appl. Probab. 29, No. 2, 1262--1309 (2019; Zbl 1467.60069)

### References:

[1] | Atar, R., Budhiraja, A. and Dupuis, P. (2001). On positive recurrence of constrained diffusion processes. Ann. Probab.29 979–1000. · Zbl 1018.60081 |

[2] | Banerjee, S., Burdzy, K. and Duarte, M. (2015). Gravitation versus Brownian motion. Ann. Inst. Henri Poincaré B, Probab. Stat. To appear. |

[3] | Braverman, A. (2018). Steady-state analysis of the join the shortest queue model in the Halfin–Whitt regime. Available at arXiv:1801.05121. |

[4] | Budhiraja, A. and Lee, C. (2007). Long time asymptotics for constrained diffusions in polyhedral domains. Stochastic Process. Appl.117 1014–1036. · Zbl 1119.60066 |

[5] | Dupuis, P. and Williams, R. J. (1994). Lyapunov functions for semimartingale reflecting Brownian motions. Ann. Probab.22 680–702. · Zbl 0808.60068 |

[6] | Eschenfeldt, P. and Gamarnik, D. (2018). Join the shortest queue with many servers. The heavy traffic asymptotics. Math. Oper. Res.43 867–886. |

[7] | Fralix, B., Knessl, C. and van Leeuwaarden, J. S. H. (2014). First passage times to congested states of many-server systems in the Halfin–Whitt regime. Stoch. Models30 162–186. · Zbl 1303.60086 |

[8] | Gamarnik, D. and Goldberg, D. A. (2013). On the rate of convergence to stationarity of the \(\rm M/M/N\) queue in the Halfin–Whitt regime. Ann. Appl. Probab.23 1879–1912. · Zbl 1287.60111 |

[9] | Gamarnik, D. and Goldberg, D. A. (2013). Steady-state \(GI/GI/n\) queue in the Halfin–Whitt regime. Ann. Appl. Probab.23 2382–2419. · Zbl 1285.60090 |

[10] | Hairer, M. and Mattingly, J. C. (2009). Slow energy dissipation in anharmonic oscillator chains. Comm. Pure Appl. Math.62 999–1032. · Zbl 1169.82012 |

[11] | Halfin, S. and Whitt, W. (1981). Heavy-traffic limits for queues with many exponential servers. Oper. Res.29 567–588. · Zbl 0455.60079 |

[12] | Karatzas, I. and Shreve, S. E. (2012). Brownian Motion and Stochastic Calculus. Graduate Texts in Mathematics113. Springer, New York. · Zbl 0734.60060 |

[13] | Massart, P. (2007). Concentration Inequalities and Model Selection. Lecture Notes in Math.1896. Springer, Berlin. · Zbl 1170.60006 |

[14] | Mukherjee, D., Borst, S. C., van Leeuwaarden, J. S. H. and Whiting, P. A. (2016). Universality of load balancing schemes on the diffusion scale. J. Appl. Probab.53 1111–1124. · Zbl 1356.60149 |

[15] | Mukherjee, D., Borst, S. C., Van Leeuwaarden, J. S. H. and Whiting, P. A. (2016). Universality of power-of-\(d\) load balancing in many-server systems. Stoch. Syst. To appear. · Zbl 1356.60149 |

[16] | Rogers, L. C. G. and Williams, D. (2000). Diffusions, Markov Processes, and Martingales. Vol. 2: Itô Calculus. Cambridge Mathematical Library. Cambridge Univ. Press, Cambridge. Reprint of the second (1994) edition. |

[17] | Ross, S. M. (2010). Introduction to Probability Models, 10th ed. Academic Press, San Diego. · Zbl 1184.60002 |

[18] | Thorisson, H. (2000). Coupling, Stationarity, and Regeneration. Probability and Its Applications (New York). Springer, New York. · Zbl 0949.60007 |

[19] | Van Leeuwaarden, J. S. H. (2018). Asymptotically optimal load balancing topologies. Proc. ACM Meas. Anal. Comput. Syst.2 1–29. DOI:10.1145/3179417. |

[20] | Van Leeuwaarden, J. S. H. and Knessl, C. (2011). Transient behavior of the Halfin–Whitt diffusion. Stochastic Process. Appl.121 1524–1545. · Zbl 1235.60133 |

[21] | Van Leeuwaarden, J. S. H. and Knessl, C. (2012). Spectral gap of the Erlang A model in the Halfin–Whitt regime. Stoch. Syst.2 149–207. · Zbl 1296.60256 |

[22] | Van Leeuwaarden, J. S. H., Mathijsen, B. W. J. and Zwart, B. (2017). Economies-of-scale in resource sharing systems: Tutorial and partial review of the QED heavy-traffic regime. Available at arXiv:1706.05397. |

[23] | Van der Boor, M., Borst, S. C., Van Leeuwaarden, J. S. H. and Mukherjee, D. (2018). Scalable load balancing in networked systems: Universality properties and stochastic coupling methods. In Proc. |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.