Extremal dependence analysis of network sessions. (English) Zbl 1329.62234

Summary: We refine a stimulating study by S. Sarvotham et al. [“Network and user driven alpha-beta on-off source model for network traffic”, Comput. Networks 48, No. 3, 335–350 (2005; doi:10.1016/j.comnet.2004.11.024)] which highlighted the influence of peak transmission rate on network burstiness. From TCP packet headers, we amalgamate packets into sessions where each session is characterized by a 5-tuple \((S,D,R,R ^{ \vee},\Gamma)\)=(total payload, duration, average transmission rate, peak transmission rate, initiation time). After careful consideration, a new definition of peak rate is required. Unlike Sarvotham et al. [loc. cit.] who segmented sessions into two groups labelled alpha and beta, we segment into 10 sessions according to the empirical quantiles of the peak rate variable as a demonstration that the beta group is far from homogeneous. Our more refined segmentation reveals additional structure that is missed by segmentation into two groups. In each segment, we study the dependence structure of \((S,D,R)\) and find that it varies across the groups. Furthermore, within each segment, session initiation times are well approximated by a Poisson process whereas this property does not hold for the data set taken as a whole. Therefore, we conclude that the peak rate level is important for understanding structure and for constructing accurate simulations of data in the wild. We outline a simple method of simulating network traffic based on our findings.


62G32 Statistics of extreme values; tail inference
62P30 Applications of statistics in engineering and industry; control charts


QRM; ismev
Full Text: DOI arXiv


[1] Arlitt, M., Williamson C.: Web server workload characterization: the search for invariants. Master’s thesis, University of Saskatchewan (1996)
[2] Athreya, K.B.: Bootstrap of the mean in the infinite variance case. Ann. Stat. 15(2), 724–731 (1987) · Zbl 0628.62042 · doi:10.1214/aos/1176350371
[3] Balkema, A.A., de Haan, L.: Residual life time at great age. Ann. Probab. 2(5), 792–804 (1974) · Zbl 0295.60014 · doi:10.1214/aop/1176996548
[4] Beirlant, J., Goegebeur Y., Teugels, J., Segers, J.: Statistics of extremes. In: Wiley Series in Probability and Statistics, Theory and Applications, With contributions from Daniel De Waal and Chris Ferro. Wiley, Chichester (2004) · Zbl 1070.62036
[5] Brockwell, P., Davis, R.: Time Series: Theory and Methods, 2nd edn. Springer, New York (1991) · Zbl 0709.62080
[6] Coles, S.: An introduction to statistical modeling of extreme values. In: Springer Series in Statistics, xiv, 210 p. Springer, London (2001) · Zbl 0980.62043
[7] Crovella, M., Bestavros, A.: Self-similarity in world wide web traffic: evidence and possible causes. IEEE/ACM Trans. Netw. 5(6), 835–846 (1997) · doi:10.1109/90.650143
[8] Csörgo, S., Deheuvels, P., Mason, D.: Kernel estimates for the tail index of a distribution. Ann. Stat. 13(3), 1050–1077 (1985) · Zbl 0588.62051 · doi:10.1214/aos/1176349656
[9] Das, B., Resnick, S.: Conditioning on an extreme component: model consistency and regular variation on cones. Tech. rep., Cornell University, School of ORIE. http://arxiv.org/abs/0805.4373 (2009a) · Zbl 1284.60103
[10] Das, B., Resnick, S.: Detecting a conditional extrme value model. Tech. rep., Cornell University, School of ORIE. http://arxiv.org/abs/0902.2996 (2009b)
[11] D’Auria, B., Resnick, S.: Data network models of burstiness. Adv. Appl. Probab. 38(2), 373–404 (2006) · Zbl 1103.90029 · doi:10.1239/aap/1151337076
[12] D’Auria, B., Resnick, S.: The influence of dependence on data network models. Adv. Appl. Probab. 40(1), 60–94 (2008) · Zbl 1157.90349 · doi:10.1239/aap/1208358887
[13] Davis, R., Resnick, S.: Tail estimates motivated by extreme value theory. Ann. Stat. 12(4), 1467–1487 (1984) · Zbl 0555.62035 · doi:10.1214/aos/1176346804
[14] Davison, A.C., Smith, R.L.: Models for exceedances over high thresholds (with discussion). J. R. Stat. Soc. B. 52(3), 393–442 (1990) · Zbl 0706.62039
[15] Deheuvels, P., Mason, D., Shorack, G.: Some results on the influence of extremes on the bootstrap. Ann. Inst. Henri Poincaré Probab. Stat. 29(1), 83–103 (1993) · Zbl 0774.62042
[16] Dekkers, A., de Haan, L.: On the estimation of the extreme-value index and large quantile estimation. Ann. Stat. 17(4), 1795–1832 (1989) · Zbl 0699.62028 · doi:10.1214/aos/1176347396
[17] Dietrich, D., de Haan, L., Hüsler, J.: Testing extreme value conditions. Extremes 5(1), 71–85 (2002) · Zbl 1035.60050 · doi:10.1023/A:1020934126695
[18] Drees, H., de Haan, L., Li, D.: Approximations to the tail empirical distribution function with application to testing extreme value conditions. J. Stat. Plan. Inference 136(10), 3498–3538 (2006) · Zbl 1093.62052 · doi:10.1016/j.jspi.2005.02.017
[19] Embrechts, P., Kluppelberg, C., Mikosch, T.: Modelling Extreme Events for Insurance and Finance. Springer, Berlin (1997) · Zbl 0873.62116
[20] Geluk, J., de Haan, L., Resnick, S., Stărică, C.: Second-order regular variation, convolution and the central limit theorem. Stoch. Process. Their Appl. 69(2), 139–159 (1997) · Zbl 0913.60001 · doi:10.1016/S0304-4149(97)00042-2
[21] Giné, E., Zinn, J.: Necessary conditions for the bootstrap of the mean. Ann. Stat. 17(2), 684–691 (1989) · Zbl 0672.62026 · doi:10.1214/aos/1176347134
[22] Guerin, C., Nyberg, H., Perrin, O., Resnick, S., Rootzén, H., Stărică, C.: Empirical testing of the infinite source poisson data traffic model. Stoch. Models 19(2), 151–200 (2003) · Zbl 1048.62080 · doi:10.1081/STM-120020386
[23] de Haan, L., Ferreira, A.: Extreme Value Theory: An Introduction. Springer, New York (2006) · Zbl 1101.62002
[24] de Haan, L., Peng, L.: Comparison of tail index estimators. Stat. Neerl. 52(1), 60–70 (1998) · Zbl 0937.62050 · doi:10.1111/1467-9574.00068
[25] de Haan, L., Resnick, S.: Limit theory for multivariate sample extremes. Z. Wahrscheinlichkeitstheor. Verw. Geb. 40(4), 317–337 (1977) · Zbl 0375.60031 · doi:10.1007/BF00533086
[26] de Haan, L., Resnick, S.: Estimating the limit distribution of multivariate extremes. Stoch. Models 9(2), 275–309 (1993) · Zbl 0777.62036 · doi:10.1080/15326349308807267
[27] de Haan, L., Resnick, S.: On asymptotic normality of the Hill estimator. Stoch. Models 14(4), 849–867 (1998) · Zbl 1002.60519 · doi:10.1080/15326349808807504
[28] Hall, P.: On some simple estimates of an exponent of regular variation. J. R. Stat. Soc. B 44(1), 37–42 (1982) · Zbl 0521.62024
[29] Hall, P.: Asymptotic properties of the bootstrap for heavy-tailed distributions. Ann. Probab. 18(3), 1342–1360 (1990) · Zbl 0714.62035 · doi:10.1214/aop/1176990748
[30] Heffernan, J., Resnick, S.: Limit laws for random vectors with an extreme component. Ann. Appl. Probab. 17(2), 537–571 (2007). doi: 10.1214/105051606000000835 · Zbl 1125.60049 · doi:10.1214/105051606000000835
[31] Heffernan, J., Tawn, J.: A conditional approach for multivariate extreme values (with discussion). J. R. Stat. Soc. B 66(3), 497–546 (2004) · Zbl 1046.62051 · doi:10.1111/j.1467-9868.2004.02050.x
[32] Hill, B.: A simple general approach to inference about the tail of a distribution. Ann. Stat. 3(5), 1163–1174 (1975) · Zbl 0323.62033 · doi:10.1214/aos/1176343247
[33] Hohn, N., Veitch, D., Abry, P.: Cluster processes: a natural language for network traffic. IEEE Trans. Signal Process. 51(8), 2229–2244 (2003) · doi:10.1109/TSP.2003.814460
[34] Huang, X.: Statistics of bivariate extreme values. Ph.D. thesis, Tinbergen Institute Research Series 22, Erasmus University Rotterdam, Postbus 1735, 3000DR, Rotterdam, The Netherlands (1992)
[35] Hüsler, J., Li, D.: On testing extreme value conditions. Extremes 9(1), 69–86 (2006) · Zbl 1164.62352 · doi:10.1007/s10687-006-0025-8
[36] Keshav, S.: An Engineering Approach to Computer Networking; ATM Networks, the Internet, and the Telephone Network. Addison-Wesley, Reading (1997)
[37] Lehmann, E., Romano, J.: Testing statistical hypotheses, 3rd edn. In: Springer Texts in Statistics. Springer, New York (2005) · Zbl 1076.62018
[38] Leland, W., Taqqu, M., Willinger, W., Wilson, D.: On the self-similar nature of ethernet traffic (extended version). IEEE/ACM Trans. Netw. 2(1), 1–15 (1994). doi: 10.1109/90.282603 · doi:10.1109/90.282603
[39] Mason, D., Turova, T.: Weak convergence of the Hill estimator process. In: Galambos, J., Lechner, J., Simiu, E. (eds.) Extreme Value Theory and Applications, pp. 419–432. Kluwer Academic, Dordrecht (1994)
[40] Maulik, K., Resnick, S., Rootzén, H.: Asymptotic independence and a network traffic model. J. Appl. Probab. 39(4), 671–699 (2002) · Zbl 1090.90017 · doi:10.1239/jap/1037816012
[41] McNeil, A., Frey, R., Embrechts, P.: Quantitative risk management. In: Princeton Series in Finance, Concepts, Techniques and Tools. Princeton University Press, Princeton (2005) · Zbl 1089.91037
[42] Park, C., Shen, H., Marron, J.S., Hernandez-Campos, F., Veitch, D.: Capturing the elusive poissonity in web traffic. In: MASCOTS ’06: Proceedings of the 14th IEEE International Symposium on Modeling, Analysis, and Simulation, pp. 189–196. IEEE Computer Society, Washington, DC (2006). doi: 10.1109/MASCOTS.2006.17
[43] Paxson, V., Floyd, S.: Wide-area traffic: the failure of poisson modeling. IEEE/ACM Trans. Netw. 3(3), 226–244 (1995) · doi:10.1109/90.392383
[44] Peng, L.: Second order condition and extreme value theory. Ph.D. thesis, Tinbergen Institute, Erasmus University, Rotterdam (1998)
[45] Pickands, J.: Statistical inference using extreme order statistics. Ann. Stat. 3, 119–131 (1975) · Zbl 0312.62038 · doi:10.1214/aos/1176343003
[46] Reiss, R.D., Thomas, M.: Statistical Analysis of Extreme Values, 3rd edn. Birkhäuser Verlag, Basel (2007) · Zbl 1122.62036
[47] Resnick, S.: Tail equivalence and its applications. J. Appl. Probab. 8, 136–156 (1971) · Zbl 0217.49903 · doi:10.2307/3211844
[48] Resnick, S.: Modeling data networks. In: Finkenstadt, B., Rootzén, H. (eds.) SemStat: Seminaire Europeen de Statistique, Extreme Values in Finance, Telecommunications, and the Environment, pp. 287–372. Chapman-Hall, London (2003)
[49] Resnick, S.: Heavy tail phenomena: probabilistic and statistical modeling. In: Springer Series in Operations Research and Financial Engineering. Springer, New York. iSBN: 0-387-24272-4 (2007) · Zbl 1152.62029
[50] Resnick, S.: Extreme Values, Regular Variation and Point Processes. Reprint of the 1987 Original. Springer, New York (2008) · Zbl 1136.60004
[51] Sarvotham, S., Riedi, R., Baraniuk, R.: Network and user driven on-off source model for network traffic. Special issue on Long-range Dependent Traffic. Comput. Networks 48, 335–350 (2005) · doi:10.1016/j.comnet.2004.11.024
[52] Willinger, W., Paxson, V.: Where mathematics meets the Internet. Not. Am. Math. Soc. 45(8), 961–970 (1998) · Zbl 0973.00523
[53] Willinger, W., Taqqu, M., Leland, M., Wilson, D.: Self–similarity in high–speed packet traffic: analysis and modelling of ethernet traffic measurements. Stat. Sci. 10, 67–85 (1995) · Zbl 1148.90310 · doi:10.1214/ss/1177010131
[54] Willinger, W., Paxson, V., Taqqu, M.: Self-similarity and heavy tails: structural modeling of network traffic. In: Adler, R., Feldman, R., Taqqu, M. (eds.) A Practical Guide to Heavy Tails. Statistical Techniques and Applications, pp. 27–53. Birkhäuser Boston, Boston (1998) · Zbl 0926.90014
[55] Willinger, W., Taqqu, M., Sherman, R., Wilson, D.: Self-similarity through high variability: statistical analysis of ethernet lan traffic at the source level. IEEM/ACM Trans. Netw. 5(1), 71–86 (1997) · doi:10.1109/90.554723
[56] Zhang, Y., Breslau, L., Paxson, V., Shenker, S.: On the characteristics and origins of internet flow rates. In: ACM Sigcom 2002 Conference, Pittsburgh, 19–23 August 2002
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.