##
**Detection and localization of change-points in high-dimensional network traffic data.**
*(English)*
Zbl 1166.62094

Summary: We propose a novel and efficient method, that we call TopRank for detecting change-points in high-dimensional data. This issue is of growing concern to the network security community since network anomalies such as Denial of Service (DoS) attacks lead to changes in Internet traffic. Our method consists of a data reduction stage based on record filtering, followed by a nonparametric change-point detection test based on \(U\)-statistics. Using this approach, we can address massive data streams and perform anomaly detection and localization on the fly.

We show how it applies to some real Internet traffic provided by France-Télécom (a French Internet service provider) in the framework of the ANR-RNRT OSCAR project. This approach is very attractive since it benefits from a low computational load and is able to detect and localize several types of network anomalies. We also assess the performance of the TopRank algorithm using synthetic data and compare it with alternative approaches based on random aggregation.

We show how it applies to some real Internet traffic provided by France-Télécom (a French Internet service provider) in the framework of the ANR-RNRT OSCAR project. This approach is very attractive since it benefits from a low computational load and is able to detect and localize several types of network anomalies. We also assess the performance of the TopRank algorithm using synthetic data and compare it with alternative approaches based on random aggregation.

### MSC:

62P99 | Applications of statistics |

90B18 | Communication networks in operations research |

62G10 | Nonparametric hypothesis testing |

PDF
BibTeX
XML
Cite

\textit{C. Lévy-Leduc} and \textit{F. Roueff}, Ann. Appl. Stat. 3, No. 2, 637--662 (2009; Zbl 1166.62094)

### References:

[1] | Abry, P., Borgnat, P. and Dewaele, G. (2007). Sketch based anomaly detection, identification and performance evaluation. In Proceedings of the 2007 International Symposium on Applications and the Internet Workshops 80 . IEEE Computer Society, Washington, DC. |

[2] | Basseville, M. and Nikiforov, I. V. (1993). Detection of Abrupt Changes: Theory and Applications . Prentice-Hall, Englewood Cliffs, NJ. |

[3] | Bickel, P. J. and Doksum, K. A. (1976). Mathematical Statistics . Holden Day, San Francisco, CA. · Zbl 0403.62001 |

[4] | Brodsky, B. E. and Darkhovsky, B. S. (1993). Nonparametric Methods in Change-Point Problems . Kluwer Academic, Dordrecht. · Zbl 1274.62512 |

[5] | Csörgo, M. and Horvath, L. (1997). Limit Theorems in Change-Point Analysis . Wiley, New York. · Zbl 0884.62023 |

[6] | Gehan, E. (1965). A generalized Wilcoxon test for comparing arbitrarily single censored samples. Biometrika 52 203-223. · Zbl 0133.41901 |

[7] | Gombay, E. and Liu, S. (2000). A nonparametric test for change in randomly censored data. Can. J. Statist. 28 113-121. · Zbl 0963.62042 |

[8] | Krishnamurthy, B., Subhabrata, S., Zhang, Y. and Chen, Y. (2003). Sketch-based change detection: Methods, evaluation and applications. In Proceedings of the 3rd ACM SIGCOMM Conference on Internet Measurement 234-247. ACM, New York. |

[9] | Lakhina, A., Crovella, M. and Diot, C. (2004). Diagnosing network-wide traffic anomalies. In Proceedings of the 2004 Conference of Applications, Technologies, Architectures, and Protocols for Computer Comunnications 219-230. ACM, New York. |

[10] | Lévy-Leduc, C., Benmammar, B. and Roueff, F. (2008). Toprank algorithm. Registered at the “Agence pour la Protection des Programmes” (http://app.legalis.net/). IDDN.FR.001.100004.000.S.P.2008.000.20700. |

[11] | Li, X., Bian, F., Crovella, M., Diot, C., Govindan, R., Iannaccone, G. and Lakhina, A. (2006). Detection and identification of network anomalies using sketch subspaces. In Proceedings of SIGCOMM 147-152. ACM, New York. |

[12] | Liu, S. (1998). Nonparametric tests for change-point problems with random censorship. Ph.D. thesis, Dept. of Mathematical Sciences, Univ. Alberta. |

[13] | Mantel, N. (1967). Ranking procedures for arbitrarily restricted observations. Biometrics 23 65-78. |

[14] | Midodzi, W. K. (2001). Nonparametric sequential detection in the distribution of randomly censored data. Ph.D. thesis, Dept. of Mathematical Sciences, Univ. Alberta. |

[15] | Page, E. S. (1954). Continuous inspection schemes. Biometrika 41 100-115. · Zbl 0056.38002 |

[16] | Paxson, V. (1999). Bro: A system for detecting network intruders in real-time. Computer Network 31 2435-2463. |

[17] | Roesch, M. (1999). Snort: Lightweight intrusion detection for networks. In Proceedings of LISA’99 229-238. USENIX Association, Berkeley, CA. |

[18] | Salem, O., Vaton, S. and Gravey, A. (2007). A novel approach for anomaly detection over high-speed networks. In Proceedings of EC2ND . |

[19] | Siris, A. and Papagalou, F. (2004). Application of anomaly detection algorithms for detecting SYN flooding attacks. Computer Communications 29 1433-1442. |

[20] | Tartakovsky, A., Rozovskii, B., Blazek, R. and Kim, H. (2006a). Detection of intrusion in information systems by sequential change-point methods. Statist. Methodol. 3 252-340. · Zbl 1373.68144 |

[21] | Tartakovsky, A., Rozovskii, B., Blazek, R. and Kim, H. (2006b). A novel approach to detection of intrusions in computer networks via adaptive sequential and batch-sequential change-point detection methods. IEEE Transactions on Signal Processing 54 3372-3382. · Zbl 1373.68144 |

[22] | Thorup, M. and Zhang, Y. (2004). Tabulation based 4-universal hashing with applications to second moment estimation. In Proceedings of the Fifteenth ACM-SIAM Symposium on Discrete Algorithms 615-624. ACM, New York. · Zbl 1318.68198 |

[23] | Wang, H., Zhang, D. and Shin, G. (2002). Detecting SYN flooding attacks. In Proceedings of INFOCOM 3 1530-1539. |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.