zbMATH — the first resource for mathematics

The complexity of computation and approximation of the \(t\)-ratio over one-dimensional interval data. (English) Zbl 06984073
Summary: The main question is how to compute the upper and lower limits of the range of possible values of a given statistic, when the data range over given intervals. Initially some well-known statistics, such as sample mean, sample variance or \(F\)-ratio, are considered in order to illustrate that in some cases the limits can be computed efficiently, while in some cases their computation is NP-hard. Subsequently, the \(t\)-ratio (variation coefficient) is considered. It is investigated when the limits for \(t\)-ratio are computable in polynomial time and a new efficient algorithm is designed for this case. Conversely, complementary NP-hardness results are proved, demonstrating the cases when the computation cannot be done efficiently. Subsequently, the NP-hardness results are strengthened: it is shown that under certain assumptions, even an approximate evaluation with an arbitrary absolute error is NP-hard. Finally, it is shown that the situation can also be (in some sense) regarded positively: a new pseudopolynomial algorithm is developed. The algorithm is of practical importance, especially when the dataset to be processed is large and does not contain “excessively” large numbers.

62-XX Statistics
Full Text: DOI
[1] Alefeld, G.; Herzberger, J., Introduction to interval computations, (Computer Science and Applied Mathematics, (1983), Academic Press New York)
[2] Billard, L.; Diday, E., Symbolic data analysis: conceptual statistics and data mining, (2006), Wiley Hoboken, NJ · Zbl 1117.62002
[3] Černý, M.; Antoch, J.; Hladík, M., On the possibilistic approach to linear regression models involving uncertain, indeterminate or interval data, Inform. Sci., 244, 26-47, (2013) · Zbl 1357.62237
[4] Corani, G.; Antonucci, A., Credal ensembles of classifiers, Comput. Statist. Data Anal., 71, 818-831, (2014)
[5] Ferson, S.; Ginzburg, L.; Kreinovich, V.; Longpré, L.; Aviles, M., Exact bounds on finite populations of interval data, Reliab. Comput., 11, 3, 207-233, (2005) · Zbl 1076.65013
[6] Gladysz, B.; Kasperski, A., Computing mean absolute deviation under uncertainty, Appl. Soft Comput., 10, 2, 361-366, (2010)
[7] He, L. T.; Hu, C., Impacts of interval computing on stock market variability forecasting, Comput. Econ., 33, 3, 263-276, (2009)
[8] Horowitz, J. L.; Manski, C. F.; Ponomareva, C. F.; Stoye, J., Computation of bounds on population parameters when the data are incomplete, Reliab. Comput., 9, 6, 419-440, (2003) · Zbl 1140.62314
[9] Kreinovich, V., Why intervals? A simple limit theorem that is similar to limit theorems from statistics, Reliab. Comput., 1, 1, 33-40, (1995) · Zbl 0831.65053
[10] Kreinovich, V.; Longpré, L.; Patangay, P.; Ferson, S.; Ginzburg, L., Outlier detection under interval uncertainty: algorithmic solvability and computational complexity, Reliab. Comput., 11, 1, 59-76, (2005) · Zbl 1076.65014
[11] Kreinovich, V.; Xiang, G., Fast algorithms for computing statistics under interval uncertainty: an overview, (Huynh, V.-N.; etal., Interval/Probabilistic Uncertainty and Non-classical Logics, (2008), Springer Berlin), 19-31 · Zbl 1139.62065
[12] Kumkov, S.I., Interval approach to robust bounded-error estimation of corrupted experimental data. In: IFAC Proceedings Volumes (IFAC-PapersOnline) 16 (PART 1) (2012) 1091-1096, 16th IFAC Symposium on System Identification, Brussels.
[13] Marupally, P. R.; Paruchuri, V.; Hu, C., Bandwidth variability prediction with rolling interval least squares (RILS), (Proceedings of the 50th Annual Southeast Regional Conference, (2012), ACM New York), 209-213
[14] Matyiasevich, Y., Hilbert’s tenth problem, (1993), MIT Press
[15] Montes, I.; Miranda, E.; Montes, S., Stochastic dominance with imprecise information, Comput. Statist. Data Anal., 71, 868-886, (2014)
[16] Moore, R. E.; Kearfott, R. B.; Cloud, M. J., Introduction to interval analysis, (2009), SIAM Philadelphia, PA · Zbl 1168.65002
[17] Neumaier, A., Interval methods for systems of equations, (1990), Cambridge University Press Cambridge · Zbl 0706.15009
[18] Nguyen, H. T.; Kreinovich, V.; Wu, B.; Xiang, G., (Computing Statistics Under Interval and Fuzzy Uncertainty. Applications to Computer Science and Engineering, Studies in Computational Intelligence, vol. 393, (2012), Springer Berlin)
[19] Stoye, J., Partial identification of spread parameters, Quant. Econ., 1, 323-357, (2010) · Zbl 1205.62015
[20] Vavasis, S. A., Nonlinear optimization: complexity issues, (1991), Oxford University Press · Zbl 0785.90091
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.