×

State-based confidence bounds for data-driven stochastic reachability using Hilbert space embeddings. (English) Zbl 1485.93640

Summary: In this paper, we compute finite sample bounds for data-driven approximations of the solution to stochastic reachability problems. Our approach uses a nonparametric technique known as kernel distribution embeddings, and provides probabilistic assurances of safety for stochastic systems in a model-free manner. By implicitly embedding the stochastic kernel of a Markov control process in a reproducing kernel Hilbert space, we can approximate the safety probabilities for stochastic systems with arbitrary stochastic disturbances as simple matrix operations and inner products. We present finite sample bounds for point-based approximations of the safety probabilities through construction of probabilistic confidence bounds that are state- and input-dependent. One advantage of this approach is that the bounds are responsive to non-uniformly sampled data, meaning that tighter bounds are feasible in regions of the state- and input-space with more observations. We numerically evaluate the approach, and demonstrate its efficacy on a neural network-controlled pendulum system.

MSC:

93E20 Optimal stochastic control
93B03 Attainable sets, reachability
93C25 Control/observation systems in abstract spaces
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Abate, Alessandro; Blom, Henk; Cauchi, Nathalie; Degiorgio, Kurt; Fränzle, Martin; Hahn, Ernst Moritz, ARCH-COMP19 category report: Stochastic modelling, EPiC Series in Computing, 61, 62-102 (2019)
[2] Abate, Alessandro; Blom, Henk; Cauchi, Nathalie; Delicaris, Joanna; Hartmanns, Arnd; Khaled, Mahmoud, ARCH-COMP20 category report: Stochastic models, EPiC Series in Computing, 74, 76-106 (2020)
[3] Abate, Alessandro; Blom, H. A.P.; Cauchi, Nathalie; Haesaert, Sofie; Hartmanns, Arnd; Lesser, Kendra, ARCH-COMP18 category report: Stochastic modelling, EPiC Series in Computing, 54 (2018)
[4] Abate, Alessandro; Prandini, Maria; Lygeros, John; Sastry, Shankar, Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems, Automatica, 44, 11, 2724-2734 (2008) · Zbl 1152.93051
[5] Aronszajn, Nachman, Theory of reproducing kernels, Transactions of the American Mathematical Society, 68, 3, 337-404 (1950) · Zbl 0037.20701
[6] Bartlett, Peter L.; Mendelson, Shahar, Rademacher and Gaussian complexities: Risk bounds and structural results, Journal of Machine Learning Research, 3, 463-482 (2002) · Zbl 1084.68549
[7] Berlinet, Alain; Thomas-Agnan, Christine, Reproducing Kernel Hilbert spaces in probability and statistics (2004), Springer · Zbl 1145.62002
[8] Bernstein, Dennis S., Matrix mathematics (2009), Princeton University Press · Zbl 1183.15001
[9] Bertsekas, Dimitri P.; Shreve, Steven E., Stochastic optimal control: the discrete time case (1978), Elsevier · Zbl 0471.93002
[10] Bousquet, Olivier; Elisseef, André, Stability and generalization, Journal of Machine Learning Research, 2, 499-526 (2002) · Zbl 1007.68083
[11] Caponnetto, Andrea; De Vito, Ernesto, Optimal rates for the regularized least-squares algorithm, Foundations of Computational Mathematics, 7, 3, 331-368 (2007) · Zbl 1129.68058
[12] Cauchi, Nathalie, & Abate, Alessandro (2019). StocHy - automated verification and synthesis of stochastic processes. In Conference on hybrid systems: computation and control (pp. 258-259). · Zbl 07120162
[13] Christmann, Andreas; Steinwart, Ingo, Support vector machines (2008), Springer · Zbl 1203.68171
[14] Çinlar, Erhan, Probability and stochastics (2011), Springer · Zbl 1226.60001
[15] De Vito, Ernesto; Caponnetto, Andrea; Rosasco, Lorenzo, Model selection for regularized least-squares algorithm in learning theory, Foundations of Computational Mathematics, 5, 1, 59-85 (2005) · Zbl 1083.68106
[16] Devonport, Alex; Arcak, Murat, Data-driven reachable set computation using adaptive Gaussian process classification and Monte Carlo methods, (American control conference (2020), IEEE), 2629-2634
[17] Dutta, Souradeep, Chen, Xin, & Sankaranarayanan, Sriram (2019). Reachability analysis for neural feedback systems using regressive polynomial rule inference. In Conference on hybrid systems: computation and control (pp. 157-168). · Zbl 07120151
[18] Gleason, Joseph D.; Vinod, Abraham P.; Oishi, Meeko M. K., Underapproximation of reach-avoid sets for discrete-time stochastic systems via Lagrangian methods, (Conference on decision and control (2017), IEEE), 4283-4290
[19] Gleason, Joseph D.; Vinod, Abraham P.; Oishi, Meeko M. K., Lagrangian approximations for stochastic reachability of a target tube, Automatica, 128, Article 109546 pp. (2021) · Zbl 1461.93029
[20] Grünewälder, Steffen, Lever, Guy, Baldassarre, Luca, Patterson, Sam, Gretton, Arthur, & Pontil, Massimilano (2012). Conditional mean embeddings as regressors. In International conference on machine learning (pp. 1803-1810).
[21] Grünewälder, Steffen, Lever, Guy, Baldassarre, Luca, Pontil, Massimilano, & Gretton, Arthur (2012). Modelling transition dynamics in MDPs with RKHS embeddings. In International conference on machine learning (pp. 1603-1610).
[22] Kanagawa, Motonobu; Fukumizu, Kenji, Recovering distributions from Gaussian RKHS embeddings, (Artificial intelligence and statistics (2014), PMLR), 457-465
[23] Kariotoglou, Nikolaos; Kamgarpour, Maryam; Summers, Tyler H.; Lygeros, John, The linear programming approach to reach-avoid problems for Markov decision processes, Journal of Artificial Intelligence Research, 60, 1, 263-285 (2017) · Zbl 1427.90286
[24] Kim, Seung-Jean; Zymnis, Argyrios; Magnani, Alessandro; Koh, Kwangmoo; Boyd, Stephen, Learning the kernel via convex optimization, (Conference on acoustics, speech and signal processing (2008), IEEE), 1997-2000
[25] Lanckriet, Gert R. G.; Cristianini, Nello; Bartlett, Peter; Ghaoui, Laurent El.; Jordan, Michael I., Learning the kernel matrix with semidefinite programming, Journal of Machine Learning Research, 5, Jan, 27-72 (2004) · Zbl 1222.68241
[26] Le, Quoc, Sarlós, Tamás, & Smola, Alex, et al. (2013). Fastfood-approximating kernel expansions in loglinear time. In International conference on machine learning.
[27] Lesser, Kendra, Oishi, Meeko, & Scott Erwin, R. (2013). Stochastic reachability for control of spacecraft relative motion. In Conference on decision and control (pp. 4705-4712).
[28] Lever, Guy, & Stafford, Ronnie (2015). Modelling policies in MDPs in reproducing kernel Hilbert space. In Artificial intelligence and statistics (pp. 590-598).
[29] Lew, Thomas, & Pavone, Marco (2020). Sampling-based reachability analysis: A random set theory approach with adversarial sampling. In Conference on robot learning.
[30] Lopez, Diego Manzanas, Musau, Patrick, Tran, Hoang-Dung, & Johnson, Taylor T. (2019). Verification of closed-loop systems with neural network controllers. In International workshop on applied verification of continuous and hybrid systems. Vol. 61 (pp. 201-210).
[31] Malone, N.; Chiang, H.; Lesser, K.; Oishi, M.; Tapia, L., Conference on hybrid dynamic moving obstacle avoidance using a stochastic reachable set-based potential field, IEEE Transactions on Robotics, 33, 5, 1124-1138 (2017)
[32] McDiarmid, Colin, On the method of bounded differences, Surveys in Combinatorics, 141, 1, 148-188 (1989) · Zbl 0712.05012
[33] Micchelli, Charles A.; Pontil, Massimiliano, On learning vector-valued functions, Neural Computation, 17, 1, 177-204 (2005) · Zbl 1092.93045
[34] Muandet, Krikamol; Fukumizu, Kenji; Sriperumbudur, Bharath; Schölkopf, Bernhard, Kernel mean embedding of distributions: A review and beyond, Foundations and Trends in Machine Learning, 10, 1-2, 1-141 (2017) · Zbl 1380.68336
[35] Nishiyama, Yu, Boularias, Abdeslam, Gretton, Arthur, & Fukumizu, Kenji (2012). Hilbert space embeddings of POMDPs. In Conference on uncertainty in artificial intelligence (pp. 644-653).
[36] Parzen, Emanuel, An approach to time series analysis, The Annals of Mathematical Statistics, 951-989 (1961) · Zbl 0107.13801
[37] Prajna, Stephen; Jadbabaie, Ali, Safety verification of hybrid systems using barrier certificates, (Conference on hybrid systems: computation and control (2004), Springer), 477-492 · Zbl 1135.93317
[38] Prajna, Stephen; Jadbabaie, Ali; Pappas, George J., A framework for worst-case and stochastic safety verification using barrier certificates, Transactions on Automatic Control, 52, 8, 1415-1428 (2007) · Zbl 1366.93711
[39] Rahimi, Ali; Recht, Benjamin, Random features for large-scale kernel machines, (Platt, J.; Koller, D.; Singer, Y.; Roweis, S., Advances in neural information processing systems. Vol. 20 (2008))
[40] Roohi, Nima, Wang, Yu, West, Matthew, Dullerud, Geir E., & Viswanathan, Mahesh (2017). Statistical verification of the Toyota powertrain control verification benchmark. In Conference on hybrid systems: computation and control. New York, NY, USA (pp. 65-70).
[41] Rudin, Walter, Functional analysis (1991), McGraw-Hill · Zbl 0867.46001
[42] Scholkopf, Bernhard; Smola, Alexander J., Learning with kernels: support vector machines, regularization, optimization, and beyond (2001), MIT Press
[43] Shmarov, Fedor; Soudjani, Sadegh; Paoletti, Nicola; Bartocci, Ezio; Lin, Shan; Smolka, Scott A., Automated synthesis of safe digital controllers for sampled-data stochastic nonlinear systems, IEEE Access, 8, 180825-180843 (2020)
[44] Shmarov, Fedor, & Zuliani, Paolo (2015). Probreach: verified probabilistic delta-reachability for stochastic hybrid systems. In Conference on hybrid systems: computation and control (pp. 134-139). · Zbl 1366.68183
[45] Smola, Alex; Gretton, Arthur; Song, Le; Schölkopf, Bernhard, A Hilbert space embedding for distributions, (International conference on algorithmic learning theory (2007), Springer), 13-31 · Zbl 1142.68407
[46] Song, Le, Boots, Byron, Siddiqi, Sajid M., Gordon, Geoffrey, & Smola, Alex (2010). Hilbert space embeddings of hidden Markov models. In International conference on machine learning (pp. 991-998).
[47] Song, Le, Gretton, Arthur, & Guestrin, Carlos (2010). Nonparametric tree graphical models. In International conference on artificial intelligence and statistics (pp. 765-772).
[48] Song, Le, Huang, Jonathan, Smola, Alex, & Fukumizu, Kenji (2009). Hilbert space embeddings of conditional distributions with applications to dynamical systems. In International conference on machine learning (pp. 961-968).
[49] Soudjani, Sadegh Esmaeil Zadeh; Gevaerts, Caspar; Abate, Alessandro, FAUST2: Formal abstractions of uncountable-STate STochastic processes, (International conference on tools and algorithms for the construction and analysis of systems (2015), Springer), 272-286
[50] Sriperumbudur, Bharath K.; Fukumizu, Kenji; Lanckriet, Gert RG., Universality, characteristic kernels and RKHS embedding of measures, Journal of Machine Learning Research, 12, 7 (2011) · Zbl 1280.68198
[51] Sriperumbudur, Bharath K.; Gretton, Arthur; Fukumizu, Kenji; Schölkopf, Bernhard; Lanckriet, Gert RG., Hilbert space embeddings and metrics on probability measures, Journal of Machine Learning Research, 11, Apr, 1517-1561 (2010) · Zbl 1242.60005
[52] Summers, Sean; Lygeros, John, Verification of discrete time stochastic hybrid systems: A stochastic reach-avoid decision problem, Automatica, 46, 12, 1951-1961 (2010) · Zbl 1371.93220
[53] Thorpe, Adam J.; Oishi, Meeko MK., Model-free stochastic reachability using kernel distribution embeddings, IEEE Control Systems Letters (2019)
[54] Thorpe, Adam J.; Ortiz, Kendric R.; Oishi, Meeko MK., Learning approximate forward reachable sets using separating kernels, (Learning for dynamics and control (2021), PMLR), 201-212
[55] Thorpe, Adam J.; Sivaramakrishnan, Vignesh; Oishi, Meeko MK., Approximate stochastic reachability for high dimensional systems, (American control conference (2021), IEEE), 1287-1293
[56] Vapnik, Vladimir, Statistical learning theory (1998), Wiley · Zbl 0935.62007
[57] Vinod, Abraham P., Gleason, Joseph D., & Oishi, Meeko M. K. (2019). SReachTools: A MATLAB stochastic reachability toolbox. In Conference on hybrid systems: computation and control. (pp. 33-38). · Zbl 07120138
[58] Vinod, A. P., HomChaudhuri, B., & Oishi, M. M. K. (2017). Forward stochastic reachability analysis for uncontrolled linear systems using Fourier transforms. In Conference on hybrid systems: computation and control (pp. 35-44). · Zbl 1369.93582
[59] Vinod, Abraham P.; Oishi, Meeko MK., Scalable underapproximation for the stochastic reach-avoid problem for high-dimensional LTI systems using Fourier transforms, IEEE Control Systems Letters, 1, 2, 316-321 (2017)
[60] Vinod, Abraham P.; Oishi, Meeko M. K., Probabilistic occupancy via forward stochastic reachability for Markov jump affine systems, Transactions on Automatic Control, 66, 7, 3068-3083 (2021) · Zbl 1467.93296
[61] Wang, Yu; Roohi, Nima; West, Matthew; Viswanathan, Mahesh; Dullerud, Geir E., Statistical verification of PCTL using antithetic and stratified samples, Formal Methods in System Design, 54, 145-163 (2019) · Zbl 1425.68270
[62] Zarei, Mojtaba, Wang, Yu, & Pajic, Miroslav (2020). Statistical verification of learning-based cyber-physical systems. In Conference on hybrid systems: computation and control. New York, NY. · Zbl 07300853
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.