Mounjid, Othmane; Lehalle, Charles-Albert Improving reinforcement learning algorithms: towards optimal learning rate policies. (English) Zbl 07818737 Math. Finance 34, No. 2, 588-621 (2024). MSC: 91G15 68T05 PDFBibTeX XMLCite \textit{O. Mounjid} and \textit{C.-A. Lehalle}, Math. Finance 34, No. 2, 588--621 (2024; Zbl 07818737) Full Text: DOI arXiv OA License
Belomestny, Denis; Schoenmakers, John Primal-dual regression approach for Markov decision processes with general state and action spaces. (English) Zbl 07806777 SIAM J. Control Optim. 62, No. 1, 650-679 (2024). MSC: 90C40 65C05 62G08 PDFBibTeX XMLCite \textit{D. Belomestny} and \textit{J. Schoenmakers}, SIAM J. Control Optim. 62, No. 1, 650--679 (2024; Zbl 07806777) Full Text: DOI arXiv
Chen, Zaiwei; Clarke, John-Paul; Maguluri, Siva Theja Target network and truncation overcome the deadly triad in \(Q\)-learning. (English) Zbl 07786787 SIAM J. Math. Data Sci. 5, No. 4, 1078-1101 (2023). MSC: 68T05 68T07 68T09 90C40 62L20 PDFBibTeX XMLCite \textit{Z. Chen} et al., SIAM J. Math. Data Sci. 5, No. 4, 1078--1101 (2023; Zbl 07786787) Full Text: DOI arXiv
John, Majnu; Wu, Yihren A simple illustration of interleaved learning using Kalman filter for linear least squares. (English) Zbl 07786764 Results Appl. Math. 20, Article ID 100409, 5 p. (2023). MSC: 68-XX 62-XX PDFBibTeX XMLCite \textit{M. John} and \textit{Y. Wu}, Results Appl. Math. 20, Article ID 100409, 5 p. (2023; Zbl 07786764) Full Text: DOI arXiv
Lu, Xiuyuan; Van Roy, Benjamin; Dwaracherla, Vikranth; Ibrahimi, Morteza; Osband, Ian; Wen, Zheng Reinforcement learning, bit by bit. (English) Zbl 1525.68120 Found. Trends Mach. Learn. 16, No. 6, 733-865 (2023). MSC: 68T05 68-02 PDFBibTeX XMLCite \textit{X. Lu} et al., Found. Trends Mach. Learn. 16, No. 6, 733--865 (2023; Zbl 1525.68120) Full Text: DOI arXiv
Cui, Leilei; Jiang, Zhong-Ping A Lyapunov characterization of robust policy optimization. (English) Zbl 07772377 Control Theory Technol. 21, No. 3, 374-389 (2023). MSC: 93D25 93B35 PDFBibTeX XMLCite \textit{L. Cui} and \textit{Z.-P. Jiang}, Control Theory Technol. 21, No. 3, 374--389 (2023; Zbl 07772377) Full Text: DOI
Stanković, Miloš S.; Beko, Marko; Ilić, Nemanja; Stanković, Srdjan S. Multi-agent off-policy actor-critic algorithm for distributed multi-task reinforcement learning. (English) Zbl 1527.93414 Eur. J. Control 74, Article ID 100853, 9 p. (2023). MSC: 93D50 93A16 68T05 PDFBibTeX XMLCite \textit{M. S. Stanković} et al., Eur. J. Control 74, Article ID 100853, 9 p. (2023; Zbl 1527.93414) Full Text: DOI
Xu, Yong; Xiang, Haoxiang; Yang, Lixin; Lu, Renquan; Quevedo, Daniel E. Optimal transmission strategy for multiple Markovian fading channels: existence, structure, and approximation. (English) Zbl 07766518 Automatica 158, Article ID 111312, 13 p. (2023). MSC: 93E10 93B70 90C40 PDFBibTeX XMLCite \textit{Y. Xu} et al., Automatica 158, Article ID 111312, 13 p. (2023; Zbl 07766518) Full Text: DOI
Fu, Xingjian; Li, Zizheng Zero-sum game optimal control for the nonlinear switched systems based on heuristic dynamic programming. (English) Zbl 07754988 Optim. Control Appl. Methods 44, No. 5, 2821-2837 (2023). MSC: 93C10 93C30 49L20 91A10 PDFBibTeX XMLCite \textit{X. Fu} and \textit{Z. Li}, Optim. Control Appl. Methods 44, No. 5, 2821--2837 (2023; Zbl 07754988) Full Text: DOI
Li, Qiang; Xu, Yunjun Dimension reduction based adaptive dynamic programming for optimal control of discrete-time nonlinear control-affine systems. (English) Zbl 1526.93121 Int. J. Control 96, No. 11, 2799-2811 (2023). MSC: 93C40 49L20 93C55 93C10 93D20 PDFBibTeX XMLCite \textit{Q. Li} and \textit{Y. Xu}, Int. J. Control 96, No. 11, 2799--2811 (2023; Zbl 1526.93121) Full Text: DOI
Reppen, Anders Max; Soner, Halil Mete Deep empirical risk minimization in finance: looking into the future. (English) Zbl 1522.91312 Math. Finance 33, No. 1, 116-145 (2023). MSC: 91G60 65C05 49N35 PDFBibTeX XMLCite \textit{A. M. Reppen} and \textit{H. M. Soner}, Math. Finance 33, No. 1, 116--145 (2023; Zbl 1522.91312) Full Text: DOI arXiv
Long, Mingkang; An, Qing; Su, Housheng; Luo, Hui; Zhao, Jin Model-free algorithm for consensus of discrete-time multi-agent systems using reinforcement learning method. (English) Zbl 1521.93179 J. Franklin Inst. 360, No. 14, 10564-10581 (2023). MSC: 93D50 93A16 93C55 PDFBibTeX XMLCite \textit{M. Long} et al., J. Franklin Inst. 360, No. 14, 10564--10581 (2023; Zbl 1521.93179) Full Text: DOI
Qasem, Omar; Gao, Weinan; Vamvoudakis, Kyriakos G. Adaptive optimal control of continuous-time nonlinear affine systems via hybrid iteration. (English) Zbl 1522.93095 Automatica 157, Article ID 111261, 10 p. (2023). MSC: 93C40 90C39 93C10 PDFBibTeX XMLCite \textit{O. Qasem} et al., Automatica 157, Article ID 111261, 10 p. (2023; Zbl 1522.93095) Full Text: DOI
Hasanbeig, Hosein; Kroening, Daniel; Abate, Alessandro Certified reinforcement learning with logic guidance. (English) Zbl 07732224 Artif. Intell. 322, Article ID 103949, 22 p. (2023). MSC: 68Txx PDFBibTeX XMLCite \textit{H. Hasanbeig} et al., Artif. Intell. 322, Article ID 103949, 22 p. (2023; Zbl 07732224) Full Text: DOI arXiv
Sun, Changle; Li, Haitao State-flipped control and \(Q\)-learning for finite horizon output tracking of Boolean control networks. (English) Zbl 1520.93237 Int. J. Syst. Sci., Princ. Appl. Syst. Integr. 54, No. 12, 2452-2464 (2023). MSC: 93C29 93B70 93B03 PDFBibTeX XMLCite \textit{C. Sun} and \textit{H. Li}, Int. J. Syst. Sci., Princ. Appl. Syst. Integr. 54, No. 12, 2452--2464 (2023; Zbl 1520.93237) Full Text: DOI
Palmborg, Lina; Lindskog, Filip Premium control with reinforcement learning. (English) Zbl 1520.91347 ASTIN Bull. 53, No. 2, 233-257 (2023). MSC: 91G05 90C40 68T05 PDFBibTeX XMLCite \textit{L. Palmborg} and \textit{F. Lindskog}, ASTIN Bull. 53, No. 2, 233--257 (2023; Zbl 1520.91347) Full Text: DOI
Cervellera, Cristiano Optimized ensemble value function approximation for dynamic programming. (English) Zbl 07709230 Eur. J. Oper. Res. 309, No. 2, 719-730 (2023). MSC: 90Bxx PDFBibTeX XMLCite \textit{C. Cervellera}, Eur. J. Oper. Res. 309, No. 2, 719--730 (2023; Zbl 07709230) Full Text: DOI
Wang, Zixuan; Tang, Shanjian Convergence of gradient algorithms for nonconvex \(C^{1+ \alpha}\) cost functions. (English) Zbl 07708673 Chin. Ann. Math., Ser. B 44, No. 3, 445-464 (2023). MSC: 62L20 90C26 PDFBibTeX XMLCite \textit{Z. Wang} and \textit{S. Tang}, Chin. Ann. Math., Ser. B 44, No. 3, 445--464 (2023; Zbl 07708673) Full Text: DOI arXiv
Wu, Yu; Zeng, Bo Dynamic parcel pick-up routing problem with prioritized customers and constrained capacity via lower-bound-based rollout approach. (English) Zbl 07706707 Comput. Oper. Res. 154, Article ID 106176, 14 p. (2023). MSC: 90Bxx PDFBibTeX XMLCite \textit{Y. Wu} and \textit{B. Zeng}, Comput. Oper. Res. 154, Article ID 106176, 14 p. (2023; Zbl 07706707) Full Text: DOI
Luo, Xuyang; Song, Chunyue Optimal decision-making of mutual fund temporary borrowing problem via approximate dynamic programming. (English) Zbl 07706574 Comput. Oper. Res. 153, Article ID 106162, 15 p. (2023). MSC: 90Bxx PDFBibTeX XMLCite \textit{X. Luo} and \textit{C. Song}, Comput. Oper. Res. 153, Article ID 106162, 15 p. (2023; Zbl 07706574) Full Text: DOI
Zhang, Jian; Luo, Kelin; Florio, Alexandre M.; Van Woensel, Tom Solving large-scale dynamic vehicle routing problems with stochastic requests. (English) Zbl 07705414 Eur. J. Oper. Res. 306, No. 2, 596-614 (2023). MSC: 90Bxx PDFBibTeX XMLCite \textit{J. Zhang} et al., Eur. J. Oper. Res. 306, No. 2, 596--614 (2023; Zbl 07705414) Full Text: DOI arXiv
Stanković, Miloš S.; Beko, Marko; Stanković, Srdjan S. Distributed consensus-based multi-agent temporal-difference learning. (English) Zbl 1520.93516 Automatica 151, Article ID 110922, 11 p. (2023). MSC: 93D50 93A16 93A14 90C40 PDFBibTeX XMLCite \textit{M. S. Stanković} et al., Automatica 151, Article ID 110922, 11 p. (2023; Zbl 1520.93516) Full Text: DOI
Malikopoulos, Andreas A. Separation of learning and control for cyber-physical systems. (English) Zbl 1520.93194 Automatica 151, Article ID 110912, 13 p. (2023). MSC: 93B70 93C83 93E20 93E35 90C40 PDFBibTeX XMLCite \textit{A. A. Malikopoulos}, Automatica 151, Article ID 110912, 13 p. (2023; Zbl 1520.93194) Full Text: DOI arXiv
Almudevar, Anthony A stochastic contraction mapping theorem. (English) Zbl 1519.93231 Syst. Control Lett. 174, Article ID 105482, 11 p. (2023). MSC: 93E20 93E24 93E35 PDFBibTeX XMLCite \textit{A. Almudevar}, Syst. Control Lett. 174, Article ID 105482, 11 p. (2023; Zbl 1519.93231) Full Text: DOI arXiv
Pakkhesal, Sajjad; Shamaghdari, Saeed SOS-based policy iteration for \(H_\infty\) control of polynomial systems with uncertain parameters. (English) Zbl 1519.93065 Int. J. Control 96, No. 4, 1052-1065 (2023). MSC: 93B36 93C41 93C40 90C39 PDFBibTeX XMLCite \textit{S. Pakkhesal} and \textit{S. Shamaghdari}, Int. J. Control 96, No. 4, 1052--1065 (2023; Zbl 1519.93065) Full Text: DOI
Bäuerle, Nicole Mean field Markov decision processes. (English) Zbl 1517.90153 Appl. Math. Optim. 88, No. 1, Paper No. 12, 36 p. (2023). Reviewer: Wiesław Kotarski (Sosnowiec) MSC: 90C40 49L20 PDFBibTeX XMLCite \textit{N. Bäuerle}, Appl. Math. Optim. 88, No. 1, Paper No. 12, 36 p. (2023; Zbl 1517.90153) Full Text: DOI arXiv
Darendeliler, Alp; Claeys, Dieter; Aghezzaf, El-Houssaine Integrated condition-based maintenance and multi-item lot-sizing with stochastic demand. (English) Zbl 07677917 J. Ind. Manag. Optim. 19, No. 9, 6908-6947 (2023). MSC: 90B25 90C40 PDFBibTeX XMLCite \textit{A. Darendeliler} et al., J. Ind. Manag. Optim. 19, No. 9, 6908--6947 (2023; Zbl 07677917) Full Text: DOI
Srivastava, Amber; Salapaka, Srinivasa M. Dynamic parameters in sequential decision making. (English) Zbl 1507.91061 Automatica 148, Article ID 110795, 8 p. (2023). MSC: 91B06 90C40 PDFBibTeX XMLCite \textit{A. Srivastava} and \textit{S. M. Salapaka}, Automatica 148, Article ID 110795, 8 p. (2023; Zbl 1507.91061) Full Text: DOI arXiv
Liu, Kanglin; Liu, Changchun; Xiang, Xi; Tian, Zhili Testing facility location and dynamic capacity planning for pandemics with demand uncertainty. (English) Zbl 1524.90204 Eur. J. Oper. Res. 304, No. 1, 150-168 (2023). MSC: 90B80 90C15 PDFBibTeX XMLCite \textit{K. Liu} et al., Eur. J. Oper. Res. 304, No. 1, 150--168 (2023; Zbl 1524.90204) Full Text: DOI
Hao, Dong; Zhang, Dongcheng; Shi, Qi; Li, Kai Entropy regularized actor-critic based multi-agent deep reinforcement learning for stochastic games. (English) Zbl 07813886 Inf. Sci. 617, 17-40 (2022). MSC: 68-XX 91-XX PDFBibTeX XMLCite \textit{D. Hao} et al., Inf. Sci. 617, 17--40 (2022; Zbl 07813886) Full Text: DOI
Lu, Jingwei; Wei, Qinglai; Wang, Ziyang; Zhou, Tianmin; Wang, Fei-Yue Event-triggered optimal control for discrete-time multi-player non-zero-sum games using parallel control. (English) Zbl 07798597 Inf. Sci. 584, 519-535 (2022). MSC: 93C65 93C55 91A06 49L20 93E20 PDFBibTeX XMLCite \textit{J. Lu} et al., Inf. Sci. 584, 519--535 (2022; Zbl 07798597) Full Text: DOI
Pitombeira-Neto, Anselmo R.; Murta, Arthur H. F. A reinforcement learning approach to the stochastic cutting stock problem. (English) Zbl 07711246 EURO J. Comput. Optim. 10, Article ID 100027, 26 p. (2022). MSC: 90C39 90C59 90C40 PDFBibTeX XMLCite \textit{A. R. Pitombeira-Neto} and \textit{A. H. F. Murta}, EURO J. Comput. Optim. 10, Article ID 100027, 26 p. (2022; Zbl 07711246) Full Text: DOI arXiv
Belhenniche, Abdelkader; Guran, Liliana; Benahmed, Sfya; Lobo Pereira, Fernando Solving nonlinear and dynamic programming equations on extended \(b\)-metric spaces with the fixed-point technique. (English) Zbl 07702967 Fixed Point Theory Algorithms Sci. Eng. 2022, Paper No. 24, 22 p. (2022). MSC: 47-XX 54-XX PDFBibTeX XMLCite \textit{A. Belhenniche} et al., Fixed Point Theory Algorithms Sci. Eng. 2022, Paper No. 24, 22 p. (2022; Zbl 07702967) Full Text: DOI
Cai, Xuan; Wang, Chaoli; Liu, Shuxin; Chen, Guochu; Wang, Gang Optimal output tracking control of linear discrete-time systems with unknown dynamics by adaptive dynamic programming and output feedback. (English) Zbl 1518.93083 Int. J. Syst. Sci., Princ. Appl. Syst. Integr. 53, No. 16, 3426-3448 (2022). MSC: 93C55 93C05 93B52 49L20 PDFBibTeX XMLCite \textit{X. Cai} et al., Int. J. Syst. Sci., Princ. Appl. Syst. Integr. 53, No. 16, 3426--3448 (2022; Zbl 1518.93083) Full Text: DOI
Qu, Guannan; Wierman, Adam; Li, Na Scalable reinforcement learning for multiagent networked systems. (English) Zbl 1512.90058 Oper. Res. 70, No. 6, 3601-3628 (2022). MSC: 90B10 PDFBibTeX XMLCite \textit{G. Qu} et al., Oper. Res. 70, No. 6, 3601--3628 (2022; Zbl 1512.90058) Full Text: DOI arXiv
Zhang, Qi; Hu, Jiaqiao Actor-critic-like stochastic adaptive search for continuous simulation optimization. (English) Zbl 1510.90267 Oper. Res. 70, No. 6, 3519-3537 (2022). MSC: 90C30 90C90 PDFBibTeX XMLCite \textit{Q. Zhang} and \textit{J. Hu}, Oper. Res. 70, No. 6, 3519--3537 (2022; Zbl 1510.90267) Full Text: DOI
Liu, Menghan; Poppleton, Erik; Pedrielli, Giulia; Šulc, Petr; Bertsekas, Dimitri P. ExpertRNA: a new framework for RNA secondary structure prediction. (English) Zbl 07625884 INFORMS J. Comput. 34, No. 5, 2464-2484 (2022). MSC: 90Cxx PDFBibTeX XMLCite \textit{M. Liu} et al., INFORMS J. Comput. 34, No. 5, 2464--2484 (2022; Zbl 07625884) Full Text: DOI
Subramanian, Jayakumar; Sinha, Amit; Seraj, Raihan; Mahajan, Aditya Approximate information state for approximate planning and reinforcement learning in partially observed systems. (English) Zbl 07625165 J. Mach. Learn. Res. 23, Paper No. 12, 83 p. (2022). MSC: 68T05 PDFBibTeX XMLCite \textit{J. Subramanian} et al., J. Mach. Learn. Res. 23, Paper No. 12, 83 p. (2022; Zbl 07625165) Full Text: arXiv Link
Chen, Zaiwei; Zhang, Sheng; Doan, Thinh T.; Clarke, John-Paul; Maguluri, Siva Theja Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning. (English) Zbl 1504.93364 Automatica 146, Article ID 110623, 14 p. (2022). MSC: 93E03 93C10 68T05 PDFBibTeX XMLCite \textit{Z. Chen} et al., Automatica 146, Article ID 110623, 14 p. (2022; Zbl 1504.93364) Full Text: DOI arXiv
Wu, Xiao; Tang, Yanqiu Asymptotic optimality and rates of convergence of quantized stationary policies in continuous-time Markov decision processes. (English) Zbl 1497.90216 Discrete Dyn. Nat. Soc. 2022, Article ID 1080946, 11 p. (2022). MSC: 90C40 93E20 60J27 PDFBibTeX XMLCite \textit{X. Wu} and \textit{Y. Tang}, Discrete Dyn. Nat. Soc. 2022, Article ID 1080946, 11 p. (2022; Zbl 1497.90216) Full Text: DOI
Ramaswamy, Arunselvan; Bhatnagar, Shalabh Analyzing approximate value iteration algorithms. (English) Zbl 1501.90106 Math. Oper. Res. 47, No. 3, 2138-2159 (2022). MSC: 90C39 62L20 90C40 47H04 PDFBibTeX XMLCite \textit{A. Ramaswamy} and \textit{S. Bhatnagar}, Math. Oper. Res. 47, No. 3, 2138--2159 (2022; Zbl 1501.90106) Full Text: DOI arXiv
Zhou, Zhengyuan; Mertikopoulos, Panayotis; Bambos, Nicholas; Glynn, Peter; Ye, Yinyu Distributed stochastic optimization with large delays. (English) Zbl 1501.90059 Math. Oper. Res. 47, No. 3, 2082-2111 (2022). MSC: 90C15 90C26 90C25 90C05 PDFBibTeX XMLCite \textit{Z. Zhou} et al., Math. Oper. Res. 47, No. 3, 2082--2111 (2022; Zbl 1501.90059) Full Text: DOI arXiv
Aouad, Ali; Sarıtaç, Ömer Dynamic stochastic matching under limited time. (English) Zbl 1500.90053 Oper. Res. 70, No. 4, 2349-2383 (2022). MSC: 90C27 90C40 PDFBibTeX XMLCite \textit{A. Aouad} and \textit{Ö. Sarıtaç}, Oper. Res. 70, No. 4, 2349--2383 (2022; Zbl 1500.90053) Full Text: DOI
Prashanth, L. A.; Fu, Michael C. Risk-sensitive reinforcement learning via policy gradient search. (English) Zbl 1524.68292 Found. Trends Mach. Learn. 15, No. 5, 537-693 (2022). MSC: 68T05 90C90 68-02 PDFBibTeX XMLCite \textit{L. A. Prashanth} and \textit{M. C. Fu}, Found. Trends Mach. Learn. 15, No. 5, 537--693 (2022; Zbl 1524.68292) Full Text: DOI arXiv
Wu, Xiao; Kong, Yinying; Guo, Zhenbin Asymptotic optimality of quantized stationary policies in continuous-time Markov decision processes with Polish spaces. (Chinese. English summary) Zbl 1513.60098 Acta Math. Sci., Ser. A, Chin. Ed. 42, No. 2, 594-604 (2022). MSC: 60J10 90C40 PDFBibTeX XMLCite \textit{X. Wu} et al., Acta Math. Sci., Ser. A, Chin. Ed. 42, No. 2, 594--604 (2022; Zbl 1513.60098) Full Text: Link
Wang, Wei; Xie, Xiangpeng; Feng, Changyang Model-free finite-horizon optimal tracking control of discrete-time linear systems. (English) Zbl 1510.49032 Appl. Math. Comput. 433, Article ID 127400, 13 p. (2022). MSC: 49N10 49K21 93C55 PDFBibTeX XMLCite \textit{W. Wang} et al., Appl. Math. Comput. 433, Article ID 127400, 13 p. (2022; Zbl 1510.49032) Full Text: DOI
Rodriguez, Ivan D.; Bonet, Blai; Sardina, Sebastian; Geffner, Hector FOND planning with explicit fairness assumptions. (English) Zbl 07566001 J. Artif. Intell. Res. (JAIR) 74, 887-916 (2022). MSC: 68Txx PDFBibTeX XMLCite \textit{I. D. Rodriguez} et al., J. Artif. Intell. Res. (JAIR) 74, 887--916 (2022; Zbl 07566001) Full Text: DOI arXiv
Parker-Holder, Jack; Rajan, Raghu; Song, Xingyou; Biedenkapp, André; Miao, Yingjie; Eimer, Theresa; Zhang, Baohe; Nguyen, Vu; Calandra, Roberto; Faust, Aleksandra; Hutter, Frank; Lindauer, Marius Automated reinforcement learning (AutoRL): a survey and open problems. (English) Zbl 07565993 J. Artif. Intell. Res. (JAIR) 74, 517-568 (2022). MSC: 68Txx PDFBibTeX XMLCite \textit{J. Parker-Holder} et al., J. Artif. Intell. Res. (JAIR) 74, 517--568 (2022; Zbl 07565993) Full Text: DOI arXiv
Zanon, Mario; Gros, Sébastien; Palladino, Michele Stability-constrained Markov decision processes using MPC. (English) Zbl 1497.93065 Automatica 143, Article ID 110399, 9 p. (2022). MSC: 93B45 93E15 90C40 PDFBibTeX XMLCite \textit{M. Zanon} et al., Automatica 143, Article ID 110399, 9 p. (2022; Zbl 1497.93065) Full Text: DOI arXiv
Hu, Yaohua; Li, Gang; Li, Minghua; Yu, Carisa Kwok Wai Multiple-sets split quasi-convex feasibility problems: adaptive subgradient methods with convergence guarantee. (English) Zbl 1493.65108 J. Nonlinear Var. Anal. 6, No. 2, 15-33 (2022). MSC: 65K05 90C25 PDFBibTeX XMLCite \textit{Y. Hu} et al., J. Nonlinear Var. Anal. 6, No. 2, 15--33 (2022; Zbl 1493.65108) Full Text: DOI
Sirignano, Justin; Spiliopoulos, Konstantinos Asymptotics of reinforcement learning with neural networks. (English) Zbl 07547882 Stoch. Syst. 12, No. 1, 2-29 (2022). MSC: 68T05 68T07 PDFBibTeX XMLCite \textit{J. Sirignano} and \textit{K. Spiliopoulos}, Stoch. Syst. 12, No. 1, 2--29 (2022; Zbl 07547882) Full Text: DOI arXiv
Guo, Peijun Dynamic focus programming: a new approach to sequential decision problems under uncertainty. (English) Zbl 1507.91052 Eur. J. Oper. Res. 303, No. 1, 328-336 (2022). MSC: 91B06 90C15 90C39 PDFBibTeX XMLCite \textit{P. Guo}, Eur. J. Oper. Res. 303, No. 1, 328--336 (2022; Zbl 1507.91052) Full Text: DOI
Zhai, Yuexiang; Baek, Christina; Zhou, Zhengyuan; Jiao, Jiantao; Ma, Yi Computational benefits of intermediate rewards for goal-reaching policy learning. (English) Zbl 07527542 J. Artif. Intell. Res. (JAIR) 73, 847-896 (2022). MSC: 68Txx PDFBibTeX XMLCite \textit{Y. Zhai} et al., J. Artif. Intell. Res. (JAIR) 73, 847--896 (2022; Zbl 07527542) Full Text: DOI arXiv
Li, Gang; Li, Minghua; Hu, Yaohua Stochastic quasi-subgradient method for stochastic quasi-convex feasibility problems. (English) Zbl 1484.65124 Discrete Contin. Dyn. Syst., Ser. S 15, No. 4, 713-725 (2022). MSC: 65K05 90C26 49M37 PDFBibTeX XMLCite \textit{G. Li} et al., Discrete Contin. Dyn. Syst., Ser. S 15, No. 4, 713--725 (2022; Zbl 1484.65124) Full Text: DOI
Attia, Ahmed; Leyffer, Sven; Munson, Todd S. Stochastic learning approach for binary optimization: application to Bayesian optimal design of experiments. (English) Zbl 1493.62460 SIAM J. Sci. Comput. 44, No. 2, B395-B427 (2022). MSC: 62K05 62F15 62-08 35Q62 35Q93 35R30 93E35 PDFBibTeX XMLCite \textit{A. Attia} et al., SIAM J. Sci. Comput. 44, No. 2, B395--B427 (2022; Zbl 1493.62460) Full Text: DOI
Yu, Yue; Calderone, Dan; Li, Sarah H. Q.; Ratliff, Lillian J.; Açıkmeşe, Behçet Variable demand and multi-commodity flow in Markovian network equilibrium. (English) Zbl 1486.91019 Automatica 140, Article ID 110224, 8 p. (2022). MSC: 91A43 90C40 90B15 PDFBibTeX XMLCite \textit{Y. Yu} et al., Automatica 140, Article ID 110224, 8 p. (2022; Zbl 1486.91019) Full Text: DOI arXiv
Avrachenkov, Konstantin E.; Borkar, Vivek S. Whittle index based Q-learning for restless bandits with average reward. (English) Zbl 1485.93341 Automatica 139, Article ID 110186, 10 p. (2022). MSC: 93C65 68T07 PDFBibTeX XMLCite \textit{K. E. Avrachenkov} and \textit{V. S. Borkar}, Automatica 139, Article ID 110186, 10 p. (2022; Zbl 1485.93341) Full Text: DOI arXiv
Akian, Marianne; Gaubert, Stéphane; Qu, Zheng; Saadi, Omar Multiply accelerated value iteration for nonsymmetric affine fixed point problems and application to Markov decision processes. (English) Zbl 1486.90203 SIAM J. Matrix Anal. Appl. 43, No. 1, 199-232 (2022). MSC: 90C39 90C40 47H09 PDFBibTeX XMLCite \textit{M. Akian} et al., SIAM J. Matrix Anal. Appl. 43, No. 1, 199--232 (2022; Zbl 1486.90203) Full Text: DOI arXiv
Soeffker, Ninja; Ulmer, Marlin W.; Mattfeld, Dirk C. Stochastic dynamic vehicle routing in the light of prescriptive analytics: a review. (English) Zbl 1490.90069 Eur. J. Oper. Res. 298, No. 3, 801-820 (2022). MSC: 90B06 90C15 90C39 90-02 PDFBibTeX XMLCite \textit{N. Soeffker} et al., Eur. J. Oper. Res. 298, No. 3, 801--820 (2022; Zbl 1490.90069) Full Text: DOI
Boute, Robert N.; Gijsbrechts, Joren; van Jaarsveld, Willem; Vanvuchelen, Nathalie Deep reinforcement learning for inventory control: a roadmap. (English) Zbl 1490.90012 Eur. J. Oper. Res. 298, No. 2, 401-412 (2022). MSC: 90B05 68T05 90-02 PDFBibTeX XMLCite \textit{R. N. Boute} et al., Eur. J. Oper. Res. 298, No. 2, 401--412 (2022; Zbl 1490.90012) Full Text: DOI
Martinelli, Andrea; Gargiani, Matilde; Lygeros, John Data-driven optimal control with a relaxed linear program. (English) Zbl 1483.49018 Automatica 136, Article ID 110052, 7 p. (2022). MSC: 49J45 49L20 90C05 90C39 PDFBibTeX XMLCite \textit{A. Martinelli} et al., Automatica 136, Article ID 110052, 7 p. (2022; Zbl 1483.49018) Full Text: DOI arXiv
Lenarda, Pietro; Gnecco, Giorgio; Riccaboni, Massimo Parameter estimation in a 3-parameter \(p\)-star random graph model. (English) Zbl 07775298 Networks 77, No. 3, 403-420 (2021). MSC: 91D30 05C80 PDFBibTeX XMLCite \textit{P. Lenarda} et al., Networks 77, No. 3, 403--420 (2021; Zbl 07775298) Full Text: DOI arXiv
Guo, Linyuan; Rizvi, Syed Ali Asad; Lin, Zongli Optimal control of a two-wheeled self-balancing robot by reinforcement learning. (English) Zbl 1526.93175 Int. J. Robust Nonlinear Control 31, No. 6, 1885-1904 (2021). MSC: 93C85 49N10 49N35 PDFBibTeX XMLCite \textit{L. Guo} et al., Int. J. Robust Nonlinear Control 31, No. 6, 1885--1904 (2021; Zbl 1526.93175) Full Text: DOI
Kim, Jeongho; Shin, Jaeuk; Yang, Insoon Hamilton-Jacobi deep Q-learning for deterministic continuous-time systems with Lipschitz continuous controls. (English) Zbl 07626721 J. Mach. Learn. Res. 22, Paper No. 206, 34 p. (2021). MSC: 68T05 PDFBibTeX XMLCite \textit{J. Kim} et al., J. Mach. Learn. Res. 22, Paper No. 206, 34 p. (2021; Zbl 07626721) Full Text: arXiv Link
Haddad, Wassim M. The role of systems biology, neuroscience, and thermodynamics in network control and learning. (English) Zbl 1500.93041 Vamvoudakis, Kyriakos G. (ed.) et al., Handbook of reinforcement learning and control. Cham: Springer. Stud. Syst. Decis. Control 325, 763-817 (2021). MSC: 93B70 93A16 93C10 92C42 93E20 80A99 PDFBibTeX XMLCite \textit{W. M. Haddad}, Stud. Syst. Decis. Control 325, 763--817 (2021; Zbl 1500.93041) Full Text: DOI
Surana, Amit Reinforcement learning: an industrial perspective. (English) Zbl 07608721 Vamvoudakis, Kyriakos G. (ed.) et al., Handbook of reinforcement learning and control. Cham: Springer. Stud. Syst. Decis. Control 325, 647-672 (2021). MSC: 68Txx PDFBibTeX XMLCite \textit{A. Surana}, Stud. Syst. Decis. Control 325, 647--672 (2021; Zbl 07608721) Full Text: DOI
Powell, Warren B. From reinforcement learning to optimal control: a unified framework for sequential decisions. (English) Zbl 07608703 Vamvoudakis, Kyriakos G. (ed.) et al., Handbook of reinforcement learning and control. Cham: Springer. Stud. Syst. Decis. Control 325, 29-74 (2021). MSC: 68Txx 93E20 49K45 PDFBibTeX XMLCite \textit{W. B. Powell}, Stud. Syst. Decis. Control 325, 29--74 (2021; Zbl 07608703) Full Text: DOI arXiv
Kiumarsi, Bahare; Modares, Hamidreza; Lewis, Frank Reinforcement learning for distributed control and multi-player games. (English) Zbl 07608702 Vamvoudakis, Kyriakos G. (ed.) et al., Handbook of reinforcement learning and control. Cham: Springer. Stud. Syst. Decis. Control 325, 7-27 (2021). MSC: 68Txx 91Axx PDFBibTeX XMLCite \textit{B. Kiumarsi} et al., Stud. Syst. Decis. Control 325, 7--27 (2021; Zbl 07608702) Full Text: DOI
Ansari, Qamrul Hasan; Babu, Feeroz; Zeeshan, Mohd. Incremental quasi-subgradient method for minimizing sum of geodesic quasi-convex functions on Riemannian manifolds with applications. (English) Zbl 1489.90109 Numer. Funct. Anal. Optim. 42, No. 13, Part 1, 1492-1521 (2021). Reviewer: S. S. Kutateladze (Novosibirsk) MSC: 90C25 49J27 53C22 65K05 65K10 PDFBibTeX XMLCite \textit{Q. H. Ansari} et al., Numer. Funct. Anal. Optim. 42, No. 13, Part 1, 1492--1521 (2021; Zbl 1489.90109) Full Text: DOI
Prashanth, L. A.; Korda, Nathaniel; Munos, Rémi Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling. (English) Zbl 07432813 Mach. Learn. 110, No. 3, 559-618 (2021). MSC: 68T05 PDFBibTeX XMLCite \textit{L. A. Prashanth} et al., Mach. Learn. 110, No. 3, 559--618 (2021; Zbl 07432813) Full Text: DOI arXiv
Romao, Licio; Margellos, Kostas; Notarstefano, Giuseppe; Papachristodoulou, Antonis Subgradient averaging for multi-agent optimisation with different constraint sets. (English) Zbl 1478.93640 Automatica 131, Article ID 109738, 14 p. (2021). MSC: 93D50 93A16 49J05 PDFBibTeX XMLCite \textit{L. Romao} et al., Automatica 131, Article ID 109738, 14 p. (2021; Zbl 1478.93640) Full Text: DOI arXiv
Liu, Jun On the convergence of reinforcement learning with Monte Carlo exploring starts. (English) Zbl 1478.93667 Automatica 129, Article ID 109693, 10 p. (2021). MSC: 93E03 68T05 90C40 PDFBibTeX XMLCite \textit{J. Liu}, Automatica 129, Article ID 109693, 10 p. (2021; Zbl 1478.93667) Full Text: DOI arXiv
Xie, Fang; Li, Haitao; Xu, Zhe An approximate dynamic programming approach to project scheduling with uncertain resource availabilities. (English) Zbl 1481.90194 Appl. Math. Modelling 97, 226-243 (2021). MSC: 90B36 90C39 90C40 PDFBibTeX XMLCite \textit{F. Xie} et al., Appl. Math. Modelling 97, 226--243 (2021; Zbl 1481.90194) Full Text: DOI
Aravena, Ignacio; Papavasiliou, Anthony Asynchronous Lagrangian scenario decomposition. (English) Zbl 1473.90096 Math. Program. Comput. 13, No. 1, 1-50 (2021). MSC: 90C15 68W15 68W20 90C10 90C06 PDFBibTeX XMLCite \textit{I. Aravena} and \textit{A. Papavasiliou}, Math. Program. Comput. 13, No. 1, 1--50 (2021; Zbl 1473.90096) Full Text: DOI
Vijayan, Nithia; Prashanth, L. A. Smoothed functional-based gradient algorithms for off-policy reinforcement learning: a non-asymptotic viewpoint. (English) Zbl 07423703 Syst. Control Lett. 155, Article ID 104988, 11 p. (2021). MSC: 68Txx PDFBibTeX XMLCite \textit{N. Vijayan} and \textit{L. A. Prashanth}, Syst. Control Lett. 155, Article ID 104988, 11 p. (2021; Zbl 07423703) Full Text: DOI arXiv
Khamaru, Koulik; Pananjady, Ashwin; Ruan, Feng; Wainwright, Martin J.; Jordan, Michael I. Is temporal difference learning optimal? An instance-dependent analysis. (English) Zbl 07419556 SIAM J. Math. Data Sci. 3, No. 4, 1013-1040 (2021). MSC: 68T05 90C40 62L20 62G20 60F15 PDFBibTeX XMLCite \textit{K. Khamaru} et al., SIAM J. Math. Data Sci. 3, No. 4, 1013--1040 (2021; Zbl 07419556) Full Text: DOI arXiv
Bhudisaksang, Theerawat; Cartea, Álvaro Adaptive robust control in continuous time. (English) Zbl 1472.93195 SIAM J. Control Optim. 59, No. 5, 3912-3945 (2021). MSC: 93E20 93E35 93B35 49L20 91G10 PDFBibTeX XMLCite \textit{T. Bhudisaksang} and \textit{Á. Cartea}, SIAM J. Control Optim. 59, No. 5, 3912--3945 (2021; Zbl 1472.93195) Full Text: DOI
Bäuerle, Nicole; Glauner, Alexander Q-learning for distributionally robust Markov decision processes. (English) Zbl 1478.90138 Piunovskiy, Alexey (ed.) et al., Modern trends in controlled stochastic processes: theory and applications, V.III. Selected papers based on the presentations at the traditional Liverpool workshop on controlled stochastic processes, Liverpool, UK, July 2021. Cham: Springer. Emerg. Complex. Comput. 41, 108-128 (2021). MSC: 90C40 PDFBibTeX XMLCite \textit{N. Bäuerle} and \textit{A. Glauner}, Emerg. Complex. Comput. 41, 108--128 (2021; Zbl 1478.90138) Full Text: DOI
Reisinger, Christoph; Zhang, Yufei Regularity and stability of feedback relaxed controls. (English) Zbl 1471.93100 SIAM J. Control Optim. 59, No. 5, 3118-3151 (2021). MSC: 93B52 93B35 93E20 PDFBibTeX XMLCite \textit{C. Reisinger} and \textit{Y. Zhang}, SIAM J. Control Optim. 59, No. 5, 3118--3151 (2021; Zbl 1471.93100) Full Text: DOI arXiv
Belhenniche, A.; Benahmed, S.; Pereira, F. L. Extension of \(\lambda\)-PIR for weakly contractive operators via fixed point theory. (English) Zbl 1515.47114 Fixed Point Theory 22, No. 2, 511-526 (2021). Reviewer: Lisa Morhaim (Paris) MSC: 47J26 90C30 49L20 PDFBibTeX XMLCite \textit{A. Belhenniche} et al., Fixed Point Theory 22, No. 2, 511--526 (2021; Zbl 1515.47114) Full Text: Link
Zhao, Jingang; Zhang, Chi Finite-horizon optimal control of discrete-time linear systems with completely unknown dynamics using Q-learning. (English) Zbl 1476.49046 J. Ind. Manag. Optim. 17, No. 3, 1471-1483 (2021). MSC: 49N10 49K21 49N30 93C55 PDFBibTeX XMLCite \textit{J. Zhao} and \textit{C. Zhang}, J. Ind. Manag. Optim. 17, No. 3, 1471--1483 (2021; Zbl 1476.49046) Full Text: DOI
Doan, Thinh T. Finite-time analysis and restarting scheme for linear two-time-scale stochastic approximation. (English) Zbl 1471.62444 SIAM J. Control Optim. 59, No. 4, 2798-2819 (2021). MSC: 62L20 68T05 PDFBibTeX XMLCite \textit{T. T. Doan}, SIAM J. Control Optim. 59, No. 4, 2798--2819 (2021; Zbl 1471.62444) Full Text: DOI arXiv
Krishnamurthy, Vikram; Yin, George Langevin dynamics for adaptive inverse reinforcement learning of stochastic gradient algorithms. (English) Zbl 07370638 J. Mach. Learn. Res. 22, Paper No. 121, 49 p. (2021). MSC: 68T05 PDFBibTeX XMLCite \textit{V. Krishnamurthy} and \textit{G. Yin}, J. Mach. Learn. Res. 22, Paper No. 121, 49 p. (2021; Zbl 07370638) Full Text: arXiv Link
Agarwal, Alekh; Kakade, Sham M.; Lee, Jason D.; Mahajan, Gaurav On the theory of policy gradient methods: optimality, approximation, and distribution shift. (English) Zbl 07370615 J. Mach. Learn. Res. 22, Paper No. 98, 76 p. (2021). MSC: 68T05 PDFBibTeX XMLCite \textit{A. Agarwal} et al., J. Mach. Learn. Res. 22, Paper No. 98, 76 p. (2021; Zbl 07370615) Full Text: arXiv Link
Metelli, Alberto Maria; Pirotta, Matteo; Calandriello, Daniele; Restelli, Marcello Safe policy iteration: a monotonically improving approximate policy iteration approach. (English) Zbl 07370614 J. Mach. Learn. Res. 22, Paper No. 97, 83 p. (2021). MSC: 68T05 PDFBibTeX XMLCite \textit{A. M. Metelli} et al., J. Mach. Learn. Res. 22, Paper No. 97, 83 p. (2021; Zbl 07370614) Full Text: Link
Boyko, A. I.; Oseledets, I. V.; Ferrer, G. TT-QI: faster value iteration in tensor train format for stochastic optimal control. (English. Russian original) Zbl 1469.49026 Comput. Math. Math. Phys. 61, No. 5, 836-846 (2021); translation from Zh. Vychisl. Mat. Mat. Fiz. 61, No. 5, 865-877 (2021). MSC: 49K45 90C39 65F99 PDFBibTeX XMLCite \textit{A. I. Boyko} et al., Comput. Math. Math. Phys. 61, No. 5, 836--846 (2021; Zbl 1469.49026); translation from Zh. Vychisl. Mat. Mat. Fiz. 61, No. 5, 865--877 (2021) Full Text: DOI
Doan, Thinh T.; Maguluri, Siva Theja; Romberg, Justin Finite-time performance of distributed temporal-difference learning with linear function approximation. (English) Zbl 1483.68294 SIAM J. Math. Data Sci. 3, No. 1, 298-320 (2021). MSC: 68T05 68T42 68W15 68W40 90C40 PDFBibTeX XMLCite \textit{T. T. Doan} et al., SIAM J. Math. Data Sci. 3, No. 1, 298--320 (2021; Zbl 1483.68294) Full Text: DOI arXiv
Liu, Rui-Rui; Hao, Fei; Yu, Hao Optimal DoS attack scheduling for multi-sensor remote state estimation over interference channels. (English) Zbl 1465.93090 J. Franklin Inst. 358, No. 9, 5136-5162 (2021). MSC: 93B70 93C83 91A12 90C40 PDFBibTeX XMLCite \textit{R.-R. Liu} et al., J. Franklin Inst. 358, No. 9, 5136--5162 (2021; Zbl 1465.93090) Full Text: DOI
Shone, Rob; Glazebrook, Kevin; Zografos, Konstantinos G. Applications of stochastic modeling in air traffic management: methods, challenges and opportunities for solving air traffic problems under uncertainty. (English) Zbl 1487.90466 Eur. J. Oper. Res. 292, No. 1, 1-26 (2021). MSC: 90B90 90B22 90B35 90B06 90C15 90-02 PDFBibTeX XMLCite \textit{R. Shone} et al., Eur. J. Oper. Res. 292, No. 1, 1--26 (2021; Zbl 1487.90466) Full Text: DOI
Anderson, Tor; Martínez, Sonia Distributed resource allocation with binary decisions via Newton-like neural network dynamics. (English) Zbl 1461.91144 Automatica 128, Article ID 109564, 11 p. (2021). MSC: 91B32 68T07 PDFBibTeX XMLCite \textit{T. Anderson} and \textit{S. Martínez}, Automatica 128, Article ID 109564, 11 p. (2021; Zbl 1461.91144) Full Text: DOI arXiv
Wills, Adrian G.; Schön, Thomas B. Stochastic quasi-Newton with line-search regularisation. (English) Zbl 1461.93556 Automatica 127, Article ID 109503, 11 p. (2021). MSC: 93E20 93E12 93C10 90C53 PDFBibTeX XMLCite \textit{A. G. Wills} and \textit{T. B. Schön}, Automatica 127, Article ID 109503, 11 p. (2021; Zbl 1461.93556) Full Text: DOI arXiv
Pedrosa, Filipe C.; Nereu, João C.; do Val, João B. R. When control and state variations increase uncertainty: modeling and stochastic control in discrete time. (English) Zbl 1461.93549 Automatica 123, Article ID 109341, 9 p. (2021). MSC: 93E20 93C41 93C55 PDFBibTeX XMLCite \textit{F. C. Pedrosa} et al., Automatica 123, Article ID 109341, 9 p. (2021; Zbl 1461.93549) Full Text: DOI
Kalathil, Dileep; Borkar, Vivek S.; Jain, Rahul Empirical \(Q\)-value iteration. (English) Zbl 1461.68184 Stoch. Syst. 11, No. 1, 1-18 (2021). MSC: 68T05 62L20 90C39 90C40 PDFBibTeX XMLCite \textit{D. Kalathil} et al., Stoch. Syst. 11, No. 1, 1--18 (2021; Zbl 1461.68184) Full Text: DOI arXiv
Huré, Côme; Pham, Huyên; Bachouch, Achref; Langrené, Nicolas Deep neural networks algorithms for stochastic control problems on finite horizon: convergence analysis. (English) Zbl 1466.65007 SIAM J. Numer. Anal. 59, No. 1, 525-557 (2021). MSC: 65C05 90C39 93E35 62M45 PDFBibTeX XMLCite \textit{C. Huré} et al., SIAM J. Numer. Anal. 59, No. 1, 525--557 (2021; Zbl 1466.65007) Full Text: DOI arXiv
Moazeni, Somayeh; Scott, Warren R.; Powell, Warren B. Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage. (English) Zbl 1512.93152 INFOR: Inf. Syst. Oper. Res. 58, No. 1, 141-166 (2020). MSC: 93E20 90C39 93E24 93C55 PDFBibTeX XMLCite \textit{S. Moazeni} et al., INFOR: Inf. Syst. Oper. Res. 58, No. 1, 141--166 (2020; Zbl 1512.93152) Full Text: DOI arXiv
Gupta, Abhishek; Chen, Hao; Pi, Jianzong; Tendolkar, Gaurav Some limit properties of Markov chains induced by recursive stochastic algorithms. (English) Zbl 1485.93646 SIAM J. Math. Data Sci. 2, No. 4, 967-1003 (2020). MSC: 93E35 60J20 90C40 90C39 PDFBibTeX XMLCite \textit{A. Gupta} et al., SIAM J. Math. Data Sci. 2, No. 4, 967--1003 (2020; Zbl 1485.93646) Full Text: DOI arXiv
Zhao, Jingang; Gan, Minggang Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning. (English) Zbl 1483.49038 Int. J. Syst. Sci., Princ. Appl. Syst. Integr. 51, No. 13, 2429-2440 (2020). MSC: 49L12 PDFBibTeX XMLCite \textit{J. Zhao} and \textit{M. Gan}, Int. J. Syst. Sci., Princ. Appl. Syst. Integr. 51, No. 13, 2429--2440 (2020; Zbl 1483.49038) Full Text: DOI
Pan, Yunian; Peng, Guanze; Chen, Juntao; Zhu, Quanyan MASAGE: model-agnostic sequential and adaptive game estimation. (English) Zbl 1483.68047 Zhu, Quanyan (ed.) et al., Decision and game theory for security. 11th international conference, GameSec 2020, College Park, MD, USA, October 28–30, 2020. Proceedings. Cham: Springer. Lect. Notes Comput. Sci. 12513, 365-384 (2020). MSC: 68M25 91A80 PDFBibTeX XMLCite \textit{Y. Pan} et al., Lect. Notes Comput. Sci. 12513, 365--384 (2020; Zbl 1483.68047) Full Text: DOI
Sheng, Linxue; Zhu, Yuanguo; Wang, Kai Analysis of a class of dynamic programming models for multi-stage uncertain systems. (English) Zbl 1481.90306 Appl. Math. Modelling 86, 446-459 (2020). MSC: 90C39 49K21 49K45 90B05 90B30 PDFBibTeX XMLCite \textit{L. Sheng} et al., Appl. Math. Modelling 86, 446--459 (2020; Zbl 1481.90306) Full Text: DOI
Köpf, Florian; Ramsteiner, Simon; Puccetti, Luca; Flad, Michael; Hohmann, Sören Adaptive dynamic programming for model-free tracking of trajectories with time-varying parameters. (English) Zbl 1469.93061 Int. J. Adapt. Control Signal Process. 34, No. 7, 839-856 (2020). MSC: 93C40 93C55 93B47 90C39 PDFBibTeX XMLCite \textit{F. Köpf} et al., Int. J. Adapt. Control Signal Process. 34, No. 7, 839--856 (2020; Zbl 1469.93061) Full Text: DOI arXiv