Habilitation thesis of Odalric-Ambrym Maillard

Mathematics of Statistical Sequential Decision Making

In this document, we give an overview of recent contributions to the mathematics of statistical sequential learning. Unlike research articles that start from a motivating example and provide little room to the mathematical tools in the main body of the article, we here give primary focus to these tools, in order to stress their potential as well as their role in the development of improved algorithms and proof techniques in the field. We revisit in particular properties of the log Laplace transform of a random variable, the handling of random stopping time in concentration of measure of empirical distributions, and we highlight the fundamental role of the “change of measure” argument both in the construction of performance lower-bounds as well as near-optimal strategies. We then give focus to obtaining finite-time error guarantees on the parameter estimation in parametric models before highlighting the strength of Legendre-Fenchel duality in the design of risk-averse and robust strategies. Finally, we turn the setting of Markov decision processes where we present some key insights for the development of the next generation of decision strategies. We end this manuscript by providing a more focused presentation of three key contributions in bandit theory, stochastic automata, and aggregation of experts.

defended on 11/02/2019