Cargando…

Exploiting environment configurability in reinforcement learning /

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Metelli, Alberto Maria (Autor)
Autor Corporativo: IOS Press
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Amsterdam, Netherlands : IOS Press, 20221207.
Colección:Frontiers in artificial intelligence and applications ; v. 361.
Temas:
Acceso en línea:Texto completo
Tabla de Contenidos:
  • Intro
  • Title page
  • Abstract
  • Contents
  • List of Figures
  • List of Tables
  • List of Algorithms
  • List of Symbols and Notation
  • Acknowledgments
  • Introduction
  • What is Reinforcement Learning?
  • Why Environment Configurability?
  • Original Contributions
  • Overview
  • Foundations of Sequential Decision-Making
  • Introduction
  • Markov Decision Processes
  • Markov Reward Processes
  • Markov Chains
  • Performance Indexes
  • Value Functions
  • Optimality Criteria
  • Exact Solution Methods
  • Reinforcement Learning Algorithms
  • Temporal Difference Methods
  • Function Approximation
  • Policy Search
  • Modeling Environment Configurability
  • Configurable Markov Decision Processes
  • Introduction
  • Motivations and Examples
  • Definition
  • Value Functions
  • Bellman Equations and Operators
  • Taxonomy
  • Related Literature
  • Solution Concepts for Conf-MDPs
  • Cooperative Setting
  • Non-Cooperative Setting
  • Learning in Cooperative Configurable Markov Decision Processes
  • Learning in Finite Cooperative Conf-MDPs
  • Introduction
  • Relative Advantage Functions
  • Performance Improvement Bound
  • Safe Policy Model Iteration
  • Theoretical Analysis
  • Experimental Evaluation
  • Examples of Conf-MDPs
  • Learning in Continuous Conf-MDPs
  • Introduction
  • Solving Parametric Conf-MDPs
  • Relative Entropy Model Policy Search
  • Theoretical Analysis
  • Approximation of the Transition Model
  • Experiments
  • Applications of Configurable Markov Decision Processes
  • Policy Space Identification
  • Introduction
  • Generalized Likelihood Ratio Test
  • Policy Space Identification in a Fixed Env
  • Analysis for the Exponential Family
  • Policy Space Identification in a Configurable Env
  • Connections with Existing Work
  • Experimental Results
  • Control Frequency Adaptation
  • Introduction
  • Persisting Actions in MDPs
  • Bounding the Performance Loss
  • Persistent Fitted Q-Iteration
  • Persistence Selection
  • Related Works
  • Experimental Evaluation
  • Open Questions
  • Discussion and Conclusions
  • Modeling Environment Configurability
  • Learning in Conf-MDPs
  • Applications of Conf-MDPs
  • Appendices
  • Additional Results and Proofs
  • Additional Results and Proofs of Chapter 6
  • Additional Results and Proofs of Chapter 7
  • Additional Results and Proofs of Chapter 8
  • Additional Results and Proofs of Chapter 9
  • Exponential Family Policies
  • Gaussian and Boltzmann Linear Policies as Exponential Family distributions
  • Fisher Information Matrix
  • Subgaussianity Assumption
  • Bibliography