Cargando…

Exploiting environment configurability in reinforcement learning /

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Metelli, Alberto Maria (Autor)
Autor Corporativo: IOS Press
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Amsterdam, Netherlands : IOS Press, 20221207.
Colección:Frontiers in artificial intelligence and applications ; v. 361.
Temas:
Acceso en línea:Texto completo

MARC

LEADER 00000cam a22000007i 4500
001 EBOOKCENTRAL_on1373388321
003 OCoLC
005 20240329122006.0
006 m o d
007 cr cnu---unuuu
008 230320s2022 ne ob 000 0 eng d
040 |a IOSPR  |b eng  |e rda  |e pn  |c IOSPR  |d N$T  |d YDX  |d EBLCP  |d OCLCF  |d OCLCQ  |d OCLCO  |d OCLCQ  |d TMA  |d OCLCQ 
019 |a 1373232816  |a 1373345094 
020 |a 9781643683638  |q (electronic bk.) 
020 |a 1643683632  |q (electronic bk.) 
029 1 |a AU@  |b 000075409896 
035 |a (OCoLC)1373388321  |z (OCoLC)1373232816  |z (OCoLC)1373345094 
037 |a 9781643683638  |b IOS Press  |n http://www.iospress.nl 
050 4 |a Q325.6 
082 0 4 |a 006.3/1  |2 23/eng/20230320 
049 |a UAMI 
100 1 |a Metelli, Alberto Maria,  |e author. 
245 1 0 |a Exploiting environment configurability in reinforcement learning /  |c Alberto Maria Metelli. 
264 1 |a Amsterdam, Netherlands :  |b IOS Press,  |c 20221207. 
300 |a 1 online resource. 
336 |a text  |b txt  |2 rdacontent 
337 |a computer  |b c  |2 rdamedia 
338 |a online resource  |b cr  |2 rdacarrier 
490 1 |a Frontiers in artificial intelligence and applications ;  |v volume 361 
504 |a Includes bibliographical references. 
588 0 |a Online resource; title from PDF title page (IOS Press, viewed March 20, 2023). 
505 0 |a Intro -- Title page -- Abstract -- Contents -- List of Figures -- List of Tables -- List of Algorithms -- List of Symbols and Notation -- Acknowledgments -- Introduction -- What is Reinforcement Learning? -- Why Environment Configurability? -- Original Contributions -- Overview -- Foundations of Sequential Decision-Making -- Introduction -- Markov Decision Processes -- Markov Reward Processes -- Markov Chains -- Performance Indexes -- Value Functions -- Optimality Criteria -- Exact Solution Methods -- Reinforcement Learning Algorithms -- Temporal Difference Methods -- Function Approximation 
505 8 |a Policy Search -- Modeling Environment Configurability -- Configurable Markov Decision Processes -- Introduction -- Motivations and Examples -- Definition -- Value Functions -- Bellman Equations and Operators -- Taxonomy -- Related Literature -- Solution Concepts for Conf-MDPs -- Cooperative Setting -- Non-Cooperative Setting -- Learning in Cooperative Configurable Markov Decision Processes -- Learning in Finite Cooperative Conf-MDPs -- Introduction -- Relative Advantage Functions -- Performance Improvement Bound -- Safe Policy Model Iteration -- Theoretical Analysis -- Experimental Evaluation 
505 8 |a Examples of Conf-MDPs -- Learning in Continuous Conf-MDPs -- Introduction -- Solving Parametric Conf-MDPs -- Relative Entropy Model Policy Search -- Theoretical Analysis -- Approximation of the Transition Model -- Experiments -- Applications of Configurable Markov Decision Processes -- Policy Space Identification -- Introduction -- Generalized Likelihood Ratio Test -- Policy Space Identification in a Fixed Env -- Analysis for the Exponential Family -- Policy Space Identification in a Configurable Env -- Connections with Existing Work -- Experimental Results -- Control Frequency Adaptation 
505 8 |a Introduction -- Persisting Actions in MDPs -- Bounding the Performance Loss -- Persistent Fitted Q-Iteration -- Persistence Selection -- Related Works -- Experimental Evaluation -- Open Questions -- Discussion and Conclusions -- Modeling Environment Configurability -- Learning in Conf-MDPs -- Applications of Conf-MDPs -- Appendices -- Additional Results and Proofs -- Additional Results and Proofs of Chapter 6 -- Additional Results and Proofs of Chapter 7 -- Additional Results and Proofs of Chapter 8 -- Additional Results and Proofs of Chapter 9 -- Exponential Family Policies 
505 8 |a Gaussian and Boltzmann Linear Policies as Exponential Family distributions -- Fisher Information Matrix -- Subgaussianity Assumption -- Bibliography 
590 |a eBooks on EBSCOhost  |b EBSCO eBook Subscription Academic Collection - Worldwide 
590 |a ProQuest Ebook Central  |b Ebook Central Academic Complete 
650 0 |a Reinforcement learning. 
650 6 |a Apprentissage par renforcement (Intelligence artificielle) 
650 7 |a Reinforcement learning  |2 fast 
710 2 |a IOS Press. 
776 0 8 |i Print version:  |a Metelli, A. M.  |t Exploiting Environment Configurability in Reinforcement Learning  |d Amsterdam : IOS Press, Incorporated,c2022  |z 9781643683621 
830 0 |a Frontiers in artificial intelligence and applications ;  |v v. 361. 
856 4 0 |u https://ebookcentral.uam.elogim.com/lib/uam-ebooks/detail.action?docID=30413114  |z Texto completo 
938 |a EBSCOhost  |b EBSC  |n 3575340 
938 |a YBP Library Services  |b YANK  |n 304689537 
938 |a ProQuest Ebook Central  |b EBLB  |n EBL30413114 
994 |a 92  |b IZTAP