Cargando…

Hands-On Reinforcement Learning with Python : Master Reinforcement and Deep Reinforcement Learning Using OpenAI Gym and TensorFlow.

Reinforcement learning is a self-evolving type of machine learning that takes us closer to achieving true artificial intelligence. This easy-to-follow guide explains everything from scratch using rich examples written in Python.

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Ravichandiran, Sudharsan
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Birmingham : Packt Publishing Ltd, 2018.
Temas:
Acceso en línea:Texto completo

MARC

LEADER 00000cam a2200000Mi 4500
001 EBOOKCENTRAL_on1043619270
003 OCoLC
005 20240329122006.0
006 m o d
007 cr |n|---|||||
008 180707s2018 enk o 000 0 eng d
040 |a EBLCP  |b eng  |e pn  |c EBLCP  |d MERUC  |d IDB  |d NLE  |d OCLCQ  |d LVT  |d OCLCF  |d OCLCO  |d C6I  |d UKAHL  |d OCLCQ  |d UX1  |d K6U  |d OCLCO  |d OCLCQ  |d OCLCO 
019 |a 1175639828 
020 |a 9781788836913 
020 |a 178883691X 
020 |a 9781788836524 
020 |a 1788836529  |q (Trade Paper) 
024 3 |a 9781788836524 
029 1 |a AU@  |b 000067104158 
035 |a (OCoLC)1043619270  |z (OCoLC)1175639828 
037 |a B09792  |b 01201872 
050 4 |a Q325.5  |b .R385 2018eb 
082 0 4 |a 006.31  |2 23 
049 |a UAMI 
100 1 |a Ravichandiran, Sudharsan. 
245 1 0 |a Hands-On Reinforcement Learning with Python :  |b Master Reinforcement and Deep Reinforcement Learning Using OpenAI Gym and TensorFlow. 
260 |a Birmingham :  |b Packt Publishing Ltd,  |c 2018. 
300 |a 1 online resource (309 pages) 
336 |a text  |b txt  |2 rdacontent 
337 |a computer  |b c  |2 rdamedia 
338 |a online resource  |b cr  |2 rdacarrier 
588 0 |a Print version record. 
505 0 |a Cover; Title Page; Copyright and Credits; Dedication; Packt Upsell; Contributors; Table of Contents; Preface; Chapter 1: Introduction to Reinforcement Learning; What is RL?; RL algorithm; How RL differs from other ML paradigms; Elements of RL; Agent; Policy function; Value function; Model; Agent environment interface; Types of RL environment; Deterministic environment; Stochastic environment; Fully observable environment; Partially observable environment; Discrete environment; Continuous environment; Episodic and non-episodic environment; Single and multi-agent environment; RL platforms. 
505 8 |a OpenAI Gym and UniverseDeepMind Lab; RL-Glue; Project Malmo; ViZDoom; Applications of RL; Education; Medicine and healthcare; Manufacturing; Inventory management; Finance; Natural Language Processing and Computer Vision; Summary; Questions; Further reading; Chapter 2: Getting Started with OpenAI and TensorFlow; Setting up your machine; Installing Anaconda; Installing Docker; Installing OpenAI Gym and Universe; Common error fixes; OpenAI Gym; Basic simulations; Training a robot to walk; OpenAI Universe; Building a video game bot; TensorFlow; Variables, constants, and placeholders; Variables. 
505 8 |a ConstantsPlaceholders; Computation graph; Sessions; TensorBoard; Adding scope; Summary; Questions; Further reading; Chapter 3: The Markov Decision Process and Dynamic Programming; The Markov chain and Markov process; Markov Decision Process; Rewards and returns; Episodic and continuous tasks; Discount factor; The policy function; State value function; State-action value function (Q function); The Bellman equation and optimality; Deriving the Bellman equation for value and Q functions; Solving the Bellman equation; Dynamic programming; Value iteration; Policy iteration. 
505 8 |a Solving the frozen lake problemValue iteration; Policy iteration; Summary; Questions; Further reading; Chapter 4: Gaming with Monte Carlo Methods; Monte Carlo methods; Estimating the value of pi using Monte Carlo; Monte Carlo prediction; First visit Monte Carlo; Every visit Monte Carlo; Let's play Blackjack with Monte Carlo; Monte Carlo control; Monte Carlo exploration starts; On-policy Monte Carlo control; Off-policy Monte Carlo control; Summary; Questions; Further reading; Chapter 5: Temporal Difference Learning; TD learning; TD prediction; TD control; Q learning. 
505 8 |a Solving the taxi problem using Q learningSARSA; Solving the taxi problem using SARSA; The difference between Q learning and SARSA; Summary; Questions; Further reading; Chapter 6: Multi-Armed Bandit Problem; The MAB problem; The epsilon-greedy policy; The softmax exploration algorithm; The upper confidence bound algorithm; The Thompson sampling algorithm; Applications of MAB; Identifying the right advertisement banner using MAB; Contextual bandits; Summary; Questions; Further reading; Chapter 7: Deep Learning Fundamentals; Artificial neurons; ANNs; Input layer; Hidden layer; Output layer. 
500 |a Activation functions. 
520 |a Reinforcement learning is a self-evolving type of machine learning that takes us closer to achieving true artificial intelligence. This easy-to-follow guide explains everything from scratch using rich examples written in Python. 
590 |a ProQuest Ebook Central  |b Ebook Central Academic Complete 
650 0 |a Machine learning. 
650 6 |a Apprentissage automatique. 
650 7 |a Artificial intelligence.  |2 bicssc 
650 7 |a Human-computer interaction.  |2 bicssc 
650 7 |a Neural networks & fuzzy systems.  |2 bicssc 
650 7 |a Computers  |x Intelligence (AI) & Semantics.  |2 bisacsh 
650 7 |a Computers  |x Neural Networks.  |2 bisacsh 
650 7 |a Computers  |x Social Aspects  |x Human-Computer Interaction.  |2 bisacsh 
650 7 |a Machine learning  |2 fast 
776 0 8 |i Print version:  |a Ravichandiran, Sudharsan.  |t Hands-On Reinforcement Learning with Python : Master Reinforcement and Deep Reinforcement Learning Using OpenAI Gym and TensorFlow.  |d Birmingham : Packt Publishing Ltd, ©2018  |z 9781788836524 
856 4 0 |u https://ebookcentral.uam.elogim.com/lib/uam-ebooks/detail.action?docID=5439844  |z Texto completo 
938 |a Askews and Holts Library Services  |b ASKH  |n BDZ0037018846 
938 |a EBL - Ebook Library  |b EBLB  |n EBL5439844 
994 |a 92  |b IZTAP