|
|
|
|
LEADER |
00000cam a2200000Mi 4500 |
001 |
EBOOKCENTRAL_on1043619270 |
003 |
OCoLC |
005 |
20240329122006.0 |
006 |
m o d |
007 |
cr |n|---||||| |
008 |
180707s2018 enk o 000 0 eng d |
040 |
|
|
|a EBLCP
|b eng
|e pn
|c EBLCP
|d MERUC
|d IDB
|d NLE
|d OCLCQ
|d LVT
|d OCLCF
|d OCLCO
|d C6I
|d UKAHL
|d OCLCQ
|d UX1
|d K6U
|d OCLCO
|d OCLCQ
|d OCLCO
|
019 |
|
|
|a 1175639828
|
020 |
|
|
|a 9781788836913
|
020 |
|
|
|a 178883691X
|
020 |
|
|
|a 9781788836524
|
020 |
|
|
|a 1788836529
|q (Trade Paper)
|
024 |
3 |
|
|a 9781788836524
|
029 |
1 |
|
|a AU@
|b 000067104158
|
035 |
|
|
|a (OCoLC)1043619270
|z (OCoLC)1175639828
|
037 |
|
|
|a B09792
|b 01201872
|
050 |
|
4 |
|a Q325.5
|b .R385 2018eb
|
082 |
0 |
4 |
|a 006.31
|2 23
|
049 |
|
|
|a UAMI
|
100 |
1 |
|
|a Ravichandiran, Sudharsan.
|
245 |
1 |
0 |
|a Hands-On Reinforcement Learning with Python :
|b Master Reinforcement and Deep Reinforcement Learning Using OpenAI Gym and TensorFlow.
|
260 |
|
|
|a Birmingham :
|b Packt Publishing Ltd,
|c 2018.
|
300 |
|
|
|a 1 online resource (309 pages)
|
336 |
|
|
|a text
|b txt
|2 rdacontent
|
337 |
|
|
|a computer
|b c
|2 rdamedia
|
338 |
|
|
|a online resource
|b cr
|2 rdacarrier
|
588 |
0 |
|
|a Print version record.
|
505 |
0 |
|
|a Cover; Title Page; Copyright and Credits; Dedication; Packt Upsell; Contributors; Table of Contents; Preface; Chapter 1: Introduction to Reinforcement Learning; What is RL?; RL algorithm; How RL differs from other ML paradigms; Elements of RL; Agent; Policy function; Value function; Model; Agent environment interface; Types of RL environment; Deterministic environment; Stochastic environment; Fully observable environment; Partially observable environment; Discrete environment; Continuous environment; Episodic and non-episodic environment; Single and multi-agent environment; RL platforms.
|
505 |
8 |
|
|a OpenAI Gym and UniverseDeepMind Lab; RL-Glue; Project Malmo; ViZDoom; Applications of RL; Education; Medicine and healthcare; Manufacturing; Inventory management; Finance; Natural Language Processing and Computer Vision; Summary; Questions; Further reading; Chapter 2: Getting Started with OpenAI and TensorFlow; Setting up your machine; Installing Anaconda; Installing Docker; Installing OpenAI Gym and Universe; Common error fixes; OpenAI Gym; Basic simulations; Training a robot to walk; OpenAI Universe; Building a video game bot; TensorFlow; Variables, constants, and placeholders; Variables.
|
505 |
8 |
|
|a ConstantsPlaceholders; Computation graph; Sessions; TensorBoard; Adding scope; Summary; Questions; Further reading; Chapter 3: The Markov Decision Process and Dynamic Programming; The Markov chain and Markov process; Markov Decision Process; Rewards and returns; Episodic and continuous tasks; Discount factor; The policy function; State value function; State-action value function (Q function); The Bellman equation and optimality; Deriving the Bellman equation for value and Q functions; Solving the Bellman equation; Dynamic programming; Value iteration; Policy iteration.
|
505 |
8 |
|
|a Solving the frozen lake problemValue iteration; Policy iteration; Summary; Questions; Further reading; Chapter 4: Gaming with Monte Carlo Methods; Monte Carlo methods; Estimating the value of pi using Monte Carlo; Monte Carlo prediction; First visit Monte Carlo; Every visit Monte Carlo; Let's play Blackjack with Monte Carlo; Monte Carlo control; Monte Carlo exploration starts; On-policy Monte Carlo control; Off-policy Monte Carlo control; Summary; Questions; Further reading; Chapter 5: Temporal Difference Learning; TD learning; TD prediction; TD control; Q learning.
|
505 |
8 |
|
|a Solving the taxi problem using Q learningSARSA; Solving the taxi problem using SARSA; The difference between Q learning and SARSA; Summary; Questions; Further reading; Chapter 6: Multi-Armed Bandit Problem; The MAB problem; The epsilon-greedy policy; The softmax exploration algorithm; The upper confidence bound algorithm; The Thompson sampling algorithm; Applications of MAB; Identifying the right advertisement banner using MAB; Contextual bandits; Summary; Questions; Further reading; Chapter 7: Deep Learning Fundamentals; Artificial neurons; ANNs; Input layer; Hidden layer; Output layer.
|
500 |
|
|
|a Activation functions.
|
520 |
|
|
|a Reinforcement learning is a self-evolving type of machine learning that takes us closer to achieving true artificial intelligence. This easy-to-follow guide explains everything from scratch using rich examples written in Python.
|
590 |
|
|
|a ProQuest Ebook Central
|b Ebook Central Academic Complete
|
650 |
|
0 |
|a Machine learning.
|
650 |
|
6 |
|a Apprentissage automatique.
|
650 |
|
7 |
|a Artificial intelligence.
|2 bicssc
|
650 |
|
7 |
|a Human-computer interaction.
|2 bicssc
|
650 |
|
7 |
|a Neural networks & fuzzy systems.
|2 bicssc
|
650 |
|
7 |
|a Computers
|x Intelligence (AI) & Semantics.
|2 bisacsh
|
650 |
|
7 |
|a Computers
|x Neural Networks.
|2 bisacsh
|
650 |
|
7 |
|a Computers
|x Social Aspects
|x Human-Computer Interaction.
|2 bisacsh
|
650 |
|
7 |
|a Machine learning
|2 fast
|
776 |
0 |
8 |
|i Print version:
|a Ravichandiran, Sudharsan.
|t Hands-On Reinforcement Learning with Python : Master Reinforcement and Deep Reinforcement Learning Using OpenAI Gym and TensorFlow.
|d Birmingham : Packt Publishing Ltd, ©2018
|z 9781788836524
|
856 |
4 |
0 |
|u https://ebookcentral.uam.elogim.com/lib/uam-ebooks/detail.action?docID=5439844
|z Texto completo
|
938 |
|
|
|a Askews and Holts Library Services
|b ASKH
|n BDZ0037018846
|
938 |
|
|
|a EBL - Ebook Library
|b EBLB
|n EBL5439844
|
994 |
|
|
|a 92
|b IZTAP
|