|
|
|
|
LEADER |
00000cam a2200000Mi 4500 |
001 |
EBOOKCENTRAL_on1034635694 |
003 |
OCoLC |
005 |
20240329122006.0 |
006 |
m o d |
007 |
cr |n|---||||| |
008 |
180505s2018 enk o 000 0 eng d |
040 |
|
|
|a EBLCP
|b eng
|e pn
|c EBLCP
|d MERUC
|d IDB
|d NLE
|d OCLCQ
|d UKMGB
|d OCLCO
|d LVT
|d OCLCF
|d UKAHL
|d C6I
|d OCLCQ
|d UX1
|d K6U
|d OCLCO
|d OCLCQ
|d OCLCO
|d OCLCL
|
015 |
|
|
|a GBB882209
|2 bnb
|
016 |
7 |
|
|a 018853898
|2 Uk
|
019 |
|
|
|a 1175632501
|
020 |
|
|
|a 9781788830713
|
020 |
|
|
|a 1788830717
|
020 |
|
|
|a 9781788835725
|
020 |
|
|
|a 1788835727
|q (Trade Paper)
|
024 |
3 |
|
|a 9781788835725
|
029 |
1 |
|
|a UKMGB
|b 018853898
|
029 |
1 |
|
|a AU@
|b 000067097981
|
035 |
|
|
|a (OCoLC)1034635694
|z (OCoLC)1175632501
|
037 |
|
|
|a 9781788830713
|b Packt Publishing
|
050 |
|
4 |
|a Q325.6
|b .D888 2018eb
|
082 |
0 |
4 |
|a 006.31
|2 23
|
049 |
|
|
|a UAMI
|
100 |
1 |
|
|a DUTTA, SAYON.
|
245 |
1 |
0 |
|a Reinforcement Learning with TensorFlow :
|b a beginner's guide to designing self-learning systems with TensorFlow and OpenAI Gym.
|
260 |
|
|
|a Birmingham :
|b Packt Publishing,
|c 2018.
|
300 |
|
|
|a 1 online resource (327 pages)
|
336 |
|
|
|a text
|b txt
|2 rdacontent
|
337 |
|
|
|a computer
|b c
|2 rdamedia
|
338 |
|
|
|a online resource
|b cr
|2 rdacarrier
|
588 |
0 |
|
|a Print version record.
|
505 |
0 |
|
|a Cover; Title Page; Copyright and Credits; Packt Upsell; Contributors; Table of Contents; Preface; Chapter 1: Deep Learning -- Architectures and Frameworks; Deep learning; Activation functions for deep learning; The sigmoid function; The tanh function; The softmax function; The rectified linear unit function; How to choose the right activation function; Logistic regression as a neural network; Notation; Objective; The cost function; The gradient descent algorithm; The computational graph; Steps to solve logistic regression using gradient descent; What is xavier initialization?
|
505 |
8 |
|
|a Why do we use xavier initialization?The neural network model; Recurrent neural networks; Long Short Term Memory Networks; Convolutional neural networks; The LeNet-5 convolutional neural network; The AlexNet model; The VGG-Net model; The Inception model; Limitations of deep learning; The vanishing gradient problem; The exploding gradient problem; Overcoming the limitations of deep learning; Reinforcement learning; Basic terminologies and conventions; Optimality criteria; The value function for optimality; The policy model for optimality; The Q-learning approach to reinforcement learning.
|
505 |
8 |
|
|a Asynchronous advantage actor-criticIntroduction to TensorFlow and OpenAI Gym; Basic computations in TensorFlow; An introduction to OpenAI Gym; The pioneers and breakthroughs in reinforcement learning; David Silver; Pieter Abbeel; Google DeepMind; The AlphaGo program; Libratus; Summary; Chapter 2: Training Reinforcement Learning Agents Using OpenAI Gym; The OpenAI Gym; Understanding an OpenAI Gym environment; Programming an agent using an OpenAI Gym environment; Q-Learning; The Epsilon-Greedy approach; Using the Q-Network for real-world applications; Summary; Chapter 3: Markov Decision Process.
|
505 |
8 |
|
|a Markov decision processesThe Markov property; The S state set; Actions; Transition model; Rewards; Policy; The sequence of rewards -- assumptions; The infinite horizons; Utility of sequences; The Bellman equations; Solving the Bellman equation to find policies; An example of value iteration using the Bellman equation; Policy iteration; Partially observable Markov decision processes; State estimation; Value iteration in POMDPs; Training the FrozenLake-v0 environment using MDP; Summary; Chapter 4: Policy Gradients; The policy optimization method; Why policy optimization methods?
|
505 |
8 |
|
|a Why stochastic policy?Example 1 -- rock, paper, scissors; Example 2 -- state aliased grid-world; Policy objective functions; Policy Gradient Theorem; Temporal difference rule; TD(1) rule; TD(0) rule; TD() rule; Policy gradients; The Monte Carlo policy gradient; Actor-critic algorithms; Using a baseline to reduce variance; Vanilla policy gradient; Agent learning pong using policy gradients; Summary; Chapter 5: Q-Learning and Deep Q-Networks; Why reinforcement learning?; Model based learning and model free learning; Monte Carlo learning; Temporal difference learning.
|
500 |
|
|
|a On-policy and off-policy learning.
|
520 |
|
|
|a Reinforcement learning allows you to develop intelligent, self-learning systems. This book shows you how to put the concepts of Reinforcement Learning to train efficient models. You will use popular reinforcement learning algorithms to implement use-cases in image processing and NLP, by combining the power of TensorFlow and OpenAI Gym.
|
590 |
|
|
|a ProQuest Ebook Central
|b Ebook Central Academic Complete
|
650 |
|
0 |
|a Reinforcement learning.
|
650 |
|
6 |
|a Apprentissage par renforcement (Intelligence artificielle)
|
650 |
|
7 |
|a Reinforcement learning
|2 fast
|
758 |
|
|
|i has work:
|a REINFORCEMENT LEARNING WITH TENSORFLOW;A BEGINNER'S GUIDE TO DESIGNING SELFARNING SYSTEMS WITH TENSORFLOW AND OPENAI GYM (Text)
|1 https://id.oclc.org/worldcat/entity/E39PD3WC7BtP3brwWXD88QxhVy
|4 https://id.oclc.org/worldcat/ontology/hasWork
|
776 |
0 |
8 |
|i Print version:
|a DUTTA, SAYON.
|t Reinforcement Learning with TensorFlow : A beginner's guide to designing self-learning systems with TensorFlow and OpenAI Gym.
|d Birmingham : Packt Publishing, ©2018
|
856 |
4 |
0 |
|u https://ebookcentral.uam.elogim.com/lib/uam-ebooks/detail.action?docID=5371683
|z Texto completo
|
938 |
|
|
|a Askews and Holts Library Services
|b ASKH
|n BDZ0036672668
|
938 |
|
|
|a EBL - Ebook Library
|b EBLB
|n EBL5371683
|
994 |
|
|
|a 92
|b IZTAP
|