Tabla de Contenidos:
  • Intro
  • Deep Learning and the Game of Go
  • Max Pumperla and Kevin Ferguson
  • Brief Table of Contents
  • Table of Contents
  • Foreword
  • Preface
  • About this book
  • Who should read this book
  • Roadmap
  • About the code
  • Part 1. Foundations
  • 1 Toward deep learning: a machine-learning introduction
  • 1.1. What is machine learning?
  • 1.1.1. How does machine learning relate to AI?
  • 1.1.2. What you can and can't do with machine learning
  • 1.2. Machine learning by example
  • 1.2.1. Using machine learning in software applications
  • 1.2.2. Supervised learning
  • 1.2.3. Unsupervised learning
  • 1.2.4. Reinforcement learning
  • 1.3. Deep learning
  • 1.4. What you'll learn in this book
  • 1.5. Summary
  • 2 Go as a machine-learning problem
  • 2.1. Why games?
  • 2.2. A lightning introduction to the game of Go
  • 2.2.1. Understanding the board
  • 2.2.2. Placing and capturing stones
  • 2.2.3. Ending the game and counting
  • 2.2.4. Understanding ko
  • 2.3. Handicaps
  • 2.4. Where to learn more
  • 2.5. What can we teach a machine?
  • 2.5.1. Selecting moves in the opening
  • 2.5.2. Searching game states
  • 2.5.3. Reducing the number of moves to consider
  • 2.5.4. Evaluating game states
  • 2.6. How to measure your Go AI's strength
  • 2.6.1. Traditional Go ranks
  • 2.6.2. Benchmarking your Go AI
  • 2.7. Summary
  • 3 Implementing your first Go bot
  • 3.1. Representing a game of Go in Python
  • 3.1.1. Implementing the Go board
  • 3.1.2. Tracking connected groups of stones in Go: strings
  • 3.1.3. Placing and capturing stones on a Go board
  • 3.2. Capturing game state and checking for illegal moves
  • 3.2.1. Self-capture
  • 3.2.2. Ko
  • 3.3. Ending a game
  • 3.4. Creating your first bot: the weakest Go AI imaginable.
  • 3.5. Speeding up game play with Zobrist hashing
  • 3.6. Playing against your bot
  • 3.7. Summary
  • Part 2. Machine learning and game AI
  • 4 Playing games with tree search
  • 4.1. Classifying games
  • 4.2. Anticipating your opponent with minimax search
  • 4.3. Solving tic-tac-toe: a minimax example
  • 4.4. Reducing search space with pruning
  • 4.4.1. Reducing search depth with position evaluation
  • 4.4.2. Reducing search width with alpha-beta pruning
  • 4.5. Evaluating game states with Monte Carlo tree search
  • 4.5.1. Implementing Monte Carlo tree search in Python
  • 4.5.2. How to select which branch to explore
  • 4.5.3. Applying Monte Carlo tree search to Go
  • 4.6. Summary
  • 5 Getting started with neural networks
  • 5.1. A simple use case: classifying handwritten digits
  • 5.1.1. The MNIST data set of handwritten digits
  • 5.1.2. MNIST data preprocessing
  • 5.2. The basics of neural networks
  • 5.2.1. Logistic regression as simple artificial neural network
  • 5.2.2. Networks with more than one output dimension
  • 5.3. Feed-forward networks
  • 5.4. How good are our predictions? Loss functions and optimization
  • 5.4.1. What is a loss function?
  • 5.4.2. Mean squared error
  • 5.4.3. Finding minima in loss functions
  • 5.4.4. Gradient descent to find minima
  • 5.4.5. Stochastic gradient descent for loss functions
  • 5.4.6. Propagating gradients back through your network
  • 5.5. Training a neural network step-by-step in Python
  • 5.5.1. Neural network layers in Python
  • 5.5.2. Activation layers in neural networks
  • 5.5.3. Dense layers in Python as building blocks for feed-forward networks
  • 5.5.4. Sequential neural networks with Python
  • 5.5.5. Applying your network handwritten digit classification
  • 5.6. Summary
  • 6 Designing a neural network for Go data
  • 6.1. Encoding a Go game position for neural networks.
  • 6.2. Generating tree-search games as network training data
  • 6.3. Using the Keras deep-learning library
  • 6.3.1. Understanding Keras design principles
  • 6.3.2. Installing the Keras deep-learning library
  • 6.3.3. Running a familiar first example with Keras
  • 6.3.4. Go move prediction with feed-forward neural networks in Keras
  • 6.4. Analyzing space with convolutional networks
  • 6.4.1. What convolutions do intuitively
  • 6.4.2. Building convolutional neural networks with Keras
  • 6.4.3. Reducing space with pooling layers
  • 6.5. Predicting Go move probabilities
  • 6.5.1. Using the softmax activation function in the last layer
  • 6.5.2. Cross-entropy loss for classification problems
  • 6.6. Building deeper networks with dropout and rectified linear units
  • 6.6.1. Dropping neurons for regularization
  • 6.6.2. The rectified linear unit activation function
  • 6.7. Putting it all together for a stronger Go move-prediction network
  • 6.8. Summary
  • 7 Learning from data: a deep-learning bot
  • 7.1. Importing Go game records
  • 7.1.1. The SGF file format
  • 7.1.2. Downloading and replaying Go game records from KGS
  • 7.2. Preparing Go data for deep learning
  • 7.2.1. Replaying a Go game from an SGF record
  • 7.2.2. Building a Go data processor
  • 7.2.3. Building a Go data generator to load data efficiently
  • 7.2.4. Parallel Go data processing and generators
  • 7.3. Training a deep-learning model on human game-play data
  • 7.4. Building more-realistic Go data encoders
  • 7.5. Training efficiently with adaptive gradients
  • 7.5.1. Decay and momentum in SGD
  • 7.5.2. Optimizing neural networks with Adagrad
  • 7.5.3. Refining adaptive gradients with Adadelta
  • 7.6. Running your own experiments and evaluating performance
  • 7.6.1. A guideline to testing architectures and hyperparameters
  • 7.6.2. Evaluating performance metrics for training and test data.
  • 12.1.1. What is advantage?
  • 12.1.2. Calculating advantage during self-play
  • 12.2. Designing a neural network for actor-critic learning
  • 12.3. Playing games with an actor-critic agent
  • 12.4. Training an actor-critic agent from experience data
  • 12.5. Summary
  • Part 3. Greater than the sum of its parts
  • 13 AlphaGo: Bringing it all together
  • 13.1. Training deep neural networks for AlphaGo
  • 13.1.1. Network architectures in AlphaGo
  • 13.1.2. The AlphaGo board encoder
  • 13.1.3. Training AlphaGo-style policy networks
  • 13.2. Bootstrapping self-play from policy networks
  • 13.3. Deriving a value network from self-play data
  • 13.4. Better search with policy and value networks
  • 13.4.1. Using neural networks to improve Monte Carlo rollouts
  • 13.4.2. Tree search with a combined value function
  • 13.4.3. Implementing AlphaGo's search algorithm
  • 13.5. Practical considerations for training your own AlphaGo
  • 13.6. Summary
  • 14 AlphaGo Zero: Integrating tree search with reinforcement learning
  • 14.1. Building a neural network for tree search
  • 14.2. Guiding tree search with a neural network
  • 14.2.1. Walking down the tree
  • 14.2.2. Expanding the tree
  • 14.2.3. Selecting a move
  • 14.3. Training
  • 14.4. Improving exploration with Dirichlet noise
  • 14.5. Modern techniques for deeper neural networks
  • 14.5.1. Batch normalization
  • 14.5.2. Residual networks
  • 14.6. Exploring additional resources
  • 14.7. Wrapping up
  • 14.8. Summary
  • Appendix A. Mathematical foundations
  • Vectors, matrices, and beyond: a linear algebra primer
  • Vectors: one-dimensional data
  • Matrices: two-dimensional data
  • Rank 3 tensors
  • Rank 4 tensors
  • Calculus in five minutes: derivatives and finding maxima
  • Appendix B. The backpropagation algorithm
  • A bit of notation
  • The backpropagation algorithm for feed-forward networks.