Deep learning and the game of Go /
The ancient strategy game of Go is an incredible case study for AI. In 2016, a deep learning-based system shocked the Go world by defeating a world champion. Shortly after that, the upgraded AlphaGo Zero crushed the original bot by using deep reinforcement learning to master the game. Now, you can l...
Clasificación: | Libro Electrónico |
---|---|
Autores principales: | , |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Shelter Island, NY :
Manning Publications Co.,
[2019]
|
Temas: | |
Acceso en línea: | Texto completo (Requiere registro previo con correo institucional) |
Tabla de Contenidos:
- Intro
- Deep Learning and the Game of Go
- Max Pumperla and Kevin Ferguson
- Copyright
- Dedication
- Brief Table of Contents
- Table of Contents
- front matter
- Foreword
- Preface
- Acknowledgments
- About this book
- Who should read this book
- Roadmap
- About the code
- Book forum
- About the authors
- About the cover illustration
- Part 1. Foundations
- 1 Toward deep learning: a machine-learning introduction
- 1.1. What is machine learning?
- 1.1.1. How does machine learning relate to AI?
- 1.1.2. What you can and can't do with machine learning
- 1.2. Machine learning by example
- 1.2.1. Using machine learning in software applications
- 1.2.2. Supervised learning
- 1.2.3. Unsupervised learning
- 1.2.4. Reinforcement learning
- 1.3. Deep learning
- 1.4. What you'll learn in this book
- 1.5. Summary
- 2 Go as a machine-learning problem
- 2.1. Why games?
- 2.2. A lightning introduction to the game of Go
- 2.2.1. Understanding the board
- 2.2.2. Placing and capturing stones
- 2.2.3. Ending the game and counting
- 2.2.4. Understanding ko
- 2.3. Handicaps
- 2.4. Where to learn more
- 2.5. What can we teach a machine?
- 2.5.1. Selecting moves in the opening
- 2.5.2. Searching game states
- 2.5.3. Reducing the number of moves to consider
- 2.5.4. Evaluating game states
- 2.6. How to measure your Go AI's strength
- 2.6.1. Traditional Go ranks
- 2.6.2. Benchmarking your Go AI
- 2.7. Summary
- 3 Implementing your first Go bot
- 3.1. Representing a game of Go in Python
- 3.1.1. Implementing the Go board
- 3.1.2. Tracking connected groups of stones in Go: strings
- 3.1.3. Placing and capturing stones on a Go board
- 3.2. Capturing game state and checking for illegal moves
- 3.2.1. Self-capture
- 3.2.2. Ko
- 3.3. Ending a game
- 3.4. Creating your first bot: the weakest Go AI imaginable.
- 3.5. Speeding up game play with Zobrist hashing
- 3.6. Playing against your bot
- 3.7. Summary
- Part 2. Machine learning and game AI
- 4 Playing games with tree search
- 4.1. Classifying games
- 4.2. Anticipating your opponent with minimax search
- 4.3. Solving tic-tac-toe: a minimax example
- 4.4. Reducing search space with pruning
- 4.4.1. Reducing search depth with position evaluation
- 4.4.2. Reducing search width with alpha-beta pruning
- 4.5. Evaluating game states with Monte Carlo tree search
- 4.5.1. Implementing Monte Carlo tree search in Python
- 4.5.2. How to select which branch to explore
- 4.5.3. Applying Monte Carlo tree search to Go
- 4.6. Summary
- 5 Getting started with neural networks
- 5.1. A simple use case: classifying handwritten digits
- 5.1.1. The MNIST data set of handwritten digits
- 5.1.2. MNIST data preprocessing
- 5.2. The basics of neural networks
- 5.2.1. Logistic regression as simple artificial neural network
- 5.2.2. Networks with more than one output dimension
- 5.3. Feed-forward networks
- 5.4. How good are our predictions? Loss functions and optimization
- 5.4.1. What is a loss function?
- 5.4.2. Mean squared error
- 5.4.3. Finding minima in loss functions
- 5.4.4. Gradient descent to find minima
- 5.4.5. Stochastic gradient descent for loss functions
- 5.4.6. Propagating gradients back through your network
- 5.5. Training a neural network step-by-step in Python
- 5.5.1. Neural network layers in Python
- 5.5.2. Activation layers in neural networks
- 5.5.3. Dense layers in Python as building blocks for feed-forward networks
- 5.5.4. Sequential neural networks with Python
- 5.5.5. Applying your network handwritten digit classification
- 5.6. Summary
- 6 Designing a neural network for Go data
- 6.1. Encoding a Go game position for neural networks.
- 6.2. Generating tree-search games as network training data
- 6.3. Using the Keras deep-learning library
- 6.3.1. Understanding Keras design principles
- 6.3.2. Installing the Keras deep-learning library
- 6.3.3. Running a familiar first example with Keras
- 6.3.4. Go move prediction with feed-forward neural networks in Keras
- 6.4. Analyzing space with convolutional networks
- 6.4.1. What convolutions do intuitively
- 6.4.2. Building convolutional neural networks with Keras
- 6.4.3. Reducing space with pooling layers
- 6.5. Predicting Go move probabilities
- 6.5.1. Using the softmax activation function in the last layer
- 6.5.2. Cross-entropy loss for classification problems
- 6.6. Building deeper networks with dropout and rectified linear units
- 6.6.1. Dropping neurons for regularization
- 6.6.2. The rectified linear unit activation function
- 6.7. Putting it all together for a stronger Go move-prediction network
- 6.8. Summary
- 7 Learning from data: a deep-learning bot
- 7.1. Importing Go game records
- 7.1.1. The SGF file format
- 7.1.2. Downloading and replaying Go game records from KGS
- 7.2. Preparing Go data for deep learning
- 7.2.1. Replaying a Go game from an SGF record
- 7.2.2. Building a Go data processor
- 7.2.3. Building a Go data generator to load data efficiently
- 7.2.4. Parallel Go data processing and generators
- 7.3. Training a deep-learning model on human game-play data
- 7.4. Building more-realistic Go data encoders
- 7.5. Training efficiently with adaptive gradients
- 7.5.1. Decay and momentum in SGD
- 7.5.2. Optimizing neural networks with Adagrad
- 7.5.3. Refining adaptive gradients with Adadelta
- 7.6. Running your own experiments and evaluating performance
- 7.6.1. A guideline to testing architectures and hyperparameters
- 7.6.2. Evaluating performance metrics for training and test data.
- 12.1.1. What is advantage?
- 12.1.2. Calculating advantage during self-play
- 12.2. Designing a neural network for actor-critic learning
- 12.3. Playing games with an actor-critic agent
- 12.4. Training an actor-critic agent from experience data
- 12.5. Summary
- Part 3. Greater than the sum of its parts
- 13 AlphaGo: Bringing it all together
- 13.1. Training deep neural networks for AlphaGo
- 13.1.1. Network architectures in AlphaGo
- 13.1.2. The AlphaGo board encoder
- 13.1.3. Training AlphaGo-style policy networks
- 13.2. Bootstrapping self-play from policy networks
- 13.3. Deriving a value network from self-play data
- 13.4. Better search with policy and value networks
- 13.4.1. Using neural networks to improve Monte Carlo rollouts
- 13.4.2. Tree search with a combined value function
- 13.4.3. Implementing AlphaGo's search algorithm
- 13.5. Practical considerations for training your own AlphaGo
- 13.6. Summary
- 14 AlphaGo Zero: Integrating tree search with reinforcement learning
- 14.1. Building a neural network for tree search
- 14.2. Guiding tree search with a neural network
- 14.2.1. Walking down the tree
- 14.2.2. Expanding the tree
- 14.2.3. Selecting a move
- 14.3. Training
- 14.4. Improving exploration with Dirichlet noise
- 14.5. Modern techniques for deeper neural networks
- 14.5.1. Batch normalization
- 14.5.2. Residual networks
- 14.6. Exploring additional resources
- 14.7. Wrapping up
- 14.8. Summary
- Appendix A. Mathematical foundations
- Vectors, matrices, and beyond: a linear algebra primer
- Vectors: one-dimensional data
- Matrices: two-dimensional data
- Rank 3 tensors
- Rank 4 tensors
- Calculus in five minutes: derivatives and finding maxima
- Appendix B. The backpropagation algorithm
- A bit of notation
- The backpropagation algorithm for feed-forward networks.