Cargando...

Código QR

Hands-on reinforcement learning with Python : Master reinforcement and deep reinforcement learning using OpenAI Gym and TensorFlow

Reinforcement learning is a self-evolving type of machine learning that takes us closer to achieving true artificial intelligence. This easy-to-follow guide explains everything from scratch using rich examples written in Python.

Detalles Bibliográficos
Autor principal:	Ravichandiran, Sudharsan
Formato:	Printed Book
Publicado:	Birmingham-Mumbai: Packt, 2018.
Edición:	1 Ed.
Materias:	Machine learning. Artificial intelligence. Human-computer interaction.


LEADER	03911nam a22001817a 4500
020			\|a 9781788836524
082			\|a 006.31 \|b RAV-H
100			\|a Ravichandiran, Sudharsan
245			\|a Hands-on reinforcement learning with Python : \|b Master reinforcement and deep reinforcement learning using OpenAI Gym and TensorFlow \|c Sudharsan Ravichandiran
250			\|a 1 Ed.
260			\|a Birmingham-Mumbai: \|b Packt, \|c 2018.
300			\|a i-vi+305p.
505			\|a Cover; Title Page; Copyright and Credits; Dedication; Packt Upsell; Contributors; Table of Contents; Preface; Chapter 1: Introduction to Reinforcement Learning; What is RL?; RL algorithm; How RL differs from other ML paradigms; Elements of RL; Agent; Policy function; Value function; Model; Agent environment interface; Types of RL environment; Deterministic environment; Stochastic environment; Fully observable environment; Partially observable environment; Discrete environment; Continuous environment; Episodic and non-episodic environment; Single and multi-agent environment; RL platforms. OpenAI Gym and UniverseDeepMind Lab; RL-Glue; Project Malmo; ViZDoom; Applications of RL; Education; Medicine and healthcare; Manufacturing; Inventory management; Finance; Natural Language Processing and Computer Vision; Summary; Questions; Further reading; Chapter 2: Getting Started with OpenAI and TensorFlow; Setting up your machine; Installing Anaconda; Installing Docker; Installing OpenAI Gym and Universe; Common error fixes; OpenAI Gym; Basic simulations; Training a robot to walk; OpenAI Universe; Building a video game bot; TensorFlow; Variables, constants, and placeholders; Variables. ConstantsPlaceholders; Computation graph; Sessions; TensorBoard; Adding scope; Summary; Questions; Further reading; Chapter 3: The Markov Decision Process and Dynamic Programming; The Markov chain and Markov process; Markov Decision Process; Rewards and returns; Episodic and continuous tasks; Discount factor; The policy function; State value function; State-action value function (Q function); The Bellman equation and optimality; Deriving the Bellman equation for value and Q functions; Solving the Bellman equation; Dynamic programming; Value iteration; Policy iteration. Solving the frozen lake problemValue iteration; Policy iteration; Summary; Questions; Further reading; Chapter 4: Gaming with Monte Carlo Methods; Monte Carlo methods; Estimating the value of pi using Monte Carlo; Monte Carlo prediction; First visit Monte Carlo; Every visit Monte Carlo; Let's play Blackjack with Monte Carlo; Monte Carlo control; Monte Carlo exploration starts; On-policy Monte Carlo control; Off-policy Monte Carlo control; Summary; Questions; Further reading; Chapter 5: Temporal Difference Learning; TD learning; TD prediction; TD control; Q learning. Solving the taxi problem using Q learningSARSA; Solving the taxi problem using SARSA; The difference between Q learning and SARSA; Summary; Questions; Further reading; Chapter 6: Multi-Armed Bandit Problem; The MAB problem; The epsilon-greedy policy; The softmax exploration algorithm; The upper confidence bound algorithm; The Thompson sampling algorithm; Applications of MAB; Identifying the right advertisement banner using MAB; Contextual bandits; Summary; Questions; Further reading; Chapter 7: Deep Learning Fundamentals; Artificial neurons; ANNs; Input layer; Hidden layer; Output la
520			\|a Reinforcement learning is a self-evolving type of machine learning that takes us closer to achieving true artificial intelligence. This easy-to-follow guide explains everything from scratch using rich examples written in Python.
650			\|a Machine learning. Artificial intelligence. Human-computer interaction.
942			\|c BK
999			\|c 313803 \|d 313803
952			\|0 0 \|1 0 \|4 0 \|6 006_310000000000000_RAVH \|7 0 \|9 335546 \|a DCB \|b DCB \|d 2020-11-24 \|l 1 \|o 006.31 RAV-H \|p DCB3871 \|q 2020-12-08 \|r 2020-11-24 \|s 2020-11-24 \|w 2020-11-24 \|y BK

Cannot write session to /tmp/vufind_sessions/sess_1e6m8jr1q71ehfjhf2dt50m4q7