Cargando…

Learning YARN : moving beyond MapReduce--learn resource management and big data processing using YARN /

Moving beyond MapReduce - learn resource management and big data processing using YARN About This Book Deep dive into YARN components, schedulers, life cycle management and security architecture Create your own Hadoop-YARN applications and integrate big data technologies with YARN Step-by-step guide...

Descripción completa

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autores principales: Arora, Akhil (Autor), Mehrotra, Shrey (Autor)
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Birmingham, UK : Packt Publishing, 2015.
Colección:Community experience distilled.
Temas:
Acceso en línea:Texto completo (Requiere registro previo con correo institucional)
Tabla de Contenidos:
  • Cover; Copyright; Credits; About the Authors; Acknowledgments; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Starting with YARN Basics; Introduction to MapReduce v1; Shortcomings of MapReducev1; An overview of YARN components; ResourceManager; NodeManager; ApplicationMaster; Container; The YARN architecture; How YARN satisfies big data needs; Projects powered by YARN; Summary; Chapter 2: Setting up a Hadoop-YARN Cluster; Starting with the basics; Supported platforms; Hardware requirements; Software requirements; Basic Linux commands / utilities; Sudo.
  • Nano editorSource; Jps; Netstat; Man; Preparing a node for a Hadoop-YARN cluster; Install Java; Create a Hadoop dedicated user and group; Disable firewall or open Hadoop ports; Configure domain name resolution; Install SSH and configure passwordless SSH from the master to all slaves; The Hadoop-YARN single node installation; Prerequisites; Installation steps; Step 1
  • Download and extract the Hadoop bundle; Step 2
  • Configure the environment variables; Step 3
  • Configure the Hadoop configuration files; Step 4: Format NameNode; Step 5: Start Hadoop daemons; An overview of web user interfaces.
  • Run a sample applicationThe Hadoop-YARN multi-node installation; Prerequisites; Installation steps; Step 1: Configure the master node as a single-node Hadoop-YARN installation; Step 2: Copy the Hadoop folder to all the slave nodes; Step 3: Configure environment variables on slave nodes; Step 4: Format NameNode; Step 5: Start Hadoop daemons; An overview of the Hortonworks and Cloudera installations; Summary; Chapter 3: Administering a Hadoop-YARN Cluster; Using the Hadoop-YARN commands; The user commands; Jar; Application; Node; Logs; Classpath; Version; Administration commands.
  • ResourceManager / NodeManager / ProxyServerRMAdmin; DaemonLog; Configuring the Hadoop-YARN services; The ResourceManager service; The NodeManager service; The Timeline server; The web application proxy server; Ports summary; Managing the Hadoop-YARN services; Managing service logs; Managing pid files; Monitoring the YARN services; JMX monitoring; The ResourceManager JMX beans; The NodeManager JMX beans; Ganglia monitoring; Ganglia daemons; Integrating Ganglia with Hadoop; Understanding ResourceManager's High Availability; Architecture; Failover mechanisms.
  • Configuring ResourceManager High AvailabilityDefine nodes; The RM state store mechanism; The failover proxy provider; Automatic failover; High Availability admin commands; Monitoring NodeManager's health; The health checker script; Summary; Chapter 4: Executing Applications Using YARN; Understanding application execution flow; Phase 1
  • application initialization and submission; Phase 2
  • allocate memory and start ApplicationMaster; Phase 3
  • ApplicationMaster registration and resource allocation; Phase 4
  • launch and monitor containers; Phase 5
  • application progress report.
  • Phase 6
  • application completion.