Cargando…

Hadoop essentials : delve into the key concepts of Hadoop and get a thorough understanding of the Hadoop ecosystem /

If you are a system or application developer interested in learning how to solve practical problems using the Hadoop framework, then this book is ideal for you. This book is also meant for Hadoop professionals who want to find solutions to the different challenges they come across in their Hadoop pr...

Descripción completa

Detalles Bibliográficos
Clasificación:	Libro Electrónico
Autor principal:	Achari, Shiva (Autor)
Formato:	Electrónico eBook
Idioma:	Inglés
Publicado:	Birmingham, UK : Packt Publishing, 2015.
Colección:	Community experience distilled.
Temas:	Apache Hadoop. Apache Hadoop Electronic data processing > Distributed processing. Web sites > Design. Web site development. Traitement réparti. Sites Web > Conception. Sites Web > Développement. COMPUTERS > Computer Literacy. COMPUTERS > Computer Science. COMPUTERS > Data Processing. COMPUTERS > Hardware > General. COMPUTERS > Information Technology. COMPUTERS > Machine Theory. COMPUTERS > Reference. Electronic data processing > Distributed processing Web site development Web sites > Design
Acceso en línea:	Texto completo Texto completo

Tabla de Contenidos:

Cover; Copyright; Credits; About the Author; Acknowledgments; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Introduction to Big Data and Hadoop; V's of big data; Volume; Velocity; Variety; Understanding big data; NoSQL; Types of NoSQL databases; Analytical database; Who is creating the big data?; Big data use cases; Big data use case patterns; Big data as a storage pattern; Big data as a data transformation pattern; Big data for a data analysis pattern; Big data for data in a real-time pattern; Big data for a low latency caching pattern; Hadoop; Hadoop history
Description Advantages of Hadoop; Uses of Hadoop; Hadoop ecosystem; Apache Hadoop; Hadoop distributions; Pillars of Hadoop-HDFS, MapReduce, and YARN; Data access components
Hive and Pig; Data storage component
HBase; Data ingestion in Hadoop- Sqoop and Flume; Streaming and real-time analysis
Storm and Spark; Summary; Chapter 2: Hadoop Ecosystem; Traditional systems; Database trend; Hadoop use cases; Hadoop basic data flow; Hadoop integration; The Hadoop ecosystem; Distributed filesystem; HDFS; Distributed programming; NoSQL databases; Apache HBase; Data ingestion; Service Programming
Apache YARN Apache Zookeeper; Scheduling; Data analytics and machine learning; System management; Apache Ambari; Summary; Chapter 3: Pillars of Hadoop
HDFS, MapReduce, and YARN; HDFS; Features of HDFS; HDFS Architecture; NameNode; DataNode; Checkpoint NameNode or Secondary NameNode; BackupNode; Data storage in HDFS; Read pipeline; Write pipeline; Rack awareness; Advantages of rack awareness in HDFS; HDFS Federation; Limitations of HDFS 1.0; The benefit of HDFS Federation; HDFS ports; HDFS commands; MapReduce; MapReduce architecture; JobTracker; TaskTracker; Serialization data types
Writable interface Writable Comparable interface; MapReduce example; The MapReduce process; Mapper; Shuffle and sorting; Reducer; Speculative execution; FileFormats; InputFormats; RecordReader; OutputFormats; RecordWriter; Writing a MapReduce program; Mapper code; Reducer code; Driver code; Auxiliary steps; Combiner; Partitioner; YARN; YARN Architecture; ResourceManager; NodeManager; ApplicationMaster; Applications powered by YARN; Summary; Chapter 4: Data Access Components
Hive and Pig; Need of a data processing tool on Hadoop; Pig; Pig data types; Pig architecture; The logical plan
The physical plan The MapReduce plan; Pig modes; Grunt shell; Input data; Loading data; Dump; Store; Filter; Group By; Limit; Aggregation; Cogroup; DESCRIBE; EXPLAIN; ILLUSTRATE; Hive; Hive architecture; Metastore; Query compiler; Execution engine; Data types and schemas; Installing Hive; Starting Hive Shell; HiveQL; DDL (Data Definition Language) operations; DML (Data Manipulation Language) operations; SQL operation; Built-in functions; Custom UDF (User Defined Functions); Managing tables (external versus managed); SerDe; Partitioning; Bucketing; Summary; Chapter 5: Storage Component
HBase

Hadoop essentials : delve into the key concepts of Hadoop and get a thorough understanding of the Hadoop ecosystem /

Ejemplares similares