Cargando…

Hadoop MapReduce cookbook : recipes for analyzing large and complex datasets with Hadoop MapReduce /

Individual self-contained code recipes. Solve specific problems using individual recipes, or work through the book to develop your capabilities. If you are a big data enthusiast and striving to use Hadoop to solve your problems, this book is for you. Aimed at Java programmers with some knowledge of...

Descripción completa

Detalles Bibliográficos
Clasificación:	Libro Electrónico
Autor principal:	Perera, Srinath
Otros Autores:	Gunarathne, Thilina
Formato:	Electrónico eBook
Idioma:	Inglés
Publicado:	Birmingham : Packt Pub., 2013.
Colección:	Community experience distilled.
Temas:	Electronic data processing > Distributed processing. File organization (Computer science) Cloud computing. Open source software. Traitement réparti. Fichiers (Informatique) > Organisation. Infonuagique. Logiciels libres. COMPUTERS > Computer Literacy. COMPUTERS > Computer Science. COMPUTERS > Data Processing. COMPUTERS > Hardware > General. COMPUTERS > Information Technology. COMPUTERS > Machine Theory. COMPUTERS > Reference. Cloud computing Electronic data processing > Distributed processing Open source software
Acceso en línea:	Texto completo

Tabla de Contenidos:

Cover; Copyright; Credits; About the Authors; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Getting Hadoop Up and Running in a Cluster; Introduction; Setting up Hadoop in your machine; Writing a WordCount MapReduce sample, bundling it, and running it using standalone; Hadoop; Adding the combiner step to the WordCount MapReduce program; Setting up HDFS; Using HDFS monitoring UI; HDFS basic command-line file operations; Setting Hadoop in a distributed cluster environment; Running WordCount program in a distributed cluster environment
Using MapReduce monitoring UIChapter 2: Advanced HDFS; Introduction; Benchmarking HDFS; Adding a new DataNode; Decommissioning DataNodes; Using multiple disks/volumes and limiting HDFS disk usage; Setting HDFS block size; Setting the file replication factor; Using HDFS Java API; Using HDFS C API (libhdfs); Mounting HDFS (Fuse-DFS); Merging files in HDFS; Chapter 3: Advanced Hadoop MapReduce Administration; Introduction; Tuning Hadoop configurations for cluster deployments; Running benchmarks to verify the Hadoop installation; Reusing Java VMs to improve the performance
Fault tolerance and speculative executionDebug scripts
analyzing task failures; Setting failure percentages and skipping bad records; Shared-user Hadoop clusters
using fair and other schedulers; Hadoop security
integrating with Kerberos; Using the Hadoop Tool interface; Chapter 4: Developing Complex Hadoop MapReduce Applications; Introduction; Choosing appropriate Hadoop data types; Implementing a custom Hadoop Writable data type; Implementing a custom Hadoop key type; Emitting data of different value types from a mapper; Choosing a suitable Hadoop InputFormat for your input data format
Adding support for new input data formats
implementing a custom InputFormatFormatting the results of MapReduce computations
using Hadoop; OutputFormats; Hadoop intermediate (map to reduce) data partitioning; Broadcasting and distributing shared resources to tasks in a MapReduce; job
Hadoop DistributedCache; Using Hadoop with legacy applications
Hadoop Streaming; Adding dependencies between MapReduce jobs; Hadoop counters for reporting custom metrics; Chapter 5: Hadoop Ecosystem; Introduction; Installing HBase; Data random access using Java client APIs
Running MapReduce jobs on HBase (table input/output)Installing Pig; Running your first Pig command; Set operations (join, union) and sorting with Pig; Installing Hive; Running SQL-style query with Hive; Performing a join with Hive; Installing Mahout; Running K-means with Mahout; Visualizing K-means results; Chapter 6: Analytics; Introduction; Simple analytics using MapReduce; Performing Group-By using MapReduce; Calculating frequency distributions and sorting using MapReduce; Plotting the Hadoop results using GNU Plot; Calculating histograms using MapReduce

Hadoop MapReduce cookbook : recipes for analyzing large and complex datasets with Hadoop MapReduce /

Ejemplares similares