Cargando…

Learning Big Data with Amazon Elastic MapReduce.

Amazon Elastic MapReduce is a web service used to process and store vast amount of data, and it is one of the largest Hadoop operators in the world. With the increase in the amount of data generated and collected by many businesses and the arrival of cost-effective cloud-based solutions for distribu...

Descripción completa

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Singh, Amarkant (Software developer)
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Packt Publishing, 2014.
Temas:
Acceso en línea:Texto completo
Tabla de Contenidos:
  • Cover; Copyright; Credits; About the Authors; Acknowledgments; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Amazon Web Services; What is Amazon Web Services?; Structure and Design; Regions; Availability Zones; Services provided by AWS; Compute; Amazon EC2; Auto Scaling; Elastic Load Balancing; Amazon Workspaces; Storage; Amazon S3; Amazon EBS; Amazon Glacier; AWS Storage Gateway; AWS Import/Export; Databases; Amazon RDS; Amazon DynamoDB; Amazon Redshift; Amazon ElastiCache; Networking and CDN; Amazon VPC; Amazon Route 53; Amazon CloudFront; AWS Direct Connect
  • AnalyticsAmazon EMR; Amazon Kinesis; AWS Data Pipeline; Application services; Amazon CloudSearch (Beta); Amazon SQS; Amazon SNS; Amazon SES; Amazon AppStream; Amazon Elastic Transcoder; Amazon SWF; Deployment and Management; AWS Identity and Access Management; Amazon CloudWatch; AWS Elastic Beanstalk; AWS CloudFormation; AWS OpsWorks; AWS CloudHSM; AWS CloudTrail; AWS Pricing; Creating an account on AWS; Step 1
  • Creating an Amazon.com account; Step 2
  • Providing a payment method; Step 3
  • Identity verification by telephone; Step 4
  • Selecting the AWS support plan
  • Launching the AWS management consoleGetting started with Amazon EC2; How to start a machine on AWS?; Step 1
  • Choosing an Amazon Machine Image; Step 2
  • Choosing an instance type; Step 3
  • Configuring instance details; Step 4
  • Adding storage; Step 5
  • Tagging your instance; Step 6
  • Configuring a security group; Communicating with the launched instance; EC2 instance types; General purpose; Memory optimized; Compute optimized; Getting started with Amazon S3; Creating a S3 bucket; Bucket naming; S3cmd; Summary; Chapter 2: MapReduce; The map function; The reduce function; Divide and conquer
  • What is MapReduce?The map reduce function models; The map function model; The reduce function model; Data life cycle in the MapReduce framework; Creation of input data splits; Record reader; Mapper; Combiner; Partitioner; Shuffle and sort; Reducer; Real-world examples and use cases of MapReduce; Social networks ; Media and entertainment; E-commerce and websites; Fraud detection and financial analytics; Search engines and ad networks; ETL and data analytics; Software distributions built on the MapReduce framework; Apache Hadoop; MapR; Cloudera distribution; Summary; Chapter 3: Apache Hadoop
  • What is Apache Hadoop?Hadoop modules; Hadoop Distributed File System; Major architectural goals of HDFS; Block replication and rack awareness; The HDFS architecture; NameNode; DataNode; Apache Hadoop MapReduce; Hadoop MapReduce 1.x; JobTracker; TaskTracker; Hadoop MapReduce 2.0; Hadoop YARN; Apache Hadoop as a platform; Apache Pig; Apache Hive; Summary; Chapter 4: Amazon EMR
  • Hadoop on Amazon Web Services; What is AWS EMR?; Features of EMR; Accessing Amazon EMR features; Programming on AWS EMR; The EMR architecture; Types of nodes; EMR Job Flow and Steps; Job Steps; An EMR cluster
  • Hadoop filesystem on EMR
  • S3 and HDFS