Talend for Big Data.
This book is written in a concise and easy-to-understand manner, and acts as a comprehensive guide on data analytics and integration with Talend big data processing jobs. If you are a chief information officer, enterprise architect, data architect, data scientist, software developer, software engine...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Packt Publishing,
2014.
|
Temas: | |
Acceso en línea: | Texto completo |
Tabla de Contenidos:
- Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Getting Started with Talend Big Data; Talend Unified Platform presentation; Knowing about Hadoop ecosystem; Prerequisites for running examples; Downloading Talend Open Studio for Big Data; Installing TOSBD; Running TOSBD for the first time; Summary; Chapter 2: Building Our First Big Data Job; TOSBD
- the development environment; A simple HDFS writer job; Checking the result in HDFS; Summary; Chapter 3: Formatting Data; Twitter Sentiment Analysis.
- Writing the tweets in HDFSSetting our Apache Hive tables; Formatting tweets with Apache Hive; Summary; Chapter 4: Processing Tweets with Apache Hive; Extracting hashtags; Extracting emoticons; Joining the dots; Summary; Chapter 5: Aggregate Data with Apache Pig; Knowing about Pig; Extracting the top Twitter users; Extracting the top hashtags, emoticons, and sentiments; Summary; Chapter 6: Back to the SQL Database; Linking HDFS and RDBMS with Sqoop; Exporting and importing data to a MySQL database; Summary; Chapter 7: Big Data Architecture and Integration patterns; Streaming pattern.
- The Partitioning patternSummary; Appendix: Installing Your Hadoop Cluster with Cloudera CDH VM; Downloading Cloudera CDH VM; Launching the VM for the first time; Basic required configuration; Summary; Index.