Big Data
This book addresses models and methods for designing and implementing Big Data Systems to support mixed and complexdecision processes, giving special attention to Big Data Warehouses as a way ofefficiently storing and processing batch or streaming data for structured orsemi-structured analytical pro...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Otros Autores: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Aalborg :
River Publishers,
2020.
|
Colección: | River Publishers series in information science and technology.
|
Temas: | |
Acceso en línea: | Texto completo Texto completo |
Tabla de Contenidos:
- Front Cover
- Half Title
- Series Page
- RIVER PUBLISHERS SERIES IN INFORMATION SCIENCE AND TECHNOLOGY
- Title Page
- Copyright Page
- CONTENTS
- List of Figures
- List of Tables
- The Authors
- Acknowledgments
- Foreword
- Notation
- 1. Introduction
- 1.1. Objectives of this Book
- 1.2. Intended Audience
- 1.3. Book Structure
- 2. Big Data Concepts, Techniques, and Technologies
- 2.1. Big Data Relevance
- 2.2. Big Data Characteristics
- 2.3. Big Data Challenges
- 2.3.1. Big Data General Dilemmas
- 2.3.2. Challenges in the Big Data Life Cycle
- 2.3.3. Big Data in Secure, Private, and Monitored Environments
- 2.3.4. Organizational Change
- 2.4. Techniques for Big Data Solutions
- 2.4.1. Big Data Life Cycle and Requirements
- 2.4.1.1. General Steps to Process and Analyze Big Data
- 2.4.1.2. Architectural and Infrastructural Requirements
- 2.4.2. The Lambda Architecture
- 2.4.3. Towards Standardization: the NIST Reference Architecture
- 2.5. Big Data Technologies
- 2.5.1. Hadoop and Related Projects
- 2.5.2. Landscape of Distributed SQL Engines
- 2.5.3. Other Technologies for Big Data Analytics
- 3. OLTP-oriented Databases for Big Data Environments
- 3.1. NoSQL and NewSQL: an Overview
- 3.2. NoSQL Databases
- 3.2.1. Key-value Databases
- 3.2.1.1. Overview
- 3.2.1.2. Redis
- 3.2.2. Column
- oriented Databases
- 3.2.2.1. Overview
- 3.2.2.2. HBase
- 3.2.2.3. From Relational Models to HBase Data Models
- 3.2.3. Document
- oriented Databases
- 3.2.3.1. Overview
- 3.2.3.2. MongoDB
- 3.2.4. Graph Databases
- 3.2.4.1. Overview
- 3.2.4.2. Neo4j
- 3.3. NewSQL Databases and Translytical Databases
- 4. OLAP-oriented Databases for Big Data Environments
- 4.1. Hive: the De Facto SQL-on-Hadoop Engine
- 4.1.1. Data Storage Formats
- 4.1.1.1. Text File
- 4.1.1.2. Sequence File
- 4.1.1.3. RCFile
- 4.1.1.4. ORC File
- 4.1.1.5. Avro File
- 4.1.1.6. Parquet
- 4.1.2. Partitions and Buckets
- 4.2. From Dimensional Models to Tabular Models
- 4.2.1. Primary Data Tables
- 4.2.2. Derived Data Tables
- 4.3. Optimizing OLAP workloads with Druid
- 5. Design and Implementation of Big Data Warehouses
- 5.1. Big Data Warehousing: an Overview
- 5.2. Model of Logical Components and Data Flows
- 5.2.1. Data Provider and Data Consumer
- 5.2.2. Big Data Application Provider
- 5.2.3. Big Data Framework Provider
- 5.2.3.1. Messaging/Communications, Resource Management, and Infrastructures
- 5.2.3.2. Processing
- 5.2.3.3. Storage: Data Organization and Distribution
- 5.2.4. System Orchestrator and Security, Privacy, and Management
- 5.3. Model of Technological Infrastructure
- 5.4. Method for Data Modeling
- 5.4.1. Analytical Objects and their Related Concepts
- 5.4.2. Joining, Uniting, and Materializing Analytical Objects
- 5.4.3. Dimensional Big Data with Outsourced Descriptive Families