Cargando…

Big Data

This book addresses models and methods for designing and implementing Big Data Systems to support mixed and complexdecision processes, giving special attention to Big Data Warehouses as a way ofefficiently storing and processing batch or streaming data for structured orsemi-structured analytical pro...

Descripción completa

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Santos, Maribel Yasmina
Otros Autores: Costa, Carlos
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Aalborg : River Publishers, 2020.
Colección:River Publishers series in information science and technology.
Temas:
Acceso en línea:Texto completo
Texto completo
Tabla de Contenidos:
  • Front Cover
  • Half Title
  • Series Page
  • RIVER PUBLISHERS SERIES IN INFORMATION SCIENCE AND TECHNOLOGY
  • Title Page
  • Copyright Page
  • CONTENTS
  • List of Figures
  • List of Tables
  • The Authors
  • Acknowledgments
  • Foreword
  • Notation
  • 1. Introduction
  • 1.1. Objectives of this Book
  • 1.2. Intended Audience
  • 1.3. Book Structure
  • 2. Big Data Concepts, Techniques, and Technologies
  • 2.1. Big Data Relevance
  • 2.2. Big Data Characteristics
  • 2.3. Big Data Challenges
  • 2.3.1. Big Data General Dilemmas
  • 2.3.2. Challenges in the Big Data Life Cycle
  • 2.3.3. Big Data in Secure, Private, and Monitored Environments
  • 2.3.4. Organizational Change
  • 2.4. Techniques for Big Data Solutions
  • 2.4.1. Big Data Life Cycle and Requirements
  • 2.4.1.1. General Steps to Process and Analyze Big Data
  • 2.4.1.2. Architectural and Infrastructural Requirements
  • 2.4.2. The Lambda Architecture
  • 2.4.3. Towards Standardization: the NIST Reference Architecture
  • 2.5. Big Data Technologies
  • 2.5.1. Hadoop and Related Projects
  • 2.5.2. Landscape of Distributed SQL Engines
  • 2.5.3. Other Technologies for Big Data Analytics
  • 3. OLTP-oriented Databases for Big Data Environments
  • 3.1. NoSQL and NewSQL: an Overview
  • 3.2. NoSQL Databases
  • 3.2.1. Key-value Databases
  • 3.2.1.1. Overview
  • 3.2.1.2. Redis
  • 3.2.2. Column
  • oriented Databases
  • 3.2.2.1. Overview
  • 3.2.2.2. HBase
  • 3.2.2.3. From Relational Models to HBase Data Models
  • 3.2.3. Document
  • oriented Databases
  • 3.2.3.1. Overview
  • 3.2.3.2. MongoDB
  • 3.2.4. Graph Databases
  • 3.2.4.1. Overview
  • 3.2.4.2. Neo4j
  • 3.3. NewSQL Databases and Translytical Databases
  • 4. OLAP-oriented Databases for Big Data Environments
  • 4.1. Hive: the De Facto SQL-on-Hadoop Engine
  • 4.1.1. Data Storage Formats
  • 4.1.1.1. Text File
  • 4.1.1.2. Sequence File
  • 4.1.1.3. RCFile
  • 4.1.1.4. ORC File
  • 4.1.1.5. Avro File
  • 4.1.1.6. Parquet
  • 4.1.2. Partitions and Buckets
  • 4.2. From Dimensional Models to Tabular Models
  • 4.2.1. Primary Data Tables
  • 4.2.2. Derived Data Tables
  • 4.3. Optimizing OLAP workloads with Druid
  • 5. Design and Implementation of Big Data Warehouses
  • 5.1. Big Data Warehousing: an Overview
  • 5.2. Model of Logical Components and Data Flows
  • 5.2.1. Data Provider and Data Consumer
  • 5.2.2. Big Data Application Provider
  • 5.2.3. Big Data Framework Provider
  • 5.2.3.1. Messaging/Communications, Resource Management, and Infrastructures
  • 5.2.3.2. Processing
  • 5.2.3.3. Storage: Data Organization and Distribution
  • 5.2.4. System Orchestrator and Security, Privacy, and Management
  • 5.3. Model of Technological Infrastructure
  • 5.4. Method for Data Modeling
  • 5.4.1. Analytical Objects and their Related Concepts
  • 5.4.2. Joining, Uniting, and Materializing Analytical Objects
  • 5.4.3. Dimensional Big Data with Outsourced Descriptive Families