Cargando…

Data lakes /

The concept of a data lake is less than 10 years old, but they are already hugely implemented within large companies. Their goal is to efficiently deal with ever-growing volumes of heterogeneous data, while also facing various sophisticated user needs. However, defining and building a data lake is s...

Descripción completa

Detalles Bibliográficos
Clasificación:	Libro Electrónico
Otros Autores:	Laurent, Anne, 1976-, Laurent, Dominique, Madera, Cédrine
Formato:	Electrónico eBook
Idioma:	Inglés
Publicado:	London : Hoboken : ISTE, Ltd. ; Wiley, 2020.
Colección:	Computer engineering series. Databases and big data set ; volume 2.
Temas:	Big data. Databases. Données volumineuses. Big data Databases
Acceso en línea:	Texto completo (Requiere registro previo con correo institucional)

Tabla de Contenidos:

Cover
Half-Title Page
Dedication
Title Page
Copyright Page
Contents
Preface
1. Introduction to Data Lakes: Definitions and Discussions
1.1. Introduction to data lakes
1.2. Literature review and discussion
1.3. The data lake challenges
1.4. Data lakes versus decision-making systems
1.5. Urbanization for data lakes
1.6. Data lake functionalities
1.7. Summary and concluding remarks
2. Architecture of Data Lakes
2.1. Introduction
2.2. State of the art and practice
2.2.1. Definition
2.2.2. Architecture
2.2.3. Metadata
2.2.4. Data quality
2.2.5. Schema-on-read
2.3. System architecture
2.3.1. Ingestion layer
2.3.2. Storage layer
2.3.3. Transformation layer
2.3.4. Interaction layer
2.4. Use case: the Constance system
2.4.1. System overview
2.4.2. Ingestion layer
2.4.3. Maintenance layer
2.4.4. Query layer
2.4.5. Data quality control
2.4.6. Extensibility and flexibility
2.5. Concluding remarks
3. Exploiting Software Product Lines and Formal Concept Analysis for the Design of Data Lake Architectures
3.1. Our expectations
3.2. Modeling data lake functionalities
3.3. Building the knowledge base of industrial data lakes
3.4. Our formalization approach
3.5. Applying our approach
3.6. Analysis of our first results
3.7. Concluding remarks
4. Metadata in Data Lake Ecosystems
4.1. Definitions and concepts
4.2. Classification of metadata by NISO
4.2.1. Metadata schema
4.2.2. Knowledge base and catalog
4.3. Other categories of metadata
4.3.1. Business metadata
4.3.2. Navigational integration
4.3.3. Operational metadata
4.4. Sources of metadata
4.5. Metadata classification
4.6. Why metadata are needed
4.6.1. Selection of information (re)sources
4.6.2. Organization of information resources
4.6.3. Interoperability and integration
4.6.4. Unique digital identification
4.6.5. Data archiving and preservation
4.7. Business value of metadata
4.8. Metadata architecture
4.8.1. Architecture scenario 1: point-to-point metadata architecture
4.8.2. Architecture scenario 2: hub and spoke metadata architecture
4.8.3. Architecture scenario 3: tool of record metadata architecture
4.8.4. Architecture scenario 4: hybrid metadata architecture
4.8.5. Architecture scenario 5: federated metadata architecture
4.9. Metadata management
4.10. Metadata and data lakes
4.10.1. Application and workload layer
4.10.2. Data layer
4.10.3. System layer
4.10.4. Metadata types
4.11. Metadata management in data lakes
4.11.1. Metadata directory
4.11.2. Metadata storage
4.11.3. Metadata discovery
4.11.4. Metadata lineage
4.11.5. Metadata querying
4.11.6. Data source selection
4.12. Metadata and master data management
4.13. Conclusion
5. A Use Case of Data Lake Metadata Management
5.1. Context

Data lakes /

Ejemplares similares