Cargando…

Data lakes /

The concept of a data lake is less than 10 years old, but they are already hugely implemented within large companies. Their goal is to efficiently deal with ever-growing volumes of heterogeneous data, while also facing various sophisticated user needs. However, defining and building a data lake is s...

Descripción completa

Detalles Bibliográficos
Clasificación:Libro Electrónico
Otros Autores: Laurent, Anne, 1976-, Laurent, Dominique, Madera, Cédrine
Formato: Electrónico eBook
Idioma:Inglés
Publicado: London : Hoboken : ISTE, Ltd. ; Wiley, 2020.
Colección:Computer engineering series. Databases and big data set ; volume 2.
Temas:
Acceso en línea:Texto completo (Requiere registro previo con correo institucional)

MARC

LEADER 00000cam a2200000 a 4500
001 OR_on1151184484
003 OCoLC
005 20231017213018.0
006 m o d
007 cr un|---aucuu
008 200418s2020 enk ob 001 0 eng d
040 |a EBLCP  |b eng  |e pn  |c EBLCP  |d DG1  |d OCLCO  |d EBLCP  |d UKAHL  |d OCLCF  |d OCLCQ  |d S2H  |d TOH  |d N$T  |d K6U  |d OCLCO  |d OCLCQ  |d SFB  |d OCLCQ  |d OCLCO 
020 |a 9781119720430  |q (electronic bk. ;  |q oBook) 
020 |a 1119720435  |q (electronic bk. ;  |q oBook) 
020 |a 1119720427 
020 |a 9781119720423  |q (electronic bk.) 
029 1 |a AU@  |b 000067253887 
035 |a (OCoLC)1151184484 
050 4 |a QA76.9.B45 
082 0 4 |a 005.7  |2 23 
049 |a UAMI 
245 0 0 |a Data lakes /  |c edited by Anne Laurent, Dominique Laurent, Cédrine Madera. 
260 |a London :  |b ISTE, Ltd. ;  |a Hoboken :  |b Wiley,  |c 2020. 
300 |a 1 online resource (249 pages) 
336 |a text  |b txt  |2 rdacontent 
337 |a computer  |b c  |2 rdamedia 
338 |a online resource  |b cr  |2 rdacarrier 
490 1 |a Computer engineering series, databases and big data set ;  |v volume 2 
588 0 |a Print version record. 
505 0 |a Cover -- Half-Title Page -- Dedication -- Title Page -- Copyright Page -- Contents -- Preface -- 1. Introduction to Data Lakes: Definitions and Discussions -- 1.1. Introduction to data lakes -- 1.2. Literature review and discussion -- 1.3. The data lake challenges -- 1.4. Data lakes versus decision-making systems -- 1.5. Urbanization for data lakes -- 1.6. Data lake functionalities -- 1.7. Summary and concluding remarks -- 2. Architecture of Data Lakes -- 2.1. Introduction -- 2.2. State of the art and practice -- 2.2.1. Definition -- 2.2.2. Architecture -- 2.2.3. Metadata 
505 8 |a 2.2.4. Data quality -- 2.2.5. Schema-on-read -- 2.3. System architecture -- 2.3.1. Ingestion layer -- 2.3.2. Storage layer -- 2.3.3. Transformation layer -- 2.3.4. Interaction layer -- 2.4. Use case: the Constance system -- 2.4.1. System overview -- 2.4.2. Ingestion layer -- 2.4.3. Maintenance layer -- 2.4.4. Query layer -- 2.4.5. Data quality control -- 2.4.6. Extensibility and flexibility -- 2.5. Concluding remarks -- 3. Exploiting Software Product Lines and Formal Concept Analysis for the Design of Data Lake Architectures -- 3.1. Our expectations -- 3.2. Modeling data lake functionalities 
505 8 |a 3.3. Building the knowledge base of industrial data lakes -- 3.4. Our formalization approach -- 3.5. Applying our approach -- 3.6. Analysis of our first results -- 3.7. Concluding remarks -- 4. Metadata in Data Lake Ecosystems -- 4.1. Definitions and concepts -- 4.2. Classification of metadata by NISO -- 4.2.1. Metadata schema -- 4.2.2. Knowledge base and catalog -- 4.3. Other categories of metadata -- 4.3.1. Business metadata -- 4.3.2. Navigational integration -- 4.3.3. Operational metadata -- 4.4. Sources of metadata -- 4.5. Metadata classification -- 4.6. Why metadata are needed 
505 8 |a 4.6.1. Selection of information (re)sources -- 4.6.2. Organization of information resources -- 4.6.3. Interoperability and integration -- 4.6.4. Unique digital identification -- 4.6.5. Data archiving and preservation -- 4.7. Business value of metadata -- 4.8. Metadata architecture -- 4.8.1. Architecture scenario 1: point-to-point metadata architecture -- 4.8.2. Architecture scenario 2: hub and spoke metadata architecture -- 4.8.3. Architecture scenario 3: tool of record metadata architecture -- 4.8.4. Architecture scenario 4: hybrid metadata architecture 
505 8 |a 4.8.5. Architecture scenario 5: federated metadata architecture -- 4.9. Metadata management -- 4.10. Metadata and data lakes -- 4.10.1. Application and workload layer -- 4.10.2. Data layer -- 4.10.3. System layer -- 4.10.4. Metadata types -- 4.11. Metadata management in data lakes -- 4.11.1. Metadata directory -- 4.11.2. Metadata storage -- 4.11.3. Metadata discovery -- 4.11.4. Metadata lineage -- 4.11.5. Metadata querying -- 4.11.6. Data source selection -- 4.12. Metadata and master data management -- 4.13. Conclusion -- 5. A Use Case of Data Lake Metadata Management -- 5.1. Context 
500 |a 5.1.1. Data lake definition 
504 |a Includes bibliographical references and index. 
520 |a The concept of a data lake is less than 10 years old, but they are already hugely implemented within large companies. Their goal is to efficiently deal with ever-growing volumes of heterogeneous data, while also facing various sophisticated user needs. However, defining and building a data lake is still a challenge, as no consensus has been reached so far. Data Lakes presents recent outcomes and trends in the field of data repositories. The main topics discussed are the data-driven architecture of a data lake; the management of metadata - supplying key information about the stored data, master data and reference data; the roles of linked data and fog computing in a data lake ecosystem; and how gravity principles apply in the context of data lakes. A variety of case studies are also presented, thus providing the reader with practical examples of data lake management. 
590 |a O'Reilly  |b O'Reilly Online Learning: Academic/Public Library Edition 
650 0 |a Big data. 
650 0 |a Databases. 
650 6 |a Données volumineuses. 
650 7 |a Big data  |2 fast 
650 7 |a Databases  |2 fast 
700 1 |a Laurent, Anne,  |d 1976- 
700 1 |a Laurent, Dominique. 
700 1 |a Madera, Cédrine. 
776 0 8 |i Print version:  |a Laurent, Anne.  |t Data Lakes.  |d Newark : John Wiley & Sons, Incorporated, ©2020  |z 9781786305855 
830 0 |a Computer engineering series.  |p Databases and big data set ;  |v volume 2. 
856 4 0 |u https://learning.oreilly.com/library/view/~/9781786305855/?ar  |z Texto completo (Requiere registro previo con correo institucional) 
938 |a Askews and Holts Library Services  |b ASKH  |n AH37732084 
938 |a Askews and Holts Library Services  |b ASKH  |n AH37348401 
938 |a ProQuest Ebook Central  |b EBLB  |n EBL6173691 
938 |a EBSCOhost  |b EBSC  |n 2436380 
994 |a 92  |b IZTAP