Principles and practice of big data : preparing, sharing, and analyzing complex information /
Principles and Practice of Big Data: Preparing, Sharing, and Analyzing Complex Information, Second Edition updates and expands on the first edition, bringing a set of techniques and algorithms that are tailored to Big Data projects. The book stresses the point that most data analyses conducted on la...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
London :
Academic Press,
[2018]
|
Edición: | Second edition. |
Temas: | |
Acceso en línea: | Texto completo (Requiere registro previo con correo institucional) |
Tabla de Contenidos:
- Introduction
- Providing structure to unstructured data
- Identification, deidentification, and reidentification
- Metadata, semantics, and triples
- Classifications and ontologies
- Introspection
- Standards and data integration
- Immutability and immortality
- Assessing the adequacy of a big data resource
- Measurement
- Indispensable tips for fast and simple big data analysis
- Finding the clues in large collections of data
- Using random numbers to knock your big data analytic problems down to size
- Special considerations in big data analysis
- Big data failures and how to avoid (some of) them
- Data reanalysis : much more important than analysis
- Repurposing big data
- Data sharing and data security
- Legalities
- Societal issues.
- Front Cover; Principles and Practice of Big Data: Preparing, sharing, and analyzing complex information; Copyright; Other Books by Jules J. Berman; Dedication; Contents; About the Author; Author's Preface to Second Edition; Author's Preface to First Edition; References; Chapter 1: Introduction; Section 1.1. Definition of Big Data; Section 1.2. Big Data Versus Small Data; Section 1.3. Whence Comest Big Data?; Section 1.4. The Most Common Purpose of Big Data Is to Produce Small Data; Section 1.5. Big Data Sits at the Center of the Research Universe; Glossary; References
- Chapter 2: Providing Structure to Unstructured Data; Section 2.1. Nearly All Data Is Unstructured and Unusable in Its Raw Form; Section 2.2. Concordances; Section 2.3. Term Extraction; Section 2.4. Indexing; Section 2.5. Autocoding; Section 2.6. Case Study: Instantly Finding the Precise Location of Any Atom in the Universe (Some Assembly Required); Section 2.7. Case Study (Advanced): A Complete Autocoder (in 12 Lines of Python Code); Section 2.8. Case Study: Concordances as Transformations of Text; Section 2.9. Case Study (Advanced): Burrows Wheeler Transform (BWT); Glossary; References
- Chapter 3: Identification, Deidentification, and Reidentification; Section 3.1. What Are Identifiers?; Section 3.2. Difference Between an Identifier and an Identifier System; Section 3.3. Generating Unique Identifiers; Section 3.4. Really Bad Identifier Methods; Section 3.5. Registering Unique Object Identifiers; Section 3.6. Deidentification and Reidentification; Section 3.7. Case Study: Data Scrubbing; Section 3.8. Case Study (Advanced): Identifiers in Image Headers; Section 3.9. Case Study: One-Way Hashes; Glossary; References; Chapter 4: Metadata, Semantics, and Triples
- Section 4.1. Metadata; Section 4.2. eXtensible Markup Language; Section 4.3. Semantics and Triples; Section 4.4. Namespaces; Section 4.5. Case Study: A Syntax for Triples; Section 4.6. Case Study: Dublin Core; Glossary; References; Chapter 5: Classifications and Ontologies; Section 5.1. It's All About Object Relationships; Section 5.2. Classifications, the Simplest of Ontologies; Section 5.3. Ontologies, Classes With Multiple Parents; Section 5.4. Choosing a Class Model; Section 5.5. Class Blending; Section 5.6. Common Pitfalls in Ontology Development
- Section 5.7. Case Study: An Upper Level Ontology; Section 5.8. Case Study (Advanced): Paradoxes; Section 5.9. Case Study (Advanced): RDF Schemas and Class Properties; Section 5.10. Case Study (Advanced): Visualizing Class Relationships; Glossary; References; Chapter 6: Introspection; Section 6.1. Knowledge of Self; Section 6.2. Data Objects: The Essential Ingredient of Every Big Data Collection; Section 6.3. How Big Data Uses Introspection; Section 6.4. Case Study: Time Stamping Data; Section 6.5. Case Study: A Visit to the TripleStore