Cargando…

Principles and practice of big data : preparing, sharing, and analyzing complex information /

Principles and Practice of Big Data: Preparing, Sharing, and Analyzing Complex Information, Second Edition updates and expands on the first edition, bringing a set of techniques and algorithms that are tailored to Big Data projects. The book stresses the point that most data analyses conducted on la...

Descripción completa

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Berman, Jules J. (Autor)
Formato: Electrónico eBook
Idioma:Inglés
Publicado: London : Academic Press, [2018]
Edición:Second edition.
Temas:
Acceso en línea:Texto completo
Tabla de Contenidos:
  • Introduction
  • Providing structure to unstructured data
  • Identification, deidentification, and reidentification
  • Metadata, semantics, and triples
  • Classifications and ontologies
  • Introspection
  • Standards and data integration
  • Immutability and immortality
  • Assessing the adequacy of a big data resource
  • Measurement
  • Indispensable tips for fast and simple big data analysis
  • Finding the clues in large collections of data
  • Using random numbers to knock your big data analytic problems down to size
  • Special considerations in big data analysis
  • Big data failures and how to avoid (some of) them
  • Data reanalysis : much more important than analysis
  • Repurposing big data
  • Data sharing and data security
  • Legalities
  • Societal issues.
  • Front Cover; Principles and Practice of Big Data: Preparing, sharing, and analyzing complex information; Copyright; Other Books by Jules J. Berman; Dedication; Contents; About the Author; Author's Preface to Second Edition; Author's Preface to First Edition; References; Chapter 1: Introduction; Section 1.1. Definition of Big Data; Section 1.2. Big Data Versus Small Data; Section 1.3. Whence Comest Big Data?; Section 1.4. The Most Common Purpose of Big Data Is to Produce Small Data; Section 1.5. Big Data Sits at the Center of the Research Universe; Glossary; References
  • Chapter 2: Providing Structure to Unstructured Data; Section 2.1. Nearly All Data Is Unstructured and Unusable in Its Raw Form; Section 2.2. Concordances; Section 2.3. Term Extraction; Section 2.4. Indexing; Section 2.5. Autocoding; Section 2.6. Case Study: Instantly Finding the Precise Location of Any Atom in the Universe (Some Assembly Required); Section 2.7. Case Study (Advanced): A Complete Autocoder (in 12 Lines of Python Code); Section 2.8. Case Study: Concordances as Transformations of Text; Section 2.9. Case Study (Advanced): Burrows Wheeler Transform (BWT); Glossary; References
  • Chapter 3: Identification, Deidentification, and Reidentification; Section 3.1. What Are Identifiers?; Section 3.2. Difference Between an Identifier and an Identifier System; Section 3.3. Generating Unique Identifiers; Section 3.4. Really Bad Identifier Methods; Section 3.5. Registering Unique Object Identifiers; Section 3.6. Deidentification and Reidentification; Section 3.7. Case Study: Data Scrubbing; Section 3.8. Case Study (Advanced): Identifiers in Image Headers; Section 3.9. Case Study: One-Way Hashes; Glossary; References; Chapter 4: Metadata, Semantics, and Triples
  • Section 4.1. Metadata; Section 4.2. eXtensible Markup Language; Section 4.3. Semantics and Triples; Section 4.4. Namespaces; Section 4.5. Case Study: A Syntax for Triples; Section 4.6. Case Study: Dublin Core; Glossary; References; Chapter 5: Classifications and Ontologies; Section 5.1. It's All About Object Relationships; Section 5.2. Classifications, the Simplest of Ontologies; Section 5.3. Ontologies, Classes With Multiple Parents; Section 5.4. Choosing a Class Model; Section 5.5. Class Blending; Section 5.6. Common Pitfalls in Ontology Development
  • Section 5.7. Case Study: An Upper Level Ontology; Section 5.8. Case Study (Advanced): Paradoxes; Section 5.9. Case Study (Advanced): RDF Schemas and Class Properties; Section 5.10. Case Study (Advanced): Visualizing Class Relationships; Glossary; References; Chapter 6: Introspection; Section 6.1. Knowledge of Self; Section 6.2. Data Objects: The Essential Ingredient of Every Big Data Collection; Section 6.3. How Big Data Uses Introspection; Section 6.4. Case Study: Time Stamping Data; Section 6.5. Case Study: A Visit to the TripleStore