Cargando…

Principles of big data : preparing, sharing, and analyzing complex information.

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Berman, Jules J.
Formato: Electrónico eBook
Idioma:Inglés
Publicado: BOSTON MORGAN KAUFMANN 2013.
Temas:
Acceso en línea:Texto completo (Requiere registro previo con correo institucional)
Tabla de Contenidos:
  • Machine generated contents note: 1. Providing Structure to Unstructured Data
  • Background
  • Machine Translation
  • Autocoding
  • Indexing
  • Term Extraction
  • 2. Identification, Deidentification, and Reidentification
  • Background
  • Features of an Identifier System
  • Registered Unique Object Identifiers
  • Really Bad Identifier Methods
  • Embedding Information in an Identifier: Not Recommended
  • One-Way Hashes
  • Use Case: Hospital Registration
  • Deidentification
  • Data Scrubbing
  • Reidentification
  • Lessons Learned
  • 3. Ontologies and Semantics Background
  • Classifications, the Simplest of Ontologies
  • Ontologies, Classes with Multiple Parents
  • Choosing a Class Model
  • Introduction to Resource Description Framework Schema
  • Common Pitfalls in Ontology Development
  • 4. Introspection
  • Background
  • Knowledge of Self
  • eXtensible Markup Language
  • Introduction to Meaning
  • Namespaces and the Aggregation of Meaningful Assertions
  • Resource Description Framework Triples
  • Reflection
  • Use Case: Trusted Time Stamp
  • Summary
  • 5. Data Integration and Software Interoperability
  • Background
  • Committee to Survey Standards
  • Standard Trajectory
  • Specifications and Standards
  • Versioning
  • Compliance Issues
  • Interfaces to Big Data Resources
  • 6. Immutability and Immortality
  • Background
  • Immutability and Identifiers
  • Data Objects
  • Legacy Data
  • Data Born from Data
  • Reconciling Identifiers across Institutions
  • Zero-Knowledge Reconciliation
  • Curator's Burden
  • 7. Measurement
  • Background
  • Counting
  • Gene Counting
  • Dealing with Negations
  • Understanding Your Control
  • Practical Significance of Measurements
  • Obsessive-Compulsive Disorder: The Mark of a Great Data Manager
  • 8. Simple but Powerful Big Data Techniques
  • Background
  • Look at the Data
  • Data Range
  • Denominator
  • Frequency Distributions
  • Mean and Standard Deviation
  • Estimation Only Analyses
  • Use Case: Watching Data Trends with Google Ngrams
  • Use Case: Estimating Movie Preferences
  • 9. Analysis
  • Background
  • Analytic Tasks
  • Clustering, Classifying, Recommending, and Modeling
  • Data Reduction
  • Normalizing and Adjusting Data
  • Big Data Software: Speed and Scalability
  • Find Relationships, Not Similarities
  • 10. Special Considerations in Big Data Analysis
  • Background
  • Theory in Search of Data
  • Data in Search of a Theory
  • Overfitting
  • Bigness Bias
  • Too Much Data
  • Fixing Data
  • Data Subsets in Big Data: Neither Additive nor Transitive
  • Additional Big Data Pitfalls
  • 11. Stepwise Approach to Big Data Analysis
  • Background
  • Step 1 Question Is Formulated
  • Step 2 Resource Evaluation
  • Step 3 Question Is Reformulated
  • Step 4 Query Output Adequacy
  • Step 5 Data Description
  • Step 6 Data Reduction
  • Step 7 Algorithms Are Selected, If Absolutely Necessary
  • Step 8 Results Are Reviewed and Conclusions Are Asserted
  • Step 9 Conclusions Are Examined and Subjected to Validation
  • 12. Failure
  • Background
  • Failure Is Common
  • Failed Standards
  • Complexity
  • When Does Complexity Help?
  • When Redundancy Fails
  • Save Money; Don't Protect Harmless Information
  • After Failure
  • Use Case: Cancer Biomedical Informatics Grid, a Bridge Too Far
  • 13. Legalities
  • Background
  • Responsibility for the Accuracy and Legitimacy of Contained Data
  • Rights to Create, Use, and Share the Resource
  • Copyright and Patent Infringements Incurred by Using Standards
  • Protections for Individuals
  • Consent
  • Unconsented Data
  • Good Policies Are a Good Policy
  • Use Case: The Havasupai Story
  • 14. Societal Issues
  • Background
  • How Big Data Is Perceived
  • Necessity of Data Sharing, Even When It Seems Irrelevant
  • Reducing Costs and Increasing Productivity with Big Data
  • Public Mistrust
  • Saving Us from Ourselves
  • Hubris and Hyperbole
  • 15. Future
  • Background
  • Last Words.