Principles of big data : preparing, sharing, and analyzing complex information.
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
BOSTON
MORGAN KAUFMANN
2013.
|
Temas: | |
Acceso en línea: | Texto completo (Requiere registro previo con correo institucional) |
Tabla de Contenidos:
- Machine generated contents note: 1. Providing Structure to Unstructured Data
- Background
- Machine Translation
- Autocoding
- Indexing
- Term Extraction
- 2. Identification, Deidentification, and Reidentification
- Background
- Features of an Identifier System
- Registered Unique Object Identifiers
- Really Bad Identifier Methods
- Embedding Information in an Identifier: Not Recommended
- One-Way Hashes
- Use Case: Hospital Registration
- Deidentification
- Data Scrubbing
- Reidentification
- Lessons Learned
- 3. Ontologies and Semantics Background
- Classifications, the Simplest of Ontologies
- Ontologies, Classes with Multiple Parents
- Choosing a Class Model
- Introduction to Resource Description Framework Schema
- Common Pitfalls in Ontology Development
- 4. Introspection
- Background
- Knowledge of Self
- eXtensible Markup Language
- Introduction to Meaning
- Namespaces and the Aggregation of Meaningful Assertions
- Resource Description Framework Triples
- Reflection
- Use Case: Trusted Time Stamp
- Summary
- 5. Data Integration and Software Interoperability
- Background
- Committee to Survey Standards
- Standard Trajectory
- Specifications and Standards
- Versioning
- Compliance Issues
- Interfaces to Big Data Resources
- 6. Immutability and Immortality
- Background
- Immutability and Identifiers
- Data Objects
- Legacy Data
- Data Born from Data
- Reconciling Identifiers across Institutions
- Zero-Knowledge Reconciliation
- Curator's Burden
- 7. Measurement
- Background
- Counting
- Gene Counting
- Dealing with Negations
- Understanding Your Control
- Practical Significance of Measurements
- Obsessive-Compulsive Disorder: The Mark of a Great Data Manager
- 8. Simple but Powerful Big Data Techniques
- Background
- Look at the Data
- Data Range
- Denominator
- Frequency Distributions
- Mean and Standard Deviation
- Estimation Only Analyses
- Use Case: Watching Data Trends with Google Ngrams
- Use Case: Estimating Movie Preferences
- 9. Analysis
- Background
- Analytic Tasks
- Clustering, Classifying, Recommending, and Modeling
- Data Reduction
- Normalizing and Adjusting Data
- Big Data Software: Speed and Scalability
- Find Relationships, Not Similarities
- 10. Special Considerations in Big Data Analysis
- Background
- Theory in Search of Data
- Data in Search of a Theory
- Overfitting
- Bigness Bias
- Too Much Data
- Fixing Data
- Data Subsets in Big Data: Neither Additive nor Transitive
- Additional Big Data Pitfalls
- 11. Stepwise Approach to Big Data Analysis
- Background
- Step 1 Question Is Formulated
- Step 2 Resource Evaluation
- Step 3 Question Is Reformulated
- Step 4 Query Output Adequacy
- Step 5 Data Description
- Step 6 Data Reduction
- Step 7 Algorithms Are Selected, If Absolutely Necessary
- Step 8 Results Are Reviewed and Conclusions Are Asserted
- Step 9 Conclusions Are Examined and Subjected to Validation
- 12. Failure
- Background
- Failure Is Common
- Failed Standards
- Complexity
- When Does Complexity Help?
- When Redundancy Fails
- Save Money; Don't Protect Harmless Information
- After Failure
- Use Case: Cancer Biomedical Informatics Grid, a Bridge Too Far
- 13. Legalities
- Background
- Responsibility for the Accuracy and Legitimacy of Contained Data
- Rights to Create, Use, and Share the Resource
- Copyright and Patent Infringements Incurred by Using Standards
- Protections for Individuals
- Consent
- Unconsented Data
- Good Policies Are a Good Policy
- Use Case: The Havasupai Story
- 14. Societal Issues
- Background
- How Big Data Is Perceived
- Necessity of Data Sharing, Even When It Seems Irrelevant
- Reducing Costs and Increasing Productivity with Big Data
- Public Mistrust
- Saving Us from Ourselves
- Hubris and Hyperbole
- 15. Future
- Background
- Last Words.