PolyBase revealed : data virtualization with SQL server, Hadoop, Apache Spark, and beyond /
Harness the power of PolyBase data virtualization software to make data from a variety of sources easily accessible through SQL queries while using the T-SQL skills you already know and have mastered. PolyBase Revealed shows you how to use the PolyBase feature of SQL Server 2019 to integrate SQL Ser...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Berkeley, CA :
Apress L.P.,
2020.
|
Temas: | |
Acceso en línea: | Texto completo (Requiere registro previo con correo institucional) |
Tabla de Contenidos:
- Intro
- Table of Contents
- About the Author
- About the Technical Reviewer
- Acknowledgments
- Introduction
- Chapter 1: Installing and Configuring PolyBase
- Choose the Form of Your PolyBase
- Installing PolyBase Standalone-Windows
- Installing PolyBase Scale-Out Group
- Building a Configuration File
- Installing Without a GUI
- Installing PolyBase Standalone-Linux
- Configuring PolyBase
- Configuring a Client
- Enable PolyBase
- Mandatory Configuration
- Scale-Out Group Configuration
- Troubleshooting Common Errors
- Testing for Success
- Conclusion
- Chapter 2: Connecting to Azure Blob Storage
- Making Preparations in Azure
- Create a Storage Account
- Upload Data
- Building a Link
- Credentials
- External Data Sources
- External File Formats
- Delimited Files
- Flat File Compression
- Define an External File Format
- External Tables
- Querying External Data
- Inserting into External Tables
- PolyBase Data Insertion Considerations
- PolyBase Is Insert-Only
- Insert Only into Folders
- Conclusion
- Chapter 3: Connecting to Hadoop
- Hadoop Prerequisites
- Preparing Files in HDFS
- Gather Configuration Settings
- Configuring SQL Server
- Update PolyBase Configuration Files
- External PolyBase Objects for Hadoop
- Credentials
- External Data Sources
- External File Formats
- Delimited Files
- RCFile
- ORC
- Parquet
- External Tables
- Querying Data in Hadoop
- Row Counts with Police Incident Data
- Newlines and Quotes with Fire Incident Data
- Going Faster with Parking Violations Data
- Inserting Data into Hadoop
- Conclusion
- Chapter 4: Using Predicate Pushdown to Enhance Query Performance
- The Importance of Predicate Pushdown
- Predicate Pushdown in PolyBase
- Diving into Predicate Pushdown
- Packet Capture Without Predicate Pushdown
- Packet Capture with Predicate Pushdown
- When Predicate Pushdown Makes Sense
- Small Data: Raleigh Police Incidents
- Bigger Data: New York City Parking Violations
- Limitations in Pushdown-Eligible Predicates
- Limitations on Pushdown with Complex Filters
- MapReduce and Pushdown in Summary
- Conclusion
- Chapter 5: Common Hadoop and Blob Storage Integration Errors
- Finding the Real Logger
- PolyBase Log Files
- DMS Errors
- DMS Movement
- DWEngine Errors
- DWEngine Movement
- DWEngine Server
- DMS PolyBase
- DWEngine PolyBase
- Hadoop Log Files
- Job Tracker
- YARN Resource Manager
- JobHistory UI
- NameNode Logs
- Log Files
- Configuration Issues
- SQL Server Configuration
- Check External Resources
- Check SQL Server Configuration Files
- Hadoop-Side Configuration
- Invalid User Permissions or Missing Account
- Could Not Obtain Block
- Host File Pointing to 127.0.0.1
- Kerberos Should Be On or Off, Not Both
- PolyBase and Dockerized Data Nodes
- Data Issues
- Structural Mismatch
- Unsupported Characters or Formats
- PolyBase Data Limitations
- Curate Your Data