Trino : the definitive guide : SQL at any scale, on any storage, in any environment /
Perform fast interactive analytics against different data sources using the Trino high-performance distributed SQL query engine. In the second edition of this practical guide, you'll learn how to conduct analytics on data where it lives, whether it's a data lake using Hive, a modern lakeho...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Otros Autores: | , |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Sebastopol, California :
O'Reilly Media, Inc.,
[2022]
|
Edición: | 2nd edition. |
Temas: | |
Acceso en línea: | Texto completo (Requiere registro previo con correo institucional) |
Tabla de Contenidos:
- Cover
- Copyright
- Table of Contents
- Foreword
- Preface
- Conventions Used in This Book
- Code Examples, Permissions, and Attribution
- O'Reilly Online Learning
- How to Contact Us
- Acknowledgments
- Part I. Getting Started with Trino
- Chapter 1. Introducing Trino
- The Problems with Big Data
- Trino to the Rescue
- Designed for Performance and Scale
- SQL-on-Anything
- Separation of Data Storage and Query Compute Resources
- Trino Use Cases
- One SQL Analytics Access Point
- Access Point to Data Warehouse and Source Systems
- Provide SQL-Based Access to Anything
- Federated Queries
- Semantic Layer for a Virtual Data Warehouse
- Data Lake Query Engine
- SQL Conversions and ETL
- Better Insights Due to Faster Response Times
- Big Data, Machine Learning, and Artificial Intelligence
- Other Use Cases
- Trino Resources
- Website
- Documentation
- Community Chat
- Source Code, License, and Version
- Contributing
- Book Repository
- Iris Data Set
- Flight Data Set
- A Brief History of Trino
- Conclusion
- Chapter 2. Installing and Configuring Trino
- Trying Trino with the Docker Container
- Installing from the Archive File
- Java Virtual Machine
- Python
- Installation
- Configuration
- Adding a Data Source
- Running Trino
- Conclusion
- Chapter 3. Using Trino
- Trino Command-Line Interface
- Getting Started
- Pagination
- History and Completion
- Additional Diagnostics
- Executing Queries
- Output Formats
- Ignoring Errors
- Trino JDBC Driver
- Downloading and Registering the Driver
- Establishing a Connection to Trino
- Trino and ODBC
- Client Libraries
- Trino Web UI
- SQL with Trino
- Concepts
- First Examples
- Conclusion
- Part II. Diving Deeper into Trino
- Chapter 4. Trino Architecture
- Coordinator and Workers in a Cluster
- Coordinator
- Discovery Service
- Workers
- Connector-Based Architecture
- Catalogs, Schemas, and Tables
- Query Execution Model
- Query Planning
- Parsing and Analysis
- Initial Query Planning
- Optimization Rules
- Predicate Pushdown
- Cross Join Elimination
- TopN
- Partial Aggregations
- Implementation Rules
- Lateral Join Decorrelation
- Semi-Join (IN) Decorrelation
- Cost-Based Optimizer
- The Cost Concept
- Cost of the Join
- Table Statistics
- Filter Statistics
- Table Statistics for Partitioned Tables
- Join Enumeration
- Broadcast Versus Distributed Joins
- Working with Table Statistics
- Trino ANALYZE
- Gathering Statistics When Writing to Disk
- Hive ANALYZE
- Displaying Table Statistics
- Conclusion
- Chapter 5. Production-Ready Deployment
- Configuration Details
- Server Configuration
- Logging
- Node Configuration
- JVM Configuration
- Launcher
- Cluster Installation
- RPM Installation
- Installation Directory Structure
- Configuration
- Uninstall Trino
- Installation in the Cloud
- Helm Chart for Kubernetes Deployment