Cargando…

Trino : the definitive guide : SQL at any scale, on any storage, in any environment /

Perform fast interactive analytics against different data sources using the Trino high-performance distributed SQL query engine. In the second edition of this practical guide, you'll learn how to conduct analytics on data where it lives, whether it's a data lake using Hive, a modern lakeho...

Descripción completa

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Fuller, Matt (Writer on computer program languages)
Otros Autores: Moser, Manfred (Software engineer), Traverso, Martin
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Sebastopol, California : O'Reilly Media, Inc., [2022]
Edición:2nd edition.
Temas:
Acceso en línea:Texto completo (Requiere registro previo con correo institucional)
Tabla de Contenidos:
  • Cover
  • Copyright
  • Table of Contents
  • Foreword
  • Preface
  • Conventions Used in This Book
  • Code Examples, Permissions, and Attribution
  • O'Reilly Online Learning
  • How to Contact Us
  • Acknowledgments
  • Part I. Getting Started with Trino
  • Chapter 1. Introducing Trino
  • The Problems with Big Data
  • Trino to the Rescue
  • Designed for Performance and Scale
  • SQL-on-Anything
  • Separation of Data Storage and Query Compute Resources
  • Trino Use Cases
  • One SQL Analytics Access Point
  • Access Point to Data Warehouse and Source Systems
  • Provide SQL-Based Access to Anything
  • Federated Queries
  • Semantic Layer for a Virtual Data Warehouse
  • Data Lake Query Engine
  • SQL Conversions and ETL
  • Better Insights Due to Faster Response Times
  • Big Data, Machine Learning, and Artificial Intelligence
  • Other Use Cases
  • Trino Resources
  • Website
  • Documentation
  • Community Chat
  • Source Code, License, and Version
  • Contributing
  • Book Repository
  • Iris Data Set
  • Flight Data Set
  • A Brief History of Trino
  • Conclusion
  • Chapter 2. Installing and Configuring Trino
  • Trying Trino with the Docker Container
  • Installing from the Archive File
  • Java Virtual Machine
  • Python
  • Installation
  • Configuration
  • Adding a Data Source
  • Running Trino
  • Conclusion
  • Chapter 3. Using Trino
  • Trino Command-Line Interface
  • Getting Started
  • Pagination
  • History and Completion
  • Additional Diagnostics
  • Executing Queries
  • Output Formats
  • Ignoring Errors
  • Trino JDBC Driver
  • Downloading and Registering the Driver
  • Establishing a Connection to Trino
  • Trino and ODBC
  • Client Libraries
  • Trino Web UI
  • SQL with Trino
  • Concepts
  • First Examples
  • Conclusion
  • Part II. Diving Deeper into Trino
  • Chapter 4. Trino Architecture
  • Coordinator and Workers in a Cluster
  • Coordinator
  • Discovery Service
  • Workers
  • Connector-Based Architecture
  • Catalogs, Schemas, and Tables
  • Query Execution Model
  • Query Planning
  • Parsing and Analysis
  • Initial Query Planning
  • Optimization Rules
  • Predicate Pushdown
  • Cross Join Elimination
  • TopN
  • Partial Aggregations
  • Implementation Rules
  • Lateral Join Decorrelation
  • Semi-Join (IN) Decorrelation
  • Cost-Based Optimizer
  • The Cost Concept
  • Cost of the Join
  • Table Statistics
  • Filter Statistics
  • Table Statistics for Partitioned Tables
  • Join Enumeration
  • Broadcast Versus Distributed Joins
  • Working with Table Statistics
  • Trino ANALYZE
  • Gathering Statistics When Writing to Disk
  • Hive ANALYZE
  • Displaying Table Statistics
  • Conclusion
  • Chapter 5. Production-Ready Deployment
  • Configuration Details
  • Server Configuration
  • Logging
  • Node Configuration
  • JVM Configuration
  • Launcher
  • Cluster Installation
  • RPM Installation
  • Installation Directory Structure
  • Configuration
  • Uninstall Trino
  • Installation in the Cloud
  • Helm Chart for Kubernetes Deployment