Loading…

Essential PySpark for Scalable Data Analytics /

Get started with distributed computing using PySpark, a single unified framework to solve end-to-end data analytics at scale Key Features Discover how to convert huge amounts of raw data into meaningful and actionable insights Use Spark's unified analytics engine for end-to-end analytics, from...

Full description

Bibliographic Details
Call Number:Libro Electrónico
Main Author: Nudurupati, Sreeram (Author)
Corporate Author: Safari, an O'Reilly Media Company
Format: Electronic eBook
Language:Inglés
Published: Packt Publishing, 2021.
Edition:1st edition.
Subjects:
Online Access:Texto completo (Requiere registro previo con correo institucional)
Table of Contents:
  • Table of Contents Distributed Computing Primer Data Ingestion Data Cleansing and Integration Real-time Data Analytics Scalable Machine Learning with PySpark Feature Engineering – Extraction, Transformation, and Selection Supervised Machine Learning Unsupervised Machine Learning Machine Learning Life Cycle Management Scaling Out Single-Node Machine Learning Using PySpark Data Visualization with PySpark Spark SQL Primer Integrating External Tools with Spark SQL The Data Lakehouse.