Cargando…

Apache Spark 3 for Data Engineering and Analytics with Python /

Master Python and PySpark 3.0.1 for Data Engineering / Analytics (Databricks) About This Video Apply PySpark and SQL concepts to analyze data Understand the Databricks interface and use Spark on Databricks Learn Spark transformations and actions using the RDD (Resilient Distributed Datasets) API In...

Descripción completa

Detalles Bibliográficos
Autor principal: Mngadi, David (Autor)
Autor Corporativo: Safari, an O'Reilly Media Company
Formato: Electrónico Video
Idioma:Inglés
Publicado: Packt Publishing, 2021.
Edición:1st edition.
Acceso en línea:Texto completo (Requiere registro previo con correo institucional)

MARC

LEADER 00000cgm a22000007a 4500
001 OR_on1268278733
003 OCoLC
005 20231017213018.0
006 m o c
007 cr cnu||||||||
007 vz czazuu
008 020921s2021 xx --- o vleng d
040 |a AU@  |b eng  |c AU@  |d NZCPL  |d OCLCF  |d OCLCO  |d OCLCQ 
019 |a 1277196384  |a 1305857731 
020 |z 9781803244303 
024 8 |a 9781803244303 
029 0 |a AU@  |b 000069849391 
035 |a (OCoLC)1268278733  |z (OCoLC)1277196384  |z (OCoLC)1305857731 
049 |a UAMI 
100 1 |a Mngadi, David,  |e author. 
245 1 0 |a Apache Spark 3 for Data Engineering and Analytics with Python /  |c Mngadi, David. 
250 |a 1st edition. 
264 1 |b Packt Publishing,  |c 2021. 
300 |a 1 online resource (1 video file, approximately 8 hr., 31 min.) 
336 |a two-dimensional moving image  |b tdi  |2 rdacontent 
337 |a computer  |b c  |2 rdamedia 
338 |a online resource  |b cr  |2 rdacarrier 
347 |a video file 
520 |a Master Python and PySpark 3.0.1 for Data Engineering / Analytics (Databricks) About This Video Apply PySpark and SQL concepts to analyze data Understand the Databricks interface and use Spark on Databricks Learn Spark transformations and actions using the RDD (Resilient Distributed Datasets) API In Detail Apache Spark 3 is an open-source distributed engine for querying and processing data. This course will provide you with a detailed understanding of PySpark and its stack. This course is carefully developed and designed to guide you through the process of data analytics using Python Spark. The author uses an interactive approach in explaining keys concepts of PySpark such as the Spark architecture, Spark execution, transformations and actions using the structured API, and much more. You will be able to leverage the power of Python, Java, and SQL and put it to use in the Spark ecosystem. You will start by getting a firm understanding of the Apache Spark architecture and how to set up a Python environment for Spark. Followed by the techniques for collecting, cleaning, and visualizing data by creating dashboards in Databricks. You will learn how to use SQL to interact with DataFrames. The author provides an in-depth review of RDDs and contrasts them with DataFrames. There are multiple problem challenges provided at intervals in the course so that you get a firm grasp of the concepts taught in the course. Who this book is for This course is designed for Python developers who wish to learn how to use the language for data engineering and analytics with PySpark. Any aspiring data engineering and analytics professionals. Data scientists/analysts who wish to learn an analytical processing strategy that can be deployed over a big data cluster. Data managers who want to gain a deeper understanding of managing data over a cluster. 
542 |f Packt Publishing  |g 2021 
550 |a Made available through: Safari, an O'Reilly Media Company. 
588 |a Online resource; Title from title screen (viewed August 30, 2021) 
590 |a O'Reilly  |b O'Reilly Online Learning: Academic/Public Library Edition 
710 2 |a Safari, an O'Reilly Media Company. 
856 4 0 |u https://learning.oreilly.com/videos/~/9781803244303/?ar  |z Texto completo (Requiere registro previo con correo institucional) 
936 |a BATCHLOAD 
994 |a 92  |b IZTAP