Cargando…

Data algorithms with Spark /

Apache Spark's speed, ease of use, sophisticated analytics, and multilanguage support makes practical knowledge of this cluster-computing framework a required skill for data engineers and data scientists. With this hands-on guide, anyone looking for an introduction to Spark will learn practical...

Descripción completa

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Parsian, Mahmoud (Autor)
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Sebastopol, CA : O'Reilly Media, Inc., [2022]
Edición:[First edition].
Temas:
Acceso en línea:Texto completo (Requiere registro previo con correo institucional)

MARC

LEADER 00000cam a22000007i 4500
001 OR_on1310466096
003 OCoLC
005 20231017213018.0
006 m o d
007 cr cnu|||unuuu
008 220412s2022 caua o 001 0 eng d
040 |a ORMDA  |b eng  |e rda  |e pn  |c ORMDA  |d OCLCO  |d OCLCF  |d OCLCQ  |d OCLCO 
020 |z 9781492082385 
035 |a (OCoLC)1310466096 
037 |a 9781492082378  |b O'Reilly Media 
050 4 |a QA76.9.A43 
082 0 4 |a 005.1  |2 23 
049 |a UAMI 
100 1 |a Parsian, Mahmoud,  |e author. 
245 1 0 |a Data algorithms with Spark /  |c by Mahmoud Parsian. 
250 |a [First edition]. 
264 1 |a Sebastopol, CA :  |b O'Reilly Media, Inc.,  |c [2022] 
300 |a 1 online resource (435 pages) :  |b illustrations 
336 |a text  |b txt  |2 rdacontent 
337 |a computer  |b c  |2 rdamedia 
338 |a online resource  |b cr  |2 rdacarrier 
500 |a Includes index. 
520 |a Apache Spark's speed, ease of use, sophisticated analytics, and multilanguage support makes practical knowledge of this cluster-computing framework a required skill for data engineers and data scientists. With this hands-on guide, anyone looking for an introduction to Spark will learn practical algorithms and examples using PySpark. In each chapter, author Mahmoud Parsian shows you how to solve a data problem with a set of Spark transformations and algorithms. You'll learn how to tackle problems involving ETL, design patterns, machine learning algorithms, data partitioning, and genomics analysis. Each detailed recipe includes PySpark algorithms using the PySpark driver and shell script. With this book, you will: Learn how to select Spark transformations for optimized solutions Explore powerful transformations and reductions including reduceByKey(), combineByKey(), and mapPartitions() Understand data partitioning for optimized queries Build and apply a model using PySpark design patterns Apply motif-finding algorithms to graph data Analyze graph data by using the GraphFrames API Apply PySpark algorithms to clinical and genomics data Learn how to use and apply feature engineering in ML algorithms Understand and use practical and pragmatic data design patterns. 
590 |a O'Reilly  |b O'Reilly Online Learning: Academic/Public Library Edition 
630 0 0 |a SPARK (Electronic resource) 
630 0 7 |a SPARK (Electronic resource)  |2 fast 
650 0 |a Data mining. 
650 0 |a Computer programming. 
650 2 |a Data Mining 
650 6 |a Exploration de données (Informatique) 
650 6 |a Programmation (Informatique) 
650 7 |a computer programming.  |2 aat 
650 7 |a Computer programming  |2 fast 
650 7 |a Data mining  |2 fast 
856 4 0 |u https://learning.oreilly.com/library/view/~/9781492082378/?ar  |z Texto completo (Requiere registro previo con correo institucional) 
994 |a 92  |b IZTAP