Cargando…

Mastering Big Data Analytics with PySpark

Effectively apply Advanced Analytics to large datasets using the power of PySpark About This Video Solve your big data problems by building powerful Machine Learning models with Spark and implementing them using Python Get up-and-running with Spark's essential libraries and tools (such as PySpa...

Descripción completa

Detalles Bibliográficos
Autor principal: Meijer, Danny (Autor)
Autor Corporativo: Safari, an O'Reilly Media Company
Formato: Electrónico Video
Idioma:Inglés
Publicado: Packt Publishing, 2020.
Edición:1st edition.
Temas:
Acceso en línea:Texto completo (Requiere registro previo con correo institucional)
Descripción
Sumario:Effectively apply Advanced Analytics to large datasets using the power of PySpark About This Video Solve your big data problems by building powerful Machine Learning models with Spark and implementing them using Python Get up-and-running with Spark's essential libraries and tools (such as PySpark, Spark Streaming, Spark SQL, and Spark MLlib) and learn to apply them in practical, real-world big data applications Leverage Spark 2.x - one of the most popular big data technologies-to discover how powerful Spark Machine Learning is how easily you can apply it! In Detail PySpark helps you perform data analysis at-scale; it enables you to build more scalable analyses and pipelines. This course starts by introducing you to PySpark's potential for performing effective analyses of large datasets. You'll learn how to interact with Spark from Python and connect Jupyter to Spark to provide rich data visualizations. After that, you'll delve into various Spark components and its architecture. You'll learn to work with Apache Spark and perform ML tasks more smoothly than before. Gathering and querying data using Spark SQL, to overcome challenges involved in reading it. You'll use the DataFrame API to operate with Spark MLlib and learn about the Pipeline API. Finally, we provide tips and tricks for deploying your code and performance tuning. By the end of this course, you will not only be able to perform efficient data analytics but will have also learned to use PySpark to easily analyze large datasets at-scale in your organization.
Descripción Física:1 online resource (1 video file, approximately 8 hr., 7 min.)