Mastering Large Datasets with Python /
Mastering Large Datasets with Python teaches you to write code that can handle datasets of any size. You'll start with laptop-sized datasets that teach you to parallelize data analysis by breaking large tasks into smaller ones that can run simultaneously. You'll then scale those same progr...
Auteur principal: | |
---|---|
Collectivité auteur: | |
Format: | Électronique eBook |
Langue: | Inglés |
Publié: |
Manning Publications,
2020.
|
Édition: | 1st edition. |
Sujets: | |
Accès en ligne: | Texto completo (Requiere registro previo con correo institucional) |
Résumé: | Mastering Large Datasets with Python teaches you to write code that can handle datasets of any size. You'll start with laptop-sized datasets that teach you to parallelize data analysis by breaking large tasks into smaller ones that can run simultaneously. You'll then scale those same programs to industrial-sized datasets on a cluster of cloud servers. With the map and reduce paradigm firmly in place, you'll explore tools like Hadoop and PySpark to efficiently process massive distributed datasets, speed up decision-making with machine learning, and simplify your data storage with AWS S3. |
---|---|
Description matérielle: | 1 online resource (312 pages) |