Cargando…

Machine Learning with Apache Spark Quick Start Guide : Uncover Patterns, Derive Actionable Insights, and Learn from Big Data Using MLlib.

Chapter 3: Artificial Intelligence and Machine Learning; Artificial intelligence; Machine learning; Supervised learning; Unsupervised learning; Reinforced learning; Deep learning; Natural neuron; Artificial neuron; Weights; Activation function; Heaviside step function; Sigmoid function; Hyperbolic t...

Descripción completa

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Quddus, Jillur
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Birmingham : Packt Publishing Ltd, 2018.
Temas:
Acceso en línea:Texto completo
Tabla de Contenidos:
  • Cover; Title Page; Copyright and Credits; Dedication; About Packt; Contributors; Table of Contents; Preface; Chapter 1: The Big Data Ecosystem; A brief history of data; Vertical scaling; Master/slave architecture; Sharding; Data processing and analysis; Data becomes big; Big data ecosystem; Horizontal scaling; Distributed systems; Distributed data stores; Distributed filesystems; Distributed databases; NoSQL databases; Document databases; Columnar databases; Key-value databases; Graph databases; CAP theorem; Distributed search engines; Distributed processing; MapReduce; Apache Spark
  • RDDs, DataFrames, and datasetsRDDs; DataFrames; Datasets; Jobs, stages, and tasks; Job; Stage; Tasks; Distributed messaging; Distributed streaming; Distributed ledgers; Artificial intelligence and machine learning; Cloud computing platforms; Data insights platform; Reference logical architecture; Data sources layer; Ingestion layer; Persistent data storage layer; Data processing layer; Serving data storage layer; Data intelligence layer; Unified access layer; Data insights and reporting layer; Platform governance, management, and administration; Open source implementation; Summary
  • Chapter 2: Setting Up a Local Development EnvironmentCentOS Linux 7 virtual machine; Java SE Development Kit 8; Scala 2.11; Anaconda 5 with Python 3; Basic conda commands; Additional Python packages; Jupyter Notebook; Starting Jupyter Notebook; Troubleshooting Jupyter Notebook; Apache Spark 2.3; Spark binaries; Local working directories; Spark configuration; Spark properties; Environmental variables; Standalone master server; Spark worker node; PySpark and Jupyter Notebook; Apache Kafka 2.0; Kafka binaries; Local working directories; Kafka configuration; Start the Kafka server; Testing Kafka
  • Univariate linear regressionResiduals; Root mean square error; R-squared; Univariate linear regression in Apache Spark; Multivariate linear regression; Correlation; Multivariate linear regression in Apache Spark; Logistic regression; Threshold value; Confusion matrix; Receiver operator characteristic curve; Area under the ROC curve; Case study
  • predicting breast cancer; Classification and Regression Trees; Case study
  • predicting political affiliation; Random forests; K-Fold cross validation; Summary; Chapter 5: Unsupervised Learning Using Apache Spark; Clustering; Euclidean distance