Cargando…

Modern Scala projects : leverage the power of Scala for building data-driven and high-performant projects /

Scala is a multipurpose programming language, especially for analyzing large datasets without impacting the application performance. Its functional libraries can interact with databases and build scalable frameworks that create robust data pipelines. This book showcases how you can use Scala and its...

Descripción completa

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Gurusamy, Ilango (Autor)
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Birmingham, UK : Packt Publishing, 2018.
Temas:
Acceso en línea:Texto completo (Requiere registro previo con correo institucional)

MARC

LEADER 00000cam a2200000Ii 4500
001 OR_on1050169895
003 OCoLC
005 20231017213018.0
006 m o d
007 cr unu||||||||
008 180829s2018 enka ob 000 0 eng d
040 |a UMI  |b eng  |e rda  |e pn  |c UMI  |d OCLCF  |d STF  |d TEFOD  |d CEF  |d G3B  |d TEFOD  |d EBLCP  |d MERUC  |d UAB  |d UKAHL  |d OCLCQ  |d N$T  |d OCLCQ  |d UX1  |d K6U  |d NLW  |d OCLCO  |d OCLCQ  |d OCLCO 
019 |a 1175628449 
020 |a 9781788625272  |q (electronic bk.) 
020 |a 1788625277  |q (electronic bk.) 
020 |z 9781788624114 
020 |a 1788624114  |q (Trade Paper) 
020 |a 9781788624114 
024 3 |a 9781788624114 
029 1 |a CHNEW  |b 001039839 
029 1 |a CHVBK  |b 559035039 
029 1 |a AU@  |b 000067288824 
035 |a (OCoLC)1050169895  |z (OCoLC)1175628449 
037 |a CL0500000988  |b Safari Books Online 
037 |a E415B6AB-04D2-4BE4-9357-9B9C1E2372B3  |b OverDrive, Inc.  |n http://www.overdrive.com 
050 4 |a QA76.73.S28 
082 0 4 |a 005.133  |2 23 
049 |a UAMI 
100 1 |a Gurusamy, Ilango,  |e author. 
245 1 0 |a Modern Scala projects :  |b leverage the power of Scala for building data-driven and high-performant projects /  |c Ilango Gurusamy. 
264 1 |a Birmingham, UK :  |b Packt Publishing,  |c 2018. 
300 |a 1 online resource :  |b illustrations 
336 |a text  |b txt  |2 rdacontent 
337 |a computer  |b c  |2 rdamedia 
338 |a online resource  |b cr  |2 rdacarrier 
588 0 |a Online resource; title from title page (Safari, viewed August 27, 2018). 
504 |a Includes bibliographical references. 
505 0 |a Cover; Title Page; Copyright and Credits; Packt Upsell; Contributors; Table of Contents; Preface; Chapter 1: Predict the Class of a Flower from the Iris Dataset; A multivariate classification problem; Understanding multivariate; Different kinds of variables; Categorical variables; Fischer's Iris dataset; The Iris dataset represents amulticlass, multidimensional classification task; The training dataset; The mapping function; An algorithm and its mapping function; Supervised learning -- how it relates to the Iris classification task; Random Forest classification algorithm 
505 8 |a Project overview -- problem formulationGetting started with Spark; Setting up prerequisite software; Installing Spark in standalone deploy mode; Developing a simple interactive data analysis utility; Reading a data file and deriving DataFrame out of it; Implementing the Iris pipeline; Iris pipeline implementation objectives; Step 1- getting the Iris dataset from the UCI Machine Learning Repository; Step 2- preliminary EDA; Firing up Spark shell; Loading the iris.csv file and building a DataFrame; Calculating statistics; Inspecting your SparkConf again; Calculating statistics again 
505 8 |a Step 3- creating an SBT projectStep 4 -- creating Scala files in SBT project; Step 5 -- preprocessing, data transformation, and DataFrame creation; DataFrame Creation; Step 6 -- creating, training, and testing data; Step 7 -- creating a Random Forest classifier; Step 8 -- training the Random Forest classifier; Step 9 -- applying the Random Forest classifier to test data; Step 10 -- evaluate Random Forest classifier; Step 11 -- running the pipeline as an SBT application; Step 12 -- packaging the application; Step 13 -- submitting the pipeline application to Spark local; Summary; Questions 
505 8 |a Chapter 2: Build a Breast Cancer Prognosis Pipeline with the Power of Spark and ScalaBreast cancer classification problem; Breast cancer dataset at a glance; Logistic regression algorithm; Salient characteristics of LR; Binary logistic regression assumptions; A fictitious dataset and LR; LR as opposed to linear regression; Formulation of a linear regression classification model; Logit function as a mathematical equation; LR function; Getting started; Setting up prerequisite software; Implementation objectives; Implementation objective 1 -- getting the breast cancer dataset 
505 8 |a Implementation objective 2- deriving a dataframe for EDAStep 1 -- conducting preliminaryEDA; Step 2 -- loading data and converting it to an RDD[String]; Step 3 -- splitting the resilient distributed dataset and reorganizing individual rows into an array; Step 4 -- purging the dataset of rows containing question mark characters; Step 5 -- running a count after purging the dataset of rows with questionable characters; Step 6 -- getting rid of header; Step 7 -- creating a two-column DataFrame; Step 8 -- creating the final DataFrame; Random Forest breast cancer pipeline 
520 |a Scala is a multipurpose programming language, especially for analyzing large datasets without impacting the application performance. Its functional libraries can interact with databases and build scalable frameworks that create robust data pipelines. This book showcases how you can use Scala and its constructs to meet specific project demands. 
590 |a O'Reilly  |b O'Reilly Online Learning: Academic/Public Library Edition 
650 0 |a Scala (Computer program language) 
650 0 |a Machine learning. 
650 0 |a Electronic data processing. 
650 6 |a Scala (Langage de programmation) 
650 6 |a Apprentissage automatique. 
650 7 |a Project management software.  |2 bicssc 
650 7 |a Database software.  |2 bicssc 
650 7 |a Programming & scripting languages: general.  |2 bicssc 
650 7 |a Computers  |x Desktop Applications  |x Project Management Software.  |2 bisacsh 
650 7 |a Computers  |x Desktop Applications  |x Databases.  |2 bisacsh 
650 7 |a Computers  |x Programming Languages  |x Java.  |2 bisacsh 
650 7 |a Electronic data processing  |2 fast 
650 7 |a Machine learning  |2 fast 
650 7 |a Scala (Computer program language)  |2 fast 
856 4 0 |u https://learning.oreilly.com/library/view/~/9781788624114/?ar  |z Texto completo (Requiere registro previo con correo institucional) 
938 |a Askews and Holts Library Services  |b ASKH  |n BDZ0037628266 
938 |a ProQuest Ebook Central  |b EBLB  |n EBL5520893 
938 |a EBSCOhost  |b EBSC  |n 1860849 
994 |a 92  |b IZTAP