Cargando…

Mastering Spark for data science /

"Master the techniques and sophisticated analytics used to construct Spark-based solutions that scale to deliver production-grade data science products."

Detalles Bibliográficos
Clasificación:	Libro Electrónico
Autor principal:	Morgan, Andrew
Otros Autores:	Amend, Antoine, George, David, Hallett, Matthew
Formato:	Electrónico eBook
Idioma:	Inglés
Publicado:	Birmingham, UK : Packt Publishing Ltd., 2017.
Temas:	Spark (Electronic resource : Apache Software Foundation) Data mining. Machine learning. Big data. Exploration de données (Informatique) Apprentissage automatique. Données volumineuses. COMPUTERS > Databases > Data Mining. Big data Data mining Machine learning
Acceso en línea:	Texto completo

MARC


LEADER	00000cam a2200000Ii 4500
001	EBSCO_ocn981985497
003	OCoLC
005	20231017213018.0
006	m o d
007	cr \|n\|\|\|\|\|\|\|\|\|
008	170407s2017 enk o 001 0 eng d
040			\|a IDEBK \|b eng \|e pn \|c IDEBK \|d YDX \|d MERUC \|d N$T \|d EBLCP \|d OCLCF \|d COO \|d IDEBK \|d OCLCQ \|d OCLCO \|d OCLCQ \|d OCLCO \|d LVT \|d UKAHL \|d OCLCQ \|d OCLCO \|d OCLCQ \|d OCLCO
019			\|a 981591538 \|a 981844508 \|a 982010852
020			\|a 1785888285 \|q (electronic bk.)
020			\|a 9781785888281 \|q (electronic bk.)
020			\|z 1785882147
029	1		\|a AU@ \|b 000066231506
029	1		\|a CHNEW \|b 000953095
029	1		\|a CHVBK \|b 484641344
029	1		\|a AU@ \|b 000067024705
029	1		\|a AU@ \|b 000067103185
035			\|a (OCoLC)981985497 \|z (OCoLC)981591538 \|z (OCoLC)981844508 \|z (OCoLC)982010852
037			\|a 1003903 \|b MIL
050		4	\|a QA76.9.D343
072		7	\|a COM \|x 021030 \|2 bisacsh
082	0	4	\|a 005.75/85 \|2 23
049			\|a UAMI
100	1		\|a Morgan, Andrew.
245	1	0	\|a Mastering Spark for data science / \|c Andrew Morgan, Antoine Amend, Matthew Hallett, David George ; foreword by Harry Powell.
260			\|a Birmingham, UK : \|b Packt Publishing Ltd., \|c 2017.
300			\|a 1 online resource
336			\|a text \|b txt \|2 rdacontent
337			\|a computer \|b c \|2 rdamedia
338			\|a online resource \|b cr \|2 rdacarrier
500			\|a Includes index.
520			\|a "Master the techniques and sophisticated analytics used to construct Spark-based solutions that scale to deliver production-grade data science products."
588	0		\|a Print version record.
505	0		\|a Cover; Copyright; Credits; Foreword; About the Authors; About the Reviewer; www.PacktPub.com; Customer Feedback; Table of Contents; Preface; Chapter 1: The Big Data Science Ecosystem; Introducing the Big Data ecosystem; Data management; Data management responsibilities; The right tool for the job; Overall architecture; Data Ingestion; Data Lake; Reliable storage; Scalable data processing capability; Data science platform; Data Access; Data technologies; The role of Apache Spark; Companion tools; Apache HDFS; Advantages; Disadvantages; Installation; Amazon S3; Advantages; Disadvantages.
505	8		\|a InstallationApache Kafka; Advantages; Disadvantages; Installation; Apache Parquet; Advantages; Disadvantages; Installation; Apache Avro; Advantages; Disadvantages; Installation; Apache NiFi; Advantages; Disadvantages; Installation; Apache YARN; Advantages; Disadvantages; Installation; Apache Lucene; Advantages; Disadvantages; Installation; Kibana; Advantages; Disadvantages; Installation; Elasticsearch; Advantages; Disadvantages; Installation; Accumulo; Advantages; Disadvantages; Installation; Summary; Chapter 2: Data Acquisition; Data pipelines; Universal ingestion framework.
505	8		\|a Introducing the GDELT news streamDiscovering GDELT in real-time; Our first GDELT feed; Improving with publish and subscribe; Content registry; Choices and more choices; Going with the flow; Metadata model; Kibana dashboard; Quality assurance; [Example 1 -- Basic quality checking, no contending users]; Example 1 -- Basic quality checking, no contending users; Example 2 -- Advanced quality checking, no contending users; Example 3 -- Basic quality checking, 50% utility due to contending users; Summary; Chapter 3: Input Formats and Schema; A structured life is a good life; GDELT dimensional modeling.
505	8		\|a GDELT modelFirst look at the data; Core global knowledge graph model; Hidden complexity; Denormalized models; Challenges with flattened data; Issue 1 -- Loss of contextual information; Issue 2: Re-establishing dimensions; Issue 3: Including reference data; Loading your data; Schema agility; Reality check; GKG ELT; Position matters; Avro; Spark-Avro method; Pedagogical method; When to perform Avro transformation; Parquet; Summary; Chapter 4: Exploratory Data Analysis; The problem, principles and planning; Understanding the EDA problem; Design principles; General plan of exploration; Preparation.
505	8		\|a Introducing mask based data profilingIntroducing character class masks; Building a mask based profiler; Setting up Apache Zeppelin; Constructing a reusable notebook; Exploring GDELT; GDELT GKG datasets; The files; Special collections; Reference data; Exploring the GKG v2.1; The Translingual files; A configurable GCAM time series EDA; Plot.ly charting on Apache Zeppelin; Exploring translation sourced GCAM sentiment with plot.ly; Concluding remarks; A configurable GCAM Spatio-Temporal EDA; Introducing GeoGCAM; Does our spatial pivot work?; Summary; Chapter 5: Spark for Geographic Analysis.
590			\|a eBooks on EBSCOhost \|b EBSCO eBook Subscription Academic Collection - Worldwide
630	0	0	\|a Spark (Electronic resource : Apache Software Foundation)
630	0	7	\|a Spark (Electronic resource : Apache Software Foundation) \|2 fast
650		0	\|a Data mining.
650		0	\|a Machine learning.
650		0	\|a Big data.
650		6	\|a Exploration de données (Informatique)
650		6	\|a Apprentissage automatique.
650		6	\|a Données volumineuses.
650		7	\|a COMPUTERS \|x Databases \|x Data Mining. \|2 bisacsh
650		7	\|a Big data \|2 fast
650		7	\|a Data mining \|2 fast
650		7	\|a Machine learning \|2 fast
700	1		\|a Amend, Antoine.
700	1		\|a George, David.
700	1		\|a Hallett, Matthew.
776	0	8	\|i Print version: \|a Morgan, Andrew. \|t Mastering Spark for Data Science. \|d Birmingham : Packt Publishing, ©2017
856	4	0	\|u https://ebsco.uam.elogim.com/login.aspx?direct=true&scope=site&db=nlebk&AN=1495812 \|z Texto completo
938			\|a Askews and Holts Library Services \|b ASKH \|n AH30656483
938			\|a EBL - Ebook Library \|b EBLB \|n EBL4833930
938			\|a EBSCOhost \|b EBSC \|n 1495812
938			\|a ProQuest MyiLibrary Digital eBook Collection \|b IDEB \|n cis34561627
938			\|a YBP Library Services \|b YANK \|n 13953597
994			\|a 92 \|b IZTAP

Mastering Spark for data science /

MARC

Ejemplares similares