|
|
|
|
LEADER |
00000cam a2200000Ii 4500 |
001 |
EBSCO_ocn981985497 |
003 |
OCoLC |
005 |
20231017213018.0 |
006 |
m o d |
007 |
cr |n||||||||| |
008 |
170407s2017 enk o 001 0 eng d |
040 |
|
|
|a IDEBK
|b eng
|e pn
|c IDEBK
|d YDX
|d MERUC
|d N$T
|d EBLCP
|d OCLCF
|d COO
|d IDEBK
|d OCLCQ
|d OCLCO
|d OCLCQ
|d OCLCO
|d LVT
|d UKAHL
|d OCLCQ
|d OCLCO
|d OCLCQ
|d OCLCO
|
019 |
|
|
|a 981591538
|a 981844508
|a 982010852
|
020 |
|
|
|a 1785888285
|q (electronic bk.)
|
020 |
|
|
|a 9781785888281
|q (electronic bk.)
|
020 |
|
|
|z 1785882147
|
029 |
1 |
|
|a AU@
|b 000066231506
|
029 |
1 |
|
|a CHNEW
|b 000953095
|
029 |
1 |
|
|a CHVBK
|b 484641344
|
029 |
1 |
|
|a AU@
|b 000067024705
|
029 |
1 |
|
|a AU@
|b 000067103185
|
035 |
|
|
|a (OCoLC)981985497
|z (OCoLC)981591538
|z (OCoLC)981844508
|z (OCoLC)982010852
|
037 |
|
|
|a 1003903
|b MIL
|
050 |
|
4 |
|a QA76.9.D343
|
072 |
|
7 |
|a COM
|x 021030
|2 bisacsh
|
082 |
0 |
4 |
|a 005.75/85
|2 23
|
049 |
|
|
|a UAMI
|
100 |
1 |
|
|a Morgan, Andrew.
|
245 |
1 |
0 |
|a Mastering Spark for data science /
|c Andrew Morgan, Antoine Amend, Matthew Hallett, David George ; foreword by Harry Powell.
|
260 |
|
|
|a Birmingham, UK :
|b Packt Publishing Ltd.,
|c 2017.
|
300 |
|
|
|a 1 online resource
|
336 |
|
|
|a text
|b txt
|2 rdacontent
|
337 |
|
|
|a computer
|b c
|2 rdamedia
|
338 |
|
|
|a online resource
|b cr
|2 rdacarrier
|
500 |
|
|
|a Includes index.
|
520 |
|
|
|a "Master the techniques and sophisticated analytics used to construct Spark-based solutions that scale to deliver production-grade data science products."
|
588 |
0 |
|
|a Print version record.
|
505 |
0 |
|
|a Cover; Copyright; Credits; Foreword; About the Authors; About the Reviewer; www.PacktPub.com; Customer Feedback; Table of Contents; Preface; Chapter 1: The Big Data Science Ecosystem; Introducing the Big Data ecosystem; Data management; Data management responsibilities; The right tool for the job; Overall architecture; Data Ingestion; Data Lake; Reliable storage; Scalable data processing capability; Data science platform; Data Access; Data technologies; The role of Apache Spark; Companion tools; Apache HDFS; Advantages; Disadvantages; Installation; Amazon S3; Advantages; Disadvantages.
|
505 |
8 |
|
|a InstallationApache Kafka; Advantages; Disadvantages; Installation; Apache Parquet; Advantages; Disadvantages; Installation; Apache Avro; Advantages; Disadvantages; Installation; Apache NiFi; Advantages; Disadvantages; Installation; Apache YARN; Advantages; Disadvantages; Installation; Apache Lucene; Advantages; Disadvantages; Installation; Kibana; Advantages; Disadvantages; Installation; Elasticsearch; Advantages; Disadvantages; Installation; Accumulo; Advantages; Disadvantages; Installation; Summary; Chapter 2: Data Acquisition; Data pipelines; Universal ingestion framework.
|
505 |
8 |
|
|a Introducing the GDELT news streamDiscovering GDELT in real-time; Our first GDELT feed; Improving with publish and subscribe; Content registry; Choices and more choices; Going with the flow; Metadata model; Kibana dashboard; Quality assurance; [Example 1 -- Basic quality checking, no contending users]; Example 1 -- Basic quality checking, no contending users; Example 2 -- Advanced quality checking, no contending users; Example 3 -- Basic quality checking, 50% utility due to contending users; Summary; Chapter 3: Input Formats and Schema; A structured life is a good life; GDELT dimensional modeling.
|
505 |
8 |
|
|a GDELT modelFirst look at the data; Core global knowledge graph model; Hidden complexity; Denormalized models; Challenges with flattened data; Issue 1 -- Loss of contextual information; Issue 2: Re-establishing dimensions; Issue 3: Including reference data; Loading your data; Schema agility; Reality check; GKG ELT; Position matters; Avro; Spark-Avro method; Pedagogical method; When to perform Avro transformation; Parquet; Summary; Chapter 4: Exploratory Data Analysis; The problem, principles and planning; Understanding the EDA problem; Design principles; General plan of exploration; Preparation.
|
505 |
8 |
|
|a Introducing mask based data profilingIntroducing character class masks; Building a mask based profiler; Setting up Apache Zeppelin; Constructing a reusable notebook; Exploring GDELT; GDELT GKG datasets; The files; Special collections; Reference data; Exploring the GKG v2.1; The Translingual files; A configurable GCAM time series EDA; Plot.ly charting on Apache Zeppelin; Exploring translation sourced GCAM sentiment with plot.ly; Concluding remarks; A configurable GCAM Spatio-Temporal EDA; Introducing GeoGCAM; Does our spatial pivot work?; Summary; Chapter 5: Spark for Geographic Analysis.
|
590 |
|
|
|a eBooks on EBSCOhost
|b EBSCO eBook Subscription Academic Collection - Worldwide
|
630 |
0 |
0 |
|a Spark (Electronic resource : Apache Software Foundation)
|
630 |
0 |
7 |
|a Spark (Electronic resource : Apache Software Foundation)
|2 fast
|
650 |
|
0 |
|a Data mining.
|
650 |
|
0 |
|a Machine learning.
|
650 |
|
0 |
|a Big data.
|
650 |
|
6 |
|a Exploration de données (Informatique)
|
650 |
|
6 |
|a Apprentissage automatique.
|
650 |
|
6 |
|a Données volumineuses.
|
650 |
|
7 |
|a COMPUTERS
|x Databases
|x Data Mining.
|2 bisacsh
|
650 |
|
7 |
|a Big data
|2 fast
|
650 |
|
7 |
|a Data mining
|2 fast
|
650 |
|
7 |
|a Machine learning
|2 fast
|
700 |
1 |
|
|a Amend, Antoine.
|
700 |
1 |
|
|a George, David.
|
700 |
1 |
|
|a Hallett, Matthew.
|
776 |
0 |
8 |
|i Print version:
|a Morgan, Andrew.
|t Mastering Spark for Data Science.
|d Birmingham : Packt Publishing, ©2017
|
856 |
4 |
0 |
|u https://ebsco.uam.elogim.com/login.aspx?direct=true&scope=site&db=nlebk&AN=1495812
|z Texto completo
|
938 |
|
|
|a Askews and Holts Library Services
|b ASKH
|n AH30656483
|
938 |
|
|
|a EBL - Ebook Library
|b EBLB
|n EBL4833930
|
938 |
|
|
|a EBSCOhost
|b EBSC
|n 1495812
|
938 |
|
|
|a ProQuest MyiLibrary Digital eBook Collection
|b IDEB
|n cis34561627
|
938 |
|
|
|a YBP Library Services
|b YANK
|n 13953597
|
994 |
|
|
|a 92
|b IZTAP
|