Cargando…

Big data made easy : a working guide to the complete Hadoop toolset /

Many corporations are finding that the size of their data sets are outgrowing the capability of their systems to store and process them. The data is becoming too big to manage and use with traditional tools. The solution: implementing a big data system. As Big Data Made Easy: A Working Guide to the...

Descripción completa

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Frampton, Michael (Autor)
Formato: Electrónico eBook
Idioma:Inglés
Publicado: [Berkeley, CA] : Apress, [2015]
Colección:Expert's voice in big data.
Temas:
Acceso en línea:Texto completo (Requiere registro previo con correo institucional)

MARC

LEADER 00000cam a2200000Ii 4500
001 OR_ocn899211442
003 OCoLC
005 20231017213018.0
006 m o d
007 cr cnu|||unuuu
008 150105s2015 caua o 001 0 eng d
040 |a N$T  |b eng  |e rda  |e pn  |c N$T  |d N$T  |d GW5XE  |d COO  |d UMI  |d E7B  |d B24X7  |d IDEBK  |d DEBBG  |d YDXCP  |d BTCTA  |d VLB  |d AUD  |d EBLCP  |d OCLCF  |d OCLCQ  |d UAB  |d Z5A  |d VGM  |d LIV  |d OCLCQ  |d MERUC  |d ESU  |d OCLCQ  |d VT2  |d IOG  |d OCLCO  |d REB  |d U3W  |d OCL  |d CEF  |d DEHBZ  |d OCLCQ  |d OCLCO  |d INT  |d OCLCQ  |d OCLCO  |d WYU  |d OCLCQ  |d OCLCO  |d UKMGB  |d UKAHL  |d OCLCQ  |d OCLCO  |d NJT  |d ERF  |d OCLCQ  |d WURST  |d UNITY  |d DCT  |d OCLCO  |d OCLCQ  |d OCLCO 
015 |a GBB902136  |2 bnb 
016 7 |a 019192417  |2 Uk 
019 |a 901701236  |a 908082861  |a 910990739  |a 1005789961  |a 1026453037  |a 1048157809  |a 1066673377  |a 1066691743  |a 1086533726  |a 1112590125  |a 1129348069  |a 1153040509  |a 1192335356  |a 1204000444  |a 1240525773 
020 |a 9781484200940  |q (electronic bk.) 
020 |a 1484200942  |q (electronic bk.) 
020 |a 1484200950  |q (print) 
020 |a 9781484200957  |q (print) 
020 |z 9781484200957 
024 7 |a 10.1007/978-1-4842-0094-0  |2 doi 
029 1 |a AU@  |b 000055220455 
029 1 |a CHNEW  |b 000890473 
029 1 |a CHVBK  |b 374491909 
029 1 |a DEBBG  |b BV042487425 
029 1 |a DEBBG  |b BV043617658 
029 1 |a DEBSZ  |b 434828238 
029 1 |a GBVCP  |b 882841165 
029 1 |a NLGGC  |b 387316167 
029 1 |a UKMGB  |b 019192417 
035 |a (OCoLC)899211442  |z (OCoLC)901701236  |z (OCoLC)908082861  |z (OCoLC)910990739  |z (OCoLC)1005789961  |z (OCoLC)1026453037  |z (OCoLC)1048157809  |z (OCoLC)1066673377  |z (OCoLC)1066691743  |z (OCoLC)1086533726  |z (OCoLC)1112590125  |z (OCoLC)1129348069  |z (OCoLC)1153040509  |z (OCoLC)1192335356  |z (OCoLC)1204000444  |z (OCoLC)1240525773 
037 |a CL0500000542  |b Safari Books Online 
050 4 |a QA76.9.D5  |b F73 2015eb 
072 7 |a COM  |x 013000  |2 bisacsh 
072 7 |a COM  |x 014000  |2 bisacsh 
072 7 |a COM  |x 018000  |2 bisacsh 
072 7 |a COM  |x 067000  |2 bisacsh 
072 7 |a COM  |x 032000  |2 bisacsh 
072 7 |a COM  |x 037000  |2 bisacsh 
072 7 |a COM  |x 052000  |2 bisacsh 
072 7 |a UN  |2 bicssc 
072 7 |a UMT  |2 bicssc 
082 0 4 |a 004/.36  |2 23 
049 |a UAMI 
100 1 |a Frampton, Michael,  |e author. 
245 1 0 |a Big data made easy :  |b a working guide to the complete Hadoop toolset /  |c Michael Frampton. 
264 1 |a [Berkeley, CA] :  |b Apress,  |c [2015] 
264 4 |c ©2015 
300 |a 1 online resource :  |b color illustrations 
336 |a text  |b txt  |2 rdacontent 
337 |a computer  |b c  |2 rdamedia 
338 |a online resource  |b cr  |2 rdacarrier 
347 |a text file  |b PDF  |2 rda 
490 1 |a The expert's voice in big data 
500 |a Includes index. 
588 0 |a Online resource; title from PDF title page (EBSCO, viewed January 9, 2015). 
520 |a Many corporations are finding that the size of their data sets are outgrowing the capability of their systems to store and process them. The data is becoming too big to manage and use with traditional tools. The solution: implementing a big data system. As Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset shows, Apache Hadoop offers a scalable, fault-tolerant system for storing and processing data in parallel. It has a very rich toolset that allows for storage (Hadoop), configuration (YARN and ZooKeeper), collection (Nutch and Solr), processing (Storm, Pig, and Map Reduce), scheduling (Oozie), moving (Sqoop and Avro), monitoring (Chukwa, Ambari, and Hue), testing (Big Top), and analysis (Hive). The problem is that the Internet offers IT pros wading into big data many versions of the truth and some outright falsehoods born of ignorance. What is needed is a book just like this one: a wide-ranging but easily understood set of instructions to explain where to get Hadoop tools, what they can do, how to install them, how to configure them, how to integrate them, and how to use them successfully. And you need an expert who has worked in this area for a decade?someone just like author and big data expert Mike Frampton. Big Data Made Easy approaches the problem of managing massive data sets from a systems perspective, and it explains the roles for each project (like architect and tester, for example) and shows how the Hadoop toolset can be used at each system stage. It explains, in an easily understood manner and through numerous examples, how to use each tool. The book also explains the sliding scale of tools available depending upon data size and when and how to use them. Big Data Made Easy shows developers and architects, as well as testers and project managers, how to: Store big data Configure big data Process big data Schedule processes Move data among SQL and NoSQL systems Monitor data Perform big data analytics Report on big data processes and projects Test big data systems Big Data Made Easy also explains the best part, which is that this toolset is free. Anyone can download it and?with the help of this book?start to use it within a day. With the skills this book will teach you under your belt, you will add value to your company or client immediately, not to mention your career. 
505 0 |a At a Glance; Introduction; Chapter 1: The Problem with Data; A Definition of "Big Data"; The Potentials and Difficulties of Big Data; Requirements for a Big Data System; How Hadoop Tools Can Help; My Approach; Overview of the Big Data System; Big Data Flow and Storage; Benefits of Big Data Systems; What's in This Book; Storage: Chapter 2; Data Collection: Chapter 3; Processing: Chapter 4; Scheduling: Chapter 5; Data Movement: Chapter 6; Monitoring: Chapter 7; Cluster Management: Chapter 8; Analysis: Chapter 9; ETL: Chapter 10; Reports: Chapter 11; Summary. 
505 8 |a Chapter 2: Storing and Configuring Data with Hadoop, YARN, and ZooKeeperAn Overview of Hadoop; The Hadoop V1 Architecture; The Differences in Hadoop V2; The Hadoop Stack; Environment Management; Hadoop V1 Installation; Hadoop 1.2.1 Single-Node Installation; 1. Set up Bash shell file for hadoop HOME/.bashrc; 2. Set up conf/hadoop-env. sh; 3. Create Hadoop temporary directory; 4. Set up conf/core-site. xml; 5. Set up conf/mapred-site. xml; 6. Set up file conf/hdfs-site. xml; 7. Format the file system; Setting up the Cluster; Running a Map Reduce Job Check; Hadoop User Interfaces. 
505 8 |a Hadoop V2 InstallationZooKeeper Installation; Manually Accessing the ZooKeeper Servers; The ZooKeeper Client; Hadoop MRv2 and YARN; Running Another Map Reduce Job Test; Hadoop Commands; Hadoop Shell Commands; Hadoop User Commands; Hadoop Administration Commands; Summary; Chapter 3: Collecting Data with Nutch and Solr; The Environment; Stopping the Servers; Changing the Environment Scripts; Starting the Servers; Architecture 1: Nutch 1.x; Nutch Installation; Solr Installation; Running Nutch with Hadoop 1.8; Architecture 2: Nutch 2.x; Nutch and Solr Configuration; HBase Installation. 
505 8 |a Gora ConfigurationRunning the Nutch Crawl; Potential Errors; A Brief Comparison; Summary; Chapter 4: Processing Data with Map Reduce; An Overview of the Word-Count Algorithm; Map Reduce Native; Java Word-Count Example 1; Describing the Example 1 Code; Running the Example 1 Code; Java Word-Count Example 2; Describing the Example 2 Code; Running the Example 2 Code; Comparing the Examples; Map Reduce with Pig; Installing Pig; Running Pig; Pig User-Defined Functions; Map Reduce with Hive; InstallingHive; Hive Word-Count Example; Map Reduce with Perl; Summary; Chapter 5: Scheduling and Workflow. 
505 8 |a An Overview of SchedulingThe Capacity Scheduler; The Fair Scheduler; Scheduling in Hadoop V1; V1 Capacity Scheduler; V1 Fair Scheduler; Scheduling in Hadoop V2; V2 Capacity Scheduler; V2 Fair Scheduler; Using Oozie for Workflow; Installing Oozie; The Mechanics of the Oozie Workflow; Oozie Workflow Control Nodes; Oozie Workflow Actions; Creating an Oozie Workflow; The Workflow Configuration File; Running an Oozie Workflow; Scheduling an Oozie Workflow; Summary; Chapter 6: Moving Data; Moving File System Data; The Cat Command; The CopyFromLocal Command; The CopyToLocal Command; The Cp Command. 
590 |a O'Reilly  |b O'Reilly Online Learning: Academic/Public Library Edition 
630 0 0 |a Apache Hadoop. 
630 0 7 |a Apache Hadoop  |2 fast 
650 0 |a Electronic data processing  |x Distributed processing. 
650 0 |a Big data  |x Computer programs. 
650 6 |a Traitement réparti. 
650 6 |a Données volumineuses  |x Logiciels. 
650 7 |a Databases.  |2 bicssc 
650 7 |a Computer networking & communications.  |2 bicssc 
650 7 |a COMPUTERS  |x Computer Literacy.  |2 bisacsh 
650 7 |a COMPUTERS  |x Computer Science.  |2 bisacsh 
650 7 |a COMPUTERS  |x Data Processing.  |2 bisacsh 
650 7 |a COMPUTERS  |x Hardware  |x General.  |2 bisacsh 
650 7 |a COMPUTERS  |x Information Technology.  |2 bisacsh 
650 7 |a COMPUTERS  |x Machine Theory.  |2 bisacsh 
650 7 |a COMPUTERS  |x Reference.  |2 bisacsh 
650 7 |a Electronic data processing  |x Distributed processing  |2 fast 
653 0 0 |a computerwetenschappen 
653 0 0 |a computer sciences 
653 0 0 |a informatiesystemen 
653 0 0 |a information systems 
653 0 0 |a communicatie 
653 0 0 |a communication 
653 0 0 |a databasebeheer 
653 0 0 |a database management 
653 1 0 |a Information and Communication Technology (General) 
653 1 0 |a Informatie- en communicatietechnologie (algemeen) 
773 0 |t Springer eBooks 
776 0 8 |i Printed edition:  |z 9781484200957 
830 0 |a Expert's voice in big data. 
856 4 0 |u https://learning.oreilly.com/library/view/~/9781484200940/?ar  |z Texto completo (Requiere registro previo con correo institucional) 
938 |a Askews and Holts Library Services  |b ASKH  |n AH29490584 
938 |a Books 24x7  |b B247  |n bks00078463 
938 |a Baker and Taylor  |b BTCP  |n BK0017513114 
938 |a EBL - Ebook Library  |b EBLB  |n EBL1964849 
938 |a ebrary  |b EBRY  |n ebr11003169 
938 |a EBSCOhost  |b EBSC  |n 935086 
938 |a ProQuest MyiLibrary Digital eBook Collection  |b IDEB  |n cis30426715 
938 |a YBP Library Services  |b YANK  |n 12229113 
994 |a 92  |b IZTAP