Cargando…

Apache Flume: distributed log collection for Hadoop : design and implement a series of Flume agents to send streamed data into Hadoop /

If you are a Hadoop programmer who wants to learn about Flume to be able to move datasets into Hadoop in a timely and replicable manner, then this book is ideal for you. No prior knowledge about Apache Flume is necessary, but a basic knowledge of Hadoop and the Hadoop File System (HDFS) is assumed.

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Hoffman, Steve (Autor)
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Birmingham, UK : Packt Publishing, 2015.
Edición:Second edition.
Colección:Community experience distilled.
Temas:
Acceso en línea:Texto completo (Requiere registro previo con correo institucional)

MARC

LEADER 00000cam a2200000Ii 4500
001 OR_ocn906041062
003 OCoLC
005 20231017213018.0
006 m o d
007 cr unu||||||||
008 150402s2015 enka o 001 0 eng d
040 |a UMI  |b eng  |e rda  |e pn  |c UMI  |d DEBBG  |d EBLCP  |d YDXCP  |d COO  |d OCLCF  |d N$T  |d IDB  |d MERUC  |d OCLCQ  |d OCLCO  |d CEF  |d OCLCQ  |d CNNOR  |d DKC  |d AU@  |d OCLCO  |d OCLCQ  |d OCLCO  |d OCLCQ  |d OCLCO  |d OCLCQ  |d QGK  |d OCLCO 
019 |a 904517864  |a 905735633  |a 1259131329 
020 |a 9781784399146  |q (electronic bk.) 
020 |a 1784399140  |q (electronic bk.) 
020 |z 1784399140 
020 |z 1784392170 
020 |z 9781784392178 
029 1 |a CHNEW  |b 000890848 
029 1 |a CHVBK  |b 374495653 
029 1 |a DEBBG  |b BV042683008 
029 1 |a DEBBG  |b BV043619014 
029 1 |a DEBSZ  |b 446582107 
029 1 |a GBVCP  |b 835872424 
035 |a (OCoLC)906041062  |z (OCoLC)904517864  |z (OCoLC)905735633  |z (OCoLC)1259131329 
037 |a CL0500000573  |b Safari Books Online 
050 4 |a QA76.9.D5 
072 7 |a COM  |x 013000  |2 bisacsh 
072 7 |a COM  |x 014000  |2 bisacsh 
072 7 |a COM  |x 018000  |2 bisacsh 
072 7 |a COM  |x 067000  |2 bisacsh 
072 7 |a COM  |x 032000  |2 bisacsh 
072 7 |a COM  |x 037000  |2 bisacsh 
072 7 |a COM  |x 052000  |2 bisacsh 
082 0 4 |a 004.36 
049 |a UAMI 
100 1 |a Hoffman, Steve,  |e author. 
245 1 0 |a Apache Flume: distributed log collection for Hadoop :  |b design and implement a series of Flume agents to send streamed data into Hadoop /  |c Steve Hoffman. 
246 3 0 |a Design and implement a series of Flume agents to send streamed data into Hadoop 
250 |a Second edition. 
264 1 |a Birmingham, UK :  |b Packt Publishing,  |c 2015. 
300 |a 1 online resource (1 volume) :  |b illustrations 
336 |a text  |b txt  |2 rdacontent 
337 |a computer  |b c  |2 rdamedia 
338 |a online resource  |b cr  |2 rdacarrier 
347 |a text file 
490 1 |a Community experience distilled 
588 0 |a Online resource; title from cover page (Safari, viewed March 26, 2015). 
500 |a Includes index. 
505 0 |a Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Overview and Architecture; Flume 0.9; Flume 1.X (Flume-NG); The problem with HDFS and streaming data/logs; Sources, channels, and sinks; Flume events; Interceptors, channel selectors, and sink processors; Tiered data collection (multiple flows and/or agents); The Kite SDK; Summary; Chapter 2: A Quick Start Guide to Flume; Downloading Flume; Flume in Hadoop distributions; An overview of the Flume configuration file; Starting up with ""Hello, World!""; Summary. 
505 8 |a Chapter 3: ChannelsThe memory channel; The file channel; Spillable Memory Channel; Summary; Chapter 4: Sinks and Sink Processors; HDFS sink; Path and filename; File rotation; Compression codecs; Event Serializers; Text output; Text with headers; Apache Avro; User-provided Avro schema; File type; SequenceFile; DataStream; CompressedStream; Timeouts and workers; Sink groups; Load balancing; Failover; MorphlineSolrSink; Morphline configuration files; Typical SolrSink configuration; Sink configuration; ElasticSearchSink; LogStash Serializer; Dynamic Serializer; Summary. 
505 8 |a Chapter 5: Sources and Channel SelectorsThe problem with using tail; The Exec source; Spooling Directory Source; Syslog sources; The syslog UDP source; The syslog TCP source; The multiport syslog TCP source; JMS source; Channel selectors; Replicating; Multiplexing; Summary; Chapter 6: Interceptors, ETL, and Routing; Interceptors; Timestamp; Host; Static; Regular expression filtering; Regular expression extractor; Morphline interceptor; Custom interceptors; The plugins directory; Tiering flows; The Avro source/sink; Compressing Avro; SSL Avro flows; The Thrift source/sink. 
505 8 |a Using command-line AvroThe Log4J appender; The Log4J load-balancing appender; The embedded agent; Configuration and startup; Sending data; Shutdown; Routing; Summary; Chapter 7: Putting it All Together; Web logs to searchable UI; Setting up the web server; Configuring log rotation to the spool directory; Setting up the target -- Elasticsearch; Setting up Flume on collector/relay; Setting up Flume on the client; Creating more search fields with an interceptor; Setting up a better user interface -- Kibana; Archiving to HDFS; Summary; Chapter 8: Monitoring Flume; Monitoring the agent process. 
505 8 |a MonitNagios; Monitoring performance metrics; Ganglia; Internal HTTP server; Custom monitoring hooks; Summary; Chapter 9: There Is No Spoon -- the Realities of Real-time Distributed Data Collection; Transport time versus log time; Time zones are evil; Capacity planning; Considerations for multiple data centers; Compliance and data expiry; Summary; Index. 
520 |a If you are a Hadoop programmer who wants to learn about Flume to be able to move datasets into Hadoop in a timely and replicable manner, then this book is ideal for you. No prior knowledge about Apache Flume is necessary, but a basic knowledge of Hadoop and the Hadoop File System (HDFS) is assumed. 
546 |a English. 
590 |a eBooks on EBSCOhost  |b EBSCO eBook Subscription Academic Collection - Worldwide 
590 |a O'Reilly  |b O'Reilly Online Learning: Academic/Public Library Edition 
630 0 0 |a Apache Hadoop. 
630 0 7 |a Apache Hadoop  |2 fast 
650 0 |a Electronic data processing  |x Distributed processing. 
650 0 |a File organization (Computer science) 
650 6 |a Traitement réparti. 
650 6 |a Fichiers (Informatique)  |x Organisation. 
650 7 |a COMPUTERS  |x Computer Literacy.  |2 bisacsh 
650 7 |a COMPUTERS  |x Computer Science.  |2 bisacsh 
650 7 |a COMPUTERS  |x Data Processing.  |2 bisacsh 
650 7 |a COMPUTERS  |x Hardware  |x General.  |2 bisacsh 
650 7 |a COMPUTERS  |x Information Technology.  |2 bisacsh 
650 7 |a COMPUTERS  |x Machine Theory.  |2 bisacsh 
650 7 |a COMPUTERS  |x Reference.  |2 bisacsh 
650 7 |a Electronic data processing  |x Distributed processing  |2 fast 
650 7 |a File organization (Computer science)  |2 fast 
776 0 8 |i Print version:  |a Hoffman, Steve.  |t Apache Flume : Distributed Log Collection for Hadoop.  |d Birmingham : Packt Publishing, ©2015  |z 9781784392178 
830 0 |a Community experience distilled. 
856 4 0 |u https://learning.oreilly.com/library/view/~/9781784392178/?ar  |z Texto completo (Requiere registro previo con correo institucional) 
938 |a ProQuest Ebook Central  |b EBLB  |n EBL1969460 
938 |a EBSCOhost  |b EBSC  |n 959552 
938 |a YBP Library Services  |b YANK  |n 12316839 
994 |a 92  |b IZTAP