Cargando…

Apache Flume : Distributed Log Collection for Hadoop.

A starter guide that covers Apache Flume in detail. Apache Flume: Distributed Log Collection for Hadoop is intended for people who are responsible for moving datasets into Hadoop in a timely and reliable manner like software engineers, database administrators, and data warehouse administrators.

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Hoffman, Steve
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Packt Publishing, 2013.
Colección:Community experience distilled.
Temas:
Acceso en línea:Texto completo (Requiere registro previo con correo institucional)

MARC

LEADER 00000cam a2200000Ma 4500
001 OR_ocn853239283
003 OCoLC
005 20231017213018.0
006 m o d
007 cr |n|||||||||
008 130719s2013 xx o 000 0 eng d
040 |a IDEBK  |b eng  |e pn  |c IDEBK  |d EBLCP  |d MHW  |d MEAUC  |d UMI  |d OCLCQ  |d DEBSZ  |d OCLCQ  |d DEBBG  |d RIV  |d OCLCQ  |d YDXCP  |d OCLCF  |d OCLCQ  |d FEM  |d MERUC  |d OCLCQ  |d OCLCO  |d CEF  |d OCLCQ  |d OCLCO  |d UAB  |d STF  |d OCLCQ  |d OCLCO 
019 |a 858027908  |a 968098015  |a 969020022 
020 |a 1299735142  |q (ebk) 
020 |a 9781299735149  |q (ebk) 
020 |a 9781782167921 
020 |a 1782167927 
020 |a 1782167919 
020 |a 9781782167914 
020 |z 9781782167914 
029 1 |a AU@  |b 000052281630 
029 1 |a CHNEW  |b 001051908 
029 1 |a CHVBK  |b 567707288 
029 1 |a DEBBG  |b BV041776498 
029 1 |a DEBSZ  |b 397584504 
029 1 |a DEBSZ  |b 404319416 
029 1 |a GBVCP  |b 785371680 
029 1 |a AU@  |b 000067101711 
035 |a (OCoLC)853239283  |z (OCoLC)858027908  |z (OCoLC)968098015  |z (OCoLC)969020022 
037 |a 504765  |b MIL 
050 4 |a QA76.76 .A65 
082 0 4 |a 004.6 
049 |a UAMI 
100 1 |a Hoffman, Steve. 
245 1 0 |a Apache Flume :  |b Distributed Log Collection for Hadoop. 
260 |b Packt Publishing,  |c 2013. 
300 |a 1 online resource 
336 |a text  |b txt  |2 rdacontent 
337 |a computer  |b c  |2 rdamedia 
338 |a online resource  |b cr  |2 rdacarrier 
347 |a text file  |2 rda 
490 1 |a Community experience distilled 
520 |a A starter guide that covers Apache Flume in detail. Apache Flume: Distributed Log Collection for Hadoop is intended for people who are responsible for moving datasets into Hadoop in a timely and reliable manner like software engineers, database administrators, and data warehouse administrators. 
505 0 |a Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Overview and Architecture; Flume 0.9; Flume 1.X (Flume-NG); The problem with HDFS and streaming data/logs; Sources, channels, and sinks; Flume events; Interceptors, channel selectors, and sink processors; Tiered data collection (multiple flows and/or agents); Chapter 2: Flume Quick Start; Downloading Flume; Flume in Hadoop distributions; Flume configuration file overview; Starting up with Hello World -- Summary; Chapter 3: Channels; Memory channel; File channel; Summary. 
505 8 |a Chapter 4: Sinks and Sink ProcessorsHDFS sink; Path and filename; File rotation; Compression codecs; Event serializers; Text output; Text with headers; Apache Avro; File type; Sequence file; Data stream; Compressed stream; Timeouts and workers; Sink groups; Load balancing; Failover; Summary; Chapter 5: Sources and Channel Selectors; The problem with using tail; The exec source; The spooling directory source; Syslog sources; The syslog UDP source; The syslog TCP source; The multiport syslog TCP source; Channel selectors; Replicating; Multiplexing; Summary. 
505 8 |a Chapter 6: Interceptors, ETL, and RoutingInterceptors; Timestamp; Host; Static; Regular expression filtering; Regular expression extractor; Custom interceptors; Tiering data flows; Avro Source/Sink; Command-line Avro; Log4J Appender; The Load Balancing Log4J Appender; Routing; Summary; Chapter 7: Monitoring Flume; Monitoring the agent process; Monit; Nagios; Monitoring performance metrics; Ganglia; The internal HTTP server; Custom monitoring hooks; Summary; Chapter 8: There Is No Spoon -- The Realities of Real-time Distributed Data Collection; Transport time versus log time. 
505 8 |a Time zones are evilCapacity planning; Considerations for multiple data centers; Compliance and data expiry; Summary; Index. 
588 0 |a Print version record. 
590 |a O'Reilly  |b O'Reilly Online Learning: Academic/Public Library Edition 
630 0 0 |a Apache Hadoop. 
630 0 7 |a Apache Hadoop.  |2 blmlsh 
630 0 7 |a Apache Hadoop.  |2 fast  |0 (OCoLC)fst01911570 
650 0 |a File organization (Computer science) 
650 0 |a Electronic data processing  |x Distributed processing. 
650 6 |a Fichiers (Informatique)  |x Organisation. 
650 6 |a Traitement réparti. 
650 7 |a Electronic data processing  |x Distributed processing.  |2 fast  |0 (OCoLC)fst00906987 
650 7 |a File organization (Computer science)  |2 fast  |0 (OCoLC)fst00924147 
655 4 |a Llibres electrònics. 
776 0 8 |i Print version:  |z 9781299735149 
830 0 |a Community experience distilled. 
856 4 0 |u https://learning.oreilly.com/library/view/~/9781782167914/?ar  |z Texto completo (Requiere registro previo con correo institucional) 
938 |a EBL - Ebook Library  |b EBLB  |n EBL1274883 
938 |a ProQuest MyiLibrary Digital eBook Collection  |b IDEB  |n cis25840600 
938 |a YBP Library Services  |b YANK  |n 10861127 
994 |a 92  |b IZTAP