|
|
|
|
LEADER |
00000cam a22000003i 4500 |
001 |
OR_on1308508409 |
003 |
OCoLC |
005 |
20231017213018.0 |
006 |
m o d |
007 |
cr cnu|||unuuu |
008 |
220401s2021 xx o 000 0 eng d |
040 |
|
|
|a N$T
|b eng
|e rda
|e pn
|c N$T
|d YDX
|d EBLCP
|d TOH
|d AU@
|d VT2
|d OCLCF
|d DST
|d UKAHL
|d OCLCO
|d OCLCQ
|
019 |
|
|
|a 1256713426
|a 1256804372
|a 1272923830
|a 1281717485
|
020 |
|
|
|a 9781638356837
|q (electronic bk.)
|
020 |
|
|
|a 1638356831
|q (electronic bk.)
|
020 |
|
|
|z 9781617296901
|
020 |
|
|
|z 1617296902
|
024 |
8 |
|
|a 9781617296901
|
029 |
1 |
|
|a AU@
|b 000069347134
|
029 |
1 |
|
|a AU@
|b 000071968419
|
035 |
|
|
|a (OCoLC)1308508409
|z (OCoLC)1256713426
|z (OCoLC)1256804372
|z (OCoLC)1272923830
|z (OCoLC)1281717485
|
050 |
|
4 |
|a QA76.9.D343
|b .H374 2021
|
082 |
0 |
4 |
|a 006.3/12
|2 23
|
049 |
|
|
|a UAMI
|
100 |
1 |
|
|a Ruiter, Julian de.
|
245 |
1 |
0 |
|a Data Pipelines with Apache Airflow.
|
264 |
|
1 |
|a [Place of publication not identified] :
|b Simon & Schuster :
|b Manning,
|c 2021.
|
300 |
|
|
|a 1 online resource
|
336 |
|
|
|a text
|b txt
|2 rdacontent
|
337 |
|
|
|a computer
|b c
|2 rdamedia
|
338 |
|
|
|a online resource
|b cr
|2 rdacarrier
|
347 |
|
|
|a text file
|
588 |
0 |
|
|a Vendor-supplied metadata.
|
505 |
0 |
|
|a Intro -- inside front cover -- Data Pipelines with Apache Airflow -- Copyright -- brief contents -- contents -- front matter -- preface -- acknowledgments -- Bas Harenslak -- Julian de Ruiter -- about this book -- Who should read this book -- How this book is organized: A road map -- About the code -- LiveBook discussion forum -- about the authors -- about the cover illustration -- Part 1. Getting started -- 1 Meet Apache Airflow -- 1.1 Introducing data pipelines -- 1.1.1 Data pipelines as graphs -- 1.1.2 Executing a pipeline graph -- 1.1.3 Pipeline graphs vs. sequential scripts
|
505 |
8 |
|
|a 1.1.4 Running pipeline using workflow managers -- 1.2 Introducing Airflow -- 1.2.1 Defining pipelines flexibly in (Python) code -- 1.2.2 Scheduling and executing pipelines -- 1.2.3 Monitoring and handling failures -- 1.2.4 Incremental loading and backfilling -- 1.3 When to use Airflow -- 1.3.1 Reasons to choose Airflow -- 1.3.2 Reasons not to choose Airflow -- 1.4 The rest of this book -- Summary -- 2 Anatomy of an Airflow DAG -- 2.1 Collecting data from numerous sources -- 2.1.1 Exploring the data -- 2.2 Writing your first Airflow DAG -- 2.2.1 Tasks vs. operators
|
505 |
8 |
|
|a 2.2.2 Running arbitrary Python code -- 2.3 Running a DAG in Airflow -- 2.3.1 Running Airflow in a Python environment -- 2.3.2 Running Airflow in Docker containers -- 2.3.3 Inspecting the Airflow UI -- 2.4 Running at regular intervals -- 2.5 Handling failing tasks -- Summary -- 3 Scheduling in Airflow -- 3.1 An example: Processing user events -- 3.2 Running at regular intervals -- 3.2.1 Defining scheduling intervals -- 3.2.2 Cron-based intervals -- 3.2.3 Frequency-based intervals -- 3.3 Processing data incrementally -- 3.3.1 Fetching events incrementally
|
505 |
8 |
|
|a 3.3.2 Dynamic time references using execution dates -- 3.3.3 Partitioning your data -- 3.4 Understanding Airflow's execution dates -- 3.4.1 Executing work in fixed-length intervals -- 3.5 Using backfilling to fill in past gaps -- 3.5.1 Executing work back in time -- 3.6 Best practices for designing tasks -- 3.6.1 Atomicity -- 3.6.2 Idempotency -- Summary -- 4 Templating tasks using the Airflow context -- 4.1 Inspecting data for processing with Airflow -- 4.1.1 Determining how to load incremental data -- 4.2 Task context and Jinja templating -- 4.2.1 Templating operator arguments
|
505 |
8 |
|
|a 4.2.2 What is available for templating? -- 4.2.3 Templating the PythonOperator -- 4.2.4 Providing variables to the PythonOperator -- 4.2.5 Inspecting templated arguments -- 4.3 Hooking up other systems -- Summary -- 5 Defining dependencies between tasks -- 5.1 Basic dependencies -- 5.1.1 Linear dependencies -- 5.1.2 Fan-in/-out dependencies -- 5.2 Branching -- 5.2.1 Branching within tasks -- 5.2.2 Branching within the DAG -- 5.3 Conditional tasks -- 5.3.1 Conditions within tasks -- 5.3.2 Making tasks conditional -- 5.3.3 Using built-in operators -- 5.4 More about trigger rules
|
520 |
|
|
|a Data Pipelines with Apache Airflow teaches you how to build and maintain effective data pipelines. You'll explore the most common usage patterns, including aggregating multiple data sources, connecting to and from data lakes, and cloud deployment. Part reference and part tutorial, this practical guide covers every aspect of the directed acyclic graphs (DAGs) that power Airflow, and how to customize them for your pipeline's needs.
|
542 |
|
|
|f © 2021 Manning Publications Co. All rights reserved.
|g 2021
|
590 |
|
|
|a O'Reilly
|b O'Reilly Online Learning: Academic/Public Library Edition
|
650 |
|
0 |
|a Data mining.
|
650 |
|
0 |
|a Cloud computing.
|
650 |
|
0 |
|a Programming languages (Electronic computers)
|
650 |
|
0 |
|a Python (Computer program language)
|
650 |
|
0 |
|a Big data.
|
650 |
|
0 |
|a Machine learning.
|
650 |
|
0 |
|a Electronic data processing.
|
650 |
|
0 |
|a Information storage and retrieval systems
|x Scalability.
|
650 |
|
0 |
|a Application program interfaces (Computer software)
|
650 |
|
2 |
|a Data Mining
|
650 |
|
6 |
|a Exploration de données (Informatique)
|
650 |
|
6 |
|a Infonuagique.
|
650 |
|
6 |
|a Python (Langage de programmation)
|
650 |
|
6 |
|a Données volumineuses.
|
650 |
|
6 |
|a Apprentissage automatique.
|
650 |
|
6 |
|a Interfaces de programmation d'applications.
|
650 |
|
7 |
|a APIs (interfaces)
|2 aat
|
650 |
|
7 |
|a Application program interfaces (Computer software)
|2 fast
|0 (OCoLC)fst00811704
|
650 |
|
7 |
|a Big data.
|2 fast
|0 (OCoLC)fst01892965
|
650 |
|
7 |
|a Cloud computing.
|2 fast
|0 (OCoLC)fst01745899
|
650 |
|
7 |
|a Data mining.
|2 fast
|0 (OCoLC)fst00887946
|
650 |
|
7 |
|a Electronic data processing.
|2 fast
|0 (OCoLC)fst00906956
|
650 |
|
7 |
|a Information storage and retrieval systems
|x Scalability.
|2 fast
|0 (OCoLC)fst01921149
|
650 |
|
7 |
|a Machine learning.
|2 fast
|0 (OCoLC)fst01004795
|
650 |
|
7 |
|a Programming languages (Electronic computers)
|2 fast
|0 (OCoLC)fst01078704
|
650 |
|
7 |
|a Python (Computer program language)
|2 fast
|0 (OCoLC)fst01084736
|
700 |
1 |
|
|a Harenslak, Bas.
|
776 |
0 |
8 |
|i Print version:
|a Ruiter, Julian de.
|t Data Pipelines with Apache Airflow.
|d [Place of publication not identified] : Simon & Schuster : Manning, 2021
|z 9781617296901
|z 1617296902
|w (OCoLC)1249108869
|
856 |
4 |
0 |
|u https://learning.oreilly.com/library/view/~/9781617296901/?ar
|z Texto completo (Requiere registro previo con correo institucional)
|
938 |
|
|
|a Askews and Holts Library Services
|b ASKH
|n AH39609424
|
938 |
|
|
|a ProQuest Ebook Central
|b EBLB
|n EBL6642618
|
938 |
|
|
|a EBSCOhost
|b EBSC
|n 2949094
|
938 |
|
|
|a YBP Library Services
|b YANK
|n 302273010
|
994 |
|
|
|a 92
|b IZTAP
|