Pentaho 3.2 data integration : beginner's guide : explore, transform, validate, and integrate your data with ease /
"Pentaho Data Integration (a.k.a. Kettle) is a full-featured open source ETL (Extract, Transform, and Load) solution. Although PDI is a feature-rich tool, effectively capturing, manipulating, cleansing, transferring, and loading data can get complicated. This book is full of practical examples...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Birmingham, U.K. :
Packt Pub.,
©2010.
|
Temas: | |
Acceso en línea: | Texto completo (Requiere registro previo con correo institucional) |
Tabla de Contenidos:
- Cover; Copyright; Credits; Foreword; The Kettle Project; About the Author; About the Reviewers; Table of Contents; Preface; Chapter 1: Getting started with Pentaho Data Integration; Pentaho Data Integration and Pentaho BI Suite; Exploring the Pentaho Demo; Pentaho Data Integration; Using PDI in real world scenarios; Loading data warehouses or data marts; Integrating data; Data cleansing; Migrating information; Exporting data; Integrating PDI using Pentaho BI; Installing PDI; Time for action
- installing PDI; Launching the PDI graphical designer: Spoon
- Time for action
- starting and customizing SpoonSpoon; Setting preferences in the Options window; Storing transformations and jobs in a repository; Creating your first transformation; Time for action
- creating a hello world transformation; Directing the Kettle engine with transformations; Exploring the Spoon interface; Running and previewing the transformation; Time for action
- running and previewing the hello_world; transformation; Installing MySQL; Time for action
- installing MySQL on Windows; Time for action
- installing MySQL on Ubuntu; Summary
- Chapter 2: Getting Started with TransformationsReading data from files; Time for action
- reading results of football matches from files; Input files; Input steps; Reading several files at once; Time for action
- reading all your files at a time using a single; Text file input step; Time for action
- reading all your files at a time using a single; Text file input step and regular expressions; Regular expressions; Grids; Sending data to files; Time for action
- sending the results of matches to a plain file; Output files; Output steps; Some data definitions; Rowset; Streams
- The Select values stepGetting system information; Time for action
- updating a file with news about examinations; Getting information by using Get System Info step; Data types; Date fields; Numeric fields; Running transformations from a terminal window; Time for action
- running the examination transformation from; a terminal window; XML files; Time for action
- getting data from an XML file with information; about countries; What is XML; PDI transformation files; Getting data from XML files; XPath; Configuring the Get data from XML step; Kettle variables; How and when you can use variables
- Time for action
- finding out which language people speak