Cargando…

Pentaho 3.2 data integration : beginner's guide : explore, transform, validate, and integrate your data with ease /

"Pentaho Data Integration (a.k.a. Kettle) is a full-featured open source ETL (Extract, Transform, and Load) solution. Although PDI is a feature-rich tool, effectively capturing, manipulating, cleansing, transferring, and loading data can get complicated. This book is full of practical examples...

Descripción completa

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Roldán, María Carina
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Birmingham, U.K. : Packt Pub., ©2010.
Temas:
Acceso en línea:Texto completo
Texto completo
Tabla de Contenidos:
  • Cover; Copyright; Credits; Foreword; The Kettle Project; About the Author; About the Reviewers; Table of Contents; Preface; Chapter 1: Getting started with Pentaho Data Integration; Pentaho Data Integration and Pentaho BI Suite; Exploring the Pentaho Demo; Pentaho Data Integration; Using PDI in real world scenarios; Loading data warehouses or data marts; Integrating data; Data cleansing; Migrating information; Exporting data; Integrating PDI using Pentaho BI; Installing PDI; Time for action
  • installing PDI; Launching the PDI graphical designer: Spoon
  • Time for action
  • starting and customizing SpoonSpoon; Setting preferences in the Options window; Storing transformations and jobs in a repository; Creating your first transformation; Time for action
  • creating a hello world transformation; Directing the Kettle engine with transformations; Exploring the Spoon interface; Running and previewing the transformation; Time for action
  • running and previewing the hello_world; transformation; Installing MySQL; Time for action
  • installing MySQL on Windows; Time for action
  • installing MySQL on Ubuntu; Summary
  • Chapter 2: Getting Started with TransformationsReading data from files; Time for action
  • reading results of football matches from files; Input files; Input steps; Reading several files at once; Time for action
  • reading all your files at a time using a single; Text file input step; Time for action
  • reading all your files at a time using a single; Text file input step and regular expressions; Regular expressions; Grids; Sending data to files; Time for action
  • sending the results of matches to a plain file; Output files; Output steps; Some data definitions; Rowset; Streams
  • The Select values stepGetting system information; Time for action
  • updating a file with news about examinations; Getting information by using Get System Info step; Data types; Date fields; Numeric fields; Running transformations from a terminal window; Time for action
  • running the examination transformation from; a terminal window; XML files; Time for action
  • getting data from an XML file with information; about countries; What is XML; PDI transformation files; Getting data from XML files; XPath; Configuring the Get data from XML step; Kettle variables; How and when you can use variables
  • Time for action
  • finding out which language people speak