Hands-On Data Warehousing with Azure Data Factory : ETL techniques to load and transform data from various sources, both on-premises and on cloud.
Azure Data Factory (ADF) is a Microsoft Azure PaaS solution which supports data movement between many on premises and cloud data sources. This book covers custom tailored tutorials to help you develop, maintain and troubleshoot data movement processes and environments using Azure Data Factory V2 and...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Otros Autores: | , |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Birmingham :
Packt Publishing,
2018.
|
Temas: | |
Acceso en línea: | Texto completo |
Tabla de Contenidos:
- Cover; Title Page; Copyright and Credits; Packt Upsell; Contributors; Table of Contents; Preface; Chapter 1: The Modern Data Warehouse; The need for a data warehouse; Driven by IT; Self-service BI; Cloud-based BI
- big data and artificial intelligence; The modern data warehouse; Main components of a data warehouse; Staging area; Data warehouse; Cubes; Consumption layer
- BI and analytics; What is Azure Data Factory; Limitations of ADF V1.0; What's new in V2.0?; Integration runtime; Linked services; Datasets; Pipelines; Activities; Parameters; Expressions; Controlling the flow of activities.
- SSIS package deployment in AzureSpark cluster data store; Summary; Chapter 2: Getting Started with Our First Data Factory; Resource group; Azure Data Factory; Datasets; Linked services; Integration runtimes; Activities; Monitoring the data factory pipeline runs; Azure Blob storage; Blob containers; Types of blobs; Block blobs; Page blobs; Replication of storage; Creating an Azure Blob storage account; SQL Azure database; Creating the Azure SQL Server; Attaching the BACPAC to our database; Copying data using our data factory; Summary; Chapter 3: SSIS Lift and Shift; SSIS in ADF; Sample setup.
- Sample databasesSSIS components; Integration services catalog setup; Sample solution in Visual Studio; Deploying the project on-premises; Leveraging our package in ADF V2; Integration runtimes; Azure integration runtime; Self-hosted runtime; SSIS integration runtime; Adding an SSIS integration runtime to the factory; SSIS execution from a pipeline; Summary; Chapter 4: Azure Data Lake; Creating and configuring Data Lake Store; Next Steps; Ways to copy/import data from a database to the Data Lake; Ways to store imported data in files in the Data Lake; Easily moving data to the Data Lake Store.
- Ways to directly copy files into the Data LakePrerequisites for the next steps; Creating a Data Lake Analytics resource; Using the data factory to manipulate data in the Data Lake; Task 1
- copy/import data from SQL Server to a blob storage file using data factory; Task 2
- run a U-SQL task from the data factory pipeline to summarize data; Service principal authentication; Run U-SQL from a job in the Data Lake Analytics; Summary; Chapter 5: Machine Learning on the Cloud; Machine learning overview; Machine learning algorithms; Supervised learning; Unsupervised learning; Reinforcement learning.
- Machine learning tasksMaking predictions with regression algorithms; Automated classification using machine learning; Identifying groups using clustering methods; Dimensionality reduction to improve performance; Feature selection; Feature extraction; Azure Machine Learning Studio; Azure Machine Learning Studio account; Azure Machine Learning Studio experiment; Dataset; Module; Work area; Breast cancer detection; Get the data; Prepare the data; Train the model; Score and evaluate the model; Summary; Chapter 6: Introduction to Azure Databricks; Azure Databricks setup; Prepare the data to ingest.