Loading…

Data warehousing in the age of big data /

"In conclusion as you come to the end of this book, the concept of a Data Warehouse and its primary goal of serving the enterprise version of truth, and being the single platform for all the source of information will continue to remain intact and valid for many years to come. As we have discus...

Full description

Bibliographic Details
Call Number:	Libro Electrónico
Main Author:	Krishnan, Krish
Format:	Electronic eBook
Language:	Inglés
Published:	Amsterdam : Morgan Kaufmann is an imprint of Elsevier, 2013.
Series:	Morgan Kaufmann Series on Business Intelligence.
Subjects:	Data warehousing. Big data. Entrepôts de données (Informatique) Données volumineuses. COMPUTERS > Database Management > Data Warehousing. Big data Data warehousing Big Data Data-Warehouse-Konzept
Online Access:	Texto completo

Table of Contents:

Front Cover
Data Warehousing in the Age of Big Data
Copyright Page
Contents
Acknowledgments
About the Author
Introduction
Part 1: Big Data
Part 2: The Data Warehousing
Part 3: Building the Big Data
Data Warehouse
Appendixes
Companion website
1 BIG DATA
1 Introduction to Big Data
Introduction
Big Data
Defining Big Data
Why Big Data and why now?
Big Data example
Social Media posts
Survey data analysis
Survey data
Weather data
Twitter data
Integration and analysis
Additional data types
Summary
Further reading.
2 Working with Big Data
Introduction
Data explosion
Data volume
Machine data
Application log
Clickstream logs
External or third-party data
Emails
Contracts
Geographic information systems and geo-spatial data
Example: Funshots, Inc.
Data velocity
Amazon, Facebook, Yahoo, and Google
Sensor data
Mobile networks
Social media
Data variety
Summary
3 Big Data Processing Architectures
Introduction
Data processing revisited
Data processing techniques
Data processing infrastructure challenges
Storage
Transportation
Processing.
Journal
Checkpoint
HDFS startup
Block allocation and storage in HDFS
HDFS client
Replication and recovery
Communication and management
Heartbeats
CheckpointNode and BackupNode
CheckpointNode
BackupNode
File system snapshots
JobTracker and TaskTracker
MapReduce
MapReduce programming model
MapReduce program design
MapReduce implementation architecture
MapReduce job processing and management
MapReduce limitations (Version 1, Hadoop MapReduce)
MapReduce v2 (YARN)
YARN scalability
Comparison between MapReduce v1 and v2
SQL/MapReduce.
Speed or throughput
Shared-everything and shared-nothing architectures
Shared-everything architecture
Shared-nothing architecture
OLTP versus data warehousing
Big Data processing
Infrastructure explained
Data processing explained
Telco Big Data study
Infrastructure
Data processing
4 Introducing Big Data Technologies
Introduction
Distributed data processing
Big Data processing requirements
Technologies for Big Data processing
Google file system
Hadoop
Hadoop core components
HDFS
HDFS architecture
NameNode
DataNodes
Image.
Zookeeper
Zookeeper features
Locks and processing
Failure and recovery
Pig
Programming with pig latin
Pig data types
Running pig programs
Pig program flow
Common pig command
HBase
HBase architecture
HBase components
Write-ahead log
Hive
Hive architecture
Infrastructure
Execution: how does hive process queries?
Hive data types
Hive query language (HiveQL)
Chukwa
Flume
Oozie
HCatalog
Sqoop
Sqoop1
Sqoop2
Hadoop summary
NoSQL
CAP theorem
Key-value pair: Voldemort
Column family store: Cassandra
Data model.

Data warehousing in the age of big data /

Similar Items