Big Data Analytics with R.
Utilize R to uncover hidden patterns in your Big DataAbout This Book Perform computational analyses on Big Data to generate meaningful results Get a practical knowledge of R programming language while working on Big Data platforms like Hadoop, Spark, H2O and SQL/NoSQL databases, Explore fast, stream...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Packt Publishing,
2016.
|
Edición: | 1. |
Temas: | |
Acceso en línea: | Texto completo |
Tabla de Contenidos:
- Cover; Copyright; Credits; About the Author; Acknowledgement; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: The Era of Big Data; Big Data
- The monster re-defined; Big Data toolbox
- dealing with the giant; Hadoop
- the elephant in the room; Databases; Hadoop Spark-ed up; R
- The unsung Big Data hero; Summary; Chapter 2: Introduction to R Programming Language and Statistical Environment; Learning R; Revisiting R basics; Getting R and RStudio ready; Setting the URLs to R repositories; R data structures; Vectors; Scalars; Matrices; Arrays; Data frames; Lists.
- Exporting R data objectsApplied data science with R; Importing data from different formats; Exploratory Data Analysis; Data aggregations and contingency tables; Hypothesis testing and statistical inference; Tests of differences; Independent t-test example (with power and effect size estimates); ANOVA example; Tests of relationships; An example of Pearson's r correlations; Multiple regression example; Data visualization packages; Summary; Chapter 3: Unleashing the Power of R from Within; Traditional limitations of R; Out-of-memory data; Processing speed; To the memory limits and beyond.
- Data transformations and aggregations with the ff and ffbase packagesGeneralized linear models with the ff and ffbase packages; Logistic regression example with ffbase and biglm; Expanding memory with the bigmemory package; Parallel R; From bigmemory to faster computations; An apply() example with the big.matrix object; A for() loop example with the ffdf object; Using apply() and for() loop examples on a data.frame; A parallel package example; A foreach package example; The future of parallel processing in R; Utilizing Graphics Processing Units with R.
- Multi-threading with Microsoft R Open distributionParallel machine learning with H2O and R; Boosting R performance with the data.table package and other tools; Fast data import and manipulation with the data.table package; Data import with data.table; Lightning-fast subsets and aggregations on data.table; Chaining, more complex aggregations, and pivot tables with data.table; Writing better R code; Summary; Chapter 4: Hadoop and MapReduce Framework for R; Hadoop architecture; Hadoop Distributed File System; MapReduce framework; A simple MapReduce word count example; Other Hadoop native tools.
- Learning HadoopA single-node Hadoop in Cloud; Deploying Hortonworks Sandbox on Azure; A word count example in Hadoop using Java; A word count example in Hadoop using the R language; RStudio Server on a Linux RedHat/CentOS virtual machine; Installing and configuring RHadoop packages; HDFS management and MapReduce in R
- a word count example; HDInsight
- a multi-node Hadoop cluster on Azure; Creating your first HDInsight cluster; Creating a new Resource Group; Deploying a Virtual Network; Creating a Network Security Group; Setting up and configuring an HDInsight cluster.