Data mining applications with R /
Data Mining Applications with R is a great resource for researchers and professionals to understand the wide use of R, a free software environment for statistical computing and graphics, in solving different problems in industry. R is widely used in leveraging data mining techniques across many diff...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Otros Autores: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Waltham, MA :
Academic Press,
©2014.
|
Temas: | |
Acceso en línea: | Texto completo Texto completo |
Tabla de Contenidos:
- Front Cover; Data Mining Applications with R; Copyright; Contents; Preface; Background; Objectives and Significance; Target Audience; Acknowledgments; Review Committee; Additional Reviewers; Foreword; References; Chapter 1: Power Grid Data Analysis with R and Hadoop; 1.1. Introduction; 1.2. A Brief Overview of the Power Grid; 1.3. Introduction to MapReduce, Hadoop, and RHIPE; 1.3.1. MapReduce; 1.3.1.1. An Example: The Iris Data; 1.3.2. Hadoop; 1.3.3. RHIPE: R with Hadoop; 1.3.3.1. Installation; 1.3.3.2. Iris MapReduce Example with RHIPE; 1.3.3.2.1. The Map Expression.
- 1.3.3.2.2. The Reduce Expression1.3.3.2.3. Running the Job; 1.3.3.2.4. Looking at Results; 1.3.4. Other Parallel R Packages; 1.4. Power Grid Analytical Approach; 1.4.1. Data Preparation; 1.4.2. Exploratory Analysis and Data Cleaning; 1.4.2.1. 5-min Summaries; 1.4.2.2. Quantile Plots of Frequency; 1.4.2.3. Tabulating Frequency by Flag; 1.4.2.4. Distribution of Repeated Values; 1.4.2.5. White Noise; 1.4.3. Event Extraction; 1.4.3.1. OOS Frequency Events; 1.4.3.2. Finding Generator Trip Features; 1.4.3.3. Creating Overlapping Frequency Data; 1.5. Discussion and Conclusions; Appendix; References.
- Chapter 2: Picturing Bayesian Classifiers: A Visual Data Mining Approach to Parameters Optimization2.1. Introduction; 2.2. Related Works; 2.3. Motivations and Requirements; 2.3.1. R Packages Requirements; 2.4. Probabilistic Framework of NB Classifiers; 2.4.1. Choosing the Model; 2.4.1.1. Multivariate Bernoulli model; 2.4.1.2. Multinomial Model; 2.4.1.3. Poisson Model; 2.4.2. Estimating the Parameters; 2.5. Two-Dimensional Visualization System; 2.5.1. Design Choices; 2.5.2. Visualization Design; 2.6. A Case Study: Text Classification; 2.6.1. Description of the Dataset.
- 2.6.2. Creating Document-Term Matrices2.6.3. Loading Existing Term-Document Matrices; 2.6.4. Running the Program; 2.6.4.1. Comparing Models; 2.7. Conclusions; Acknowledgments; References; Chapter 3: Discovery of Emergent Issues and Controversies in Anthropology Using Text Mining, Topic Modeling, and Social Ne ... ; 3.1. Introduction; 3.2. How Many Messages and How Many Twitter-Users in the Sample?; 3.3. Who Is Writing All These Twitter Messages?; 3.4. Who Are the Influential Twitter-Users in This Sample?; 3.5. What Is the Community Structure of These Twitter-Users?
- 3.6. What Were Twitter-Users Writing About During the Meeting?3.7. What Do the Twitter Messages Reveal About the Opinions of Their Authors?; 3.8. What Can Be Discovered in the Less Frequently Used Words in the Sample?; 3.9. What Are the Topics That Can Be Algorithmically Discovered in This Sample?; 3.10. Conclusion; References; Chapter 4: Text Mining and Network Analysis of Digital Libraries in R; 4.1. Introduction; 4.2. Dataset Preparation; 4.3. Manipulating the Document-Term Matrix; 4.3.1. The Document-Term Matrix; 4.3.2. Term Frequency-Inverse Document Frequency.