Learning data mining with R : develop key skills and techniques with R to create and customize data mining algorithms /
This book is intended for the budding data scientist or quantitative analyst with only a basic exposure to R and statistics. This book assumes familiarity with only the very basics of R, such as the main data types, simple functions, and how to move data around. No prior experience with data mining...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Birmingham, UK :
Packt Publishing,
2015.
|
Colección: | Community experience distilled.
|
Temas: | |
Acceso en línea: | Texto completo |
Tabla de Contenidos:
- Cover
- Copyright
- Credits
- About the Author
- Acknowledgments
- About the Reviewers
- www.PacktPub.com
- Table of Contents
- Preface
- Chapter 1: Warming Up
- Big data
- Scalability and efficiency
- Data source
- Data mining
- Feature extraction
- Summarization
- The data mining process
- CRISP-DM
- SEMMA
- Social network mining
- Social network
- Text mining
- Information retrieval and text mining
- Mining text for prediction
- Web data mining
- Why R?
- What is the disadvantage of R?
- Statistics.
- Statistics and data mining
- Statistics and machine learning
- Statistics and R
- The limitations of statistics on data mining
- Machine learning
- Approaches to machine learning
- Machine learning architecture
- Data attributes and description
- Numeric attributes
- Categorical attributes
- Data description
- Data measuring
- Data cleaning
- Missing values
- Junk, noisy data, or outlier
- Data integration
- Data dimension reduction
- Eigenvalues and Eigenvectors
- Principal-Component Analysis
- Singular-value decomposition.
- CUR decomposition
- Data transformation and discretization
- Data transformation
- Normalization data transformation methods
- Data discretization
- Visualization of results
- Visualization with R
- Time for action
- Summary
- Chapter 2: Mining Frequent Patterns, Associations, and Correlations
- An overview of associations and patterns
- Patterns and pattern discovery
- The frequent itemset
- The frequent subsequence
- The frequent substructures
- Relationship or rules discovery
- Association rules
- Correlation rules.
- Market basket analysis
- The market basket model
- A-Priori algorithm
- Input data characteristics and data structure
- The A-Priori algorithm
- The R implementation
- A-Priori algorithm variants
- The Eclat algorithm
- The R implementation
- The FP-growth algorithm
- Input data characteristics and data structure
- The FP-growth algorithm
- The R implementation
- The GenMax algorithm with maximal frequent itemsets
- The R implementation
- The Charm algorithm with closed frequent itemsets
- The R implementation.
- The algorithm to generate association rules
- The R implementation
- Hybrid association rules mining
- Mining multilevel and multidimensional association rules
- Constraint-based frequent pattern mining
- Mining sequence dataset
- Sequence dataset
- The GSP algorithm
- The R implementation
- The SPADE algorithm
- The R implementation
- Rule generation from sequential patterns
- High-performance algorithms
- Time for action
- Summary
- Chapter 3: Classification
- Classification
- Generic decision tree induction