Elementary Cluster Analysis
The availability of packaged clustering programs means that anyone with data can easily do cluster analysis on it. But many users of this technology don't fully appreciate its many hidden dangers. In today's world of "grab and go algorithms," part of my motivation for writing thi...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Aalborg :
River Publishers,
2022.
|
Colección: | River Publishers Series in Mathematical and Engineering Sciences Ser.
|
Temas: | |
Acceso en línea: | Texto completo |
Tabla de Contenidos:
- Front Cover
- Elementary Cluster Analysis: Four Basic Methods that (Usually) Work
- Contents
- Preface
- List of Figures
- List of Tables
- List of Abbreviations
- Appendix A. List of Algorithms
- Appendix D. List of Definitions
- Appendix E. List of Examples
- Appendix L. List of Lemmas and Theorems
- Appendix V. List of Video Links
- I The Art and Science of Clustering
- 1 Clusters: The Human Point of View (HPOV)
- 1.1 Introduction
- 1.2 What are Clusters?
- 1.3 Notes and Remarks
- 1.4 Exercises
- 2 Uncertainty: Fuzzy Sets and Models
- 2.1 Introduction
- 2.2 Fuzzy Sets and Models
- 2.3 Fuzziness and Probability
- 2.4 Notes and Remarks
- 2.5 Exercises
- 3 Clusters: The Computer Point of View (CPOV)
- 3.1 Introduction
- 3.2 Label Vectors
- 3.3 Partition Matrices
- 3.4 How Many Clusters are Present in a Data Set?
- 3.5 CPOV Clusters: The Computer's Point of View
- 3.6 Notes and Remarks
- 3.7 Exercises
- 4 The Three Canonical Problems
- 4.1 Introduction
- 4.2 Tendency Assessment
- (Are There Clusters?)
- 4.2.1 An Overview of Tendency Assessment
- 4.2.2 Minimal Spanning Trees (MSTs)
- 4.2.3 Visual Assessment of Clustering Tendency
- 4.2.4 The VAT and iVAT Reordering Algorithms
- 4.3 Clustering (Partitioning the Data into Clusters)
- 4.4 Cluster Validity (Which Clusters are "Best"?)
- 4.5 Notes and Remarks
- 4.6 Exercises
- 5 Feature Analysis
- 5.1 Introduction
- 5.2 Feature Nomination
- 5.3 Feature Analysis
- 5.4 Feature Selection
- 5.5 Feature Extraction
- 5.5.1 Principal Components Analysis
- 5.5.2 Random Projection
- 5.5.3 Sammon's Algorithm
- 5.5.4 Autoencoders
- 5.5.5 Relational Data
- 5.6 Normalization and Statistical Standardization
- 5.7 Notes and Remarks
- 5.8 Exercises
- II Four Basic Models and Algorithms
- 6 The c-Means (aka k-Means) Models
- 6.1 Introduction
- 6.2 The Geometry of Partition Spaces
- 6.3 The HCM/FCM Models and Basic AO Algorithms
- 6.4 Cluster Accuracy for Labeled Data
- 6.5 Choosing Model Parameters (c, m, ||*||A)
- 6.5.1 How to Pick the Number of Clusters c
- 6.5.2 How to Pick the Weighting Exponent m
- 6.5.3 Choosing the Weight Matrix (A) for the Model Norm
- 6.6 Choosing Execution Parameters (V0, "", ||*||err,T)
- 6.6.1 Choosing Termination and Iterate Limit Criteria
- 6.6.2 How to Pick an Initial V0 (or U0)
- 6.6.3 Acceleration Schemes for HCM (aka k-Means) and (FCM)
- 6.7 Cluster Validity With the Best c Method
- 6.7.1 Scale Normalization
- 6.7.2 Statistical Standardization
- 6.7.3 Stochastic Correction for Chance
- 6.7.4 Best c Validation With Internal CVIs
- 6.7.5 Crisp Cluster Validity Indices
- 6.7.6 Soft Cluster Validity Indices
- 6.8 Alternate Forms of Hard c-Means (aka k-Means)
- 6.8.1 Bounds on k-Means in Randomly Projected Downspaces
- 6.8.2 Matrix Factorization for HCM for Clustering
- 6.8.3 SVD: A Global Bound for J1 (U, V
- X)
- 6.9 Notes and Remarks
- 6.10 Exercises
- 7 Probabilistic Clustering
- GMD/EM