Cargando…

Data mining and data visualization /

This book focuses on dealing with large-scale data, a field commonly referred to as data mining. The book is divided into three sections. The first deals with an introduction to statistical aspects of data mining and machine learning and includes applications to text analysis, computer intrusion det...

Descripción completa

Detalles Bibliográficos
Clasificación:Libro Electrónico
Otros Autores: Rao, C. Radhakrishna (Calyampudi Radhakrishna), 1920- (Editor ), Wegman, Edward J., 1943- (Editor ), Solka, Jeffrey L., 1955- (Editor )
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Amsterdam ; San Diego, CA : Elsevier North Holland, 2005.
Edición:1st ed.
Colección:Handbook of statistics (Amsterdam, Netherlands) ; v. 24.
Temas:
Acceso en línea:Texto completo
Texto completo
Tabla de Contenidos:
  • Cover
  • front cover
  • copyright
  • Table of contents
  • Preface
  • Contributors
  • 1. Statistical Data Mining
  • Introduction 1
  • Computational complexity
  • The computer science roots of data mining
  • Data preparation
  • Databases
  • Statistical methods for data mining
  • Visual data mining
  • Streaming data
  • A final word
  • Acknowledgements 1
  • References 1
  • 2. From Data Mining to Knowledge Mining
  • Introduction 2
  • Knowledge generation operators
  • Discovering rules and patterns via AQ learning
  • Types of problems in learning from examples
  • Clustering of entities into conceptually meaningful categories
  • Automated improvement of the search space: constructive induction
  • Reducing the amount of data: selecting representative examples
  • Integrating qualitative and quantitative methods of numerical discovery
  • Predicting processes qualitatively
  • Knowledge improvement via incremental learning
  • Summarizing the logical data analysis approach
  • Strong patterns vs. complete and consistent rules
  • Ruleset visualization via concept association graphs
  • Integration of knowledge generation operators
  • Summary 2
  • Acknowledgements 2
  • References 2
  • 3. Mining Computer Securitycomputer security Data
  • Introduction 3
  • Basic TCP/IP
  • Overview of networking
  • The threat
  • Probes and scans
  • Denial of service attacks
  • Gaining access
  • Network monitoring
  • TCP sessions
  • Signatures versus anomalies
  • User profiling
  • Program profiling
  • Conclusions 3
  • References 3
  • 4. Data Mining of Text Files
  • 4. Introduction and background
  • Natural language processing at the word and sentence level
  • Hidden Markov models
  • Probabilistic context-free grammars
  • Word sense disambiguation
  • Approaches beyond the word and sentence level
  • Information retrieval
  • Other approaches
  • Summary 4
  • References 4
  • 5. Text Data Mining with Minimal Spanning Trees
  • Introduction 5
  • Approach
  • Results 5
  • Datasets
  • Feature extraction
  • Automated serendipity extraction on the Science News data set with no user driven focus of attention
  • Automated serendipity extraction on the ONR ILIR data set with no user driven focus of attention
  • Automated serendipity extraction on the Science News data set with user driven focus of attention
  • Clustering results on the ONR ILIR dataset
  • Clustering results on the Science News dataset
  • Conclusions 5
  • Acknowledgements 5
  • References 5
  • 6. Information Hiding: Steganography and Steganalysis
  • Introduction 6
  • Image formats
  • Steganography
  • Embedding by modifying carrier bits
  • Embedding using pairs of values
  • Steganalysis
  • Relationship of steganography to watermarking
  • Literature survey
  • Conclusions 6
  • References 6
  • 7. Canonical Variate Analysis and Related Methods for Reduction of Dimensionality and Graphical Representation
  • Introduction 7
  • Canonical coordinates
  • Mahalanobis space.