Cargando…

IBM SPSS Modeler cookbook : over 60 practical recipes to achieve better results using the experts' methods for data mining /

This is a practical cookbook with intermediate-advanced recipes for SPSS Modeler data analysts. It is loaded with step-by-step examples explaining the process followed by the experts. If you have had some hands-on experience with IBM SPSS Modeler and now want to go deeper and take more control over...

Descripción completa

Detalles Bibliográficos
Clasificación:Libro Electrónico
Otros Autores: McCormick, Keith
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Birmingham, UK : Packt Pub., 2013.
Temas:
Acceso en línea:Texto completo
Texto completo
Tabla de Contenidos:
  • Cover; Copyright; Credits; Foreword; About the Authors; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Data Understanding; Introduction; Using an empty aggregate to evaluate sample size; Evaluating the need to sample from the initial data; Using CHAID stumps when interviewing an SME; Using a single cluster K-means as an alternative to anomaly detection; Using an @NULL multiple Derive to explore missing data; Creating an outlier report to give to SMEs; Detecting potential model instability early using the Partition node and Feature Selection.
  • Chapter 2: Data Preparation
  • SelectIntroduction; Using the Feature Selection node creatively to remove, or decapitate, perfect predictors; Running a Statistics node on anti-join to evaluate potential missing data; Evaluating the use of sampling for speed; Removing redundant variables using correlation matrices; Selecting variable using the CHAID modeling node; Selecting variables using the Means node; Selecting variables using single-antecedent association rules; Chapter 3: Data Preparation
  • Clean; Introduction; Binning scale variables to address missing data.
  • Using a full data model/partial data model approach to address missing dataImputing in-stream mean or median; Imputing missing values randomly from uniform or normal distributions; Using random imputation to match a variable's distribution; Searching for similar records using a neural network for inexact matching; Using neuro-fuzzy searching to find similar names; Producing longer Soundex codes; Chapter 4: Data Preparation
  • Construct; Introduction; Building transformations with multiple Derive nodes; Calculating and comparing conversion rates; Grouping categorical values.
  • Transforming high skew and kurtosis variables with a multiple Derive nodeCreating flag variables for aggregation; Using Association Rules for interaction detection/feature creation; Creating time-aligned cohorts; Chapter 5: Data Preparation
  • Integrate and Format; Introduction; Speeding up merge with caching and optimization settings; Merging a look-up table; Shuffle-down (nonstandard aggregation); Cartesian product merge using key-less merge by key; Multiplying out using Cartesian product merge, user source, and derive dummy; Changing large numbers of variable names without scripting.
  • Parsing nonstandard datesParsing and performing a conversion on a complex stream; Sequence processing; Chapter 6: Selecting and Building a Model; Introduction; Evaluating balancing with the Auto Classifier; Building models with and without outliers; Neural Network Feature Selection; Creating a bootstrap sample; Creating bagged logistic regression models; Using KNN to match similar cases; Using Auto Classifier to tune models; Next-Best-Offer for large datasets; Chapter 7: Modeling
  • Assessment, Evaluation, Deployment, and Monitoring; Introduction; How (and why) to validate as well as test.