Mastering predictive analytics with R : master the craft of predictive modeling by developing strategy, intuition, and a solid foundation in essential concepts /
This book is intended for the budding data scientist, predictive modeler, or quantitative analyst with only a basic exposure to R and statistics. It is also designed to be a reference for experienced professionals wanting to brush up on the details of a particular type of predictive model. Mastering...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Birmingham, UK :
Packt Publishing,
2015.
|
Colección: | Community experience distilled.
|
Temas: | |
Acceso en línea: | Texto completo (Requiere registro previo con correo institucional) |
Tabla de Contenidos:
- Cover
- Copyright
- Credits
- About the Author
- Acknowledgments
- About the Reviewers
- www.PacktPub.com
- Preface
- Chapter 1: Gearing Up for Predictive Modeling
- Models
- Learning from data
- The core components of a model
- Our first model: k-nearest neighbors
- Types of models
- Supervised, unsupervised, semi-supervised, and reinforcement learning models
- Parametric and nonparametric models
- Regression and classification models
- Real time and batch machine learning models
- The process of predictive modeling
- Defining the model's objective
- Collecting the data
- Picking a model
- Pre-processing the data
- Exploratory data analysis
- Feature transformations
- Encoding categorical features
- Missing data
- Outliers
- Removing problematic features
- Feature engineering and dimensionality reduction
- Training and assessing the model
- Repeating with different models and final model selection
- Deploying the model
- Performance metrics
- Assessing regression models
- Assessing classification models
- Assessing binary classification models
- Summary
- Chapter 2 : Linear Regression
- Linear regression
- Assumptions of linear regression
- Simple linear regression
- Estimating the regression coefficients
- Multiple linear regression
- Predicting CPU performance
- Predicting the price of used cars
- Assessing linear regression models
- Residual analysis
- Significance tests for linear regression
- Performance metrics for linear regression
- Comparing different regression models
- Test set performance
- Problems with linear regression
- Multicollinearity
- Outliers
- Feature selection
- Regularization
- Ridge regression
- Least absolute shrinkage and selection operator (lasso)
- Implementing regularization in R
- Summary
- Chapter 3 : Logistic Regression.
- Classifying with linear regression
- Logistic regression
- Generalized linear models
- Interpreting coefficients in logistic regression
- Assumptions of logistic regression
- Maximum likelihood estimation
- Predicting heart disease
- Assessing logistic regression models
- Model deviance
- Test set performance
- Regularization with the lasso
- Classification metrics
- Extensions of the binary logistic classifier
- Multinomial logistic regression
- Predicting glass type
- Ordinal logistic regression
- Predicting wine quality
- Summary
- Chapter 4 : Neural Networks
- The biological neuron
- The artificial neuron
- Stochastic gradient descent
- Gradient descent and local minima
- The perceptron algorithm
- Linear separation
- The logistic neuron
- Multilayer perceptron networks
- Training multilayer perceptron networks
- Predicting the energy efficiency of buildings
- Evaluating multilayer perceptrons for regression
- Predicting glass type revisited
- Predicting handwritten digits
- Receiver operating characteristic curves
- Summary
- Chapter 5 : Support Vector Machines
- Maximal margin classification
- Support vector classification
- Inner products
- Kernels and support vector machines
- Predicting chemical biodegration
- Cross-validation
- Predicting credit scores
- Multi-class classification with support vector machines
- Summary
- Chapter 6 : Tree-based Methods
- The intuition for tree models
- Algorithms for training decision trees
- Classification and regression trees
- CART regression trees
- Tree pruning
- Missing data
- Regression model trees
- CART classification trees
- C5.0
- Predicting class membership on synthetic 2D data
- Predicting the authenticity of banknotes
- Predicting complex skill learning
- Tuning model parameters in CART trees
- Variable importance in tree models.
- Regression model trees in action
- Summary
- Chapter 7 : Ensemble Methods
- Bagging
- Margins and out-of-bag observations
- Predicting complex skill learning with bagging
- Predicting heart disease with bagging
- Limitations of bagging
- Boosting
- AdaBoost
- Predicting atmospheric gamma ray radiation
- Predicting complex skill learning with boosting
- Limitations of boosting
- Random forests
- The importance of variables in random forests
- Summary
- Chapter 8 : Probabilistic Graphical Models
- A Little Graph Theory
- Bayes' Theorem
- Conditional independence
- Bayesian networks
- The Naïve Bayes classifier
- Predicting the sentiment of movie reviews
- Hidden Markov models
- Predicting promoter gene sequences
- Predicting letter patterns in English words
- Summary
- Chapter 9 : Time Series Analysis
- Fundamental concepts of time series
- Time series summary functions
- Some fundamental time series
- White noise
- Fitting a white noise time series
- Random walk
- Fitting a random walk
- Stationarity
- Stationary time series models
- Moving average models
- Autoregressive models
- Autoregressive moving average models
- Non-stationary time series models
- Autoregressive integrated moving average models
- Autoregressive conditional heteroscedasticity models
- Generalized autoregressive heteroscedasticity models
- Predicting intense earthquakes
- Predicting lynx trappings
- Predicting foreign exchange rates
- Other time series models
- Summary
- Chapter 10 : Topic Modeling
- An overview of topic modeling
- Latent Dirichlet Allocation
- The Dirichlet distribution
- The generative process
- Fitting an LDA model
- Modeling the topics of online news stories
- Model stability
- Finding the number of topics
- Topic distributions
- Word distributions
- LDA extensions
- Summary.
- Chapter 11 : Recommendation Systems
- Rating matrix
- Measuring user similarity
- Collaborative filtering
- User-based collaborative filtering
- Item-based collaborative filtering
- Singular value decomposition
- R and Big Data
- Predicting recommendations for movies and jokes
- Loading and preprocessing the data
- Exploring the data
- Evaluating binary top-N recommendations
- Evaluating non-binary top-N recommendations
- Evaluating individual predictions
- Other approaches to recommendation systems
- Summary
- Index.