Regression analysis with Python : learn the art of regression analysis with Python /
Learn the art of regression analysis with Python About This Book Become competent at implementing regression analysis in Python Solve some of the complex data science problems related to predicting outcomes Get to grips with various types of regression for effective data analysis Who This Book Is Fo...
Clasificación: | Libro Electrónico |
---|---|
Autores principales: | , |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Birmingham, UK :
Packt Publishing,
2016.
|
Colección: | Community experience distilled.
|
Temas: | |
Acceso en línea: | Texto completo (Requiere registro previo con correo institucional) |
Tabla de Contenidos:
- Cover; Copyright; Credits; About the Authors; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Regression
- The Workhorse of Data Science; Regression analysis and data science; Exploring the promise of data science; The challenge; The linear models; What you are going to find in the book ; Python for data science; Installing Python; Choosing between Python 2 and Python 3; Step-by-step installation; Installing packages; Package upgrades; Scientific distributions; Introducing Jupyter or IPython; Python packages and functions for linear models ; NumPy; SciPy
- StatsmodelsScikit-learn; Summary; Chapter 2: Approaching Simple Linear Regression; Defining a regression problem; Linear models and supervised learning; Reflecting on predictive variables; Reflecting on response variables; The family of linear models; Preparing to discover simple linear regression; Starting from the basics; A measure of linear relationship; Extending to linear regression; Regressing with StatsModels; The coefficient of determination; Meaning and significance of coefficients; Evaluating the fitted values; Correlation is not causation; Predicting with a regression model
- Regressing with Scikit-learnMinimizing the cost function; Explaining the reason for using squared errors; Pseudoinverse and other optimization methods; Gradient Descent at work; Summary; Chapter 3: Multiple Regression in Action; Using multiple features; Model building with Statsmodels; Using formulas as an alternative; The correlation matrix; Revisiting gradient descent; Feature scaling; Unstandardizing coefficients; Estimating feature importance; Inspecting standardized coefficients; Comparing models by R-squared; Interaction models; Discovering interactions; Polynomial regression
- Testing linear versus cubic transformationGoing for higher-degree solutions; Introducing underfitting and overfitting; Summary; Chapter 4: Logistic Regression; Defining a classification problem; Formalization of the problem: binary classification; Assessing the classifier's performance; Defining a probability-based approach; More on the logistic and logit functions; Let's see some code; Pros and cons of logistic regression; Revisiting Gradient Descend; Multiclass Logistic Regression; An example; Summary; Chapter 5: Data Preparation; Numeric feature scaling; Mean centering; Standardization
- NormalizationThe logistic regression case; Qualitative feature encoding; Dummy coding with Pandas; DictVectorizer and one-hot encoding; Feature hasher; Numeric feature transformation; Observing residuals; Summarizations by binning; Missing data; Missing data imputation; Keeping track of missing values; Outliers; Outliers on the response; Outliers among the predictors; Removing or replacing outliers; Summary; Chapter 6: Achieving Generalization; Checking on out-of-sample data; Testing by sample split; Cross-validation; Bootstrapping; Greedy selection of features ; The Madelon dataset