Machine learning for hackers /
If you're an experienced programmer interested in crunching data, this book will get you started with machine learning--a toolkit of algorithms that enables computers to train themselves to automate useful tasks. Authors Drew Conway and John Myles White help you understand machine learning and...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Otros Autores: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Sebastopol, CA :
O'Reilly,
2012.
|
Edición: | 1st ed. |
Temas: | |
Acceso en línea: | Texto completo (Requiere registro previo con correo institucional) |
Tabla de Contenidos:
- Table of Contents; Preface; Machine Learning for Hackers; How This Book Is Organized; Conventions Used in This Book; Using Code Examples; Safari® Books Online; How to Contact Us; Acknowledgements; Chapter 1. Using R; R for Machine Learning; Downloading and Installing R; Windows; Mac OS X; Linux; IDEs and Text Editors; Loading and Installing R Packages; R Basics for Machine Learning; Loading libraries and the data; Converting date strings and dealing with malformed data; Organizing location data; Dealing with data outside our scope; Aggregating and organizing the data; Analyzing the data.
- Further Reading on RChapter 2. Data Exploration; Exploration versus Confirmation; What Is Data?; Inferring the Types of Columns in Your Data; Inferring Meaning; Numeric Summaries; Means, Medians, and Modes; Quantiles; Standard Deviations and Variances; Exploratory Data Visualization; Visualizing the Relationships Between Columns; Chapter 3. Classification: Spam Filtering; This or That: Binary Classification; Moving Gently into Conditional Probability; Writing Our First Bayesian Spam Classifier; Defining the Classifier and Testing It with Hard Ham.
- Testing the Classifier Against All Email TypesImproving the Results; Chapter 4. Ranking: Priority Inbox; How Do You Sort Something When You Don't Know the Order?; Ordering Email Messages by Priority; Priority Features of Email; Writing a Priority Inbox; Functions for Extracting the Feature Set; Creating a Weighting Scheme for Ranking; A log-weighting scheme; Weighting from Email Thread Activity; Training and Testing the Ranker; Chapter 5. Regression: Predicting Page Views; Introducing Regression; The Baseline Model; Regression Using Dummy Variables; Linear Regression in a Nutshell.
- Predicting Web TrafficDefining Correlation; Chapter 6. Regularization: Text Regression; Nonlinear Relationships Between Columns: Beyond Straight Lines; Introducing Polynomial Regression; Methods for Preventing Overfitting; Preventing Overfitting with Regularization; Text Regression; Logistic Regression to the Rescue; Chapter 7. Optimization: Breaking Codes; Introduction to Optimization; Ridge Regression; Code Breaking as Optimization; Chapter 8. PCA: Building a Market Index; Unsupervised Learning; Chapter 9. MDS: Visually Exploring US Senator Similarity; Clustering Based on Similarity.
- A Brief Introduction to Distance Metrics and Multidirectional ScalingHow Do US Senators Cluster?; Analyzing US Senator Roll Call Data (101st-111th Congresses); Exploring senator MDS clustering by Congress; Chapter 10. kNN: Recommendation Systems; The k-Nearest Neighbors Algorithm; R Package Installation Data; Chapter 11. Analyzing Social Graphs; Social Network Analysis; Thinking Graphically; Hacking Twitter Social Graph Data; Working with the Google SocialGraph API; Analyzing Twitter Networks; Local Community Structure; Visualizing the Clustered Twitter Network with Gephi.