Data science from scratch /
Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they're also a good way to dive into the discipline without actually understanding data science. In this book, you'll learn how many of the most fundamental data science tools and algorithms wor...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Sebastopol, CA :
O'Reilly Media,
[2015]
|
Edición: | First edition. |
Temas: | |
Acceso en línea: | Texto completo (Requiere registro previo con correo institucional) |
Tabla de Contenidos:
- Machine generated contents note: The Ascendance of Data
- What Is Data Science?
- Motivating Hypothetical: DataSciencester
- Finding Key Connectors
- Data Scientists You May Know
- Salaries and Experience
- Paid Accounts
- Topics of Interest
- Onward
- The Basics
- Getting Python
- The Zen of Python
- Whitespace Formatting
- Modules
- Arithmetic
- Functions
- Strings
- Exceptions
- Lists
- Tuples
- Dictionaries
- Sets
- Control Flow
- Truthiness
- The Not-So-Basics
- Sorting
- List Comprehensions
- Generators and Iterators
- Randomness
- Regular Expressions
- Object-Oriented Programming
- Functional Tools
- enumerate
- zip and Argument Unpacking
- args and kwargs
- Welcome to DataSciencester!
- For Further Exploration
- matplotlib
- Bar Charts
- Line Charts
- Scatterplots
- For Further Exploration
- Vectors
- Matrices
- For Further Exploration
- Describing a Single Set of Data
- Central Tendencies
- Dispersion
- Correlation
- Simpson's Paradox
- Some Other Correlational Caveats
- Correlation and Causation
- For Further Exploration
- Dependence and Independence
- Conditional Probability
- Bayes's Theorem
- Random Variables
- Continuous Distributions
- The Normal Distribution
- The Central Limit Theorem
- For Further Exploration
- Statistical Hypothesis Testing
- Example: Flipping a Coin
- Confidence Intervals
- P-hacking
- Example: Running an A/B Test
- Bayesian Inference
- For Further Exploration
- The Idea Behind Gradient Descent
- Estimating the Gradient
- Using the Gradient
- Choosing the Right Step Size
- Putting It All Together
- Stochastic Gradient Descent
- For Further Exploration
- stdin and stdout
- Reading Files
- The Basics of Text Files
- Delimited Files
- Scraping the Web
- HTML and the Parsing Thereof
- Example: O'Reilly Books About Data
- Using APIs
- JSON (and XML)
- Using an Unauthenticated API
- Finding APIs
- Example: Using the Twitter APIs
- Getting Credentials
- For Further Exploration
- Exploring Your Data
- Exploring One-Dimensional Data
- Two Dimensions
- Many Dimensions
- Cleaning and Munging
- Manipulating Data
- Rescaling
- Dimensionality Reduction
- For Further Exploration
- Modeling
- What Is Machine Learning?
- Overfitting and Underfitting
- Correctness
- The Bias-Variance Trade-off
- Feature Extraction and Selection
- For Further Exploration
- The Model
- Example: Favorite Languages
- The Curse of Dimensionality
- For Further Exploration
- A Really Dumb Spam Filter
- A More Sophisticated Spam Filter
- Implementation
- Testing Our Model
- For Further Exploration
- The Model
- Using Gradient Descent
- Maximum Likelihood Estimation
- For Further Exploration
- The Model
- Further Assumptions of the Least Squares Model
- Fitting the Model
- Interpreting the Model
- Goodness of Fit
- Digression: The Bootstrap
- Standard Errors of Regression Coefficients
- Regularization
- For Further Exploration
- The Problem
- The Logistic Function
- Applying the Model
- Goodness of Fit
- Support Vector Machines
- For Further Investigation
- What Is a Decision Tree?
- Entropy
- The Entropy of a Partition
- Creating a Decision Tree
- Putting It All Together
- Random Forests
- For Further Exploration
- Perceptrons
- Feed-Forward Neural Networks
- Backpropagation
- Example: Defeating a CAPTCHA
- For Further Exploration
- The Idea
- The Model
- Example: Meetups
- Choosing k
- Example: Clustering Colors
- Bottom-up Hierarchical Clustering
- For Further Exploration
- Word Clouds
- n-gram Models
- Grammars
- An Aside: Gibbs Sampling
- Topic Modeling
- For Further Exploration
- Betweenness Centrality
- Eigenvector Centrality
- Matrix Multiplication
- Centrality
- Directed Graphs and PageRank
- For Further Exploration
- Manual Curation
- Recommending What's Popular
- User-Based Collaborative Filtering
- Item-Based Collaborative Filtering
- For Further Exploration
- CREATE TABLE and INSERT
- UPDATE
- DELETE
- SELECT
- GROUP BY
- ORDER BY
- JOIN
- Subqueries
- Indexes
- Query Optimization
- NoSQL
- For Further Exploration
- Example: Word Count
- Why MapReduce?
- MapReduce More Generally
- Example: Analyzing Status Updates
- Example: Matrix Multiplication
- An Aside: Combiners
- For Further Exploration
- IPython
- Mathematics
- Not from Scratch
- NumPy
- pandas
- scikit-learn
- Visualization
- R
- Find Data
- Do Data Science
- Hacker News
- Fire Trucks
- T-shirts
- And You?