Cargando…

Pandas Basics

This book is intended for those who plan to become data scientists as well as anyone who needs to perform data cleaning tasks using Pandas and NumPy. --

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Campesato, Oswald
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Bloomfield : Mercury Learning & Information, 2022.
Temas:
Acceso en línea:Texto completo
Tabla de Contenidos:
  • Cover
  • Title Page
  • Copyright
  • Dedication
  • Contents
  • Preface
  • Chapter 1: Introduction to Python
  • Tools for Python
  • easy_install and pip
  • virtualenv
  • IPython
  • Python Installation
  • Setting the PATH Environment Variable (Windows Only)
  • Launching Python on Your Machine
  • The Python Interactive Interpreter
  • Python Identifiers
  • Lines, Indentation, and Multi-lines
  • Quotations and Comments
  • Saving Your Code in a Module
  • Some Standard Modules
  • The help() and dir() Functions
  • Compile Time and Runtime Code Checking
  • Simple Data Types
  • Working with Numbers
  • Working with Other Bases
  • The chr() Function
  • The round() Function
  • Formatting Numbers
  • Working with Fractions
  • Unicode and UTF-8
  • Working with Unicode
  • Working with Strings
  • Comparing Strings
  • Formatting Strings
  • Uninitialized Variables and the Value None
  • Slicing and Splicing Strings
  • Testing for Digits and Alphabetic Characters
  • Search and Replace a String in Other Strings
  • Remove Leading and Trailing Characters
  • Printing Text without NewLine Characters
  • Text Alignment
  • Working with Dates
  • Converting Strings to Dates
  • Exception Handling
  • Handling User Input
  • Command-line Arguments
  • Summary
  • Chapter 2: Working with Data
  • Dealing with Data: What Can Go Wrong?
  • What is Data Drift?
  • What are Datasets?
  • Data Preprocessing
  • Data Types
  • Preparing Datasets
  • Discrete Data Versus Continuous Data
  • Binning Continuous Data
  • Scaling Numeric Data via Normalization
  • Scaling Numeric Data via Standardization
  • Scaling Numeric Data via Robust Standardization
  • What to Look for in Categorical Data
  • Mapping Categorical Data to Numeric Values
  • Working with Dates
  • Working with Currency
  • Working with Outliers and Anomalies
  • Outlier Detection/Removal
  • Finding Outliers with NumPy
  • Finding Outliers with Pandas
  • Calculating Z-scores to Find Outliers
  • Finding Outliers with SkLearn (Optional)
  • Working with Missing Data
  • Imputing Values: When is Zero a Valid Value?
  • Dealing with Imbalanced Datasets
  • What is SMOTE?
  • SMOTE extensions
  • The Bias-Variance Tradeoff
  • Types of Bias in Data
  • Analyzing Classifiers (Optional)
  • What is LIME?
  • What is ANOVA?
  • Summary
  • Chapter 3: Introduction to Probability and Statistics
  • What is a Probability?
  • Calculating the Expected Value
  • Random Variables
  • Discrete versus Continuous Random Variables
  • Well-known Probability Distributions
  • Fundamental Concepts in Statistics
  • The Mean
  • The Median
  • The Mode
  • The Variance and Standard Deviation
  • Population, Sample, and Population Variance
  • Chebyshev's Inequality
  • What is a p-value?
  • The Moments of a Function (Optional)
  • What is Skewness?
  • What is Kurtosis?
  • Data and Statistics
  • The Central Limit Theorem
  • Correlation versus Causation
  • Statistical Inferences
  • Statistical Terms: RSS, TSS, R^2, and F1 Score