Cargando…

Learning pandas /

Get to grips with pandas--a versatile and high-performance Python library for data manipulation, analysis, and discoveryAbout This Book* Get comfortable using pandas and Python as an effective data exploration and analysis tool* Explore pandas through a framework of data analysis, with an explanatio...

Descripción completa

Detalles Bibliográficos
Clasificación:Libro Electrónico
Autor principal: Heydt, Michael (Autor)
Formato: Electrónico eBook
Idioma:Inglés
Publicado: Birmingham : Packt Publishing, 2017.
Edición:Second edition.
Temas:
Acceso en línea:Texto completo
Tabla de Contenidos:
  • Cover
  • Copyright
  • Credits
  • About the Author
  • About the Reviewers
  • www.PacktPub.com
  • Customer Feedback
  • Table of Contents
  • Preface
  • Chapter 1: pandas and Data Analysis
  • Introducing pandas
  • Data manipulation, analysis, science, and pandas
  • Data manipulation
  • Data analysis
  • Data science
  • Where does pandas fit?
  • The process of data analysis
  • The process
  • Ideation
  • Retrieval
  • Preparation
  • Exploration
  • Modeling
  • Presentation
  • Reproduction
  • A note on being iterative and agile
  • Relating the book to the process
  • Concepts of data and analysis in our tour of pandas
  • Types of data
  • Structured
  • Unstructured
  • Semi-structured
  • Variables
  • Categorical
  • Continuous
  • Discrete
  • Time series data
  • General concepts of analysis and statistics
  • Quantitative versus qualitative data/analysis
  • Single and multivariate analysis
  • Descriptive statistics
  • Inferential statistics
  • Stochastic models
  • Probability and Bayesian statistics
  • Correlation
  • Regression
  • Other Python libraries of value with pandas
  • Numeric and scientific computing
  • NumPy and SciPy
  • Statistical analysis
  • StatsModels
  • Machine learning
  • scikit-learn
  • PyMC
  • stochastic Bayesian modeling
  • Data visualization
  • matplotlib and seaborn
  • Matplotlib
  • Seaborn
  • Summary
  • Chapter 2: Up and Running with pandas
  • Installation of Anaconda
  • IPython and Jupyter Notebook
  • IPython
  • Jupyter Notebook
  • Introducing the pandas Series and DataFrame
  • Importing pandas
  • The pandas Series
  • The pandas DataFrame
  • Loading data from files into a DataFrame
  • Visualization
  • Summary
  • Chapter 3: Representing Univariate Data with the Series
  • Configuring pandas
  • Creating a Series
  • Creating a Series using Python lists and dictionaries
  • Creation using NumPy functions
  • Creation using a scalar value.
  • The .index and .values properties
  • The size and shape of a Series
  • Specifying an index at creation
  • Heads, tails, and takes
  • Retrieving values in a Series by label or position
  • Lookup by label using the operator and the .ix property
  • Explicit lookup by position with .iloc
  • Explicit lookup by labels with .loc
  • Slicing a Series into subsets
  • Alignment via index labels
  • Performing Boolean selection
  • Re-indexing a Series
  • Modifying a Series in-place
  • Summary
  • Chapter 4: Representing Tabular and Multivariate Data with the DataFrame
  • Configuring pandas
  • Creating DataFrame objects
  • Creating a DataFrame using NumPy function results
  • Creating a DataFrame using a Python dictionary and pandas Series objects
  • Creating a DataFrame from a CSV file
  • Accessing data within a DataFrame
  • Selecting the columns of a DataFrame
  • Selecting rows of a DataFrame
  • Scalar lookup by label or location using .at and .iat
  • Slicing using the operator
  • Selecting rows using Boolean selection
  • Selecting across both rows and columns
  • Summary
  • Chapter 5: Manipulating DataFrame Structure
  • Configuring pandas
  • Renaming columns
  • Adding new columns with and .insert()
  • Adding columns through enlargement
  • Adding columns using concatenation
  • Reordering columns
  • Replacing the contents of a column
  • Deleting columns
  • Appending new rows
  • Concatenating rows
  • Adding and replacing rows via enlargement
  • Removing rows using .drop()
  • Removing rows using Boolean selection
  • Removing rows using a slice
  • Summary
  • Chapter 6: Indexing Data
  • Configuring pandas
  • The importance of indexes
  • The pandas index types
  • The fundamental type
  • Index
  • Integer index labels using Int64Index and RangeIndex
  • Floating-point labels using Float64Index
  • Representing discrete intervals using IntervalIndex.
  • Categorical values as an index
  • CategoricalIndex
  • Indexing by date and time using DatetimeIndex
  • Indexing periods of time using PeriodIndex
  • Working with Indexes
  • Creating and using an index with a Series or DataFrame
  • Selecting values using an index
  • Moving data to and from the index
  • Reindexing a pandas object
  • Hierarchical indexing
  • Summary
  • Chapter 7: Categorical Data
  • Configuring pandas
  • Creating Categoricals
  • Renaming categories
  • Appending new categories
  • Removing categories
  • Removing unused categories
  • Setting categories
  • Descriptive information of a Categorical
  • Munging school grades
  • Summary
  • Chapter 8: Numerical and Statistical Methods
  • Configuring pandas
  • Performing numerical methods on pandas objects
  • Performing arithmetic on a DataFrame or Series
  • Getting the counts of values
  • Determining unique values (and their counts)
  • Finding minimum and maximum values
  • Locating the n-smallest and n-largest values
  • Calculating accumulated values
  • Performing statistical processes on pandas objects
  • Retrieving summary descriptive statistics
  • Measuring central tendency: mean, median, and mode
  • Calculating the mean
  • Finding the median
  • Determining the mode
  • Calculating variance and standard deviation
  • Measuring variance
  • Finding the standard deviation
  • Determining covariance and correlation
  • Calculating covariance
  • Determining correlation
  • Performing discretization and quantiling of data
  • Calculating the rank of values
  • Calculating the percent change at each sample of a series
  • Performing moving-window operations
  • Executing random sampling of data
  • Summary
  • Chapter 9: Accessing Data
  • Configuring pandas
  • Working with CSV and text/tabular format data
  • Examining the sample CSV data set
  • Reading a CSV file into a DataFrame.
  • Specifying the index column when reading a CSV file
  • Data type inference and specification
  • Specifying column names
  • Specifying specific columns to load
  • Saving DataFrame to a CSV file
  • Working with general field-delimited data
  • Handling variants of formats in field-delimited data
  • Reading and writing data in Excel format
  • Reading and writing JSON files
  • Reading HTML data from the web
  • Reading and writing HDF5 format files
  • Accessing CSV data on the web
  • Reading and writing from/to SQL databases
  • Reading data from remote data services
  • Reading stock data from Yahoo! and Google Finance
  • Retrieving options data from Google Finance
  • Reading economic data from the Federal Reserve Bank of St. Louis
  • Accessing Kenneth French's data
  • Reading from the World Bank
  • Summary
  • Chapter 10: Tidying Up Your Data
  • Configuring pandas
  • What is tidying your data?
  • How to work with missing data
  • Determining NaN values in pandas objects
  • Selecting out or dropping missing data
  • Handling of NaN values in mathematical operations
  • Filling in missing data
  • Forward and backward filling of missing values
  • Filling using index labels
  • Performing interpolation of missing values
  • Handling duplicate data
  • Transforming data
  • Mapping data into different values
  • Replacing values
  • Applying functions to transform data
  • Summary
  • Chapter 11: Combining, Relating, and Reshaping Data
  • Configuring pandas
  • Concatenating data in multiple objects
  • Understanding the default semantics of concatenation
  • Switching axes of alignment
  • Specifying join type
  • Appending versus concatenation
  • Ignoring the index labels
  • Merging and joining data
  • Merging data from multiple pandas objects
  • Specifying the join semantics of a merge operation
  • Pivoting data to and from value and indexes
  • Stacking and unstacking.
  • Stacking using non-hierarchical indexes
  • Unstacking using hierarchical indexes
  • Melting data to and from long and wide format
  • Performance benefits of stacked data
  • Summary
  • Chapter 12: Data Aggregation
  • Configuring pandas
  • The split, apply, and combine (SAC) pattern
  • Data for the examples
  • Splitting data
  • Grouping by a single column's values
  • Accessing the results of a grouping
  • Grouping using multiple columns
  • Grouping using index levels
  • Applying aggregate functions, transforms, and filters
  • Applying aggregation functions to groups
  • Transforming groups of data
  • The general process of transformation
  • Filling missing values with the mean of the group
  • Calculating normalized z-scores with a transformation
  • Filtering groups from aggregation
  • Summary
  • Chapter 13: Time-Series Modelling
  • Setting up the IPython notebook
  • Representation of dates, time, and intervals
  • The datetime, day, and time objects
  • Representing a point in time with a Timestamp
  • Using a Timedelta to represent a time interval
  • Introducing time-series data
  • Indexing using DatetimeIndex
  • Creating time-series with specific frequencies
  • Calculating new dates using offsets
  • Representing data intervals with date offsets
  • Anchored offsets
  • Representing durations of time using Period
  • Modelling an interval of time with a Period
  • Indexing using the PeriodIndex
  • Handling holidays using calendars
  • Normalizing timestamps using time zones
  • Manipulating time-series data
  • Shifting and lagging
  • Performing frequency conversion on a time-series
  • Up and down resampling of a time-series
  • Time-series moving-window operations
  • Summary
  • Chapter 14: Visualization
  • Configuring pandas
  • Plotting basics with pandas
  • Creating time-series charts
  • Adorning and styling your time-series plot.