Cargando…

Learning pandas /

Get to grips with pandas--a versatile and high-performance Python library for data manipulation, analysis, and discoveryAbout This Book* Get comfortable using pandas and Python as an effective data exploration and analysis tool* Explore pandas through a framework of data analysis, with an explanatio...

Descripción completa

Detalles Bibliográficos
Clasificación:	Libro Electrónico
Autor principal:	Heydt, Michael (Autor)
Formato:	Electrónico eBook
Idioma:	Inglés
Publicado:	Birmingham : Packt Publishing, 2017.
Edición:	Second edition.
Temas:	Python. Electronic data processing. COMPUTERS > Data Processing. COMPUTERS > Programming Languages > Python. COMPUTERS > Data Visualization. Electronic data processing
Acceso en línea:	Texto completo

Tabla de Contenidos:

Cover
Copyright
Credits
About the Author
About the Reviewers
www.PacktPub.com
Customer Feedback
Table of Contents
Preface
Chapter 1: pandas and Data Analysis
Introducing pandas
Data manipulation, analysis, science, and pandas
Data manipulation
Data analysis
Data science
Where does pandas fit?
The process of data analysis
The process
Ideation
Retrieval
Preparation
Exploration
Modeling
Presentation
Reproduction
A note on being iterative and agile
Relating the book to the process
Concepts of data and analysis in our tour of pandas
Types of data
Structured
Unstructured
Semi-structured
Variables
Categorical
Continuous
Discrete
Time series data
General concepts of analysis and statistics
Quantitative versus qualitative data/analysis
Single and multivariate analysis
Descriptive statistics
Inferential statistics
Stochastic models
Probability and Bayesian statistics
Correlation
Regression
Other Python libraries of value with pandas
Numeric and scientific computing
NumPy and SciPy
Statistical analysis
StatsModels
Machine learning
scikit-learn
PyMC
stochastic Bayesian modeling
Data visualization
matplotlib and seaborn
Matplotlib
Seaborn
Summary
Chapter 2: Up and Running with pandas
Installation of Anaconda
IPython and Jupyter Notebook
IPython
Jupyter Notebook
Introducing the pandas Series and DataFrame
Importing pandas
The pandas Series
The pandas DataFrame
Loading data from files into a DataFrame
Visualization
Summary
Chapter 3: Representing Univariate Data with the Series
Configuring pandas
Creating a Series
Creating a Series using Python lists and dictionaries
Creation using NumPy functions
Creation using a scalar value.
The .index and .values properties
The size and shape of a Series
Specifying an index at creation
Heads, tails, and takes
Retrieving values in a Series by label or position
Lookup by label using the operator and the .ix property
Explicit lookup by position with .iloc
Explicit lookup by labels with .loc
Slicing a Series into subsets
Alignment via index labels
Performing Boolean selection
Re-indexing a Series
Modifying a Series in-place
Summary
Chapter 4: Representing Tabular and Multivariate Data with the DataFrame
Configuring pandas
Creating DataFrame objects
Creating a DataFrame using NumPy function results
Creating a DataFrame using a Python dictionary and pandas Series objects
Creating a DataFrame from a CSV file
Accessing data within a DataFrame
Selecting the columns of a DataFrame
Selecting rows of a DataFrame
Scalar lookup by label or location using .at and .iat
Slicing using the operator
Selecting rows using Boolean selection
Selecting across both rows and columns
Summary
Chapter 5: Manipulating DataFrame Structure
Configuring pandas
Renaming columns
Adding new columns with and .insert()
Adding columns through enlargement
Adding columns using concatenation
Reordering columns
Replacing the contents of a column
Deleting columns
Appending new rows
Concatenating rows
Adding and replacing rows via enlargement
Removing rows using .drop()
Removing rows using Boolean selection
Removing rows using a slice
Summary
Chapter 6: Indexing Data
Configuring pandas
The importance of indexes
The pandas index types
The fundamental type
Index
Integer index labels using Int64Index and RangeIndex
Floating-point labels using Float64Index
Representing discrete intervals using IntervalIndex.
Categorical values as an index
CategoricalIndex
Indexing by date and time using DatetimeIndex
Indexing periods of time using PeriodIndex
Working with Indexes
Creating and using an index with a Series or DataFrame
Selecting values using an index
Moving data to and from the index
Reindexing a pandas object
Hierarchical indexing
Summary
Chapter 7: Categorical Data
Configuring pandas
Creating Categoricals
Renaming categories
Appending new categories
Removing categories
Removing unused categories
Setting categories
Descriptive information of a Categorical
Munging school grades
Summary
Chapter 8: Numerical and Statistical Methods
Configuring pandas
Performing numerical methods on pandas objects
Performing arithmetic on a DataFrame or Series
Getting the counts of values
Determining unique values (and their counts)
Finding minimum and maximum values
Locating the n-smallest and n-largest values
Calculating accumulated values
Performing statistical processes on pandas objects
Retrieving summary descriptive statistics
Measuring central tendency: mean, median, and mode
Calculating the mean
Finding the median
Determining the mode
Calculating variance and standard deviation
Measuring variance
Finding the standard deviation
Determining covariance and correlation
Calculating covariance
Determining correlation
Performing discretization and quantiling of data
Calculating the rank of values
Calculating the percent change at each sample of a series
Performing moving-window operations
Executing random sampling of data
Summary
Chapter 9: Accessing Data
Configuring pandas
Working with CSV and text/tabular format data
Examining the sample CSV data set
Reading a CSV file into a DataFrame.
Specifying the index column when reading a CSV file
Data type inference and specification
Specifying column names
Specifying specific columns to load
Saving DataFrame to a CSV file
Working with general field-delimited data
Handling variants of formats in field-delimited data
Reading and writing data in Excel format
Reading and writing JSON files
Reading HTML data from the web
Reading and writing HDF5 format files
Accessing CSV data on the web
Reading and writing from/to SQL databases
Reading data from remote data services
Reading stock data from Yahoo! and Google Finance
Retrieving options data from Google Finance
Reading economic data from the Federal Reserve Bank of St. Louis
Accessing Kenneth French's data
Reading from the World Bank
Summary
Chapter 10: Tidying Up Your Data
Configuring pandas
What is tidying your data?
How to work with missing data
Determining NaN values in pandas objects
Selecting out or dropping missing data
Handling of NaN values in mathematical operations
Filling in missing data
Forward and backward filling of missing values
Filling using index labels
Performing interpolation of missing values
Handling duplicate data
Transforming data
Mapping data into different values
Replacing values
Applying functions to transform data
Summary
Chapter 11: Combining, Relating, and Reshaping Data
Configuring pandas
Concatenating data in multiple objects
Understanding the default semantics of concatenation
Switching axes of alignment
Specifying join type
Appending versus concatenation
Ignoring the index labels
Merging and joining data
Merging data from multiple pandas objects
Specifying the join semantics of a merge operation
Pivoting data to and from value and indexes
Stacking and unstacking.
Stacking using non-hierarchical indexes
Unstacking using hierarchical indexes
Melting data to and from long and wide format
Performance benefits of stacked data
Summary
Chapter 12: Data Aggregation
Configuring pandas
The split, apply, and combine (SAC) pattern
Data for the examples
Splitting data
Grouping by a single column's values
Accessing the results of a grouping
Grouping using multiple columns
Grouping using index levels
Applying aggregate functions, transforms, and filters
Applying aggregation functions to groups
Transforming groups of data
The general process of transformation
Filling missing values with the mean of the group
Calculating normalized z-scores with a transformation
Filtering groups from aggregation
Summary
Chapter 13: Time-Series Modelling
Setting up the IPython notebook
Representation of dates, time, and intervals
The datetime, day, and time objects
Representing a point in time with a Timestamp
Using a Timedelta to represent a time interval
Introducing time-series data
Indexing using DatetimeIndex
Creating time-series with specific frequencies
Calculating new dates using offsets
Representing data intervals with date offsets
Anchored offsets
Representing durations of time using Period
Modelling an interval of time with a Period
Indexing using the PeriodIndex
Handling holidays using calendars
Normalizing timestamps using time zones
Manipulating time-series data
Shifting and lagging
Performing frequency conversion on a time-series
Up and down resampling of a time-series
Time-series moving-window operations
Summary
Chapter 14: Visualization
Configuring pandas
Plotting basics with pandas
Creating time-series charts
Adorning and styling your time-series plot.

Learning pandas /

Ejemplares similares