Cargando…

Data science at the command line : obtain, scrub, explore, and model data with Unix power tools /

This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, au...

Descripción completa

Detalles Bibliográficos
Clasificación:	Libro Electrónico
Autor principal:	Janssens, Jeroen, (Data scientist) (Autor)
Formato:	Electrónico eBook
Idioma:	Inglés
Publicado:	Sebastopol, CA : O'Reilly Media, Inc., 2021.
Edición:	Second edition.
Temas:	Electronic data processing. Database management. Information science. Command languages (Computer science) User interfaces (Computer systems) Bases de données > Gestion. Sciences de l'information. Langages de commande (Informatique) Interfaces utilisateurs (Informatique) information science.
Acceso en línea:	Texto completo (Requiere registro previo con correo institucional)

MARC


LEADER	00000cam a2200000 i 4500
001	OR_on1242402237
003	OCoLC
005	20231017213018.0
006	m o d
007	cr cnu---unuuu
008	110920s2021 cau ob 001 0 eng
040			\|a AU@ \|b eng \|e rda \|c AU@ \|d EBLCP \|d YDX \|d OCLCO \|d OCLCF \|d TEFOD \|d NTG \|d UKAHL \|d OCLCO \|d OCLCQ \|d OCL
019			\|a 1264472070 \|a 1277060788
020			\|a 9781492087885 \|q electronic book
020			\|a 1492087882 \|q electronic book
020			\|a 9781492087861 \|q (electronic bk.)
020			\|a 1492087866 \|q (electronic bk.)
020			\|z 9781492087915
024	8		\|a 9781492087908
029	0		\|a AU@ \|b 000068857662
035			\|a (OCoLC)1242402237 \|z (OCoLC)1264472070 \|z (OCoLC)1277060788
037			\|a 0F0298B8-E89C-47DB-A15D-692FAEE4972E \|b OverDrive, Inc. \|n http://www.overdrive.com
050		4	\|a QA76.76.O63 \|b J36 2021
082	0	4	\|a 005.7 \|2 23
049			\|a UAMI
100	1		\|a Janssens, Jeroen, \|c (Data scientist), \|e author.
245	1	0	\|a Data science at the command line : \|b obtain, scrub, explore, and model data with Unix power tools / \|c Jeroen Janssens.
250			\|a Second edition.
264		1	\|a Sebastopol, CA : \|b O'Reilly Media, Inc., \|c 2021.
300			\|a 1 online resource (250 pages)
336			\|a text \|b txt \|2 rdacontent
337			\|a computer \|b c \|2 rdamedia
338			\|a online resource \|b cr \|2 rdacarrier
520			\|a This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packed with over 80 tools--useful whether you work with Windows, macOS, or Linux. You'll quickly discover why the command line is an agile, scalable, and extensible technology. Even if you're comfortable processing data with Python or R, you'll learn how to greatly improve your data science workflow by leveraging the command line's power. This book is ideal for data scientists, analysts, and engineers; software and machine learning engineers; and system administrators. Obtain data from websites, APIs, databases, and spreadsheets Perform scrub operations on text, CSV, HTM, XML, and JSON files Explore data, compute descriptive statistics, and create visualizations Manage your data science workflow Create reusable command-line tools from one-liners and existing Python or R code Parallelize and distribute data-intensive pipelines Model data with dimensionality reduction, clustering, regression, and classification algorithms.
504			\|a Includes bibliographical references and index.
505	0		\|a Cover -- Copyright -- Table of Contents -- Foreword -- Preface -- What to Expect from This Book -- Changes for the Second Edition -- How to Read This Book -- Who This Book Is For -- Conventions Used in This Book -- O'Reilly Online Learning -- How to Contact Us -- Acknowledgments for the Second Edition (2021) -- Acknowledgments for the First Edition (2014) -- Chapter 1. Introduction -- Data Science Is OSEMN -- Obtaining Data -- Scrubbing Data -- Exploring Data -- Modeling Data -- Interpreting Data -- Intermezzo Chapters -- What Is the Command Line? -- Why Data Science at the Command Line?
505	8		\|a The Command Line Is Agile -- The Command Line Is Augmenting -- The Command Line Is Scalable -- The Command Line Is Extensible -- The Command Line Is Ubiquitous -- Summary -- For Further Exploration -- Chapter 2. Getting Started -- Getting the Data -- Installing the Docker Image -- Essential Unix Concepts -- The Environment -- Executing a Command-Line Tool -- Five Types of Command-Line Tools -- Combining Command-Line Tools -- Redirecting Input and Output -- Working with Files and Directories -- Managing Output -- Help! -- Summary -- For Further Exploration -- Chapter 3. Obtaining Data -- Overview
505	8		\|a Copying Local Files to the Docker Container -- Downloading from the Internet -- Introducing curl -- Saving -- Other Protocols -- Following Redirects -- Decompressing Files -- Converting Microsoft Excel Spreadsheets to CSV -- Querying Relational Databases -- Calling Web APIs -- Authentication -- Streaming APIs -- Summary -- For Further Exploration -- Chapter 4. Creating Command-Line Tools -- Overview -- Converting One-Liners into Shell Scripts -- Step 1: Create a File -- Step 2: Give Permission to Execute -- Step 3: Define a Shebang -- Step 4: Remove the Fixed Input -- Step 5: Add Arguments
505	8		\|a Step 6: Extend Your PATH -- Creating Command-Line Tools with Python and R -- Porting the Shell Script -- Processing Streaming Data from Standard Input -- Summary -- For Further Exploration -- Chapter 5. Scrubbing Data -- Overview -- Transformations, Transformations Everywhere -- Plain Text -- Filtering Lines -- Extracting Values -- Replacing and Deleting Values -- CSV -- Bodies and Headers and Columns, Oh My! -- Performing SQL Queries on CSV -- Extracting and Reordering Columns -- Filtering Rows -- Merging Columns -- Combining Multiple CSV Files -- Working with XML/HTML and JSON -- Summary
505	8		\|a For Further Exploration -- Chapter 6. Project Management with Make -- Overview -- Introducing Make -- Running Tasks -- Building, for Real -- Adding Dependencies -- Summary -- For Further Exploration -- Chapter 7. Exploring Data -- Overview -- Inspecting Data and Its Properties -- Header or Not, Here I Come -- Inspect All the Data -- Feature Names and Data Types -- Unique Identifiers, Continuous Variables, and Factors -- Computing Descriptive Statistics -- Column Statistics -- R One-Liners on the Shell -- Creating Visualizations -- Displaying Images from the Command Line -- Plotting in a Rush
590			\|a O'Reilly \|b O'Reilly Online Learning: Academic/Public Library Edition
650		0	\|a Electronic data processing.
650		0	\|a Database management.
650		0	\|a Information science.
650		0	\|a Command languages (Computer science)
650		0	\|a User interfaces (Computer systems)
650		6	\|a Bases de données \|x Gestion.
650		6	\|a Sciences de l'information.
650		6	\|a Langages de commande (Informatique)
650		6	\|a Interfaces utilisateurs (Informatique)
650		7	\|a information science. \|2 aat
650		7	\|a Information science. \|2 fast \|0 (OCoLC)fst00972640
650		7	\|a Electronic data processing. \|2 fast \|0 (OCoLC)fst00906956
650		7	\|a Database management. \|2 fast \|0 (OCoLC)fst00888037
650		7	\|a Command languages (Computer science) \|2 fast \|0 (OCoLC)fst01739453
650		7	\|a User interfaces (Computer systems) \|2 fast \|0 (OCoLC)fst01163191
776	0	8	\|i Print version: \|a Janssens, Jeroen \|t Data Science at the Command Line \|d Sebastopol : O'Reilly Media, Incorporated,c2021 \|z 9781492087915
856	4	0	\|u https://learning.oreilly.com/library/view/~/9781492087908/?ar \|z Texto completo (Requiere registro previo con correo institucional)
938			\|a YBP Library Services \|b YANK \|n 302404626
938			\|a ProQuest Ebook Central \|b EBLB \|n EBL6706301
938			\|a Askews and Holts Library Services \|b ASKH \|n AH39158631
994			\|a 92 \|b IZTAP

Data science at the command line : obtain, scrub, explore, and model data with Unix power tools /

MARC

Ejemplares similares