Cargando…

Applied text analysis with Python : enabling language-aware data products with machine learning /

From news and speeches to informal chatter on social media, natural language is one of the richest and most underutilized sources of data. Not only does it come in a constant stream, always changing and adapting in context; it also contains information that is not conveyed by traditional data source...

Descripción completa

Detalles Bibliográficos
Clasificación:	Libro Electrónico
Autores principales:	Bengfort, Benjamin, 1984- (Autor), Bilbro, Rebecca (Autor), Ojeda, Tony (Autor)
Formato:	Electrónico eBook
Idioma:	Inglés
Publicado:	Sebastopol, CA : O'Reilly Media, [2018]
Edición:	First edition.
Temas:	Python (Computer program language) Natural language processing (Computer science) Machine learning. Natural Language Processing Machine Learning Python (Langage de programmation) Traitement automatique des langues naturelles. Apprentissage automatique. COMPUTERS > Programming Languages > Python. Machine learning
Acceso en línea:	Texto completo (Requiere registro previo con correo institucional)

MARC


LEADER	00000cam a2200000 i 4500
001	OR_on1046057318
003	OCoLC
005	20231017213018.0
006	m o d
007	cr unu\|\|\|\|\|\|\|\|
008	180725s2018 caua ob 001 0 eng d
040			\|a UMI \|b eng \|e rda \|e pn \|c UMI \|d MERER \|d STF \|d OCLCQ \|d TOH \|d OCLCF \|d DEBBG \|d CEF \|d G3B \|d UAB \|d GRG \|d N$T \|d EBLCP \|d TEFOD \|d I3U \|d AAA \|d AJB \|d NRC \|d UMC \|d S9I \|d VT2 \|d C6I \|d OCLCQ \|d YDX \|d UKAHL \|d VFL \|d OCLCQ \|d OCLCO \|d NZAUC \|d OCLCQ \|d OCLCO
019			\|a 1039887927 \|a 1040534119 \|a 1111123796 \|a 1202555092 \|a 1240512425
020			\|a 9781491963012 \|q (electronic book)
020			\|a 1491963018 \|q (electronic book)
020			\|a 9781491962992 \|q (electronic bk.)
020			\|a 1491962992 \|q (electronic bk.)
020			\|a 1491963042
020			\|a 9781491963043
020			\|z 9781491963043 \|q (paperback)
029	1		\|a GBVCP \|b 1029873151
029	1		\|a AU@ \|b 000068979674
035			\|a (OCoLC)1046057318 \|z (OCoLC)1039887927 \|z (OCoLC)1040534119 \|z (OCoLC)1111123796 \|z (OCoLC)1202555092 \|z (OCoLC)1240512425
037			\|a CL0500000981 \|b Safari Books Online
050		4	\|a QA76.73.P98
072		7	\|a COM \|x 051360 \|2 bisacsh
082	0	4	\|a 005.133 \|2 23
049			\|a UAMI
100	1		\|a Bengfort, Benjamin, \|d 1984- \|e author. \|4 aut
245	1	0	\|a Applied text analysis with Python : \|b enabling language-aware data products with machine learning / \|c Benjamin Bengfort, Rebecca Bilbro, and Tony Ojeda.
250			\|a First edition.
264		1	\|a Sebastopol, CA : \|b O'Reilly Media, \|c [2018]
264		4	\|c Ã2018
300			\|a 1 online resource (xviii, 310 pages) : \|b illustrations
336			\|a text \|b txt \|2 rdacontent
337			\|a computer \|b c \|2 rdamedia
338			\|a online resource \|b cr \|2 rdacarrier
347			\|a data file
588	0		\|a Online resource; title from title page (Safari, viewed July 23, 2018).
504			\|a Includes bibliographical references and index.
505	0		\|a 1. Language and computation -- 2. Building a custom corpus -- 3. Corpus preprocessing and wrangling -- 4. Text vectorization and transformation pipelines -- 5. Classification for text analysis -- 6. Clustering for text similarity -- 7. Context-aware text analysis -- 8. Text visualization -- 9. Graph analysis of text -- 10. Chatbots -- 11. Scaling text analytics with multiprocessing and spark -- 12. Deep learning and beyond.
505	0		\|a Cover; Copyright; Table of Contents; Preface; Computational Challenges of Natural Language; Linguistic Data: Tokens and Words; Enter Machine Learning; Tools for Text Analysis; What to Expect from This Book; Who This Book Is For; Code Examples and GitHub Repository; Conventions Used in This Book; Using Code Examples; O'Reilly Safari; How to Contact Us; Acknowledgments; Chapter 1. Language and Computation; The Data Science Paradigm; Language-Aware Data Products; The Data Product Pipeline; Language as Data; A Computational Model of Language; Language Features; Contextual Features.
505	8		\|a Structural FeaturesConclusion; Chapter 2. Building a Custom Corpus; What Is a Corpus?; Domain-Specific Corpora; The Baleen Ingestion Engine; Corpus Data Management; Corpus Disk Structure; Corpus Readers; Streaming Data Access with NLTK; Reading an HTML Corpus; Reading a Corpus from a Database; Conclusion; Chapter 3. Corpus Preprocessing and Wrangling; Breaking Down Documents; Identifying and Extracting Core Content; Deconstructing Documents into Paragraphs; Segmentation: Breaking Out Sentences; Tokenization: Identifying Individual Tokens; Part-of-Speech Tagging; Intermediate Corpus Analytics.
505	8		\|a Corpus TransformationIntermediate Preprocessing and Storage; Reading the Processed Corpus; Conclusion; Chapter 4. Text Vectorization and Transformation Pipelines; Words in Space; Frequency Vectors; One-Hot Encoding; Term Frequency-Inverse Document Frequency; Distributed Representation; The Scikit-Learn API; The BaseEstimator Interface; Extending TransformerMixin; Pipelines; Pipeline Basics; Grid Search for Hyperparameter Optimization; Enriching Feature Extraction with Feature Unions; Conclusion; Chapter 5. Classification for Text Analysis; Text Classification.
505	8		\|a Identifying Classification ProblemsClassifier Models; Building a Text Classification Application; Cross-Validation; Model Construction; Model Evaluation; Model Operationalization; Conclusion; Chapter 6. Clustering for Text Similarity; Unsupervised Learning on Text; Clustering by Document Similarity; Distance Metrics; Partitive Clustering; Hierarchical Clustering; Modeling Document Topics; Latent Dirichlet Allocation; Latent Semantic Analysis; Non-Negative Matrix Factorization; Conclusion; Chapter 7. Context-Aware Text Analysis; Grammar-Based Feature Extraction; Context-Free Grammars.
505	8		\|a Syntactic ParsersExtracting Keyphrases; Extracting Entities; n-Gram Feature Extraction; An n-Gram-Aware CorpusReader; Choosing the Right n-Gram Window; Significant Collocations; n-Gram Language Models; Frequency and Conditional Frequency; Estimating Maximum Likelihood; Unknown Words: Back-off and Smoothing; Language Generation; Conclusion; Chapter 8. Text Visualization; Visualizing Feature Space; Visual Feature Analysis; Guided Feature Engineering; Model Diagnostics; Visualizing Clusters; Visualizing Classes; Diagnosing Classification Error; Visual Steering; Silhouette Scores and Elbow Curves.
520			\|a From news and speeches to informal chatter on social media, natural language is one of the richest and most underutilized sources of data. Not only does it come in a constant stream, always changing and adapting in context; it also contains information that is not conveyed by traditional data sources. The key to unlocking natural language is through the creative application of text analytics. This practical book presents a data scientist's approach to building language-aware products with applied machine learning. You'll learn robust, repeatable, and scalable techniques for text analysis with Python, including contextual and linguistic feature engineering, vectorization, classification, topic modeling, entity resolution, graph analysis, and visual steering. By the end of the book, you'll be equipped with practical methods to solve any number of complex real-world problems. Preprocess and vectorize text into high-dimensional feature representations. Perform document classification and topic modeling. Steer the model selection process with visual diagnostics. Extract key phrases, named entities, and graph structures to reason about data in text. Build a dialog framework to enable chatbots and language-driven interaction. Use Spark to scale processing power and neural networks to scale model complexity.--Provided by publisher.
590			\|a O'Reilly \|b O'Reilly Online Learning: Academic/Public Library Edition
650		0	\|a Python (Computer program language)
650		0	\|a Natural language processing (Computer science)
650		0	\|a Machine learning.
650		2	\|a Natural Language Processing
650		2	\|a Machine Learning
650		6	\|a Python (Langage de programmation)
650		6	\|a Traitement automatique des langues naturelles.
650		6	\|a Apprentissage automatique.
650		7	\|a COMPUTERS \|x Programming Languages \|x Python. \|2 bisacsh
650		7	\|a Machine learning \|2 fast
650		7	\|a Natural language processing (Computer science) \|2 fast
650		7	\|a Python (Computer program language) \|2 fast
700	1		\|a Bilbro, Rebecca, \|e author \|4 aut
700	1		\|a Ojeda, Tony, \|e author \|4 aut
776	0	8	\|i Print version: \|a Bengfort, Benjamin, 1984- \|t Applied text analysis with Python. \|b First edition. \|d Sebastopol, CA : O'Reilly Media, Inc., 2018 \|w (DLC) 2018276483
856	4	0	\|u https://learning.oreilly.com/library/view/~/9781491963036/?ar \|z Texto completo (Requiere registro previo con correo institucional)
938			\|a Askews and Holts Library Services \|b ASKH \|n AH37545940
938			\|a ProQuest Ebook Central \|b EBLB \|n EBL5425029
938			\|a EBSCOhost \|b EBSC \|n 1827695
938			\|a YBP Library Services \|b YANK \|n 15546767
938			\|a YBP Library Services \|b YANK \|n 15530107
994			\|a 92 \|b IZTAP

Applied text analysis with Python : enabling language-aware data products with machine learning /

MARC

Ejemplares similares