Cargando…

NATURAL LANGUAGE PROCESSING WITH SPARK NLP : learning to understand text at scale.

If you want to build an enterprise-quality application that uses natural language text but aren't sure where to begin or what tools to use, this practical guide will help get you started. Alex Thomas, principal data scientist at Wisecube, shows software engineers and data scientists how to buil...

Descripción completa

Detalles Bibliográficos
Clasificación:	Libro Electrónico
Autor principal:	THOMAS, ALEX
Formato:	Electrónico eBook
Idioma:	Inglés
Publicado:	[Place of publication not identified] O'REILLY MEDIA, 2020.
Temas:	Spark (Electronic resource : Apache Software Foundation) Natural language processing (Computer science) Application software > Development. Text data mining. Traitement automatique des langues naturelles. Logiciels d'application > Développement.
Acceso en línea:	Texto completo (Requiere registro previo con correo institucional)

Tabla de Contenidos:

Intro
Copyright
Table of Contents
Preface
Why Natural Language Processing Is Important and Difficult
Background
Philosophy
Conventions Used in This Book
Using Code Examples
O'Reilly Online Learning
How to Contact Us
Acknowledgments
Part I. Basics
Chapter 1. Getting Started
Introduction
Other Tools
Setting Up Your Environment
Prerequisites
Starting Apache Spark
Checking Out the Code
Getting Familiar with Apache Spark
Starting Apache Spark with Spark NLP
Loading and Viewing Data in Apache Spark
Hello World with Spark NLP
Chapter 2. Natural Language Basics
What Is Natural Language?
Origins of Language
Spoken Language Versus Written Language
Linguistics
Phonetics and Phonology
Morphology
Syntax
Semantics
Sociolinguistics: Dialects, Registers, and Other Varieties
Formality
Context
Pragmatics
Roman Jakobson
How To Use Pragmatics
Writing Systems
Origins
Alphabets
Abjads
Abugidas
Syllabaries
Logographs
Encodings
ASCII
Unicode
UTF-8
Exercises: Tokenizing
Tokenize English
Tokenize Greek
Tokenize Ge'ez (Amharic)
Resources
Chapter 3. NLP on Apache Spark
Parallelism, Concurrency, Distributing Computation
Parallelization Before Apache Hadoop
MapReduce and Apache Hadoop
Apache Spark
Architecture of Apache Spark
Physical Architecture
Logical Architecture
Spark SQL and Spark MLlib
Transformers
Estimators and Models
Evaluators
NLP Libraries
Functionality Libraries
Annotation Libraries
NLP in Other Libraries
Spark NLP
Annotation Library
Stages
Pretrained Pipelines
Finisher
Exercises: Build a Topic Model
Resources
Chapter 4. Deep Learning Basics
Gradient Descent
Backpropagation
Convolutional Neural Networks
Filters
Pooling
Recurrent Neural Networks
Backpropagation Through Time
Elman Nets
LSTMs
Exercise 1
Exercise 2
Resources
Part II. Building Blocks
Chapter 5. Processing Words
Tokenization
Vocabulary Reduction
Stemming
Lemmatization
Stemming Versus Lemmatization
Spelling Correction
Normalization
Bag-of-Words
CountVectorizer
N-Gram
Visualizing: Word and Document Distributions
Exercises
Resources
Chapter 6. Information Retrieval
Inverted Indices
Building an Inverted Index
Vector Space Model
Stop-Word Removal
Inverse Document Frequency
In Spark
Exercises
Resources
Chapter 7. Classification and Regression
Bag-of-Words Features
Regular Expression Features
Feature Selection
Modeling
Naïve Bayes
Linear Models
Decision/Regression Trees
Deep Learning Algorithms
Iteration
Exercises
Chapter 8. Sequence Modeling with Keras
Sentence Segmentation
(Hidden) Markov Models
Section Segmentation
Part-of-Speech Tagging
Conditional Random Field
Chunking and Syntactic Parsing

NATURAL LANGUAGE PROCESSING WITH SPARK NLP : learning to understand text at scale.

Ejemplares similares