The natural language processing workshop : confidently design and build your own NLP projects with this easy-to-understand practical guide /
Make NLP easy by building chatbots and models, and executing various NLP tasks to gain data-driven insights from raw text data Key Features Get familiar with key natural language processing (NLP) concepts and terminology Explore the functionalities and features of popular NLP tools Learn how to use...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Birmingham, UK :
Packt Publishing,
2020.
|
Temas: | |
Acceso en línea: | Texto completo (Requiere registro previo con correo institucional) |
Tabla de Contenidos:
- Cover
- FM
- Copyright
- Table of Contents
- Preface
- Chapter 1: Introduction to Natural Language Processing
- Introduction
- History of NLP
- Text Analytics and NLP
- Exercise 1.01: Basic Text Analytics
- Various Steps in NLP
- Tokenization
- Exercise 1.02: Tokenization of a Simple Sentence
- PoS Tagging
- Exercise 1.03: PoS Tagging
- Stop Word Removal
- Exercise 1.04: Stop Word Removal
- Text Normalization
- Exercise 1.05: Text Normalization
- Spelling Correction
- Exercise 1.06: Spelling Correction of a Word and a Sentence
- Stemming
- Exercise 1.07: Using Stemming
- Lemmatization
- Exercise 1.08: Extracting the Base Word Using Lemmatization
- Named Entity Recognition (NER)
- Exercise 1.09: Treating Named Entities
- Word Sense Disambiguation
- Exercise 1.10: Word Sense Disambiguation
- Sentence Boundary Detection
- Exercise 1.11: Sentence Boundary Detection
- Activity 1.01: Preprocessing of Raw Text
- Kick Starting an NLP Project
- Data Collection
- Data Preprocessing
- Feature Extraction
- Model Development
- Model Assessment
- Model Deployment
- Summary
- Chapter 2: Feature Extraction Methods
- Introduction
- Types of Data
- Categorizing Data Based on Structure
- Categorizing Data Based on Content
- Cleaning Text Data
- Tokenization
- Exercise 2.01: Text Cleaning and Tokenization
- Exercise 2.02: Extracting n-grams
- Exercise 2.03: Tokenizing Text with Keras and TextBlob
- Types of Tokenizers
- Exercise 2.04: Tokenizing Text Using Various Tokenizers
- Stemming
- RegexpStemmer
- Exercise 2.05: Converting Words in the Present Continuous Tense into Base Words with RegexpStemmer
- The Porter Stemmer
- Exercise 2.06: Using the Porter Stemmer
- Lemmatization
- Exercise 2.07: Performing Lemmatization
- Exercise 2.08: Singularizing and Pluralizing Words
- Language Translation
- Exercise 2.09: Language Translation
- Stop-Word Removal
- Exercise 2.10: Removing Stop Words from Text
- Activity 2.01: Extracting Top Keywords from the News Article
- Feature Extraction from Texts
- Extracting General Features from Raw Text
- Exercise 2.11: Extracting General Features from Raw Text
- Exercise 2.12: Extracting General Features from Text
- Bag of Words (BoW)
- Exercise 2.13: Creating a Bag of Words
- Zipf's Law
- Exercise 2.14: Zipf's Law
- Term Frequency-Inverse Document Frequency (TFIDF)
- Exercise 2.15: TFIDF Representation
- Finding Text Similarity
- Application of Feature Extraction
- Exercise 2.16: Calculating Text Similarity Using Jaccard and Cosine Similarity
- Word Sense Disambiguation Using the Lesk Algorithm
- Exercise 2.17: Implementing the Lesk Algorithm Using String Similarity and Text Vectorization
- Word Clouds
- Exercise 2.18: Generating Word Clouds
- Other Visualizations
- Exercise 2.19: Other Visualizations Dependency Parse Trees and Named Entities
- Activity 2.02: Text Visualization
- Summary
- Chapter 3: Developing a Text Classifier
- Introduction