Natural Language Processing with Java : Techniques for Building Machine Learning and Neural Network Models for NLP, 2nd Edition.
Natural Language Processing with Java will explore how to automatically organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization. You will leverage the power of Java to extract relationships within different elem...
Clasificación: | Libro Electrónico |
---|---|
Autor principal: | |
Otros Autores: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Birmingham :
Packt Publishing Ltd,
2018.
|
Edición: | 2nd ed. |
Temas: | |
Acceso en línea: | Texto completo |
Tabla de Contenidos:
- Cover; Title Page; Copyright and Credits; Dedication; Packt Upsell; Contributors; Table of Contents; Preface; Chapter 1: Introduction to NLP; What is NLP?; Why use NLP?; Why is NLP so hard?; Survey of NLP tools; Apache OpenNLP; Stanford NLP; LingPipe; GATE; UIMA; Apache Lucene Core; Deep learning for Java; Overview of text-processing tasks; Finding parts of text; Finding sentences; Feature-engineering; Finding people and things; Detecting parts of speech; Classifying text and documents; Extracting relationships; Using combined approaches; Understanding NLP models; Identifying the task.
- Selecting a modelBuilding and training the model; Verifying the model; Using the model; Preparing data; Summary; Chapter 2: Finding Parts of Text; Understanding the parts of text; What is tokenization?; Uses of tokenizers; Simple Java tokenizers; Using the Scanner class; Specifying the delimiter; Using the split method; Using the BreakIterator class; Using the StreamTokenizer class; Using the StringTokenizer class; Performance considerations with Java core tokenization; NLP tokenizer APIs; Using the OpenNLPTokenizer class; Using the SimpleTokenizer class; Using the WhitespaceTokenizer class.
- Using the TokenizerME classUsing the Stanford tokenizer; Using the PTBTokenizer class; Using the DocumentPreprocessor class; Using a pipeline; Using LingPipe tokenizers; Training a tokenizer to find parts of text; Comparing tokenizers; Understanding normalization; Converting to lowercase; Removing stopwords; Creating a StopWords class; Using LingPipe to remove stopwords; Using stemming; Using the Porter Stemmer; Stemming with LingPipe; Using lemmatization; Using the StanfordLemmatizer class; Using lemmatization in OpenNLP; Normalizing using a pipeline; Summary; Chapter 3: Finding Sentences.
- The SBD processWhat makes SBD difficult?; Understanding the SBD rules of LingPipe's HeuristicSentenceModel class; Simple Java SBDs; Using regular expressions; Using the BreakIterator class; Using NLP APIs; Using OpenNLP; Using the SentenceDetectorME class; Using the sentPosDetect method; Using the Stanford API; Using the PTBTokenizer class; Using the DocumentPreprocessor class; Using the StanfordCoreNLP class; Using LingPipe; Using the IndoEuropeanSentenceModel class; Using the SentenceChunker class; Using the MedlineSentenceModel class; Training a sentence-detector model.
- Using the Trained modelEvaluating the model using the SentenceDetectorEvaluator class; Summary; Chapter 4: Finding People and Things; Why is NER difficult?; Techniques for name recognition; Lists and regular expressions; Statistical classifiers; Using regular expressions for NER; Using Java's regular expressions to find entities; Using the RegExChunker class of LingPipe; Using NLP APIs; Using OpenNLP for NER; Determining the accuracy of the entity; Using other entity types; Processing multiple entity types; Using the Stanford API for NER; Using LingPipe for NER.