Elasticsearch Server - Third Edition.
Leverage Elasticsearch to create a robust, fast, and flexible search solution with easeAbout This Book Boost the searching capabilities of your system through synonyms, multilingual data handling, nested objects and parent-child documents Deep dive into the world of data aggregation and data analysi...
Clasificación: | Libro Electrónico |
---|---|
Autores principales: | , |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Birmingham :
Packt Publishing, Limited
Feb. 2016.
|
Edición: | 3rd ed. |
Temas: | |
Acceso en línea: | Texto completo |
Tabla de Contenidos:
- Cover
- Copyright
- Credits
- About the Authors
- About the Reviewer
- www.PacktPub.com
- Table of Contents
- Preface
- Chapter 1: Getting Started with Elasticsearch Cluster
- Full text searching
- The Lucene glossary and architecture
- Input data analysis
- Indexing and querying
- Scoring and query relevance
- The basics of Elasticsearch
- Key concepts of Elasticsearch
- Index
- Document
- Document type
- Mapping
- Key concepts of the Elasticsearch infrastructure
- Nodes and clusters
- Shards
- Replicas
- Gateway
- Indexing and searching
- Installing and configuring your cluster
- Installing Java
- Installing Elasticsearch
- Running Elasticsearch
- Shutting down Elasticsearch
- The directory layout
- Configuring Elasticsearch
- The system-specific installation and configuration
- Installing Elasticsearch on Linux
- Configuring Elasticsearch as a system service on Linux
- Elasticsearch as a system service on Windows
- Manipulating data with the REST API
- Understanding the REST API
- Storing data in Elasticsearch
- Creating a new document
- Retrieving documents
- Updating documents
- Dealing with non-existing documents
- Adding partial documents
- Deleting documents
- Versioning
- Usage example
- Versioning from external systems
- Searching with the URI request query
- Sample data
- URI search
- Elasticsearch query response
- Query analysis
- URI query string parameters
- The query
- The default search field
- Analyzer
- The default operator property
- Query explanation
- The fields returned
- Sorting the results
- The search timeout
- The results window
- Limiting per-shard results
- Ignoring unavailable indices
- The search type
- Lowercasing term expansion
- Wildcard and prefix analysis
- Lucene query syntax
- Summary
- Chapter 2: Indexing Your Data.
- Elasticsearch indexing
- Shards and replicas
- Write consistency
- Creating indices
- Altering automatic index creation
- Settings for a newly created index
- Index deletion
- Mappings configuration
- Type determining mechanism
- Disabling the type determining mechanism
- Tuning the type determining mechanism for numeric types
- Tuning the type determining mechanism for dates
- Index structure mapping
- Type and types definition
- Fields
- Core types
- Multi fields
- The IP address type
- Token count type
- Using analyzers
- Out-of-the-box analyzers
- Defining your own analyzers
- Default analyzers
- Different similarity models
- Setting per-field similarity
- Available similarity models
- Batch indexing to speed up your indexing process
- Preparing data for bulk indexing
- Indexing the data
- The _all field
- The _source field
- Additional internal fields
- Introduction to segment merging
- Segment merging
- The need for segment merging
- The merge policy
- The merge scheduler
- Throttling
- Introduction to routing
- Default indexing
- Default searching
- Routing
- The routing parameters
- Routing fields
- Summary
- Chapter 3: Searching Your Data
- Querying Elasticsearch
- The example data
- A simple query
- Paging and result size
- Returning the version value
- Limiting the score
- Choosing the fields that we want to return
- Source filtering
- Using the script fields
- Passing parameters to the script fields
- Understanding the querying process
- Query logic
- Search type
- Search execution preference
- Search shards API
- Basic queries
- The term query
- The terms query
- The match all query
- The type query
- The exists query
- The missing query
- The common terms query
- The match query
- The Boolean match query
- The phrase match query
- The match phrase prefix query.
- The multi match query
- The query string query
- Running the query string query against multiple fields
- The simple query string query
- The identifiers query
- The prefix query
- The fuzzy query
- The wildcard query
- The range query
- Regular expression query
- The more like this query
- Compound queries
- The bool query
- The dis_max query
- The boosting query
- The constant_score query
- The indices query
- Using span queries
- A span
- Span term query
- Span first query
- Span near query
- Span or query
- Span not query
- Span within query
- Span containing query
- Span multi query
- Performance considerations
- Choosing the right query
- The use cases
- Limiting results to given tags
- Searching for values in a range
- Boosting some of the matched documents
- Ignoring lower scoring partial queries
- Using Lucene query syntax in queries
- Handling user queries without errors
- Autocomplete using prefixes
- Finding terms similar to a given one
- Matching phrases
- Spans, spans everywhere
- Summary
- Chapter 4: Extending Your Querying Knowledge
- Filtering your results
- The context is the key
- Explicit filtering with bool query
- Highlighting
- Getting started with highlighting
- Field configuration
- Under the hood
- Forcing highlighter type
- Configuring HTML tags
- Controlling highlighted fragments
- Global and local settings
- Require matching
- Custom highlighting query
- The Postings highlighter
- Validating your queries
- Using the Validate API
- Sorting data
- Default sorting
- Selecting fields used for sorting
- Sorting mode
- Specifying behavior for missing fields
- Dynamic criteria
- Calculate scoring when sorting
- Query rewrite
- Prefix query as an example
- Getting back to Apache Lucene
- Query rewrite properties
- Summary.
- Chapter 5: Extending Your Index Structure
- Indexing tree-like structures
- Data structure
- Analysis
- Indexing data that is not flat
- Data
- Objects
- Arrays
- Mappings
- Final mappings
- Sending the mappings to Elasticsearch
- To be or not to be dynamic
- Disabling object indexing
- Using nested objects
- Scoring and nested queries
- Using the parent-child relationship
- Index structure and data indexing
- Child mappings
- Parent mappings
- The parent document
- Child documents
- Querying
- Querying data in the child documents
- Querying data in the parent documents
- Performance considerations
- Modifying your index structure with the update API
- The mappings
- Adding a new field to the existing index
- Modifying fields of an existing index
- Summary
- Chapter 6: Make Your Search Better
- Introduction to Apache Lucene scoring
- When a document is matched
- Default scoring formula
- Relevancy matters
- Scripting capabilities of Elasticsearch
- Objects available during script execution
- Script types
- In file scripts
- Inline scripts
- Indexed scripts
- Querying with scripts
- Scripting with parameters
- Script languages
- Using other than embedded languages
- Using native code
- The factory implementation
- Implementing the native script
- The plugin definition
- Installing the plugin
- Running the script
- Searching content in different languages
- Handling languages differently
- Handling multiple languages
- Detecting the language of the document
- Sample document
- The mappings
- Querying
- Queries with an identified language
- Queries with an unknown language
- Combining queries
- Influencing scores with query boosts
- The boost
- Adding the boost to queries
- Modifying the score
- Constant score query
- Boosting query
- The function score query.
- When does index-time boosting make sense?
- Defining boosting in the mappings
- Words with the same meaning
- Synonym filter
- Synonyms in the mappings
- Synonyms stored on the file system
- Defining synonym rules
- Using Apache Solr synonyms
- Using WordNet synonyms
- Query or index-time synonym expansion
- Understanding the explain information
- Understanding field analysis
- Explaining the query
- Summary
- Chapter 7: Aggregations for Data Analysis
- Aggregations
- General query structure
- Inside the aggregations engine
- Aggregation types
- Metrics aggregations
- Minimum, maximum, average, and sum
- Field value statistics and extended statistics
- Value count
- Field cardinality
- Percentiles
- Percentile ranks
- Top hits aggregation
- Geo bounds aggregation
- Scripted metrics aggregation
- Buckets aggregations
- Filter aggregation
- Filters aggregation
- Terms aggregation
- Range aggregation
- Date range aggregation
- IPv4 range aggregation
- Missing aggregation
- Histogram aggregation
- Date histogram aggregation
- Time zones
- Geo distance aggregations
- Geohash grid aggregation
- Global aggregation
- Significant terms aggregation
- Choosing significant terms
- Multiple value analysis
- Sampler aggregation
- Children aggregation
- Nested aggregation
- Reverse nested aggregation
- Nesting aggregations and ordering buckets
- Buckets ordering
- Pipeline aggregations
- Available types
- Referencing other aggregations
- Gaps in the data
- Pipeline aggregation types
- Summary
- Chapter 8: Beyond Full-text Searching
- Percolator
- The index
- Percolator preparation
- Getting deeper
- Controlling the size of returned results
- Percolator and score calculation
- Combining percolators with other functionalities
- Getting the number of matching queries
- Indexed document percolation.