Overview
TheElasticsearchIndexer sends documents to Elasticsearch using the official Java Client. It provides advanced features including join relations for parent-child documents, routing, and external versioning.
Java Class: com.kmwllc.lucille.indexer.ElasticsearchIndexer
Source: ElasticsearchIndexer.java
Configuration
Basic Configuration
With Authentication
Parameters
Target Elasticsearch index name.Example:
"documents", "logs-2024-01"Elasticsearch HTTP endpoint including protocol and port.Example:
"https://localhost:9200"Use partial update API to modify only specified fields instead of replacing the entire document.
Allow invalid TLS certificates. Use only for development/testing.
Document field that supplies the routing key for shard placement.Example:
"user_id", "parent_id"Version control type:
External: Use external version numbersExternalGte: External version must be >= current version
KafkaDocument instances.Join Field Configuration
Elasticsearch supports parent-child relationships using join fields:Name of the join field mapped in the index.Example:
"document_join"Whether documents being indexed are children in the join relation.
Child relation name. Required when
isChild is true.Example: "comment"Document field containing the parent document ID. Required when
isChild is true.Example: "parent_id"Parent relation name. Required when
isChild is false and join is used.Example: "article"Features
Join Relations (Parent-Child)
Elasticsearch join fields enable parent-child document relationships:- Parent Documents
- Child Documents
routingField to the parent ID field.
Partial Updates
Use update mode to modify only specific fields:- Index mode (update=false): Replaces entire document
- Update mode (update=true): Merges fields, preserving unspecified fields
Routing
Control shard placement:- Parent-child relationships
- Multi-tenant applications
- Query performance optimization
External Versioning
Use external version numbers (e.g., Kafka offsets):KafkaDocument.getOffset().
Index Override Not Supported
Connection Validation
The indexer pings Elasticsearch during startup:Error Handling
Failed documents are returned with error details:- Join field validation failures
- Parent document not found (for children)
- Routing value missing
- Version conflicts
- Mapping type mismatches
Example Configurations
Simple indexing
Simple indexing
Parent-child relationships
Parent-child relationships
With routing and updates
With routing and updates
Kafka integration
Kafka integration
Best Practices
Design join relationships carefully
Design join relationships carefully
- Keep parent-child relationships shallow (one level)
- Avoid too many children per parent (Elasticsearch limit: 10000 per shard)
- Use join only when you need parent-child queries (e.g.,
has_child,has_parent) - Consider denormalization as an alternative
Always route child documents
Always route child documents
Child documents must be on the same shard as their parent:Without proper routing, child documents will be unreachable.
Prepare index mapping first
Prepare index mapping first
Create the index with join field mapping before indexing:
Use separate pipelines for parents and children
Use separate pipelines for parents and children
Index parents first, then children. This ensures parents exist before children reference them.
Troubleshooting
Join field already exists error
Join field already exists error
The indexer automatically adds the join field. If the document already has it:
- Remove the join field from your source data
- Let the indexer populate it based on configuration
Parent document not found
Parent document not found
When indexing children:
- Ensure parent documents are indexed first
- Verify
parentDocumentIdSourcefield contains correct parent IDs - Check that routing is configured correctly
Index override not supported
Index override not supported
If you need multi-index support:
- Use separate pipelines for each index
- Or use OpenSearchIndexer which supports index override
Routing required but missing
Routing required but missing
For child documents or routing-required indices:
- Set
routingFieldin configuration - Ensure all documents have the routing field
- Verify routing field contains non-null values
Join Field Implementation
The indexer’sElasticJoinData class handles join field population: