Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Introduce Sparse index for indexes with sort fields #17038

Open
mgodwan opened this issue Jan 16, 2025 · 2 comments
Open

[Feature Request] Introduce Sparse index for indexes with sort fields #17038

mgodwan opened this issue Jan 16, 2025 · 2 comments
Assignees
Labels
enhancement Enhancement or improvement to existing feature or request Indexing:Performance

Comments

@mgodwan
Copy link
Member

mgodwan commented Jan 16, 2025

Is your feature request related to a problem? Please describe

Lucene 10 and above supports sparse indexing on doc values via FieldType#setDocValuesSkipIndexType. The sparse index will record the minimum and maximum values per block of doc IDs. Used in conjunction with index sorting to cluster similar documents together, this allows for very space-efficient and CPU-efficient filtering.

Describe the solution you'd like

OpenSearch should leverage this for indices with sort fields as this allows to create sparse index, and for use cases such as datastreams/time-series indices with sorting on timestamp, we may benefit from not having to create the points data structure for the field, resulting in reduced storage use, and efficient filtering for aggregation use cases.

Related component

Indexing:Performance

Describe alternatives you've considered

No response

Additional context

No response

@mgodwan mgodwan added enhancement Enhancement or improvement to existing feature or request untriaged labels Jan 16, 2025
@mgodwan
Copy link
Member Author

mgodwan commented Jan 16, 2025

I'm working on analysing the integration points. I will come back with possible approaches on this, and if this can be solved with our upcoming OpenSearch 3.0 release.

@andrross
Copy link
Member

andrross commented Feb 3, 2025

Catch All Triage - 1, 2, 3, 4, 5

@andrross andrross removed the untriaged label Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Indexing:Performance
Projects
None yet
Development

No branches or pull requests

2 participants