Fixes for concurrent document indexing and querying #684

danielaskdd · 2025-01-31T07:47:16Z

Problems Addressed

When enable_llm_cache_for_entity_extract is True and enable_llm_cache is False, there is race condition when doing document indexing and user query at the same time. Func extract_entities temporary modify global config enable_llm_cache to True, causing user query also using cache temporary.
Performance bottleneck during document indexing due to LLM function instance exhaustion, causing user query latency. The limit_async_func_call decorator implementation using sleep to wait for another thread to release resource is so bad, that user query will not have a chance to run until document index is over.
Thread safety issues in NanoVectorDB's disk persistence operations.

Solutions

Entity Extraction Improvements:

Intruduce force_llm_cache parm to handle_cache, forcing the function handle cache no mater what global config settings are.
Modify extract_entities to use this new parm to control the behavior of handle_cache.

Concurrent Operation Optimization:

Replaced function call limiting mechanism of limit_async_func_call with asyncio.Semaphore, ensuring FIFO order to prevent query starvation during indexing.
Removed redundant semaphore logic from EmbeddingFunc

Introduce asyncio.Lock to protect file save operations in NanoVectorDBStorage

Above changes improve concurrent document indexing and querying reliability while maintaining system performance and improve thread safety.

Bug fixes

Fix quantize_embedding for not support embedding vector of list type
Fix cache_type missing in llm_response_cache, and add new cache_type extract for entities extraction llm respond cache.
Fix prompt respond cache fail when is_embedding_cache_enabled is true.
Fix llm_model_func retrieval error in handle_cache, this function is for LLM similarity verification.
Update similarity_check prompt to avoid generating two scores sometimes.

- Abandon the approach of temporarily replacing the global llm_model_func configuration - Introduce custom_llm function with new_config for handle_cache while extracting entities - Update handle_cache to accept custom_llm

- Separate insert/query embedding funcs - Add query-specific async limit - Update storage classes to use new funcs - Protect vector DB save with lock - Improve config handling for thresholds

danielaskdd · 2025-01-31T07:49:55Z

@ParisNeo This update involves some core functionalities in LightRAG, and I hope you could help with a preliminary review of the PR.

… very long context is provided.

…lready control by limit_async_func_call)

- Convert list to numpy array if needed - Maintain existing functionality

This reverts commit 21481db.

- Introduce asyncio.Lock for save operations - Ensure thread-safe file writes

…ormance. - Replace custom counter with asyncio.Semaphore - The existing implementation cannot follow the FIFO order

…nc control)

…nchanged.

- Removed custom LLM function in entity extraction - Simplified cache handling logic - Added `force_llm_cache` parameter - Updated cache handling conditions

…n for API Server

danielaskdd added 5 commits January 30, 2025 02:45

Fix concurrent problem on extract_entities function.

cc50ade

- Abandon the approach of temporarily replacing the global llm_model_func configuration - Introduce custom_llm function with new_config for handle_cache while extracting entities - Update handle_cache to accept custom_llm

Merge branch 'main' into fix-extract-entity-concurrent-problem

54b6807

Refactor embedding functions and add async query limit

21481db

- Separate insert/query embedding funcs - Add query-specific async limit - Update storage classes to use new funcs - Protect vector DB save with lock - Improve config handling for thresholds

Shorten log message for cosine similarity threshold.

389f4ee

Fix linting

b0d87b2

danielaskdd changed the title ~~Fix concurrent problem extract entity and query at the same time~~ Fixes for concurrent document indexing and querying Jan 31, 2025

danielaskdd added 23 commits February 1, 2025 10:36

Add logging for chunk truncation in mix_kg_vector_query

2a010c9

Fix linting

b22a8b2

Improve prompts to avoid make-up respond from LLM like qwen-plus when…

0692635

… very long context is provided.

Fix typo in prompt

60b66b9

remove semaphore logic from EmbeddingFunc(cause num of instances is a…

1192727

…lready control by limit_async_func_call)

Add support for list input in quantize_embedding function

2ba36f8

- Convert list to numpy array if needed - Maintain existing functionality

Revert "Refactor embedding functions and add async query limit"

6a326e2

This reverts commit 21481db.

Add lock to protect file write operations in NanoVectorDBStorage

635d4fd

- Introduce asyncio.Lock for save operations - Ensure thread-safe file writes

Refactor async call limiting to use asyncio.Semaphore for better perf…

b109f57

…ormance. - Replace custom counter with asyncio.Semaphore - The existing implementation cannot follow the FIFO order

Merge branch 'main' into fix-extract-entity-concurrent-problem

a0710e9

Fix linting

95edf8a

remove unused parm

c98a675

Use direct embedding_func from hashing_kv (do not by pass maxiumu asy…

c394207

…nc control)

Save cache_type to llm_response_cache

3bc7c4d

Add embedding_func to llm_response_cache

b87703a

Fix typo error

3c3cdba

Save keywords to cache only when it's no empty

2d387fa

Add cache type "extract" for entity extraction

c9481c8

Refactor cache handling logic for better readability, keep function u…

6c7d7c2

…nchanged.

Refactor LLM cache handling and entity extraction

b45ae15

- Removed custom LLM function in entity extraction - Simplified cache handling logic - Added `force_llm_cache` parameter - Updated cache handling conditions

Fix linting

5d14ab0

Fix prompt respond cache fail when is_embedding_cache_enabled is true

bed5a97

Set embedding_func in all llm_response_cache

fdc9017

danielaskdd and others added 10 commits February 2, 2025 03:15

Add debug logging for cache response retrieval

873b52d

Fix llm_model_func retrieval error.

8484564

Update similarity_check prompt to avoid generating two scores sometiimes

6f5503e

Add embedding cache config and disable LLM cache for entity extractio…

ecf48a5

…n for API Server

Fix linting

0a693db

fix import bugs of postgres example

9fc5912

Update requirements.txt

0048d50

Introduced docling instead of other tools for loading files

963ebf7

fixed linting as well as file path

67bf333

Simplified file loading

3e11b26

danielaskdd closed this Feb 1, 2025

danielaskdd deleted the fix-extract-entity-concurrent-problem branch February 1, 2025 20:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes for concurrent document indexing and querying #684

Fixes for concurrent document indexing and querying #684

danielaskdd commented Jan 31, 2025 •

edited

Loading

danielaskdd commented Jan 31, 2025

Fixes for concurrent document indexing and querying #684

Fixes for concurrent document indexing and querying #684

Conversation

danielaskdd commented Jan 31, 2025 • edited Loading

Problems Addressed

Solutions

Bug fixes

danielaskdd commented Jan 31, 2025

danielaskdd commented Jan 31, 2025 •

edited

Loading