Bug in AzureAISearch Vector Store: user_id Filter Not Working Correctly #2170

junmo1215 · 2025-01-22T08:33:07Z

🐛 Describe the bug

When configuring azure_ai_search as the vector_store in the mem0 project, updating a user's memory via m.add(xxx, user_id="") does not respect the user_id filter. Specifically, an update intended for one user inadvertently modifies another user's memory.

Sample code:

from pprint import pprint
from mem0 import Memory

m = Memory.from_config(config_dict={
    "version": "v1.1",
    "vector_store": {
        "provider": "azure_ai_search",
        "config": {
            "service_name": "",
            "api_key": "",
            "collection_name": "", 
            "embedding_model_dims": 3072,
            "use_compression": False
        }
    }
})

# Add memory for Alice
m_alice = m.add("my name is Alice", user_id="Alice")
print("m_alice: ", m_alice)

# Add memory for Bob
m_bob = m.add("my name is Bob", user_id="Bob")
print("m_bob: ", m_bob)

# Wait for the operations to complete
time.sleep(10)

# Retrieve all memories
print("Final:")
pprint(m.get_all())

Output

m_alice:  {'results': [{'id': '94a4ce36-84bd-4c77-80da-3ed62bdde35c', 'memory': 'Name is Alice', 'event': 'ADD'}], 'relations': []}
m_bob:  {'results': [{'id': '94a4ce36-84bd-4c77-80da-3ed62bdde35c', 'memory': 'Name is Bob', 'event': 'UPDATE', 'previous_memory': 'Name is Alice'}], 'relations': []}
Final:
{'results': [{'created_at': '2025-01-22T00:19:10.999547-08:00',
              'hash': '2c6f48df7e8d4ea366914773ca57b8b4',
              'id': '94a4ce36-84bd-4c77-80da-3ed62bdde35c',
              'memory': 'Name is Bob',
              'metadata': None,
              'updated_at': '2025-01-22T00:19:13.253396-08:00',
              'user_id': 'Alice'}]}

Issue Details:

As shown in the output:

Adding memory for Alice works as expected.
When adding memory for Bob, instead of creating a new memory entry, it updates Alice's existing memory.
The event for Bob's operation is 'UPDATE' with 'previous_memory': 'Name is Alice'.
In the final output, there's only one memory entry with user_id: 'Alice', but the memory content is 'Name is Bob'.

Expected Behavior:

Each user should have their own separate memory entries. Adding a memory for Bob should create a new entry associated with user_id: 'Bob', without affecting Alice's memory.

Actual Behavior:

Bob's memory addition updates Alice's existing memory instead of creating a new one. This indicates that the user_id filter is not functioning properly in the azure_ai_search vector store implementation, leading to cross-user data contamination.

Additional Information

The text was updated successfully, but these errors were encountered:

junmo1215 · 2025-01-22T08:51:23Z

After looking into the implementation of mem0/vector_stores/azure_ai_search.py, I noticed that the index is created with only three fields: id, vector, and payload. The current approach fetches all documents and then filters them afterward.

def search(self, query, limit=5, filters=None):
    vector_query = VectorizedQuery(vector=query, k_nearest_neighbors=limit, fields="vector")     
    search_results = self.search_client.search(vector_queries=[vector_query], top=limit)

    results = []
    for result in search_results:
        payload = json.loads(result["payload"])
        if filters:
            for key, value in filters.items():
                if key not in payload or payload[key] != value:
                    continue
        results.append(OutputData(id=result["id"], score=result["@search.score"], payload=payload))
    return results

This method has a couple of issues:

Filtering Order:

Applying a limit before filtering could exclude relevant documents that should be included after filtering.

Ineffective Filtering:

The filtering logic doesn't seem to work as intended, so documents aren't correctly filtered by user_id.

These issues might be causing the problem where one user's memory updates another's memory. I'll submit a pull request soon to modify the implementation and address these concerns.

mem0ai#2170

junmo1215 added a commit to junmo1215/mem0 that referenced this issue Jan 22, 2025

Fix query filter in azure ai search

32d23f4

mem0ai#2170

junmo1215 linked a pull request Jan 22, 2025 that will close this issue

Fix query filter in azure ai search #2171

Open

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in AzureAISearch Vector Store: user_id Filter Not Working Correctly #2170

Bug in AzureAISearch Vector Store: user_id Filter Not Working Correctly #2170

junmo1215 commented Jan 22, 2025

junmo1215 commented Jan 22, 2025

Bug in AzureAISearch Vector Store: user_id Filter Not Working Correctly #2170

Bug in AzureAISearch Vector Store: user_id Filter Not Working Correctly #2170

Comments

junmo1215 commented Jan 22, 2025

🐛 Describe the bug

Sample code:

Output

Issue Details:

Expected Behavior:

Actual Behavior:

Additional Information

junmo1215 commented Jan 22, 2025