NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE. #3659

chericher · 2025-01-10T09:17:44Z

Describe the issue
"I have locally deployed bge-large-zh-v1.5 + qwen2.5-3B-Instruct + GraphRAG 1.1.2, using Python 3.10.12 and torch 2.5. When I run the graphrag index --root ./ command, I encounter the following error:"

15:05:20,104 graphrag.utils.storage INFO reading table from storage: create_final_relationships.parquet
15:05:20,108 graphrag.utils.storage INFO reading table from storage: create_final_entities.parquet
15:05:20,113 graphrag.utils.storage INFO reading table from storage: create_final_communities.parquet
15:05:20,130 graphrag.index.operations.summarize_communities.prepare_community_reports INFO Number of nodes at level=0 => 3
15:05:24,750 httpx INFO HTTP Request: POST http://localhost:8000/v1/chat/completions "HTTP/1.1 200 OK"
15:05:24,912 graphrag.utils.storage INFO reading table from storage: create_final_documents.parquet
15:05:24,917 graphrag.utils.storage INFO reading table from storage: create_final_relationships.parquet
15:05:24,922 graphrag.utils.storage INFO reading table from storage: create_final_text_units.parquet
15:05:24,927 graphrag.utils.storage INFO reading table from storage: create_final_entities.parquet
15:05:24,932 graphrag.utils.storage INFO reading table from storage: create_final_community_reports.parquet
15:05:24,942 graphrag.index.flows.generate_text_embeddings INFO Creating embeddings
15:05:24,942 graphrag.index.operations.embed_text.embed_text INFO using vector store lancedb with container_name default for embedding entity.description: default-entity-description
15:05:25,143 graphrag.index.operations.embed_text.strategies.openai INFO embedding 3 inputs via 3 snippets using 1 batches. max_batch_size=16, max_tokens=8191
15:05:25,391 httpx INFO HTTP Request: POST http://localhost:8150/v1/embeddings "HTTP/1.1 200 OK"
15:05:25,432 graphrag.index.operations.embed_text.embed_text INFO using vector store lancedb with container_name default for embedding text_unit.text: default-text_unit-text
15:05:25,436 graphrag.index.operations.embed_text.strategies.openai INFO embedding 1 inputs via 1 snippets using 1 batches. max_batch_size=16, max_tokens=8191
15:05:25,445 graphrag.index.operations.embed_text.embed_text INFO using vector store lancedb with container_name default for embedding community.full_content: default-community-full_content
15:05:25,448 graphrag.index.operations.embed_text.strategies.openai INFO embedding 1 inputs via 1 snippets using 1 batches. max_batch_size=16, max_tokens=8191
15:05:25,471 httpx INFO HTTP Request: POST http://localhost:8150/v1/embeddings "HTTP/1.1 400 Bad Request"
15:05:25,475 graphrag.callbacks.file_workflow_callbacks INFO Error Invoking LLM details={'prompt': ["# Family A\n\nThe community revolves around the key entities A, F, and M, who are related by familial ties. A is the child of F and M, and both F and M are parents of A. This family structure is central to the community's dynamics.\n\n## F and M as parents\n\nF and M are the parents of A, and their roles as parents are central to the community's structure. Their relationship with A is crucial in understanding the dynamics of the family. [Data: Entities (1, 2), Relationships (0, 1, +more)]\n\n## A as the child\n\nA is the child of F and M, and their relationship with A is central to the community's structure. A's role as a child is significant in understanding the family dynamics and potential conflicts. [Data: Entities (0), Relationships (0, 1, +more)]\n\n## F and M's combined degree\n\nF and M have a combined degree of 3, indicating their significant role in the community. Their relationship with A is crucial in understanding the family dynamics and potential conflicts. [Data: Entities (1, 2), Relationships (0, 1, +more)]\n\n## A's relationship with F and M\n\nA's relationship with F and M is central to the community's structure. Their roles as parents and the relationship with A are significant in understanding the family dynamics and potential conflicts. [Data: Entities (0), Relationships (0, 1, +more)]\n\n## Family structure\n\nThe family structure is central to the community's dynamics, with F and M as parents and A as the child. This structure is significant in understanding the potential for family disputes or conflicts. [Data: Entities (1, 2), Relationships (0, 1, +more)]"], 'kwargs': {}}
15:05:25,476 graphrag.index.run.run_workflows ERROR error running workflow generate_text_embeddings
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/run/run_workflows.py", line 166, in _run_workflows
result = await run_workflow(
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/workflows/generate_text_embeddings.py", line 45, in run_workflow
await generate_text_embeddings(
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/flows/generate_text_embeddings.py", line 98, in generate_text_embeddings
await _run_and_snapshot_embeddings(
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/flows/generate_text_embeddings.py", line 121, in _run_and_snapshot_embeddings
data["embedding"] = await embed_text(
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/operations/embed_text/embed_text.py", line 89, in embed_text
return await _text_embed_with_vector_store(
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/operations/embed_text/embed_text.py", line 179, in _text_embed_with_vector_store
result = await strategy_exec(
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/operations/embed_text/strategies/openai.py", line 63, in run
embeddings = await _execute(llm, text_batches, ticker, semaphore)
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/operations/embed_text/strategies/openai.py", line 103, in _execute
results = await asyncio.gather(*futures)
File "/usr/local/lib/python3.10/dist-packages/graphrag/index/operations/embed_text/strategies/openai.py", line 97, in embed
chunk_embeddings = await llm(chunk)
File "/usr/local/lib/python3.10/dist-packages/fnllm/base/base.py", line 112, in call
return await self._invoke(prompt, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/fnllm/base/base.py", line 128, in _invoke
return await self._decorated_target(prompt, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/fnllm/services/retryer.py", line 109, in invoke
result = await execute_with_retry()
File "/usr/local/lib/python3.10/dist-packages/fnllm/services/retryer.py", line 93, in execute_with_retry
async for a in AsyncRetrying(
File "/usr/local/lib/python3.10/dist-packages/tenacity/asyncio/init.py", line 166, in anext
do = await self.iter(retry_state=self._retry_state)
File "/usr/local/lib/python3.10/dist-packages/tenacity/asyncio/init.py", line 153, in iter
result = await action(retry_state)
File "/usr/local/lib/python3.10/dist-packages/tenacity/_utils.py", line 99, in inner
return call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/tenacity/init.py", line 398, in
self._add_action_func(lambda rs: rs.outcome.result())
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/local/lib/python3.10/dist-packages/fnllm/services/retryer.py", line 101, in execute_with_retry
return await attempt()
File "/usr/local/lib/python3.10/dist-packages/fnllm/services/retryer.py", line 78, in attempt
return await delegate(prompt, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/fnllm/services/rate_limiter.py", line 70, in invoke
result = await delegate(prompt, **args)
File "/usr/local/lib/python3.10/dist-packages/fnllm/base/base.py", line 152, in _decorator_target
output = await self._execute_llm(prompt, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/fnllm/openai/llm/embeddings.py", line 133, in _execute_llm
response = await self._call_embeddings_or_cache(
File "/usr/local/lib/python3.10/dist-packages/fnllm/openai/llm/embeddings.py", line 110, in _call_embeddings_or_cache
return await self._cache.get_or_insert(
File "/usr/local/lib/python3.10/dist-packages/fnllm/services/cache_interactor.py", line 50, in get_or_insert
entry = await func()
File "/usr/local/lib/python3.10/dist-packages/openai/resources/embeddings.py", line 236, in create
return await self._post(
File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 1849, in post
return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 1543, in request
return await self._request(
File "/usr/local/lib/python3.10/dist-packages/openai/_base_client.py", line 1644, in _request
raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': 'NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.\n\n(The expanded size of the tensor (513) must match the existing size (512) at non-singleton dimension 1. Target sizes: [1, 513]. Tensor sizes: [1, 512])', 'code': 50001}
15:05:25,477 graphrag.callbacks.file_workflow_callbacks INFO Error running pipeline! details=None
15:05:25,554 graphrag.cli.index ERROR Errors occurred during the pipeline run, see logs for more details.

Steps to reproduce
No response

GraphRAG Config Used

This config file contains required core defaults that must be set, along with a handful of common optional settings.

For a full list of available settings, see https://microsoft.github.io/graphrag/config/yaml/

LLM settings

There are a number of settings to tune the threading and token limits for LLM calls - check the docs.

encoding_model: cl100k_base # this needs to be matched to your model!

llm:
api_key: ${GRAPHRAG_API_KEY} # set this in the generated .env file
type: openai_chat # or azure_openai_chat
model: qwen3B
model_supports_json: false # recommended if this is available for your model.

audience: "https://cognitiveservices.azure.com/.default"

api_base: http://localhost:8000/v1

api_version: 2024-02-15-preview

organization: <organization_id>

deployment_name: <azure_model_deployment_name>

parallelization:
stagger: 0.3

num_threads: 50

async_mode: threaded # or asyncio

embeddings:
async_mode: threaded # or asyncio
vector_store:
type: lancedb
db_uri: 'output/lancedb'
container_name: default
overwrite: true
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_embedding # or azure_openai_embedding
model: gpt-4
api_base: http://localhost:8150/v1
# api_version: 2024-02-15-preview
# audience: "https://cognitiveservices.azure.com/.default"
# organization: <organization_id>
# deployment_name: <azure_model_deployment_name>

Input settings

input:
type: file # or blob
file_type: text # or csv
base_dir: "input"
file_encoding: utf-8
file_pattern: ".*\.txt$"

chunks:
size: 1200
overlap: 100
group_by_columns: [id]

Storage settings

If blob storage is specified in the following four sections,

connection_string and container_name must be provided

cache:
type: file # one of [blob, cosmosdb, file]
base_dir: "cache"

reporting:
type: file # or console, blob
base_dir: "logs"

storage:
type: file # one of [blob, cosmosdb, file]
base_dir: "output"

only turn this on if running `graphrag index` with custom settings

we normally use `graphrag update` with the defaults

update_index_storage:

type: file # or blob

base_dir: "update_output"

Workflow settings

skip_workflows: []

entity_extraction:
prompt: "prompts/entity_extraction.txt"
entity_types: [organization,person,geo,event]
max_gleanings: 1

summarize_descriptions:
prompt: "prompts/summarize_descriptions.txt"
max_length: 1000

claim_extraction:
enabled: false
prompt: "prompts/claim_extraction.txt"
description: "Any claims or facts that could be relevant to information discovery."
max_gleanings: 1

community_reports:
prompt: "prompts/community_report.txt"
max_length: 1000
max_input_length: 4000

cluster_graph:
max_cluster_size: 10

embed_graph:
enabled: false # if true, will generate node2vec embeddings for nodes

umap:
enabled: false # if true, will generate UMAP embeddings for nodes (embed_graph must also be enabled)

snapshots:
graphml: false
embeddings: false
transient: false

Query settings

The prompt locations are required here, but each search method has a number of optional knobs that can be tuned.

See the config docs: https://microsoft.github.io/graphrag/config/yaml/#query

local_search:
prompt: "prompts/local_search_system_prompt.txt"

global_search:
map_prompt: "prompts/global_search_map_system_prompt.txt"
reduce_prompt: "prompts/global_search_reduce_system_prompt.txt"
knowledge_prompt: "prompts/global_search_knowledge_system_prompt.txt"

drift_search:
prompt: "prompts/drift_search_system_prompt.txt"

basic_search:
prompt: "prompts/basic_search_system_prompt.txt"
Logs and screenshots
(graphragtest) root@cdd2b6557714:/home/graphragtest# graphrag index --root ./

Logging enabled at /home/graphragtest/logs/indexing-engine.log
Running standard indexing.
🚀 create_base_text_units
id text document_ids n_tokens
0 b53ef702af00f35578b1cdbf74474a32866bd5bb89a30a... A的爸爸叫F。\n\nA的妈妈叫M。\n [10ae1eaa0dc9f3bd3cbbfc0ff5d391e0a4eb7ed2d604d... 22
🚀 create_final_documents
id human_readable_id title text text_unit_ids
0 10ae1eaa0dc9f3bd3cbbfc0ff5d391e0a4eb7ed2d604dd... 1 report.txt A的爸爸叫F。\n\nA的妈妈叫M。\n [b53ef702af00f35578b1cdbf74474a32866bd5bb89a30...
🚀 extract_graph
None
🚀 compute_communities
level community parent title
0 0 0 -1 A
0 0 0 -1 F
0 0 0 -1 M
🚀 create_final_entities
id human_readable_id title type description text_unit_ids
0 c137ae10-4252-48da-894b-ca30f7aef684 0 A PERSON A is a person [b53ef702af00f35578b1cdbf74474a32866bd5bb89a30...
1 5650c001-6bb1-4868-bbf9-08a8a3f95892 1 F PERSON F is the father of A [b53ef702af00f35578b1cdbf74474a32866bd5bb89a30...
2 b447f3a1-29d2-4130-b586-da16499a79a2 2 M PERSON M is the mother of A [b53ef702af00f35578b1cdbf74474a32866bd5bb89a30...
🚀 create_final_relationships
id human_readable_id source target description weight combined_degree text_unit_ids
0 69ebb419-9b02-4ca0-8d42-12335857355f 0 A F A's father is F 2.0 3 [b53ef702af00f35578b1cdbf74474a32866bd5bb89a30...
1 4860e8a2-b30f-4615-b127-64ddc3617535 1 A M A's mother is M 2.0 3 [b53ef702af00f35578b1cdbf74474a32866bd5bb89a30...
🚀 create_final_nodes
id human_readable_id title community level degree x y
0 c137ae10-4252-48da-894b-ca30f7aef684 0 A 0 0 2 0 0
1 5650c001-6bb1-4868-bbf9-08a8a3f95892 1 F 0 0 1 0 0
2 b447f3a1-29d2-4130-b586-da16499a79a2 2 M 0 0 1 0 0
🚀 create_final_communities
id human_readable_id community ... text_unit_ids period size
0 a48137e0-b5f5-4297-9919-50fb59ef270f 0 0 ... [b53ef702af00f35578b1cdbf74474a32866bd5bb89a30... 2025-01-10 3

[1 rows x 11 columns]
🚀 create_final_text_units
id ... relationship_ids
0 b53ef702af00f35578b1cdbf74474a32866bd5bb89a30a... ... [69ebb419-9b02-4ca0-8d42-12335857355f, 4860e8a...

[1 rows x 7 columns]
🚀 create_final_community_reports
id human_readable_id community ... full_content_json period size
0 54b5f0c3db3343f7a348d43a0ef6f086 0 0 ... {\n "title": "Family A",\n "summary": "T... 2025-01-10 3

[1 rows x 14 columns]
❌ generate_text_embeddings
None
⠼ GraphRAG Indexer
├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_base_text_units ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_final_documents ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── extract_graph ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── compute_communities ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_final_entities ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_final_relationships ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_final_nodes ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_final_communities ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_final_text_units ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_final_community_reports ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
❌ Errors occurred during the pipeline run, see logs for more details.

Additional Information
GraphRAG Version:1.1.2
Operating System:linux
Python Version:3.10.12
Related Issues:

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE. #3659

NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE. #3659

chericher commented Jan 10, 2025

NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE. #3659

NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE. #3659

Comments

chericher commented Jan 10, 2025

This config file contains required core defaults that must be set, along with a handful of common optional settings.

For a full list of available settings, see https://microsoft.github.io/graphrag/config/yaml/

LLM settings

There are a number of settings to tune the threading and token limits for LLM calls - check the docs.

audience: "https://cognitiveservices.azure.com/.default"

api_version: 2024-02-15-preview

organization: <organization_id>

deployment_name: <azure_model_deployment_name>

num_threads: 50

Input settings

Storage settings

If blob storage is specified in the following four sections,

connection_string and container_name must be provided

only turn this on if running graphrag index with custom settings

we normally use graphrag update with the defaults

type: file # or blob

base_dir: "update_output"

Workflow settings

Query settings

The prompt locations are required here, but each search method has a number of optional knobs that can be tuned.

See the config docs: https://microsoft.github.io/graphrag/config/yaml/#query

only turn this on if running `graphrag index` with custom settings

we normally use `graphrag update` with the defaults