Replies: 1 comment 1 reply
-
I am curious what system you are running this on 😃 These are a lot of chunks. Curious if this is finished. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm running ingest.py for ingesting a txt file containing question and answer pairs, it is over 800MB (I know it's a lot). I am using the instruct-xl as the embedding model to ingest. Any approximate idea for how long will it take to complete the ingest process.
Loads all documents from the source documents directory, ignoring specified files
Loading new documents: 100%|██████████████████████| 1/1 [00:11<00:00, 11.11s/it]
Loaded 1 new documents from //SOURCE_DOCUMENTS LOCATION
Split into 981498 chunks of text (max. 1000 tokens each)
Creating embeddings. May take some minutes...
Using embedded DuckDB with persistence: data will be stored in: DB
Beta Was this translation helpful? Give feedback.
All reactions