Encoding using multiple-GPUs #541

DenuwanClouda · 2023-09-01T07:21:20Z

I am doing this task where I am recommending products based on their reviews. To do that I am using your library and when I am creating the index, what I would like to know is whether there is a way to utilize several GPUs. Because the time it takes to encode the review text is huge and I have a multiGPU environment.
Also the task also expects explanations and I would also like to know can the same be done when I execute the explain method.

Thank you.

davidmezzetti · 2023-09-01T17:58:50Z

This has been brought up from time to time and a good reminder that better integrated support should be added.

One less than pretty workaround that has been used in the past:

import torch

embeddings.model.model.model = torch.nn.DataParallel(embeddings.model.model.model)

Might be worth trying that to see if it improves performance at all. Below are a couple more references to investigate.

References:

Let's keep this open and I'll take a look at better multi-GPU support in an upcoming release. Ultimately, the best solution given Python's GIL is to spawn a pool of encoding processes, one per GPU.

DenuwanClouda · 2023-09-05T03:38:04Z

Thank you for the response

DenuwanClouda · 2023-09-13T14:52:53Z

@davidmezzetti

The suggested work around did not drastically improve in my case.
For everyone's information: I am using sub indexes feature so the suggested workaround had to be slightly changed in my case as
embeddings.models['sentence-transformers/all-MiniLM-L6-v2'].model.model = torch.nn.DataParallel(embeddings.models['sentence-transformers/all-MiniLM-L6-v2'].model.model)
Hope this is correct
But this method did not work so I tried to use seperate function to take control of the encoding to speed it up using multiprocessing as suggested from the link provided. See the image below

But this seems to under utilise the 2 GPUs in kaggle free tier.

Any suggestions?

davidmezzetti · 2023-09-13T16:29:21Z

If you're going to use the multi process pool, make sure to undo the torch dataparallel wrapper.

This most likely needs a dedicated effort to optimize encoding for 2 GPUs. I only develop with 1 GPU so it's not a use case I've prioritized.

DenuwanClouda · 2023-09-15T06:27:12Z

I was able to significantly speed up the process using this external encoding handling feature. An encoding which tool around 1 hr in a single GPU environment was reduced to 18 mins. But it would be nice if this feature is an inbuilt feature in your library.
External function used is as below

# starting the pool outside the function
pool = model.start_multi_process_pool()

def transform(data):
    #Compute the embeddings using the multi-process pool
    emb = model.encode_multi_process(data, pool2, batch_size=64)
    
    return emb

davidmezzetti · 2023-09-15T11:36:53Z

Glad to hear it! Nice to see you were able to use an external transform function to solve this.

Will keep this open to add a similar method directly to txtai.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encoding using multiple-GPUs #541

Encoding using multiple-GPUs #541

DenuwanClouda commented Sep 1, 2023

davidmezzetti commented Sep 1, 2023

DenuwanClouda commented Sep 5, 2023

DenuwanClouda commented Sep 13, 2023

davidmezzetti commented Sep 13, 2023

DenuwanClouda commented Sep 15, 2023 •

edited

Loading

davidmezzetti commented Sep 15, 2023

Encoding using multiple-GPUs #541

Encoding using multiple-GPUs #541

Comments

DenuwanClouda commented Sep 1, 2023

davidmezzetti commented Sep 1, 2023

DenuwanClouda commented Sep 5, 2023

DenuwanClouda commented Sep 13, 2023

davidmezzetti commented Sep 13, 2023

DenuwanClouda commented Sep 15, 2023 • edited Loading

davidmezzetti commented Sep 15, 2023

DenuwanClouda commented Sep 15, 2023 •

edited

Loading