-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-GPU support #440
Comments
Hi, in the run.py script, the parameters use_multi_gpu and devices are available for configuring multi-GPU parallel execution. Currently, the TSLib repository supports PyTorch's data parallel mode for parallel processing. If you want to use PyTorch's DDP ( Distribute Data Parallel) mode, please refer to the official DDP document for modification and adaptation. |
if args.use_gpu and args.use_multi_gpu:
|
if you set For instance, the bash should look like below with gpu 2 and 3: python -u run.py \
--use_multi_gpu \
--devices 2,3 \
--task_name classification \
... and also, you should set devices manually, since the SO the entire bash file will look like this:
export CUDA_VISIBLE_DEVICES=2,3
python -u run.py \
--use_multi_gpu \
--devices 2,3 \
--task_name classification \
...
python -u run.py \
--use_multi_gpu \
--devices 2,3 \
--task_name anomaly_detection \
...
CUDA_VISIBLE_DEVICES=2,3 python -u run.py \
--use_multi_gpu \
--devices 2,3 \
--task_name anomaly_detection \
...
CUDA_VISIBLE_DEVICES=2,3 python -u run.py \
--use_multi_gpu \
--devices 2,3 \
--task_name classification \
...
As far as I remember, you might confront an error saying This is because of the CUDA_VISIBLE_DEVICES setting. For instance, if you set CUDA_VISIBLE_DEVICES=2,3 , then the torch recognizes the GPU#2 as device_id 0, and GPU#3 as device_id 1. So if you get the error, fix the args.device_ids = list(range(args.devices))
args.device = torch.device('cuda:0') (actually, Hope this helps |
Hey guys, thanks for your great work, I wonder how to set Multi-GPU, I have realized some all reduce but look no fancy. Can you help to support Multi-GPU training and testing, thank you!
The text was updated successfully, but these errors were encountered: