-
Notifications
You must be signed in to change notification settings - Fork 316
GPU is not used #64
Comments
Hi, one thing worth checking is if python actually imports |
|
Hi, could you provide some printout from your terminal so that we can have a check if any error or warning in it? |
Hi, here are the logs:
We can see that it runs the training on the CPU:
|
Hi, I am sorry this problem looks novel to me and I am not quite sure where the error actually is. It seems tensorflow didn't locate (or try to locate) any GPU resources. |
@mothguib - I'm facing a similar issue as well. Could you try this out to see if there is an issue with the CUDA drivers itself and if that could be the reason why tensorflow couldnt find the GPU devices as mentioned above: import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
print("GPUs: ", tf.config.experimental.list_physical_devices('GPU')) |
@akashjaswal - It seems effectively this issue comes from the CUDA driver itself, my GPU is not detected:
|
I finally solved the problem. It's somewhat a mystery for me, but when using Docker image |
I faced the same issue and fixed it by this way: I am using this GCP image I found that if you do |
hello, I've met the same trouble like yours, I also use docker tensorflow-gpu == 1.15.2, but I don't understand your solustion. Can you explain it? |
I set up Pegasus following the instructions in a Docker container with CUDA 10, but it seems that the GPU is not used, whether I run
train.py
orevaluate.py
.Commands run:
python3 pegasus/bin/train.py --params=aeslc_transformer --param_overrides=vocab_filename=ckpt/pegasus_ckpt/c4.unigram.newline.10pct.96000.model --train_init_checkpoint=ckpt/pegasus_ckpt/model.ckpt-1500000 --model_dir=ckpt/pegasus_ckpt/aeslc
python3 pegasus/bin/evaluate.py --params=aeslc_transformer --param_overrides=vocab_filename=ckpt/pegasus_ckpt/c4.unigram.newline.10pct.96000.model,batch_size=1,beam_size=5,beam_alpha=0.6 --model_dir=ckpt/pegasus_ckpt/aeslc
These programmes are run on my 16-core CPU but when I monitor my GPU with
nvidia-smi
it shows that the GPU is not used (utilisation of 0%).My Python env:
The text was updated successfully, but these errors were encountered: