Issue running single gpu training script #229
Replies: 2 comments 4 replies
-
Solved : Issue was caused by python 3.7. Running it with python 3.10 fixed the issue. |
Beta Was this translation helpful? Give feedback.
-
@Shidhanta95 With batch =2, I reckon the problem could be coming from mask_decoder: with But @JunMa11 Sorry for tagging you out of the blue. I have spent hours on this matter with no success. Since you are the contributor of branch 0.1, perhaps you have encounted this problem before? |
Beta Was this translation helpful? Give feedback.
-
Hi, I am new to deep learning so apologies if the question may be very trivial. I am using a modified version of the train_one_gpu script to train the medsam model on a dataset. The first time I run the script I have no issues. But the second time I ran the script without making any changes I got the following error.
"RuntimeError: The size of tensor a (4) must match the size of tensor b (2) at non-singleton dimension 0"
Passing the code and tensor dimensions to chatgpt and asking it to output the tensor sizes shows that there should be no mismatch with the tensor dimensions.
I am clueless as to why it runs the first time and then it doesnt run again. I have attached the screenshots of the first run and the error. If required I can share my script as well.
![medsam single gpu training](https://private-user-images.githubusercontent.com/147026855/317648938-1ec63fbc-cb90-4e4b-81e8-5e542cb94008.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk0NjY2OTksIm5iZiI6MTczOTQ2NjM5OSwicGF0aCI6Ii8xNDcwMjY4NTUvMzE3NjQ4OTM4LTFlYzYzZmJjLWNiOTAtNGU0Yi04MWU4LTVlNTQyY2I5NDAwOC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjEzJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxM1QxNzA2MzlaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0wMzU3ZGI0OTEzMDdkNjE3Mjc2NDBhNzZkYzgxNTRlNjE4YWRkOWNhMDE3YTRmZWM2NmFiZGJkZmZhZDY3MDVkJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.vNv6WSxZsx6wzmuqDF8HDU3ygti9bXzR68VXT80MzGg)
![medsam single gpu training error (2)](https://private-user-images.githubusercontent.com/147026855/317648971-6d870c70-3d55-4107-b0e7-39390ab4606b.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk0NjY2OTksIm5iZiI6MTczOTQ2NjM5OSwicGF0aCI6Ii8xNDcwMjY4NTUvMzE3NjQ4OTcxLTZkODcwYzcwLTNkNTUtNDEwNy1iMGU3LTM5MzkwYWI0NjA2Yi5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjEzJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxM1QxNzA2MzlaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1jYTgzOGFhNmRiNmQ0NWY1NTMyMmVlZjA1MGYwOGIyY2RmM2QzZDhiZGRlMDBmY2JmNTRhYjlmMTU0YWE5Y2QxJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.enaYnYZRmwy0TLUR1nolwu6Q3_5AatzqERPKBJGfBFU)
Beta Was this translation helpful? Give feedback.
All reactions