Replies: 5 comments 4 replies
-
did you update repo, there was an issue very similar to this yesterday that was fixed. |
Beta Was this translation helpful? Give feedback.
-
I have SDP set and disable memory attention enabled. When I start a training, I get an OOM error with batch size 5. I have an RTX 3060 12 GB video card. Should I not even try the TI training with SDP? |
Beta Was this translation helpful? Give feedback.
-
does it work with lower batch sizes? embedding training with high batch size has been buggy for a looong time, i don't think it has anything to do with sdp. plus any kind of memory attention can create false backward pass so training is much slower to actually learn. |
Beta Was this translation helpful? Give feedback.
-
Training with optimization enabled under torch 2.0 is just bad. Doesn't matter if it worked before, its not recommended and there is no way to make training work properly with cross-attention in general - cross-attention shotcuts backwards pass (thats part why its faster) which means that loss function cannot be correctly evalulated - so you think you're learning when you're not. I asked a question - if you want to proceed with troubleshooting, lets go through that. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Preparing dataset...
100%|████████████████████████████████████████████████████████████████████████████████| 334/334 [00:09<00:00, 36.39it/s]
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:02<00:00, 7.64it/s]
embedding train: TypeError█████████████████████████████████████████████████████████████| 20/20 [00:02<00:00, 8.16it/s]
┌───────────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────────┐
│ D:\automatic\modules\textual_inversion\textual_inversion.py:604 in train_embedding │
│ │
│ 603 │ │ │ │ │ │ captioned_image = caption_image_overlay(image, title, footer_lef │
│ > 604 │ │ │ │ │ │ captioned_image = insert_image_data_embed(captioned_image, data) │
│ 605 │
│ │
│ D:\automatic\modules\textual_inversion\image_embedding.py:74 in insert_image_data_embed │
│ │
│ 73 │ │
│ > 74 │ h = image.size[1] │
│ 75 │ next_size = data_np_low.shape[0] + (h-(data_np_low.shape[0] % h)) │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
TypeError: 'int' object is not subscriptable
Applying scaled dot product cross attention optimization
images are on 512x512
Beta Was this translation helpful? Give feedback.
All reactions