Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于训练unet和SyncNet的问题 #68

Open
wangaocheng opened this issue Jan 11, 2025 · 0 comments
Open

关于训练unet和SyncNet的问题 #68

wangaocheng opened this issue Jan 11, 2025 · 0 comments

Comments

@wangaocheng
Copy link

UNET按照教程我训练了20000步,训练正常。我一共只有自己录的三个视频,一共30分钟左右。
但是SyncNet训练的时候报错,

最先是报:
RuntimeError: Given groups=1, weight of size [64, 48, 3, 3], expected input[4, 15, 128, 256] to have 48 channels, but got 15 channels instead

然后我把
visual_encoder:
in_channels: 15 #原配置这里的值是48,我改成了15

跟着就报下面的错误:
RuntimeError: Calculated padded input size per channel: (3 x 2). Kernel size: (3 x 3). Kernel size can't be greater than actual input size

说输入特征图的尺寸过小。这个怎么解决?

视频是1080x1080,训练用的256x256,数据集用的预处理后的high_resolution目录。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant