We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UNET按照教程我训练了20000步,训练正常。我一共只有自己录的三个视频,一共30分钟左右。 但是SyncNet训练的时候报错,
最先是报: RuntimeError: Given groups=1, weight of size [64, 48, 3, 3], expected input[4, 15, 128, 256] to have 48 channels, but got 15 channels instead
然后我把 visual_encoder: in_channels: 15 #原配置这里的值是48,我改成了15
跟着就报下面的错误: RuntimeError: Calculated padded input size per channel: (3 x 2). Kernel size: (3 x 3). Kernel size can't be greater than actual input size
说输入特征图的尺寸过小。这个怎么解决?
视频是1080x1080,训练用的256x256,数据集用的预处理后的high_resolution目录。
The text was updated successfully, but these errors were encountered:
No branches or pull requests
UNET按照教程我训练了20000步,训练正常。我一共只有自己录的三个视频,一共30分钟左右。
但是SyncNet训练的时候报错,
最先是报:
RuntimeError: Given groups=1, weight of size [64, 48, 3, 3], expected input[4, 15, 128, 256] to have 48 channels, but got 15 channels instead
然后我把
visual_encoder:
in_channels: 15 #原配置这里的值是48,我改成了15
跟着就报下面的错误:
RuntimeError: Calculated padded input size per channel: (3 x 2). Kernel size: (3 x 3). Kernel size can't be greater than actual input size
说输入特征图的尺寸过小。这个怎么解决?
视频是1080x1080,训练用的256x256,数据集用的预处理后的high_resolution目录。
The text was updated successfully, but these errors were encountered: