关于训练syncnet 这块我结合了deepseek 直接用512的图片训练 #92

endofD · 2025-01-17T03:57:34Z

直接用512的图片训练
scripts/train_syncnet.py: 脚本用的是 configs/syncnet/syncnet_16_pixel.yaml

visual_encoder: # input (48, 128, 256)
in_channels: 48
block_out_channels: [64, 128, 256, 256, 512, 1024, 2048, 2048]
downsample_factors: [[1, 2], 2, 2, 2, 2, 2, 2, 2]
attn_blocks: [0, 0, 0, 0, 0, 0, 0, 0]
dropout: 0.0

修改

visual_encoder: # input (48, 128, 512)
in_channels: 48
block_out_channels: [64, 128, 256, 256, 512, 1024, 2048, 2048,2048]
downsample_factors: [[1, 2], 2, 2, 2, 2, 2, 2, 2, 2]
attn_blocks: [0, 0, 0, 0, 0, 0, 0, 0,0]
dropout: 0.0

resolution: 256 修改成512

然后数据处理去 resize 脸到512 。

jishunyu · 2025-01-17T11:30:18Z

你训练出来可用的模型了吗

endofD · 2025-01-19T03:51:00Z

所以别切换vae可能显存爆炸
直接用512的图片训练
scripts/train_syncnet.py: 脚本用的是 configs/syncnet/syncnet_16_pixel.yaml

  visual_encoder: # input (48, 128, 256)
    in_channels: 48
    block_out_channels: [64, 128, 256, 256, 512, 1024, 2048, 2048]
    downsample_factors: [[1, 2], 2, 2, 2, 2, 2, 2, 2]
    attn_blocks: [0, 0, 0, 0, 0, 0, 0, 0]
    dropout: 0.0

修改

 visual_encoder: # input (48, 128, 512)
    in_channels: 48
    block_out_channels: [64, 128, 256, 256, 512, 1024, 2048, 2048,2048]
    downsample_factors: [[1, 2], 2, 2, 2, 2, 2, 2, 2, 2]
    attn_blocks: [0, 0, 0, 0, 0, 0, 0, 0,0]
    dropout: 0.0

resolution: 256 修改成512

然后数据处理去 resize 脸到512 。

... 我回头测试下

endofD · 2025-01-19T03:55:34Z

data_processing_pipeline.sh

里面直接能修改分辨率提取

python -m preprocess.data_processing_pipeline \
    --total_num_workers 20 \
    --per_gpu_num_workers 10 \
    --resolution 256 \    #  512
    --sync_conf_threshold 3 \
    --temp_dir temp \
    --input_dir /mnt/bn/maliva-gen-ai-v2/chunyu.li/VoxCeleb2/raw

--resolution 256 \ 修改成512

endofD · 2025-01-19T03:55:41Z

回头测试下

endofD changed the title ~~关于训练syncnet 这块我结合了deepseek 步骤感觉ok~~ 关于训练syncnet 这块我结合了deepseek 直接用512的图片训练 Jan 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

关于训练syncnet 这块我结合了deepseek 直接用512的图片训练 #92

关于训练syncnet 这块我结合了deepseek 直接用512的图片训练 #92

endofD commented Jan 17, 2025 •

edited

Loading

jishunyu commented Jan 17, 2025

endofD commented Jan 19, 2025 •

edited

Loading

endofD commented Jan 19, 2025

endofD commented Jan 19, 2025

关于训练syncnet 这块 我结合了deepseek 直接用512的图片训练 #92

关于训练syncnet 这块 我结合了deepseek 直接用512的图片训练 #92

Comments

endofD commented Jan 17, 2025 • edited Loading

jishunyu commented Jan 17, 2025

endofD commented Jan 19, 2025 • edited Loading

endofD commented Jan 19, 2025

endofD commented Jan 19, 2025

关于训练syncnet 这块我结合了deepseek 直接用512的图片训练 #92

关于训练syncnet 这块我结合了deepseek 直接用512的图片训练 #92

endofD commented Jan 17, 2025 •

edited

Loading

endofD commented Jan 19, 2025 •

edited

Loading