-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
run error #4
Comments
Thank you for your attention to our work. You can set the dtype in the config to fp32, and then it should work. |
I still get this error when changing the dtype in the config file you specified to fp32. |
I switched dtype to bf16 and it worked; makes sense, because the model was trained in bf16. |
How did you install apex? The command provided in this repo gives the following error for me:
Would greatly appreciate some help trouble shooting this. |
torchrun --standalone --nproc_per_node 1 scripts/inference.py --config configs/mvdit/inference/16x512x512.py /root/miniconda3/lib/python3.10/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /root/miniconda3/lib/python3.10/site-packages/colossalai/pipeline/schedule/_utils.py:19: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _register_pytree_node(OrderedDict, _odict_flatten, _odict_unflatten) /root/miniconda3/lib/python3.10/site-packages/torch/utils/_pytree.py:254: UserWarning: <class 'collections.OrderedDict'> is already registered as pytree node. Overwriting the previous registration. warnings.warn( [2024-07-08 11:13:11,432] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to cuda (auto detect) /root/miniconda3/lib/python3.10/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /root/miniconda3/lib/python3.10/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /root/miniconda3/lib/python3.10/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( /mine_workspace/Mirage/repos/diffusers/src/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. torch.utils._pytree._register_pytree_node( Config (path: configs/mvdit/inference/16x512x512.py): {'num_frames': 16, 'fps': 8, 'image_size': (512, 512), 'model': {'type': 'MVDiT-XL/2', 'space_scale': 1.0, 'time_scale': 1.0, 'enable_flashattn': True, 'enable_layernorm_kernel': True, 'from_pretrained': '/mnt_alipayshnas/youtai.ts/checkpoints/OpenVid/MVDiT-16×512×512.pt'}, 'vae': {'type': 'VideoAutoencoderKL', 'from_pretrained': '/mnt_alipayshnas/youtai.ts/checkpoints/sd-vae-ft-ema/stabilityai__sd-vae-ft-ema', 'micro_batch_size': 2}, 'text_encoder': {'type': 't5', 'from_pretrained': '/mnt_alipayshnas/youtai.ts/checkpoints/t5-v1_1-xxl/t5-v1_1-xxl', 'model_max_length': 120}, 'scheduler': {'type': 'iddpm', 'num_sampling_steps': 100, 'cfg_scale': 7.0}, 'dtype': 'fp16', 'batch_size': 2, 'seed': 42, 'prompt_path': './assets/texts/evalcrafter.txt', 'start_idx': 0, 'end_idx': 700, 'save_dir': './outputs/samples/', 'multi_resolution': False} Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████| 2/2 [04:23<00:00, 131.92s/it] Loading /mnt_alipayshnas/youtai.ts/checkpoints/OpenVid/MVDiT-16×512×512.pt Missing keys: ['pos_embed', 'pos_embed_temporal'] Unexpected keys: [] /ossfs/workspace/py310/workspace/OpenVid-1M/openvid/models/text_encoder/t5.py:163: MarkupResemblesLocatorWarning: The input looks more like a filename than markup. You may want to open this file and pass the filehandle into Beautiful Soup. caption = BeautifulSoup(caption, features="html.parser").text 0%| | 0/100 [00:00<?, ?it/s] Traceback (most recent call last): File "/ossfs/workspace/py310/workspace/OpenVid-1M/scripts/inference.py", line 107, in <module> main() File "/ossfs/workspace/py310/workspace/OpenVid-1M/scripts/inference.py", line 87, in main samples = scheduler.sample( File "/ossfs/workspace/py310/workspace/OpenVid-1M/openvid/schedulers/iddpm/__init__.py", line 77, in sample samples = self.p_sample_loop( File "/ossfs/workspace/py310/workspace/OpenVid-1M/openvid/schedulers/iddpm/gaussian_diffusion.py", line 437, in p_sample_loop for sample in self.p_sample_loop_progressive( File "/ossfs/workspace/py310/workspace/OpenVid-1M/openvid/schedulers/iddpm/gaussian_diffusion.py", line 488, in p_sample_loop_progressive out = self.p_sample( File "/ossfs/workspace/py310/workspace/OpenVid-1M/openvid/schedulers/iddpm/gaussian_diffusion.py", line 391, in p_sample out = self.p_mean_variance( File "/ossfs/workspace/py310/workspace/OpenVid-1M/openvid/schedulers/iddpm/respace.py", line 94, in p_mean_variance return super().p_mean_variance(self._wrap_model(model), *args, **kwargs) File "/ossfs/workspace/py310/workspace/OpenVid-1M/openvid/schedulers/iddpm/gaussian_diffusion.py", line 270, in p_mean_variance model_output = model(x, t, **model_kwargs) File "/ossfs/workspace/py310/workspace/OpenVid-1M/openvid/schedulers/iddpm/respace.py", line 127, in __call__ return self.model(x, new_ts, **kwargs) File "/ossfs/workspace/py310/workspace/OpenVid-1M/openvid/schedulers/iddpm/__init__.py", line 94, in forward_with_cfg model_out = model.forward(combined, timestep, y, **kwargs) File "/ossfs/workspace/py310/workspace/OpenVid-1M/openvid/models/mvdit/mvdit.py", line 331, in forward x, y = auto_grad_checkpoint(block, x, y, t0, t_y, t0_tmep, t_y_tmep, mask, tpe) File "/ossfs/workspace/py310/workspace/OpenVid-1M/openvid/acceleration/checkpoint.py", line 24, in auto_grad_checkpoint return module(*args, **kwargs) File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/ossfs/workspace/py310/workspace/OpenVid-1M/openvid/models/mvdit/mvdit.py", line 162, in forward x = x + self.cross_attn(x, y, mask) File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/ossfs/workspace/py310/workspace/OpenVid-1M/openvid/models/layers/blocks.py", line 399, in forward attn_bias[attn_bias==0] = exp RuntimeError: value cannot be converted to type at::Half without overflow
I get error when running mvdit, could u give some advice? But when I run stdit, it successfully generate videos.
The text was updated successfully, but these errors were encountered: