Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ubuntu20.04系统的conda虚拟环境如何离线执行DIffBIR? #142

Open
xyhweinin opened this issue Dec 4, 2024 · 6 comments
Open

Comments

@xyhweinin
Copy link

您好!我在ubuntu20.04系统上使用conda虚拟环境部署了diffbir v2.1,由于服务器无法访问外网,执行下面时报错:
python -u inference.py --task denoise --upscale 1 --version v2.1 --captioner llava --cfg_scale 8 --noise_aug 0 --input inputs/demo/bid --output results/v2.1_demo_bid

报错如下:
File "/root/anaconda3/envs/diffbir/lib/python3.10/site-packages/transformers/configuration_utils.py", line 689, in _get_config_dict
resolved_config_file = cached_file(
File "/root/anaconda3/envs/diffbir/lib/python3.10/site-packages/transformers/utils/hub.py", line 451, in cached_file
raise EnvironmentError(
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like liuhaotian/llava-v1.5-7b is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

请问如何离线支持运行DiffBIR,需要提前下载哪些文件? 有没有这方面的描述说明?

期待您的回复,谢谢。

@0x3f3f3f3fun
Copy link
Collaborator

您好。可以通过下述步骤完成模型加载,不过还没测试过。

  1. 把hugging face上的文件下载到本地目录。
  2. 模型加载路径修改为本地目录地址。

需要这些文件:
image

@xyhweinin
Copy link
Author

谢谢您的回复。 我按照您提供的建议下载了提供的链接的hugging face上的相关文件到本地,然后将模型加载路径修改为本地目录地址,比如:model_path = "/root/VAS/DiffBIR/DiffBIR-2.1.0/weights/llava-v1.5-7b",下载的模型文件都放到这个目录下。
执行下面的命令:
python -u inference.py --task denoise --upscale 1 --version v2.1 --captioner llava --cfg_scale 8 --noise_aug 0 --input inputs/demo/bid --output results/v2.1_demo_bid

发生不少错误,有什么好的解决方法,期待您的回复。谢谢

具体的报错如下:
load controlnet weight
You are using a model of type llava to instantiate a model of type llava_llama. This is not supported for all configurations of models and can yield errors.
Traceback (most recent call last):
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/urllib3/connection.py", line 199, in _new_conn
sock = connection.create_connection(
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
raise err
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
OSError: [Errno 101] Network is unreachable

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/urllib3/connectionpool.py", line 789, in urlopen
response = self._make_request(
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/urllib3/connectionpool.py", line 490, in _make_request
raise new_e
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/urllib3/connectionpool.py", line 466, in _make_request
self._validate_conn(conn)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn
conn.connect()
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/urllib3/connection.py", line 693, in connect
self.sock = sock = self._new_conn()
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/urllib3/connection.py", line 214, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7fe044497f10>: Failed to establish a new connection: [Errno 101] Network is unreachable

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/requests/adapters.py", line 667, in send
resp = conn.urlopen(
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/urllib3/connectionpool.py", line 843, in urlopen
retries = retries.increment(
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/urllib3/util/retry.py", line 519, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /openai/clip-vit-large-patch14-336/resolve/main/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fe044497f10>: Failed to establish a new connection: [Errno 101] Network is unreachable'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1376, in _get_metadata_or_catch_error
metadata = get_hf_file_metadata(
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1296, in get_hf_file_metadata
r = _request_wrapper(
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 277, in _request_wrapper
response = _request_wrapper(
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 300, in _request_wrapper
response = get_session().request(method=method, url=url, **params)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 93, in send
return super().send(request, *args, **kwargs)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/requests/adapters.py", line 700, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /openai/clip-vit-large-patch14-336/resolve/main/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fe044497f10>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), '(Request ID: 13604a27-c537-4bff-8b50-3e10259786c6)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/transformers/utils/hub.py", line 385, in cached_file
resolved_file = hf_hub_download(
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 862, in hf_hub_download
return _hf_hub_download_to_cache_dir(
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 969, in _hf_hub_download_to_cache_dir
_raise_on_head_call_error(head_call_error, force_download, local_files_only)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1487, in _raise_on_head_call_error
raise LocalEntryNotFoundError(
huggingface_hub.errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/inference.py", line 292, in
main()
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/inference.py", line 287, in main
loopsargs.task.run()
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/diffbir/inference/loop.py", line 43, in init
self.load_captioner()
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/diffbir/inference/loop.py", line 124, in load_captioner
self.captioner = LLaVACaptioner(self.args.device, self.args.llava_bit)
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/diffbir/utils/caption.py", line 74, in init
load_pretrained_model(
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/llava/model/builder.py", line 117, in load_pretrained_model
model = LlavaLlamaForCausalLM.from_pretrained(
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3594, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/llava/model/language_model/llava_llama.py", line 46, in init
self.model = LlavaLlamaModel(config)
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/llava/model/language_model/llava_llama.py", line 38, in init
super(LlavaLlamaModel, self).init(config)
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/llava/model/llava_arch.py", line 35, in init
self.vision_tower = build_vision_tower(config, delay_load=True)
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/llava/model/multimodal_encoder/builder.py", line 13, in build_vision_tower
return CLIPVisionTower(vision_tower, args=vision_tower_cfg, **kwargs)
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/llava/model/multimodal_encoder/clip_encoder.py", line 22, in init
self.cfg_only = CLIPVisionConfig.from_pretrained(self.vision_tower_name)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/transformers/models/clip/configuration_clip.py", line 251, in from_pretrained
config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/transformers/configuration_utils.py", line 634, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/transformers/configuration_utils.py", line 689, in _get_config_dict
resolved_config_file = cached_file(
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/transformers/utils/hub.py", line 425, in cached_file
raise EnvironmentError(
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like openai/clip-vit-large-patch14-336 is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

@xyhweinin
Copy link
Author

再补充几点:
1、我采用上述相同的环境(下载模型到本地,然后修改模型加载路径),采用v2.0的命令行执行:
python -u inference.py --task denoise --upscale 1 --version v2 --sampler spaced --steps 50 --captioner none --pos_prompt '' --neg_prompt 'low quality, blurry, low-resolution, noisy, unsharp, weird textures' --cfg_scale 4.0 --input inputs/demo/bid --output results/v2_demo_bid --device cuda --precision fp32

可以正常运行。
2、采用上面v2.0相同的命令执行,只是我把/root/VAS/DiffBIR/DiffBIR-2.1.0/inputs/demo/bid目录下的图像换成分辨率为高清(1920*1080)的图像,结果执行发生错误,提示服务器显存不足,服务器的GPU卡是GeForce RTX 3080,显存10G。高清的图像应该是目前最常用的,能通过调整参数的方式解决这个显存不足的问题吗?
报错如下:
load lq: inputs/demo/bid/image-00001.png
Traceback (most recent call last):
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/inference.py", line 292, in
main()
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/inference.py", line 287, in main
loopsargs.task.run()
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/diffbir/inference/loop.py", line 181, in run
batch_samples = self.pipeline.run(
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/diffbir/pipeline.py", line 274, in run
cond_img = self.apply_cleaner(
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/diffbir/pipeline.py", line 417, in apply_cleaner
output = model(lq)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/diffbir/model/scunet.py", line 232, in forward
x2 = self.m_down1(x1)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward
input = module(input)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/diffbir/model/scunet.py", line 155, in forward
trans_x = self.trans_block(trans_x)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/diffbir/model/scunet.py", line 119, in forward
x = x + self.drop_path(self.msa(self.ln1(x)))
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/diffbir/model/scunet.py", line 71, in forward
sim = torch.einsum('hbwpc,hbwqc->hbwpq', q, k) * self.scale
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 510.00 MiB. GPU 0 has a total capacity of 9.77 GiB of which 60.81 MiB is free. Including non-PyTorch memory, this process has 9.70 GiB memory in use. Of the allocated memory 9.36 GiB is allocated by PyTorch, and 82.12 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

期待您的回复,非常感谢。

@0x3f3f3f3fun
Copy link
Collaborator

您好。

  1. 网络连接有关的报错信息
    在连不上hugging face的情况下,下载llava有点麻烦。可以试着使用--captioner none或者--captioner ram。选择ram时,需要下载这一行代码对应的文件,然后替换为本地存储路径。

  2. 显存不足问题
    使用tiled sampling,同时设置--precision fp16(实际上这是默认参数)。

@xyhweinin
Copy link
Author

2. 显存不足问题
使用tiled sampling,同时设置--precision fp16(实际上这是默认参数)。

----采用了这种方法测试,分别尝试了下面的这些命令:

  1. python -u inference.py --task denoise --upscale 1 --version v2 --sampler spaced --steps 50 --captioner none --pos_prompt '' --neg_prompt 'low quality, blurry, low-resolution, noisy, unsharp, weird textures' --cfg_scale 4.0 --input inputs/demo/bid --output results/v2_demo_bid --device cuda --precision fp16 --vae_encoder_tiled --vae_encoder_tile_size 256 --vae_decoder_tiled --vae_decoder_tile_size 256

  2. python -u inference.py --task denoise --upscale 1 --version v2 --sampler spaced --steps 50 --captioner none --pos_prompt '' --neg_prompt 'low quality, blurry, low-resolution, noisy, unsharp, weird textures' --cfg_scale 4.0 --input inputs/demo/bid --output results/v2_demo_bid --device cuda --precision fp16 --vae_decoder_tiled --vae_decoder_tile_size 256

  3. python -u inference.py --task denoise --upscale 1 --version v2 --sampler spaced --steps 50 --captioner none --pos_prompt '' --neg_prompt 'low quality, blurry, low-resolution, noisy, unsharp, weird textures' --cfg_scale 4.0 --input inputs/demo/bid --output results/v2_demo_bid --device cuda --precision fp16 --cldm_tiled --cldm_tile_size 512 --cldm_tile_stride 256

  4. python -u inference.py --task denoise --upscale 1 --version v2 --sampler spaced --steps 50 --captioner none --pos_prompt '' --neg_prompt 'low quality, blurry, low-resolution, noisy, unsharp, weird textures' --cfg_scale 4.0 --input inputs/demo/bid --output results/v2_demo_bid --device cuda --precision fp16 --cleaner_tiled --cleaner_tile_size 256 --cleaner_tile_stride 128 --vae_encoder_tiled --vae_encoder_tile_size 256 --vae_decoder_tiled --vae_decoder_tile_size 256 --cldm_tiled --cldm_tile_size 512 --cldm_tile_stride 256

以上4条命令都能正常值执行,只是运行的速度不一样。而且最后生成的图像也存在同样的问题,即图像的内容按照块左右搞反了,详细效果请看我附件中的两幅图像:“image-00001-orig.png”--原始图像;“image-00001-inferenced.png”--处理后的图像。

v2_demo_bid.zip

请问这是什么原因,如何解决呢? 期待您的回复。谢谢

@xyhweinin
Copy link
Author

1. 网络连接有关的报错信息
在连不上hugging face的情况下,下载llava有点麻烦。可以试着使用--captioner none或者--captioner ram。选择ram时,需要下载这一行代码对应的文件,然后替换为本地存储路径。

/////////////////////////////////////////////////////
//
---尝试了--captioner none
python -u inference.py --task denoise --upscale 1 --version v2.1 --captioner none --cfg_scale 8 --noise_aug 0 --input inputs/demo/bid --output results/v2.1_demo_bid
对于DIffBIR项目中原来提供的图像,可以正常执行。 如果用我需要处理的高清图像,同样会出现显存不够用的问题。 采用上面提到的tiled smpling方法解决,也同样存在处理之后图像内容不正确的问题。

---按此方法尝试了--captioner ram (下载模型文件,修改相应代码指向本地的模型文件)
python -u inference.py --task denoise --upscale 1 --version v2.1 --captioner ram--cfg_scale 8 --noise_aug 0 --input inputs/demo/bid --output results/v2.1_demo_bid
提示报错如下:
load controlnet weight
Traceback (most recent call last):
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/inference.py", line 292, in
main()
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/inference.py", line 287, in main
loopsargs.task.run()
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/diffbir/inference/loop.py", line 43, in init
self.load_captioner()
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/diffbir/inference/loop.py", line 127, in load_captioner
self.captioner = RAMCaptioner(self.args.device)
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/diffbir/utils/caption.py", line 162, in init
model = ram_plus(pretrained=pretrained, image_size=image_size, vit="swin_l")
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/ram/models/ram_plus.py", line 403, in ram_plus
model = RAM_plus(**kwargs)
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/ram/models/ram_plus.py", line 137, in init
self.tokenizer = init_tokenizer(text_encoder_type)
File "/root/VAS/DiffBIR/DiffBIR-2.1.0/ram/models/utils.py", line 131, in init_tokenizer
tokenizer = BertTokenizer.from_pretrained(text_encoder_type)
File "/root/anaconda3/envs/diffbir-21/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2013, in from_pretrained
raise EnvironmentError(
OSError: Can't load tokenizer for 'bert-base-uncased'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'bert-base-uncased' is the correct path to a directory containing all relevant files for a BertTokenizer tokenizer.

期待您的回复,谢谢。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants