Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: Inpaint with FluxFillPipeline fails with "Input is in incorrect format." #3699

Open
2 tasks done
troex opened this issue Jan 13, 2025 · 8 comments
Open
2 tasks done
Labels
cannot reproduce Reported issue cannot be easily reproducible question Further information is requested

Comments

@troex
Copy link

troex commented Jan 13, 2025

Issue Description

Trying to use inpaint or img2img with FluxFillPipeline but it fails with error.

I simply upload an image into inpaint, mask area, add prompt and try to generate

Version Platform Description

23:15:25-918551 INFO     Starting SD.Next
23:15:25-921399 INFO     Logger: file="/data/sd-next/sdnext.log" level=INFO size=970478 mode=append
23:15:25-922322 INFO     Python: version=3.11.2 platform=Linux bin="/data/sd-next/venv/bin/python3"
                         venv="/data/sd-next/venv"
23:15:25-979704 INFO     Version: app=sd.next updated=2025-01-02 hash=3f360891 branch=master
                         url=https://github.com/vladmandic/automatic/tree/master ui=main
23:15:26-175496 INFO     Platform: arch=x86_64 cpu= system=Linux release=6.1.0-28-cloud-amd64 python=3.11.2 docker=False
23:15:26-177061 INFO     Args: []
23:15:26-187936 INFO     CUDA: nVidia toolkit detected

Relevant log output

00:19:09-209237 INFO     Base: pipeline=FluxFillPipeline task=TEXT_2_IMAGE batch=1/1x4 set={'guidance_scale': 6,
                         'generator': 'cuda:[750230034, 750230035, 750230036, 750230037]', 'num_inference_steps': 20,
                         'output_type': 'latent', 'width': 1264, 'height': 1688, 'parser': 'native', 'prompt': 'embeds'}
00:19:09-219323 ERROR    Processing: args={'prompt_embeds': 'cuda:0:torch.bfloat16:torch.Size([4, 512, 4096])',
                         'pooled_prompt_embeds': 'cuda:0:torch.bfloat16:torch.Size([4, 768])', 'guidance_scale': 6,
                         'generator': [<torch._C.Generator object at 0x7f6d5f1d1230>, <torch._C.Generator object at
                         0x7f6d5f1d2750>, <torch._C.Generator object at 0x7f6d5f1d0ff0>, <torch._C.Generator object at
                         0x7f6d5f173570>], 'callback_on_step_end': <function diffusers_callback at 0x7f6e297a60c0>,
                         'callback_on_step_end_tensor_inputs': ['latents'], 'num_inference_steps': 20, 'output_type':
                         'latent', 'width': 1264, 'height': 1688} Input is in incorrect format. Currently, we only support
                         <class 'PIL.Image.Image'>, <class 'numpy.ndarray'>, <class 'torch.Tensor'>
00:19:09-786059 INFO     Processed: images=0 its=0.00 time=0.64 timers={'gc': 0.56, 'process': 0.29, 'post': 0.28, 'init':
                         0.05} memory={'ram': {'used': 5.18, 'total': 30.91}, 'gpu': {'used': 31.97, 'total': 44.52},
                         'retries': 1, 'oom': 1}

Backend

Diffusers

UI

Standard

Branch

Master

Model

FLUX.1

Acknowledgements

  • I have read the above and searched for existing issues
  • I confirm that this is classified correctly and its not an extension issue
@vladmandic vladmandic added backlog Valid issue but requires non-trivial work and is placed in backlog and removed backlog Valid issue but requires non-trivial work and is placed in backlog labels Jan 13, 2025
@vladmandic
Copy link
Owner

i cant reproduce - how are you loading flux fill model?

image

@vladmandic vladmandic added question Further information is requested cannot reproduce Reported issue cannot be easily reproducible labels Jan 14, 2025
@SAC020
Copy link

SAC020 commented Jan 15, 2025

I have seen the same error trying to use flux fill.

Selected the Flux fill model from the model drop-down list (I know it's not selected in the screenshot right now, just illustrating)

image

@vladmandic
Copy link
Owner

vladmandic commented Jan 15, 2025

you cannot select flux fill from dropdown as its NOT a base model, you need to select it from scripts so its properly initialized - that's why I've posted my screenshot!

@SAC020
Copy link

SAC020 commented Jan 15, 2025

you cannot select flux fill from dropdown as its NOT a base model, you need to select it from scripts so its properly initialized - that's why I've posted my screenshot!

I was going to assume you would say that (and it makes sense), so I've tested it and it blows out of VRAM (Nvidia 4090)

sdnext.log

@vladmandic
Copy link
Owner

yes, its a horribly inefficient model - but it works fine if you enable a) balanced offload, b) on-the-fly quantization

@troex
Copy link
Author

troex commented Jan 15, 2025

@vladmandic Not sure what I'm going wrong, but now getting other error.

Do I understand correctly that I select Flux.1 Dev as Base model and enable Flux Tools and select Fill?

image

23:13:11-147012 INFO     Load model: select="Diffusers/black-forest-labs/FLUX.1-dev [0ef5fff789]"
Diffusers  2.85it/s █████████████ 100% 2/2 00:00 00:00 Loading checkpoint shards
Diffusers  2.17it/s ████████ 100% 7/7 00:03 00:00 Loading pipeline components...
23:13:14-826795 WARNING  Setting model: offload=none type=FluxPipeline large model
23:14:47-128174 INFO     Model compile: pipeline=FluxPipeline mode=default backend=stable-fast fullgraph=True
                         compile=['Model', 'VAE']
23:14:47-132967 INFO     Model compile: task=stablefast 'FluxPipeline' object has no attribute 'unet'
23:14:47-431012 INFO     Load model: time=total=95.98 move=92.30 load=3.62 options=0.05 native=1024 memory={'ram': {'used':
                         5.54, 'total': 30.91}, 'gpu': {'used': 31.97, 'total': 44.42}, 'retries': 0, 'oom': 0}
23:18:30-092598 INFO     Flux Tools: tool=Fill init
23:18:30-097130 INFO     HF search: model="black-forest-labs/FLUX.1-Fill-dev" results=['black-forest-labs/FLUX.1-Fill-dev']
23:18:30-097989 INFO     Load model: select="Diffusers/black-forest-labs/FLUX.1-Fill-dev [03216e1d67]"
Diffusers  2.80it/s █████████████ 100% 2/2 00:00 00:00 Loading checkpoint shards
Diffusers  1.72it/s ████████ 100% 7/7 00:04 00:00 Loading pipeline components...
23:19:33-608914 WARNING  Setting model: offload=none type=FluxFillPipeline large model
23:21:03-661922 INFO     Model compile: pipeline=FluxFillPipeline mode=default backend=stable-fast fullgraph=True compile=['Model', 'VAE']
23:21:03-665662 INFO     Model compile: task=stablefast 'FluxFillPipeline' object has no attribute 'unet'
23:21:03-964757 INFO     Load model: time=total=94.63 move=90.05 load=4.52 options=0.06 native=1024 memory={'ram': {'used': 6.57, 'total': 30.91},
                         'gpu': {'used': 31.97, 'total': 44.42}, 'retries': 0, 'oom': 0}
23:21:05-953959 WARNING  Sampler: sampler="DPM++ 2M SDE" does not accept sigmas
23:21:06-418244 INFO     Base: pipeline=FluxFillPipeline task=TEXT_2_IMAGE batch=1/1x1 set={'guidance_scale': 6, 'generator': 'cuda:[4108511075]',
                         'num_inference_steps': 20, 'output_type': 'latent', 'width': 896, 'height': 1152, 'image': <PIL.Image.Image image mode=RGB
                         size=896x1152 at 0x7FB1B7C9C390>, 'mask_image': <PIL.Image.Image image mode=L size=896x1152 at 0x7FB1A63D6A90>, 'parser':
                         'native', 'prompt': 'embeds'}
Progress ?it/s                                              0% 0/20 00:00 ? Base
23:21:07-725633 ERROR    Exception: cannot access local variable 'attn_output' where it is not associated with a value
23:21:07-726977 ERROR    Arguments: args=('task(ofi3808ambztdcr)', '', 1.0, 'red sport car', '', [], None, None, {'image': <PIL.Image.Image image
                         mode=RGBA size=896x1152 at 0x7FB1B7A12E50>, 'mask': <PIL.Image.Image image mode=RGB size=896x1152 at 0x7FB1A7F1C510>}, None,
                         None, None, None, 20, 4, 4, 1, 1, True, False, False, False, 1, 1, 6, 6, 0.7, 0, 0.5, 1, 0, 1, 0.4, -1.0, -1.0, 0, 0, 0, 0,
                         0, 0, 1, 1, 'None', 'Add with forward', 0, 32, 0, None, '', '', '', 0, 0, 0, 0, False, 4, 0.95, False, 0.6, 1, '#000000', 0,
                         False, None, 0.3, 1, 'Add with forward', 'None', False, 20, 1, 0, 0, 20, 0, '', '', [], 5, 1, False, 'None', 'None', 'None',
                         'None', 0.5, 0.5, 0.5, 0.5, None, None, None, None, False, False, False, False, 0, 0, 0, 0, 1, 1, 1, 1, None, None, None,
                         None, False, '', False, 0, '', [], 0, '', [], 0, '', [], False, True, False, True, False, False, False, False, 0, False,
                         'None', 2, True, 1, 0, 0, '', '', 0.5, '', 0.5, 5, None, '', 0.5, 5, None, True, 1, False, 'None', None, 'None', [], 'FaceID
                         Base', True, True, 1, 1, 1, 0.5, True, 'person', 1, 0.5, True, 'Fill', 0, True, True, 2, True, 1, 35, True, 1, 0.75, True,
                         2, 0.75, False, 3, 0.75, False, 4, 0.75, 0.65, True, False, 1, 1, 1, 0, 1, False, False, False, None, 0.1, 1, True, '', 0.5,
                         0.9, '', 0.5, 0.9, 4, 0.5, 'Linear', 'None', True, None, 1, 0, 0, 0, 0, 0, 0, 0, 1, 'OpenGVLab/InternVL-14B-224px',
                         '<span>&nbsp Outpainting</span><br>', 128, 8, ['left', 'right', 'up', 'down'], 1, 0.05, 128, 4, 0, ['left', 'right', 'up',
                         'down'], False, 0.7, 1.2, 128, False, False, 'positive', 'comma', 0, False, False, '', [], 0.8, 20, 'dpmpp_sde', 'v2',
                         False, True, 'v1.1', '<span>&nbsp SD Upscale</span><br>', 64, 0, 2, '7,8,9', 1, 0.01, 0.2, None, '', False, ['attention',
                         'adain_queries', 'adain_keys'], 1, 0, 0, 'THUDM/CogVideoX-2b', 'DDIM', 49, 6, 'balanced', True, 'None', 8, True, 1, 0, None,
                         None, '0.9.1', '', 'diffusers', True, 41, 'None', 2, True, 1, 0, False, 0.03, 'SVD 1.0', 14, True, 1, 3, 6, 0.5, 0.1,
                         'None', 2, True, 1, 0, 'None', 16, 'None', 2, True, 1, 0, 'none', 3, 4, 0.25, 0.25, 0.5, 0.5, 0, '', [], 0, '', [], 0, '',
                         [], False, True, False, True, False, False, False, False, 0, False, 'None', 2, True, 1, 0) kwargs={}
23:21:07-733227 ERROR    gradio call: UnboundLocalError
╭──────────────────────────────────────────────────────── Traceback (most recent call last) ────────────────────────────────────────────────────────╮
│ /data/sd-next/modules/call_queue.py:31 in f                                                                                                       │
│                                                                                                                                                   │
│   30 │   │   │   try:                                                                                                                             │
│ ❱ 31 │   │   │   │   res = func(*args, **kwargs)                                                                                                  │
│   32 │   │   │   │   progress.record_results(id_task, res)                                                                                        │
│                                                                                                                                                   │
│ /data/sd-next/modules/img2img.py:301 in img2img                                                                                                   │
│                                                                                                                                                   │
│   300 │   │   if processed is None:                                                                                                               │
│ ❱ 301 │   │   │   processed = processing.process_images(p)                                                                                        │
│   302 │   │   processed = modules.scripts.scripts_img2img.after(p, processed, *args)                                                              │
│                                                                                                                                                   │
│ /data/sd-next/modules/processing.py:210 in process_images                                                                                         │
│                                                                                                                                                   │
│   209 │   │   │   with context_hypertile_vae(p), context_hypertile_unet(p):                                                                       │
│ ❱ 210 │   │   │   │   processed = process_images_inner(p)                                                                                         │
│   211                                                                                                                                             │
│                                                                                                                                                   │
│ /data/sd-next/modules/processing.py:337 in process_images_inner                                                                                   │
│                                                                                                                                                   │
│   336 │   │   │   │   │   from modules.processing_diffusers import process_diffusers                                                              │
│ ❱ 337 │   │   │   │   │   samples = process_diffusers(p)                                                                                          │
│   338 │   │   │   │   else:                                                                                                                       │
│                                                                                                                                                   │
│ /data/sd-next/modules/processing_diffusers.py:447 in process_diffusers                                                                            │
│                                                                                                                                                   │
│   446 │   if 'base' not in p.skip:                                                                                                                │
│ ❱ 447 │   │   output = process_base(p)                                                                                                            │
│   448 │   else:                                                                                                                                   │
│                                                                                                                                                   │
│ /data/sd-next/modules/processing_diffusers.py:99 in process_base                                                                                  │
│                                                                                                                                                   │
│    98 │   │   else:                                                                                                                               │
│ ❱  99 │   │   │   output = shared.sd_model(**base_args)                                                                                           │
│   100 │   │   if isinstance(output, dict):                                                                                                        │
│                                                                                                                                                   │
│ /data/sd-next/venv/lib/python3.11/site-packages/torch/utils/_contextlib.py:116 in decorate_context                                                │
│                                                                                                                                                   │
│   115 │   │   with ctx_factory():                                                                                                                 │
│ ❱ 116 │   │   │   return func(*args, **kwargs)                                                                                                    │
│   117                                                                                                                                             │
│                                                                                                                                                   │
│ /data/sd-next/venv/lib/python3.11/site-packages/diffusers/pipelines/flux/pipeline_flux_fill.py:916 in __call__                                    │
│                                                                                                                                                   │
│   915 │   │   │   │                                                                                                                               │
│ ❱ 916 │   │   │   │   noise_pred = self.transformer(                                                                                              │
│   917 │   │   │   │   │   hidden_states=torch.cat((latents, masked_image_latents), dim=2),                                                        │
│                                                                                                                                                   │
│ /data/sd-next/venv/lib/python3.11/site-packages/torch/nn/modules/module.py:1736 in _wrapped_call_impl                                             │
│                                                                                                                                                   │
│   1735 │   │   else:                                                                                                                              │
│ ❱ 1736 │   │   │   return self._call_impl(*args, **kwargs)                                                                                        │
│   1737                                                                                                                                            │
│                                                                                                                                                   │
│ /data/sd-next/venv/lib/python3.11/site-packages/torch/nn/modules/module.py:1747 in _call_impl                                                     │
│                                                                                                                                                   │
│   1746 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                                                    │
│ ❱ 1747 │   │   │   return forward_call(*args, **kwargs)                                                                                           │
│   1748                                                                                                                                            │
│                                                                                                                                                   │
│ /data/sd-next/venv/lib/python3.11/site-packages/diffusers/models/transformers/transformer_flux.py:522 in forward                                  │
│                                                                                                                                                   │
│   521 │   │   │   else:                                                                                                                           │
│ ❱ 522 │   │   │   │   encoder_hidden_states, hidden_states = block(                                                                               │
│   523 │   │   │   │   │   hidden_states=hidden_states,                                                                                            │
│                                                                                                                                                   │
│ /data/sd-next/venv/lib/python3.11/site-packages/torch/nn/modules/module.py:1736 in _wrapped_call_impl                                             │
│                                                                                                                                                   │
│   1735 │   │   else:                                                                                                                              │
│ ❱ 1736 │   │   │   return self._call_impl(*args, **kwargs)                                                                                        │
│   1737                                                                                                                                            │
│                                                                                                                                                   │
│ /data/sd-next/venv/lib/python3.11/site-packages/torch/nn/modules/module.py:1747 in _call_impl                                                     │
│                                                                                                                                                   │
│   1746 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                                                    │
│ ❱ 1747 │   │   │   return forward_call(*args, **kwargs)                                                                                           │
│   1748                                                                                                                                            │
│                                                                                                                                                   │
│ /data/sd-next/venv/lib/python3.11/site-packages/diffusers/models/transformers/transformer_flux.py:193 in forward                                  │
│                                                                                                                                                   │
│   192 │   │   # Process attention outputs for the `hidden_states`.                                                                                │
│ ❱ 193 │   │   attn_output = gate_msa.unsqueeze(1) * attn_output                                                                                   │
│   194 │   │   hidden_states = hidden_states + attn_output                                                                                         │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
UnboundLocalError: cannot access local variable 'attn_output' where it is not associated with a value

@vladmandic
Copy link
Owner

vladmandic commented Jan 15, 2025

yes, that is correct, but please simplify - is torch compile relevant for this issue? if it works without compile, then this issue is about torch.compile compatibility with flux-fill. and if it doesn't work with compile, well, reproduce with clean debug log without compile.
that's always the first rule when diagnosing what's wrong - simplify - i cannot try to reproduce every single combination and like i said, base configuration worked for me.
and once reproduce, upload full log (with --debug command line enabled) from server start until the error.

also, using flux without offloading is really expensive, not sure why you have it disabled. and stablefast is not a flux thing.
really, try with default configuration.

@vladmandic
Copy link
Owner

any updates here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cannot reproduce Reported issue cannot be easily reproducible question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants