Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROCM support in Docker #681

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

ROCM support in Docker #681

wants to merge 2 commits into from

Conversation

Krisseck
Copy link

Closes issue #63

Added a new docker profile for AUTOMATIC1111 / AMD Rocm support.

docker compose --profile auto-rocm up

Tried on my 7900 XTX, works okay.

Xformers has an experimental AMD support in 0.0.25, but I could not get it to work. facebookresearch/xformers@44b0d07

@Mrrwmix
Copy link

Mrrwmix commented May 20, 2024

This works for me on my 6750 XT, but I have to add HSA_OVERRIDE_GFX_VERSION=10.3.0 as an ENV variable in Dockerfile.rocm

RUN . /clone.sh generative-models https://github.com/Stability-AI/generative-models 45c443b316737a4ab6e40413d7794a7f5657c19f


FROM rocm/pytorch:rocm6.0.2_ubuntu22.04_py3.10_pytorch_2.1.2
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the only difference between this docker file and the nvidia one?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, all references to CodeFormer was removed, and the env variable NVIDIA_VISIBLE_DEVICES

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the environment variable would not make a difference in this case right?

I am wondering if we can configure this different image as a build arg instead of duplicating the entire dockerfile

@picarica
Copy link

picarica commented Jun 28, 2024

hey man i tried testing your branch but is rx 550-580 supported? i got this error


webui-docker-auto-rocm-1  | Stable diffusion model failed to load
webui-docker-auto-rocm-1  | 
webui-docker-auto-rocm-1  | rocBLAS error: Cannot read /opt/rocm/lib/rocblas/library/TensileLibrary.dat: Illegal seek for GPU arch : gfx803
webui-docker-auto-rocm-1  |  List of available TensileLibrary Files : 

any idea how to make it supported?

i think maybe pytorch version is too new? or rocm version, but i cant find which version to install

@picarica
Copy link

i tried changin from rocm6 to rocm 5.7
and i got this error RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'
sdlog.log
what did i do worngly :/

@picarica
Copy link

got it run but still uses my cpu instead of gpu
workingbutcpu.log

@hakimilg
Copy link

Do you think it would be possible to merge this feature @AbdBarho ? I would love to take advantages of the update of your repo, but still using my AMD GPU.

@AJenbo
Copy link

AJenbo commented Jul 22, 2024

Confirmed on Ubuntu 24.04 with a 7800xt

@JaCoB1123
Copy link

Confirmed working on Manjaro with a 7800 GRE. I had to add HIP_VISIBLE_DEVICES=0 to the docker-compose.yml

@picarica
Copy link

Confirmed working on Manjaro with a 7800 GRE. I had to add HIP_VISIBLE_DEVICES=0 to the docker-compose.yml

where did u add that? i have issues with it working with my rx570

@JaCoB1123
Copy link

Confirmed working on Manjaro with a 7800 GRE. I had to add HIP_VISIBLE_DEVICES=0 to the docker-compose.yml

where did u add that? i have issues with it working with my rx570

I can't check the file currently, but I think it was in the base-service at the top of the file, something like this (taken from the current docker-compose.yml):

x-base_service: &base_service
    ports:
      - "${WEBUI_PORT:-7860}:7860"
    volumes:
      - &v1 ./data:/data
      - &v2 ./output:/output
    stop_signal: SIGKILL
    environment:
      - HIP-VISIBLE_DEVICES=0
    tty: true
    deploy:
      resources:
        reservations:
          devices:
              - driver: nvidia
                device_ids: ['0']
                capabilities: [compute, utility]

name: webui-docker

services:
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants