Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Auto enable mem efficient attention on gfx1100 on pytorch nightly 2.7
I'm not not sure which arches are supported yet. If you see improvements in memory usage while using --use-pytorch-cross-attention on your AMD GPU let me know and I will add it to the list.
- Loading branch information
d7b4bf2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, after updating comfy yesterday, my PC hard locks on the 5/12 sampler step in my workflow, its happen 3 times the exact same way when using hunyuan Video. 320x240x61
I tried LTXv, sdxl and sd1 and had no issues.
Its never happen before like that, so i test a bit and determine this was the cause, after commented out this code and now it works again.
Also xformers work with 7900xtx as of a few months ago, there are still some incompatibility with custom nodes but not with core nodes. For example it does not work with depth_anythingV2 but it does work with ml-depth-pro.
comfy start up below
/ComfyUI$ MIOPEN_FIND_MODE=2 PYTORCH_TUNABLEOP_ENABLED=1 python main.py --use-pytorch-cross-attention --bf16-vae --reserve-vram 0.9
[START] Security scan
[DONE] Security scan
ComfyUI-Manager: installing dependencies done.
** ComfyUI startup time: 2025-02-15 12:29:17.545
** Platform: Linux
** Python version: 3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0]
** Python executable: /home/adminl/anaconda3/envs/ComfyUI_310_s/bin/python
** ComfyUI Path: /home/adminl/Comfy/minimal/ComfyUI
** ComfyUI Base Folder Path: /home/adminl/Comfy/minimal/ComfyUI
** User directory: /home/adminl/Comfy/minimal/ComfyUI/user
** ComfyUI-Manager config path: /home/adminl/Comfy/minimal/ComfyUI/user/default/ComfyUI-Manager/config.ini
** Log path: /home/adminl/Comfy/minimal/ComfyUI/user/comfyui.log
Prestartup times for custom nodes:
0.0 seconds: /home/adminl/Comfy/minimal/ComfyUI/custom_nodes/rgthree-comfy
0.0 seconds: /home/adminl/Comfy/minimal/ComfyUI/custom_nodes/ComfyUI-Easy-Use
1.5 seconds: /home/adminl/Comfy/minimal/ComfyUI/custom_nodes/ComfyUI-Manager
Checkpoint files will always be loaded safely.
Total VRAM 24560 MB, total RAM 128733 MB
pytorch version: 2.6.0+rocm6.2.4
xformers version: 0.0.29.post2
Set vram state to: NORMAL_VRAM
Device: cuda:0 Radeon RX 7900 XTX : native
Using pytorch attention
ComfyUI version: 0.3.14
[Prompt Server] web root: /home/adminl/Comfy/minimal/ComfyUI/web
Total VRAM 24560 MB, total RAM 128733 MB
pytorch version: 2.6.0+rocm6.2.4
xformers version: 0.0.29.post2
Set vram state to: NORMAL_VRAM
Device: cuda:0 Radeon RX 7900 XTX : native
d7b4bf2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, it seem that my files got corrupted somehow.
d7b4bf2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested on Notebook, SDXL, 1024x1536, euler_ancestral, simple
Hardware: Radeon 680M(Ryzen 7 6800H)
Using ArchLinux, pytorch 2.6, up to date
I use --fp8_e5m2 to make triton work and save 2GB RAM
I use --novram to make --use-pytorch-cross-attention work, without it will get oom
--use-pytorch-cross-attention do not save RAM, but can get significant speedup.
without torch.compile and --use-pytorch-cross-attention: 11.8s/it
with --use-pytorch-cross-attention and without torch.compile: 11.5s/it
with torch.compile and without --use-pytorch-cross-attention: 11.5s/it
with torch.compile and --use-pytorch-cross-attention: 8s/it, about 30% speedup
Checkpoint files will always be loaded safely.
Total VRAM 7584 MB, total RAM 15169 MB
pytorch version: 2.6.0
AMD arch: gfx1030
Set vram state to: NO_VRAM
Device: cuda:0 AMD Radeon Graphics : native
Using pytorch attention
ComfyUI version: 0.3.14
[Prompt Server] web root: /home/kane/ComfyUI/web
Import times for custom nodes:
0.0 seconds: /home/kane/ComfyUI/custom_nodes/websocket_image_save.py
0.0 seconds: /home/kane/ComfyUI/custom_nodes/ComfyUI-Custom-Scripts
Starting server
d7b4bf2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sleppyrobot
Though it is not related to the commit, but this might help if you are talking about "Controlnet AIO Aux Preprocessor".
kijai/ComfyUI-DepthAnythingV2@003d7b4
d7b4bf2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah i was referring to that