sync with repo 28.08

2024-08-28 19:33:34 +03:00 · 2024-08-28 19:33:34 +03:00 · ad1e3ecbcb
commit ad1e3ecbcb
parent 727693318c
134 changed files with 112534 additions and 12635 deletions
--- a/README.md
+++ b/README.md
@ -1,8 +1,35 @@
-ComfyUI
+<div align="center">
-=======
+
-The most powerful and modular stable diffusion GUI and backend.
+# ComfyUI
-----------
+**The most powerful and modular stable diffusion GUI and backend.**
 [![Website][website-shield]][website-url]
 [![Dynamic JSON Badge][discord-shield]][discord-url]
 [![Matrix][matrix-shield]][matrix-url]
 <br>
 [![][github-release-shield]][github-release-link]
 [![][github-release-date-shield]][github-release-link]
 [![][github-downloads-shield]][github-downloads-link]
 [![][github-downloads-latest-shield]][github-downloads-link]
 [matrix-shield]: https://img.shields.io/badge/Matrix-000000?style=flat&logo=matrix&logoColor=white
 [matrix-url]: https://app.element.io/#/room/%23comfyui_space%3Amatrix.org
 [website-shield]: https://img.shields.io/badge/ComfyOrg-4285F4?style=flat
 [website-url]: https://www.comfy.org/
 <!-- Workaround to display total user from https://github.com/badges/shields/issues/4500#issuecomment-2060079995 -->
 [discord-shield]: https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fdiscord.com%2Fapi%2Finvites%2Fcomfyorg%3Fwith_counts%3Dtrue&query=%24.approximate_member_count&logo=discord&logoColor=white&label=Discord&color=green&suffix=%20total
 [discord-url]: https://www.comfy.org/discord
 [github-release-shield]: https://img.shields.io/github/v/release/comfyanonymous/ComfyUI?style=flat&sort=semver
 [github-release-link]: https://github.com/comfyanonymous/ComfyUI/releases
 [github-release-date-shield]: https://img.shields.io/github/release-date/comfyanonymous/ComfyUI?style=flat
 [github-downloads-shield]: https://img.shields.io/github/downloads/comfyanonymous/ComfyUI/total?style=flat
 [github-downloads-latest-shield]: https://img.shields.io/github/downloads/comfyanonymous/ComfyUI/latest/total?style=flat&label=downloads%40latest
 [github-downloads-link]: https://github.com/comfyanonymous/ComfyUI/releases
 ![ComfyUI Screenshot](comfyui_screenshot.png)
 </div>
 This ui will let you design and execute advanced stable diffusion pipelines using a graph/nodes/flowchart based interface. For some workflow examples and see what ComfyUI can do you can check out:
 ### [ComfyUI Examples](https://comfyanonymous.github.io/ComfyUI_examples/)
@ -48,6 +75,7 @@ Workflow examples can be found on the [Examples page](https://comfyanonymous.git
 |------------------------------------|--------------------------------------------------------------------------------------------------------------------|
 | Ctrl + Enter                       | Queue up current graph for generation                                                                              |
 | Ctrl + Shift + Enter               | Queue up current graph as first for generation                                                                     |
 | Ctrl + Alt + Enter                 | Cancel current generation                                                                                          |
 | Ctrl + Z/Ctrl + Y                  | Undo/Redo                                                                                                          |
 | Ctrl + S                           | Save workflow                                                                                                      |
 | Ctrl + O                           | Load workflow                                                                                                      |
@ -70,6 +98,8 @@ Workflow examples can be found on the [Examples page](https://comfyanonymous.git
 | H                                  | Toggle visibility of history                                                                                       |
 | R                                  | Refresh graph                                                                                                      |
 | Double-Click LMB                   | Open node quick search palette                                                                                     |
 | Shift + Drag                       | Move multiple wires at once                                                                                        |
 | Ctrl + Alt + LMB                   | Disconnect all wires from clicked slot                                                                             |
 Ctrl can also be replaced with Cmd instead for macOS users
@ -165,20 +195,6 @@ You can install ComfyUI in Apple Mac silicon (M1 or M2) with any recent macOS ve
 ```pip install torch-directml``` Then you can launch ComfyUI with: ```python main.py --directml```
 ### I already have another UI for Stable Diffusion installed do I really have to install all of these dependencies?
 You don't. If you have another UI installed and working with its own python venv you can use that venv to run ComfyUI. You can open up your favorite terminal and activate it:
 ```source path_to_other_sd_gui/venv/bin/activate```
 or on Windows:
 With Powershell: ```"path_to_other_sd_gui\venv\Scripts\Activate.ps1"```
 With cmd.exe: ```"path_to_other_sd_gui\venv\Scripts\activate.bat"```
 And then you can use that terminal to run ComfyUI without installing any dependencies. Note that the venv folder might be called something else depending on the SD UI.
 # Running
 ```python main.py```
@ -214,7 +230,7 @@ To use a textual inversion concepts/embeddings in a text prompt put them in the
 Use ```--preview-method auto``` to enable previews.
-The default installation includes a fast latent preview method that's low-resolution. To enable higher-quality previews with [TAESD](https://github.com/madebyollin/taesd), download the [taesd_decoder.pth](https://github.com/madebyollin/taesd/raw/main/taesd_decoder.pth) (for SD1.x and SD2.x) and [taesdxl_decoder.pth](https://github.com/madebyollin/taesd/raw/main/taesdxl_decoder.pth) (for SDXL) models and place them in the `models/vae_approx` folder. Once they're installed, restart ComfyUI to enable high-quality previews.
+The default installation includes a fast latent preview method that's low-resolution. To enable higher-quality previews with [TAESD](https://github.com/madebyollin/taesd), download the [taesd_decoder.pth, taesdxl_decoder.pth, taesd3_decoder.pth and taef1_decoder.pth](https://github.com/madebyollin/taesd/) and place them in the `models/vae_approx` folder. Once they're installed, restart ComfyUI and launch it with `--preview-method taesd` to enable high-quality previews.
 ## How to use TLS/SSL?
 Generate a self-signed certificate (not appropriate for shared/production use) and key by running the command: `openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -sha256 -days 3650 -nodes -subj "/C=XX/ST=StateName/L=CityName/O=CompanyName/OU=CompanySectionName/CN=CommonNameOrHostname"`
@ -230,6 +246,47 @@ Use `--tls-keyfile key.pem --tls-certfile cert.pem` to enable TLS/SSL, the app w
 See also: [https://www.comfy.org/](https://www.comfy.org/)
 ## Frontend Development
 As of August 15, 2024, we have transitioned to a new frontend, which is now hosted in a separate repository: [ComfyUI Frontend](https://github.com/Comfy-Org/ComfyUI_frontend). This repository now hosts the compiled JS (from TS/Vue) under the `web/` directory.
 ### Reporting Issues and Requesting Features
 For any bugs, issues, or feature requests related to the frontend, please use the [ComfyUI Frontend repository](https://github.com/Comfy-Org/ComfyUI_frontend). This will help us manage and address frontend-specific concerns more efficiently.
 ### Using the Latest Frontend
 The new frontend is now the default for ComfyUI. However, please note:
 1. The frontend in the main ComfyUI repository is updated weekly.
 2. Daily releases are available in the separate frontend repository.
 To use the most up-to-date frontend version:
 1. For the latest daily release, launch ComfyUI with this command line argument:
   ```
   --front-end-version Comfy-Org/ComfyUI_frontend@latest
   ```
 2. For a specific version, replace `latest` with the desired version number:
   ```
   --front-end-version Comfy-Org/ComfyUI_frontend@1.2.2
   ```
 This approach allows you to easily switch between the stable weekly release and the cutting-edge daily updates, or even specific versions for testing purposes.
 ### Accessing the Legacy Frontend
 If you need to use the legacy frontend for any reason, you can access it using the following command line argument:
 ```
 --front-end-version Comfy-Org/ComfyUI_legacy_frontend@latest
 ```
 This will use a snapshot of the legacy frontend preserved in the [ComfyUI Legacy Frontend repository](https://github.com/Comfy-Org/ComfyUI_legacy_frontend).
 # QA
 ### Which GPU should I buy for this?
--- a/api_server/init.py
+++ b/api_server/init.py
--- a/api_server/routes/init.py
+++ b/api_server/routes/init.py
--- a/api_server/routes/internal/README.md
+++ b/api_server/routes/internal/README.md
@ -0,0 +1,3 @@
 # ComfyUI Internal Routes
 All routes under the `/internal` path are designated for **internal use by ComfyUI only**. These routes are not intended for use by external applications may change at any time without notice.
--- a/api_server/routes/internal/init.py
+++ b/api_server/routes/internal/init.py
--- a/api_server/routes/internal/internal_routes.py
+++ b/api_server/routes/internal/internal_routes.py
@ -0,0 +1,40 @@
 from aiohttp import web
 from typing import Optional
 from folder_paths import models_dir, user_directory, output_directory
 from api_server.services.file_service import FileService
 class InternalRoutes:
    '''
    The top level web router for internal routes: /internal/*
    The endpoints here should NOT be depended upon. It is for ComfyUI frontend use only.
    Check README.md for more information.
    '''
    def __init__(self):
        self.routes: web.RouteTableDef = web.RouteTableDef()
        self._app: Optional[web.Application] = None
        self.file_service = FileService({
            "models": models_dir,
            "user": user_directory,
            "output": output_directory
        })
    def setup_routes(self):
        @self.routes.get('/files')
        async def list_files(request):
            directory_key = request.query.get('directory', '')
            try:
                file_list = self.file_service.list_files(directory_key)
                return web.json_response({"files": file_list})
            except ValueError as e:
                return web.json_response({"error": str(e)}, status=400)
            except Exception as e:
                return web.json_response({"error": str(e)}, status=500)
    def get_app(self):
        if self._app is None:
            self._app = web.Application()
            self.setup_routes()
            self._app.add_routes(self.routes)
        return self._app
--- a/api_server/services/init.py
+++ b/api_server/services/init.py
--- a/api_server/services/file_service.py
+++ b/api_server/services/file_service.py
@ -0,0 +1,13 @@
 from typing import Dict, List, Optional
 from api_server.utils.file_operations import FileSystemOperations, FileSystemItem
 class FileService:
    def __init__(self, allowed_directories: Dict[str, str], file_system_ops: Optional[FileSystemOperations] = None):
        self.allowed_directories: Dict[str, str] = allowed_directories
        self.file_system_ops: FileSystemOperations = file_system_ops or FileSystemOperations()
    def list_files(self, directory_key: str) -> List[FileSystemItem]:
        if directory_key not in self.allowed_directories:
            raise ValueError("Invalid directory key")
        directory_path: str = self.allowed_directories[directory_key]
        return self.file_system_ops.walk_directory(directory_path)
--- a/api_server/utils/file_operations.py
+++ b/api_server/utils/file_operations.py
@ -0,0 +1,42 @@
 import os
 from typing import List, Union, TypedDict, Literal
 from typing_extensions import TypeGuard
 class FileInfo(TypedDict):
    name: str
    path: str
    type: Literal["file"]
    size: int
 class DirectoryInfo(TypedDict):
    name: str
    path: str
    type: Literal["directory"]
 FileSystemItem = Union[FileInfo, DirectoryInfo]
 def is_file_info(item: FileSystemItem) -> TypeGuard[FileInfo]:
    return item["type"] == "file"
 class FileSystemOperations:
    @staticmethod
    def walk_directory(directory: str) -> List[FileSystemItem]:
        file_list: List[FileSystemItem] = []
        for root, dirs, files in os.walk(directory):
            for name in files:
                file_path = os.path.join(root, name)
                relative_path = os.path.relpath(file_path, directory)
                file_list.append({
                    "name": name,
                    "path": relative_path,
                    "type": "file",
                    "size": os.path.getsize(file_path)
                })
            for name in dirs:
                dir_path = os.path.join(root, name)
                relative_path = os.path.relpath(dir_path, directory)
                file_list.append({
                    "name": name,
                    "path": relative_path,
                    "type": "directory"
                })
        return file_list
--- a/app/frontend_management.py
+++ b/app/frontend_management.py
@ -8,7 +8,7 @@ import zipfile
 from dataclasses import dataclass
 from functools import cached_property
 from pathlib import Path
-from typing import TypedDict
+from typing import TypedDict, Optional
 import requests
 from typing_extensions import NotRequired
@ -132,12 +132,13 @@ class FrontendManager:
        return match_result.group(1), match_result.group(2), match_result.group(3)
    @classmethod
-    def init_frontend_unsafe(cls, version_string: str) -> str:
+    def init_frontend_unsafe(cls, version_string: str, provider: Optional[FrontEndProvider] = None) -> str:
        """
        Initializes the frontend for the specified version.
        Args:
            version_string (str): The version string.
            provider (FrontEndProvider, optional): The provider to use. Defaults to None.
        Returns:
            str: The path to the initialized frontend.
@ -150,7 +151,7 @@ class FrontendManager:
            return cls.DEFAULT_FRONTEND_PATH
        repo_owner, repo_name, version = cls.parse_version_string(version_string)
-        provider = FrontEndProvider(repo_owner, repo_name)
+        provider = provider or FrontEndProvider(repo_owner, repo_name)
        release = provider.get_release(version)
        semantic_version = release["tag_name"].lstrip("v")
@ -158,15 +159,21 @@ class FrontendManager:
            Path(cls.CUSTOM_FRONTENDS_ROOT) / provider.folder_name / semantic_version
        )
        if not os.path.exists(web_root):
-            os.makedirs(web_root, exist_ok=True)
+            try:
-            logging.info(
+                os.makedirs(web_root, exist_ok=True)
-                "Downloading frontend(%s) version(%s) to (%s)",
+                logging.info(
-                provider.folder_name,
+                    "Downloading frontend(%s) version(%s) to (%s)",
-                semantic_version,
+                    provider.folder_name,
-                web_root,
+                    semantic_version,
-            )
+                    web_root,
-            logging.debug(release)
+                )
-            download_release_asset_zip(release, destination_path=web_root)
+                logging.debug(release)
                download_release_asset_zip(release, destination_path=web_root)
            finally:
                # Clean up the directory if it is empty, i.e. the download failed
                if not os.listdir(web_root):
                    os.rmdir(web_root)
        return web_root
    @classmethod
--- a/comfy/cli_args.py
+++ b/comfy/cli_args.py
@ -92,6 +92,10 @@ class LatentPreviewMethod(enum.Enum):
 parser.add_argument("--preview-method", type=LatentPreviewMethod, default=LatentPreviewMethod.NoPreviews, help="Default preview method for sampler nodes.", action=EnumAction)
 cache_group = parser.add_mutually_exclusive_group()
 cache_group.add_argument("--cache-classic", action="store_true", help="Use the old style (aggressive) caching.")
 cache_group.add_argument("--cache-lru", type=int, default=0, help="Use LRU caching with a maximum of N node results cached. May use more RAM/VRAM.")
 attn_group = parser.add_mutually_exclusive_group()
 attn_group.add_argument("--use-split-cross-attention", action="store_true", help="Use the split cross attention optimization. Ignored when xformers is used.")
 attn_group.add_argument("--use-quad-cross-attention", action="store_true", help="Use the sub-quadratic cross attention optimization . Ignored when xformers is used.")
@ -112,10 +116,14 @@ vram_group.add_argument("--lowvram", action="store_true", help="Split the unet i
 vram_group.add_argument("--novram", action="store_true", help="When lowvram isn't enough.")
 vram_group.add_argument("--cpu", action="store_true", help="To use the CPU for everything (slow).")
 parser.add_argument("--reserve-vram", type=float, default=None, help="Set the amount of vram in GB you want to reserve for use by your OS/other software. By default some amount is reverved depending on your OS.")
 parser.add_argument("--default-hashing-function", type=str, choices=['md5', 'sha1', 'sha256', 'sha512'], default='sha256', help="Allows you to choose the hash function to use for duplicate filename / contents comparison. Default is sha256.")
 parser.add_argument("--disable-smart-memory", action="store_true", help="Force ComfyUI to agressively offload to regular ram instead of keeping models in vram when it can.")
 parser.add_argument("--deterministic", action="store_true", help="Make pytorch use slower deterministic algorithms when it can. Note that this might not make images deterministic in all cases.")
 parser.add_argument("--fast", action="store_true", help="Enable some untested and potentially quality deteriorating optimizations.")
 parser.add_argument("--dont-print-server", action="store_true", help="Don't print server output.")
 parser.add_argument("--quick-test-for-ci", action="store_true", help="Quick test for CI.")
--- a/comfy/clip_model.py
+++ b/comfy/clip_model.py
@ -88,10 +88,11 @@ class CLIPTextModel_(torch.nn.Module):
        heads = config_dict["num_attention_heads"]
        intermediate_size = config_dict["intermediate_size"]
        intermediate_activation = config_dict["hidden_act"]
        num_positions = config_dict["max_position_embeddings"]
        self.eos_token_id = config_dict["eos_token_id"]
        super().__init__()
-        self.embeddings = CLIPEmbeddings(embed_dim, dtype=dtype, device=device, operations=operations)
+        self.embeddings = CLIPEmbeddings(embed_dim, num_positions=num_positions, dtype=dtype, device=device, operations=operations)
        self.encoder = CLIPEncoder(num_layers, embed_dim, heads, intermediate_size, intermediate_activation, dtype, device, operations)
        self.final_layer_norm = operations.LayerNorm(embed_dim, dtype=dtype, device=device)
@ -123,7 +124,6 @@ class CLIPTextModel(torch.nn.Module):
        self.text_model = CLIPTextModel_(config_dict, dtype, device, operations)
        embed_dim = config_dict["hidden_size"]
        self.text_projection = operations.Linear(embed_dim, embed_dim, bias=False, dtype=dtype, device=device)
        self.text_projection.weight.copy_(torch.eye(embed_dim))
        self.dtype = dtype
    def get_input_embeddings(self):
--- a/comfy/controlnet.py
+++ b/comfy/controlnet.py
@ -1,4 +1,24 @@
 """
    This file is part of ComfyUI.
    Copyright (C) 2024 Comfy
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.
    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <https://www.gnu.org/licenses/>.
 """
 import torch
 from enum import Enum
 import math
 import os
 import logging
@ -13,6 +33,8 @@ import comfy.cldm.cldm
 import comfy.t2i_adapter.adapter
 import comfy.ldm.cascade.controlnet
 import comfy.cldm.mmdit
 import comfy.ldm.hydit.controlnet
 import comfy.ldm.flux.controlnet_xlabs
 def broadcast_image_to(tensor, target_batch_size, batched_number):
@ -33,6 +55,10 @@ def broadcast_image_to(tensor, target_batch_size, batched_number):
    else:
        return torch.cat([tensor] * batched_number, dim=0)
 class StrengthType(Enum):
    CONSTANT = 1
    LINEAR_UP = 2
 class ControlBase:
    def __init__(self, device=None):
        self.cond_hint_original = None
@ -51,6 +77,8 @@ class ControlBase:
            device = comfy.model_management.get_torch_device()
        self.device = device
        self.previous_controlnet = None
        self.extra_conds = []
        self.strength_type = StrengthType.CONSTANT
    def set_cond_hint(self, cond_hint, strength=1.0, timestep_percent_range=(0.0, 1.0), vae=None):
        self.cond_hint_original = cond_hint
@ -93,6 +121,8 @@ class ControlBase:
        c.latent_format = self.latent_format
        c.extra_args = self.extra_args.copy()
        c.vae = self.vae
        c.extra_conds = self.extra_conds.copy()
        c.strength_type = self.strength_type
    def inference_memory_requirements(self, dtype):
        if self.previous_controlnet is not None:
@ -113,7 +143,10 @@ class ControlBase:
                    if x not in applied_to: #memory saving strategy, allow shared tensors and only apply strength to shared tensors once
                        applied_to.add(x)
-                        x *= self.strength
+                        if self.strength_type == StrengthType.CONSTANT:
                            x *= self.strength
                        elif self.strength_type == StrengthType.LINEAR_UP:
                            x *= (self.strength ** float(len(control_output) - i))
                    if x.dtype != output_dtype:
                        x = x.to(output_dtype)
@ -142,7 +175,7 @@ class ControlBase:
 class ControlNet(ControlBase):
-    def __init__(self, control_model=None, global_average_pooling=False, compression_ratio=8, latent_format=None, device=None, load_device=None, manual_cast_dtype=None):
+    def __init__(self, control_model=None, global_average_pooling=False, compression_ratio=8, latent_format=None, device=None, load_device=None, manual_cast_dtype=None, extra_conds=["y"], strength_type=StrengthType.CONSTANT):
        super().__init__(device)
        self.control_model = control_model
        self.load_device = load_device
@ -154,6 +187,8 @@ class ControlNet(ControlBase):
        self.model_sampling_current = None
        self.manual_cast_dtype = manual_cast_dtype
        self.latent_format = latent_format
        self.extra_conds += extra_conds
        self.strength_type = strength_type
    def get_control(self, x_noisy, t, cond, batched_number):
        control_prev = None
@ -191,13 +226,16 @@ class ControlNet(ControlBase):
            self.cond_hint = broadcast_image_to(self.cond_hint, x_noisy.shape[0], batched_number)
        context = cond.get('crossattn_controlnet', cond['c_crossattn'])
-        y = cond.get('y', None)
+        extra = self.extra_args.copy()
-        if y is not None:
+        for c in self.extra_conds:
-            y = y.to(dtype)
+            temp = cond.get(c, None)
            if temp is not None:
                extra[c] = temp.to(dtype)
        timestep = self.model_sampling_current.timestep(t)
        x_noisy = self.model_sampling_current.calculate_input(t, x_noisy)
-        control = self.control_model(x=x_noisy.to(dtype), hint=self.cond_hint, timesteps=timestep.float(), context=context.to(dtype), y=y, **self.extra_args)
+        control = self.control_model(x=x_noisy.to(dtype), hint=self.cond_hint, timesteps=timestep.to(dtype), context=context.to(dtype), **extra)
        return self.control_merge(control, control_prev, output_dtype)
    def copy(self):
@ -286,6 +324,7 @@ class ControlLora(ControlNet):
        ControlBase.__init__(self, device)
        self.control_weights = control_weights
        self.global_average_pooling = global_average_pooling
        self.extra_conds += ["y"]
    def pre_run(self, model, percent_to_timestep_function):
        super().pre_run(model, percent_to_timestep_function)
@ -338,12 +377,8 @@ class ControlLora(ControlNet):
    def inference_memory_requirements(self, dtype):
        return comfy.utils.calculate_parameters(self.control_weights) * comfy.model_management.dtype_size(dtype) + ControlBase.inference_memory_requirements(self, dtype)
-def load_controlnet_mmdit(sd):
+def controlnet_config(sd):
-    new_sd = comfy.model_detection.convert_diffusers_mmdit(sd, "")
+    model_config = comfy.model_detection.model_config_from_unet(sd, "", True)
    model_config = comfy.model_detection.model_config_from_unet(new_sd, "", True)
    num_blocks = comfy.model_detection.count_blocks(new_sd, 'joint_blocks.{}.')
    for k in sd:
        new_sd[k] = sd[k]
    supported_inference_dtypes = model_config.supported_inference_dtypes
@ -356,14 +391,28 @@ def load_controlnet_mmdit(sd):
    else:
        operations = comfy.ops.disable_weight_init
-    control_model = comfy.cldm.mmdit.ControlNet(num_blocks=num_blocks, operations=operations, device=load_device, dtype=unet_dtype, **controlnet_config)
+    offload_device = comfy.model_management.unet_offload_device()
-    missing, unexpected = control_model.load_state_dict(new_sd, strict=False)
+    return model_config, operations, load_device, unet_dtype, manual_cast_dtype, offload_device
 def controlnet_load_state_dict(control_model, sd):
    missing, unexpected = control_model.load_state_dict(sd, strict=False)
    if len(missing) > 0:
        logging.warning("missing controlnet keys: {}".format(missing))
    if len(unexpected) > 0:
        logging.debug("unexpected controlnet keys: {}".format(unexpected))
    return control_model
 def load_controlnet_mmdit(sd):
    new_sd = comfy.model_detection.convert_diffusers_mmdit(sd, "")
    model_config, operations, load_device, unet_dtype, manual_cast_dtype, offload_device = controlnet_config(new_sd)
    num_blocks = comfy.model_detection.count_blocks(new_sd, 'joint_blocks.{}.')
    for k in sd:
        new_sd[k] = sd[k]
    control_model = comfy.cldm.mmdit.ControlNet(num_blocks=num_blocks, operations=operations, device=offload_device, dtype=unet_dtype, **model_config.unet_config)
    control_model = controlnet_load_state_dict(control_model, new_sd)
    latent_format = comfy.latent_formats.SD3()
    latent_format.shift_factor = 0 #SD3 controlnet weirdness
@ -371,8 +420,31 @@ def load_controlnet_mmdit(sd):
    return control
 def load_controlnet_hunyuandit(controlnet_data):
    model_config, operations, load_device, unet_dtype, manual_cast_dtype, offload_device = controlnet_config(controlnet_data)
    control_model = comfy.ldm.hydit.controlnet.HunYuanControlNet(operations=operations, device=offload_device, dtype=unet_dtype)
    control_model = controlnet_load_state_dict(control_model, controlnet_data)
    latent_format = comfy.latent_formats.SDXL()
    extra_conds = ['text_embedding_mask', 'encoder_hidden_states_t5', 'text_embedding_mask_t5', 'image_meta_size', 'style', 'cos_cis_img', 'sin_cis_img']
    control = ControlNet(control_model, compression_ratio=1, latent_format=latent_format, load_device=load_device, manual_cast_dtype=manual_cast_dtype, extra_conds=extra_conds, strength_type=StrengthType.CONSTANT)
    return control
 def load_controlnet_flux_xlabs(sd):
    model_config, operations, load_device, unet_dtype, manual_cast_dtype, offload_device = controlnet_config(sd)
    control_model = comfy.ldm.flux.controlnet_xlabs.ControlNetFlux(operations=operations, device=offload_device, dtype=unet_dtype, **model_config.unet_config)
    control_model = controlnet_load_state_dict(control_model, sd)
    extra_conds = ['y', 'guidance']
    control = ControlNet(control_model, load_device=load_device, manual_cast_dtype=manual_cast_dtype, extra_conds=extra_conds)
    return control
 def load_controlnet(ckpt_path, model=None):
    controlnet_data = comfy.utils.load_torch_file(ckpt_path, safe_load=True)
    if 'after_proj_list.18.bias' in controlnet_data.keys(): #Hunyuan DiT
        return load_controlnet_hunyuandit(controlnet_data)
    if "lora_controlnet" in controlnet_data:
        return ControlLora(controlnet_data)
@ -430,7 +502,10 @@ def load_controlnet(ckpt_path, model=None):
            logging.warning("leftover keys: {}".format(leftover_keys))
        controlnet_data = new_sd
    elif "controlnet_blocks.0.weight" in controlnet_data: #SD3 diffusers format
-        return load_controlnet_mmdit(controlnet_data)
+        if "double_blocks.0.img_attn.norm.key_norm.scale" in controlnet_data:
            return load_controlnet_flux_xlabs(controlnet_data)
        else:
            return load_controlnet_mmdit(controlnet_data)
    pth_key = 'control_model.zero_convs.0.0.weight'
    pth = False
@ -462,6 +537,7 @@ def load_controlnet(ckpt_path, model=None):
    if manual_cast_dtype is not None:
        controlnet_config["operations"] = comfy.ops.manual_cast
    controlnet_config["dtype"] = unet_dtype
    controlnet_config["device"] = comfy.model_management.unet_offload_device()
    controlnet_config.pop("out_channels")
    controlnet_config["hint_channels"] = controlnet_data["{}input_hint_block.0.weight".format(prefix)].shape[1]
    control_model = comfy.cldm.cldm.ControlNet(**controlnet_config)
--- a/comfy/diffusers_load.py
+++ b/comfy/diffusers_load.py
@ -22,7 +22,7 @@ def load_diffusers(model_path, output_vae=True, output_clip=True, embedding_dire
    if text_encoder2_path is not None:
        text_encoder_paths.append(text_encoder2_path)
-    unet = comfy.sd.load_unet(unet_path)
+    unet = comfy.sd.load_diffusion_model(unet_path)
    clip = None
    if output_clip:
--- a/comfy/float.py
+++ b/comfy/float.py
@ -0,0 +1,62 @@
 import torch
 import math
 def calc_mantissa(abs_x, exponent, normal_mask, MANTISSA_BITS, EXPONENT_BIAS, generator=None):
    mantissa_scaled = torch.where(
        normal_mask,
        (abs_x / (2.0 ** (exponent - EXPONENT_BIAS)) - 1.0) * (2**MANTISSA_BITS),
        (abs_x / (2.0 ** (-EXPONENT_BIAS + 1 - MANTISSA_BITS)))
    )
    mantissa_scaled += torch.rand(mantissa_scaled.size(), dtype=mantissa_scaled.dtype, layout=mantissa_scaled.layout, device=mantissa_scaled.device, generator=generator)
    return mantissa_scaled.floor() / (2**MANTISSA_BITS)
 #Not 100% sure about this
 def manual_stochastic_round_to_float8(x, dtype, generator=None):
    if dtype == torch.float8_e4m3fn:
        EXPONENT_BITS, MANTISSA_BITS, EXPONENT_BIAS = 4, 3, 7
    elif dtype == torch.float8_e5m2:
        EXPONENT_BITS, MANTISSA_BITS, EXPONENT_BIAS = 5, 2, 15
    else:
        raise ValueError("Unsupported dtype")
    x = x.half()
    sign = torch.sign(x)
    abs_x = x.abs()
    sign = torch.where(abs_x == 0, 0, sign)
    # Combine exponent calculation and clamping
    exponent = torch.clamp(
        torch.floor(torch.log2(abs_x)) + EXPONENT_BIAS,
        0, 2**EXPONENT_BITS - 1
    )
    # Combine mantissa calculation and rounding
    normal_mask = ~(exponent == 0)
    abs_x[:] = calc_mantissa(abs_x, exponent, normal_mask, MANTISSA_BITS, EXPONENT_BIAS, generator=generator)
    sign *= torch.where(
        normal_mask,
        (2.0 ** (exponent - EXPONENT_BIAS)) * (1.0 + abs_x),
        (2.0 ** (-EXPONENT_BIAS + 1)) * abs_x
    )
    del abs_x
    return sign.to(dtype=dtype)
 def stochastic_rounding(value, dtype, seed=0):
    if dtype == torch.float32:
        return value.to(dtype=torch.float32)
    if dtype == torch.float16:
        return value.to(dtype=torch.float16)
    if dtype == torch.bfloat16:
        return value.to(dtype=torch.bfloat16)
    if dtype == torch.float8_e4m3fn or dtype == torch.float8_e5m2:
        generator = torch.Generator(device=value.device)
        generator.manual_seed(seed)
        return manual_stochastic_round_to_float8(value, dtype, generator=generator)
    return value.to(dtype=dtype)
--- a/comfy/k_diffusion/sampling.py
+++ b/comfy/k_diffusion/sampling.py
@ -9,6 +9,7 @@ from tqdm.auto import trange, tqdm
 from . import utils
 from . import deis
 import comfy.model_patcher
 import comfy.model_sampling
 def append_zero(x):
    return torch.cat([x, x.new_zeros([1])])
@ -509,6 +510,9 @@ def sample_dpm_adaptive(model, x, sigma_min, sigma_max, extra_args=None, callbac
@torch.no_grad()
 def sample_dpmpp_2s_ancestral(model, x, sigmas, extra_args=None, callback=None, disable=None, eta=1., s_noise=1., noise_sampler=None):
    if isinstance(model.inner_model.inner_model.model_sampling, comfy.model_sampling.CONST):
        return sample_dpmpp_2s_ancestral_RF(model, x, sigmas, extra_args, callback, disable, eta, s_noise, noise_sampler)
    """Ancestral sampling with DPM-Solver++(2S) second-order steps."""
    extra_args = {} if extra_args is None else extra_args
    noise_sampler = default_noise_sampler(x) if noise_sampler is None else noise_sampler
@ -541,6 +545,55 @@ def sample_dpmpp_2s_ancestral(model, x, sigmas, extra_args=None, callback=None,
    return x
@torch.no_grad()
 def sample_dpmpp_2s_ancestral_RF(model, x, sigmas, extra_args=None, callback=None, disable=None, eta=1., s_noise=1., noise_sampler=None):
    """Ancestral sampling with DPM-Solver++(2S) second-order steps."""
    extra_args = {} if extra_args is None else extra_args
    noise_sampler = default_noise_sampler(x) if noise_sampler is None else noise_sampler
    s_in = x.new_ones([x.shape[0]])
    sigma_fn = lambda lbda: (lbda.exp() + 1) ** -1
    lambda_fn = lambda sigma: ((1-sigma)/sigma).log()
    # logged_x = x.unsqueeze(0)
    for i in trange(len(sigmas) - 1, disable=disable):
        denoised = model(x, sigmas[i] * s_in, **extra_args)
        downstep_ratio = 1 + (sigmas[i+1]/sigmas[i] - 1) * eta
        sigma_down = sigmas[i+1] * downstep_ratio
        alpha_ip1 = 1 - sigmas[i+1]
        alpha_down = 1 - sigma_down
        renoise_coeff = (sigmas[i+1]**2 - sigma_down**2*alpha_ip1**2/alpha_down**2)**0.5
        # sigma_down, sigma_up = get_ancestral_step(sigmas[i], sigmas[i + 1], eta=eta)
        if callback is not None:
            callback({'x': x, 'i': i, 'sigma': sigmas[i], 'sigma_hat': sigmas[i], 'denoised': denoised})
        if sigmas[i + 1] == 0:
            # Euler method
            d = to_d(x, sigmas[i], denoised)
            dt = sigma_down - sigmas[i]
            x = x + d * dt
        else:
            # DPM-Solver++(2S)
            if sigmas[i] == 1.0:
                sigma_s = 0.9999
            else:
                t_i, t_down = lambda_fn(sigmas[i]), lambda_fn(sigma_down)
                r = 1 / 2
                h = t_down - t_i
                s = t_i + r * h
                sigma_s = sigma_fn(s)
            # sigma_s = sigmas[i+1]
            sigma_s_i_ratio = sigma_s / sigmas[i]
            u = sigma_s_i_ratio * x + (1 - sigma_s_i_ratio) * denoised
            D_i = model(u, sigma_s * s_in, **extra_args)
            sigma_down_i_ratio = sigma_down / sigmas[i]
            x = sigma_down_i_ratio * x + (1 - sigma_down_i_ratio) * D_i
            # print("sigma_i", sigmas[i], "sigma_ip1", sigmas[i+1],"sigma_down", sigma_down, "sigma_down_i_ratio", sigma_down_i_ratio, "sigma_s_i_ratio", sigma_s_i_ratio, "renoise_coeff", renoise_coeff)
        # Noise addition
        if sigmas[i + 1] > 0 and eta > 0:
            x = (alpha_ip1/alpha_down) * x + noise_sampler(sigmas[i], sigmas[i + 1]) * s_noise * renoise_coeff
        # logged_x = torch.cat((logged_x, x.unsqueeze(0)), dim=0)
    return x
@torch.no_grad()
 def sample_dpmpp_sde(model, x, sigmas, extra_args=None, callback=None, disable=None, eta=1., s_noise=1., noise_sampler=None, r=1 / 2):
    """DPM-Solver++ (stochastic)."""
--- a/comfy/latent_formats.py
+++ b/comfy/latent_formats.py
@ -141,6 +141,7 @@ class StableAudio1(LatentFormat):
    latent_channels = 64
 class Flux(SD3):
    latent_channels = 16
    def __init__(self):
        self.scale_factor = 0.3611
        self.shift_factor = 0.1159
@ -162,6 +163,7 @@ class Flux(SD3):
            [-0.0005, -0.0530, -0.0020],
            [-0.1273, -0.0932, -0.0680]
        ]
        self.taesd_decoder_name = "taef1_decoder"
    def process_in(self, latent):
        return (latent - self.shift_factor) * self.scale_factor
--- a/comfy/ldm/aura/mmdit.py
+++ b/comfy/ldm/aura/mmdit.py
@ -9,6 +9,7 @@ import torch.nn.functional as F
 from comfy.ldm.modules.attention import optimized_attention
 import comfy.ops
 import comfy.ldm.common_dit
 def modulate(x, shift, scale):
    return x * (1 + scale.unsqueeze(1)) + shift.unsqueeze(1)
@ -407,10 +408,7 @@ class MMDiT(nn.Module):
    def patchify(self, x):
        B, C, H, W = x.size()
-        pad_h = (self.patch_size - H % self.patch_size) % self.patch_size
+        x = comfy.ldm.common_dit.pad_to_patch_size(x, (self.patch_size, self.patch_size))
        pad_w = (self.patch_size - W % self.patch_size) % self.patch_size
        x = torch.nn.functional.pad(x, (0, pad_w, 0, pad_h), mode='circular')
        x = x.view(
            B,
            C,
--- a/comfy/ldm/common_dit.py
+++ b/comfy/ldm/common_dit.py
@ -0,0 +1,8 @@
 import torch
 def pad_to_patch_size(img, patch_size=(2, 2), padding_mode="circular"):
    if padding_mode == "circular" and torch.jit.is_tracing() or torch.jit.is_scripting():
        padding_mode = "reflect"
    pad_h = (patch_size[0] - img.shape[-2] % patch_size[0]) % patch_size[0]
    pad_w = (patch_size[1] - img.shape[-1] % patch_size[1]) % patch_size[1]
    return torch.nn.functional.pad(img, (0, pad_w, 0, pad_h), mode=padding_mode)
--- a/comfy/ldm/flux/controlnet_xlabs.py
+++ b/comfy/ldm/flux/controlnet_xlabs.py
@ -0,0 +1,104 @@
 #Original code can be found on: https://github.com/XLabs-AI/x-flux/blob/main/src/flux/controlnet.py
 import torch
 from torch import Tensor, nn
 from einops import rearrange, repeat
 from .layers import (DoubleStreamBlock, EmbedND, LastLayer,
                                 MLPEmbedder, SingleStreamBlock,
                                 timestep_embedding)
 from .model import Flux
 import comfy.ldm.common_dit
 class ControlNetFlux(Flux):
    def __init__(self, image_model=None, dtype=None, device=None, operations=None, **kwargs):
        super().__init__(final_layer=False, dtype=dtype, device=device, operations=operations, **kwargs)
        # add ControlNet blocks
        self.controlnet_blocks = nn.ModuleList([])
        for _ in range(self.params.depth):
            controlnet_block = operations.Linear(self.hidden_size, self.hidden_size, dtype=dtype, device=device)
            # controlnet_block = zero_module(controlnet_block)
            self.controlnet_blocks.append(controlnet_block)
        self.pos_embed_input = operations.Linear(self.in_channels, self.hidden_size, bias=True, dtype=dtype, device=device)
        self.gradient_checkpointing = False
        self.input_hint_block = nn.Sequential(
            operations.Conv2d(3, 16, 3, padding=1, dtype=dtype, device=device),
            nn.SiLU(),
            operations.Conv2d(16, 16, 3, padding=1, dtype=dtype, device=device),
            nn.SiLU(),
            operations.Conv2d(16, 16, 3, padding=1, stride=2, dtype=dtype, device=device),
            nn.SiLU(),
            operations.Conv2d(16, 16, 3, padding=1, dtype=dtype, device=device),
            nn.SiLU(),
            operations.Conv2d(16, 16, 3, padding=1, stride=2, dtype=dtype, device=device),
            nn.SiLU(),
            operations.Conv2d(16, 16, 3, padding=1, dtype=dtype, device=device),
            nn.SiLU(),
            operations.Conv2d(16, 16, 3, padding=1, stride=2, dtype=dtype, device=device),
            nn.SiLU(),
            operations.Conv2d(16, 16, 3, padding=1, dtype=dtype, device=device)
        )
    def forward_orig(
        self,
        img: Tensor,
        img_ids: Tensor,
        controlnet_cond: Tensor,
        txt: Tensor,
        txt_ids: Tensor,
        timesteps: Tensor,
        y: Tensor,
        guidance: Tensor = None,
    ) -> Tensor:
        if img.ndim != 3 or txt.ndim != 3:
            raise ValueError("Input img and txt tensors must have 3 dimensions.")
        # running on sequences img
        img = self.img_in(img)
        controlnet_cond = self.input_hint_block(controlnet_cond)
        controlnet_cond = rearrange(controlnet_cond, "b c (h ph) (w pw) -> b (h w) (c ph pw)", ph=2, pw=2)
        controlnet_cond = self.pos_embed_input(controlnet_cond)
        img = img + controlnet_cond
        vec = self.time_in(timestep_embedding(timesteps, 256))
        if self.params.guidance_embed:
            vec = vec + self.guidance_in(timestep_embedding(guidance, 256))
        vec = vec + self.vector_in(y)
        txt = self.txt_in(txt)
        ids = torch.cat((txt_ids, img_ids), dim=1)
        pe = self.pe_embedder(ids)
        block_res_samples = ()
        for block in self.double_blocks:
            img, txt = block(img=img, txt=txt, vec=vec, pe=pe)
            block_res_samples = block_res_samples + (img,)
        controlnet_block_res_samples = ()
        for block_res_sample, controlnet_block in zip(block_res_samples, self.controlnet_blocks):
            block_res_sample = controlnet_block(block_res_sample)
            controlnet_block_res_samples = controlnet_block_res_samples + (block_res_sample,)
        return {"input": (controlnet_block_res_samples * 10)[:19]}
    def forward(self, x, timesteps, context, y, guidance=None, hint=None, **kwargs):
        hint = hint * 2.0 - 1.0
        bs, c, h, w = x.shape
        patch_size = 2
        x = comfy.ldm.common_dit.pad_to_patch_size(x, (patch_size, patch_size))
        img = rearrange(x, "b c (h ph) (w pw) -> b (h w) (c ph pw)", ph=patch_size, pw=patch_size)
        h_len = ((h + (patch_size // 2)) // patch_size)
        w_len = ((w + (patch_size // 2)) // patch_size)
        img_ids = torch.zeros((h_len, w_len, 3), device=x.device, dtype=x.dtype)
        img_ids[..., 1] = img_ids[..., 1] + torch.linspace(0, h_len - 1, steps=h_len, device=x.device, dtype=x.dtype)[:, None]
        img_ids[..., 2] = img_ids[..., 2] + torch.linspace(0, w_len - 1, steps=w_len, device=x.device, dtype=x.dtype)[None, :]
        img_ids = repeat(img_ids, "h w c -> b (h w) c", b=bs)
        txt_ids = torch.zeros((bs, context.shape[1], 3), device=x.device, dtype=x.dtype)
        return self.forward_orig(img, img_ids, hint, context, txt_ids, timesteps, y, guidance)
--- a/comfy/ldm/flux/layers.py
+++ b/comfy/ldm/flux/layers.py
@ -2,12 +2,12 @@ import math
 from dataclasses import dataclass
 import torch
 from einops import rearrange
 from torch import Tensor, nn
 from .math import attention, rope
 import comfy.ops
 class EmbedND(nn.Module):
    def __init__(self, dim: int, theta: int, axes_dim: list):
        super().__init__()
@ -36,9 +36,7 @@ def timestep_embedding(t: Tensor, dim, max_period=10000, time_factor: float = 10
    """
    t = time_factor * t
    half = dim // 2
-    freqs = torch.exp(-math.log(max_period) * torch.arange(start=0, end=half, dtype=torch.float32) / half).to(
+    freqs = torch.exp(-math.log(max_period) * torch.arange(start=0, end=half, dtype=torch.float32, device=t.device) / half)
        t.device
    )
    args = t[:, None].float() * freqs[None]
    embedding = torch.cat([torch.cos(args), torch.sin(args)], dim=-1)
@ -48,7 +46,6 @@ def timestep_embedding(t: Tensor, dim, max_period=10000, time_factor: float = 10
        embedding = embedding.to(t)
    return embedding
 class MLPEmbedder(nn.Module):
    def __init__(self, in_dim: int, hidden_dim: int, dtype=None, device=None, operations=None):
        super().__init__()
@ -66,10 +63,8 @@ class RMSNorm(torch.nn.Module):
        self.scale = nn.Parameter(torch.empty((dim), dtype=dtype, device=device))
    def forward(self, x: Tensor):
        x_dtype = x.dtype
        x = x.float()
        rrms = torch.rsqrt(torch.mean(x**2, dim=-1, keepdim=True) + 1e-6)
-        return (x * rrms).to(dtype=x_dtype) * comfy.ops.cast_to(self.scale, dtype=x_dtype, device=x.device)
+        return (x * rrms) * comfy.ops.cast_to(self.scale, dtype=x.dtype, device=x.device)
 class QKNorm(torch.nn.Module):
@ -94,14 +89,6 @@ class SelfAttention(nn.Module):
        self.norm = QKNorm(head_dim, dtype=dtype, device=device, operations=operations)
        self.proj = operations.Linear(dim, dim, dtype=dtype, device=device)
    def forward(self, x: Tensor, pe: Tensor) -> Tensor:
        qkv = self.qkv(x)
        q, k, v = rearrange(qkv, "B L (K H D) -> K B H L D", K=3, H=self.num_heads)
        q, k = self.norm(q, k, v)
        x = attention(q, k, v, pe=pe)
        x = self.proj(x)
        return x
@dataclass
 class ModulationOut:
@ -163,22 +150,21 @@ class DoubleStreamBlock(nn.Module):
        img_modulated = self.img_norm1(img)
        img_modulated = (1 + img_mod1.scale) * img_modulated + img_mod1.shift
        img_qkv = self.img_attn.qkv(img_modulated)
-        img_q, img_k, img_v = rearrange(img_qkv, "B L (K H D) -> K B H L D", K=3, H=self.num_heads)
+        img_q, img_k, img_v = img_qkv.view(img_qkv.shape[0], img_qkv.shape[1], 3, self.num_heads, -1).permute(2, 0, 3, 1, 4)
        img_q, img_k = self.img_attn.norm(img_q, img_k, img_v)
        # prepare txt for attention
        txt_modulated = self.txt_norm1(txt)
        txt_modulated = (1 + txt_mod1.scale) * txt_modulated + txt_mod1.shift
        txt_qkv = self.txt_attn.qkv(txt_modulated)
-        txt_q, txt_k, txt_v = rearrange(txt_qkv, "B L (K H D) -> K B H L D", K=3, H=self.num_heads)
+        txt_q, txt_k, txt_v = txt_qkv.view(txt_qkv.shape[0], txt_qkv.shape[1], 3, self.num_heads, -1).permute(2, 0, 3, 1, 4)
        txt_q, txt_k = self.txt_attn.norm(txt_q, txt_k, txt_v)
        # run actual attention
-        q = torch.cat((txt_q, img_q), dim=2)
+        attn = attention(torch.cat((txt_q, img_q), dim=2),
-        k = torch.cat((txt_k, img_k), dim=2)
+                         torch.cat((txt_k, img_k), dim=2),
-        v = torch.cat((txt_v, img_v), dim=2)
+                         torch.cat((txt_v, img_v), dim=2), pe=pe)
        attn = attention(q, k, v, pe=pe)
        txt_attn, img_attn = attn[:, : txt.shape[1]], attn[:, txt.shape[1] :]
        # calculate the img bloks
@ -186,8 +172,12 @@ class DoubleStreamBlock(nn.Module):
        img = img + img_mod2.gate * self.img_mlp((1 + img_mod2.scale) * self.img_norm2(img) + img_mod2.shift)
        # calculate the txt bloks
-        txt = txt + txt_mod1.gate * self.txt_attn.proj(txt_attn)
+        txt += txt_mod1.gate * self.txt_attn.proj(txt_attn)
-        txt = txt + txt_mod2.gate * self.txt_mlp((1 + txt_mod2.scale) * self.txt_norm2(txt) + txt_mod2.shift)
+        txt += txt_mod2.gate * self.txt_mlp((1 + txt_mod2.scale) * self.txt_norm2(txt) + txt_mod2.shift)
        if txt.dtype == torch.float16:
            txt = torch.nan_to_num(txt, nan=0.0, posinf=65504, neginf=-65504)
        return img, txt
@ -232,14 +222,17 @@ class SingleStreamBlock(nn.Module):
        x_mod = (1 + mod.scale) * self.pre_norm(x) + mod.shift
        qkv, mlp = torch.split(self.linear1(x_mod), [3 * self.hidden_size, self.mlp_hidden_dim], dim=-1)
-        q, k, v = rearrange(qkv, "B L (K H D) -> K B H L D", K=3, H=self.num_heads)
+        q, k, v = qkv.view(qkv.shape[0], qkv.shape[1], 3, self.num_heads, -1).permute(2, 0, 3, 1, 4)
        q, k = self.norm(q, k, v)
        # compute attention
        attn = attention(q, k, v, pe=pe)
        # compute activation in mlp stream, cat again and run second linear layer
        output = self.linear2(torch.cat((attn, self.mlp_act(mlp)), 2))
-        return x + mod.gate * output
+        x += mod.gate * output
        if x.dtype == torch.float16:
            x = torch.nan_to_num(x, nan=0.0, posinf=65504, neginf=-65504)
        return x
 class LastLayer(nn.Module):
--- a/comfy/ldm/flux/math.py
+++ b/comfy/ldm/flux/math.py
@ -14,7 +14,7 @@ def attention(q: Tensor, k: Tensor, v: Tensor, pe: Tensor) -> Tensor:
 def rope(pos: Tensor, dim: int, theta: int) -> Tensor:
    assert dim % 2 == 0
-    if comfy.model_management.is_device_mps(pos.device):
+    if comfy.model_management.is_device_mps(pos.device) or comfy.model_management.is_intel_xpu():
        device = torch.device("cpu")
    else:
        device = pos.device
--- a/comfy/ldm/flux/model.py
+++ b/comfy/ldm/flux/model.py
@ -15,6 +15,7 @@ from .layers import (
 )
 from einops import rearrange, repeat
 import comfy.ldm.common_dit
@dataclass
 class FluxParams:
@ -37,12 +38,12 @@ class Flux(nn.Module):
    Transformer model for flow matching on sequences.
    """
-    def __init__(self, image_model=None, dtype=None, device=None, operations=None, **kwargs):
+    def __init__(self, image_model=None, final_layer=True, dtype=None, device=None, operations=None, **kwargs):
        super().__init__()
        self.dtype = dtype
        params = FluxParams(**kwargs)
        self.params = params
-        self.in_channels = params.in_channels
+        self.in_channels = params.in_channels * 2 * 2
        self.out_channels = self.in_channels
        if params.hidden_size % params.num_heads != 0:
            raise ValueError(
@ -82,7 +83,8 @@ class Flux(nn.Module):
            ]
        )
-        self.final_layer = LastLayer(self.hidden_size, 1, self.out_channels, dtype=dtype, device=device, operations=operations)
+        if final_layer:
            self.final_layer = LastLayer(self.hidden_size, 1, self.out_channels, dtype=dtype, device=device, operations=operations)
    def forward_orig(
        self,
@ -93,6 +95,7 @@ class Flux(nn.Module):
        timesteps: Tensor,
        y: Tensor,
        guidance: Tensor = None,
        control=None,
    ) -> Tensor:
        if img.ndim != 3 or txt.ndim != 3:
            raise ValueError("Input img and txt tensors must have 3 dimensions.")
@ -111,24 +114,37 @@ class Flux(nn.Module):
        ids = torch.cat((txt_ids, img_ids), dim=1)
        pe = self.pe_embedder(ids)
-        for block in self.double_blocks:
+        for i, block in enumerate(self.double_blocks):
            img, txt = block(img=img, txt=txt, vec=vec, pe=pe)
            if control is not None: # Controlnet
                control_i = control.get("input")
                if i < len(control_i):
                    add = control_i[i]
                    if add is not None:
                        img += add
        img = torch.cat((txt, img), 1)
-        for block in self.single_blocks:
+
        for i, block in enumerate(self.single_blocks):
            img = block(img, vec=vec, pe=pe)
            if control is not None: # Controlnet
                control_o = control.get("output")
                if i < len(control_o):
                    add = control_o[i]
                    if add is not None:
                        img[:, txt.shape[1] :, ...] += add
        img = img[:, txt.shape[1] :, ...]
        img = self.final_layer(img, vec)  # (N, T, patch_size ** 2 * out_channels)
        return img
-    def forward(self, x, timestep, context, y, guidance, **kwargs):
+    def forward(self, x, timestep, context, y, guidance, control=None, **kwargs):
        bs, c, h, w = x.shape
        patch_size = 2
-        pad_h = (patch_size - h % 2) % patch_size
+        x = comfy.ldm.common_dit.pad_to_patch_size(x, (patch_size, patch_size))
        pad_w = (patch_size - w % 2) % patch_size
        x = torch.nn.functional.pad(x, (0, pad_w, 0, pad_h), mode='circular')
        img = rearrange(x, "b c (h ph) (w pw) -> b (h w) (c ph pw)", ph=patch_size, pw=patch_size)
@ -140,5 +156,5 @@ class Flux(nn.Module):
        img_ids = repeat(img_ids, "h w c -> b (h w) c", b=bs)
        txt_ids = torch.zeros((bs, context.shape[1], 3), device=x.device, dtype=x.dtype)
-        out = self.forward_orig(img, img_ids, context, txt_ids, timestep, y, guidance)
+        out = self.forward_orig(img, img_ids, context, txt_ids, timestep, y, guidance, control)
        return rearrange(out, "b (h w) (c ph pw) -> b c (h ph) (w pw)", h=h_len, w=w_len, ph=2, pw=2)[:,:,:h,:w]
--- a/comfy/ldm/hydit/attn_layers.py
+++ b/comfy/ldm/hydit/attn_layers.py
@ -47,7 +47,7 @@ def reshape_for_broadcast(freqs_cis: Union[torch.Tensor, Tuple[torch.Tensor]], x
 def rotate_half(x):
-    x_real, x_imag = x.float().reshape(*x.shape[:-1], -1, 2).unbind(-1)  # [B, S, H, D//2]
+    x_real, x_imag = x.reshape(*x.shape[:-1], -1, 2).unbind(-1)  # [B, S, H, D//2]
    return torch.stack([-x_imag, x_real], dim=-1).flatten(3)
@ -78,10 +78,9 @@ def apply_rotary_emb(
    xk_out = None
    if isinstance(freqs_cis, tuple):
        cos, sin = reshape_for_broadcast(freqs_cis, xq, head_first)    # [S, D]
-        cos, sin = cos.to(xq.device), sin.to(xq.device)
+        xq_out = (xq * cos + rotate_half(xq) * sin)
        xq_out = (xq.float() * cos + rotate_half(xq.float()) * sin).type_as(xq)
        if xk is not None:
-            xk_out = (xk.float() * cos + rotate_half(xk.float()) * sin).type_as(xk)
+            xk_out = (xk * cos + rotate_half(xk) * sin)
    else:
        xq_ = torch.view_as_complex(xq.float().reshape(*xq.shape[:-1], -1, 2))  # [B, S, H, D//2]
        freqs_cis = reshape_for_broadcast(freqs_cis, xq_, head_first).to(xq.device)   # [S, D//2] --> [1, S, 1, D//2]
--- a/comfy/ldm/hydit/controlnet.py
+++ b/comfy/ldm/hydit/controlnet.py
@ -0,0 +1,321 @@
 from typing import Any, Optional
 import torch
 import torch.nn as nn
 import torch.nn.functional as F
 from torch.utils import checkpoint
 from comfy.ldm.modules.diffusionmodules.mmdit import (
    Mlp,
    TimestepEmbedder,
    PatchEmbed,
    RMSNorm,
 )
 from comfy.ldm.modules.diffusionmodules.util import timestep_embedding
 from .poolers import AttentionPool
 import comfy.latent_formats
 from .models import HunYuanDiTBlock, calc_rope
 from .posemb_layers import get_2d_rotary_pos_embed, get_fill_resize_and_crop
 class HunYuanControlNet(nn.Module):
    """
    HunYuanDiT: Diffusion model with a Transformer backbone.
    Inherit ModelMixin and ConfigMixin to be compatible with the sampler StableDiffusionPipeline of diffusers.
    Inherit PeftAdapterMixin to be compatible with the PEFT training pipeline.
    Parameters
    ----------
    args: argparse.Namespace
        The arguments parsed by argparse.
    input_size: tuple
        The size of the input image.
    patch_size: int
        The size of the patch.
    in_channels: int
        The number of input channels.
    hidden_size: int
        The hidden size of the transformer backbone.
    depth: int
        The number of transformer blocks.
    num_heads: int
        The number of attention heads.
    mlp_ratio: float
        The ratio of the hidden size of the MLP in the transformer block.
    log_fn: callable
        The logging function.
    """
    def __init__(
        self,
        input_size: tuple = 128,
        patch_size: int = 2,
        in_channels: int = 4,
        hidden_size: int = 1408,
        depth: int = 40,
        num_heads: int = 16,
        mlp_ratio: float = 4.3637,
        text_states_dim=1024,
        text_states_dim_t5=2048,
        text_len=77,
        text_len_t5=256,
        qk_norm=True,  # See http://arxiv.org/abs/2302.05442 for details.
        size_cond=False,
        use_style_cond=False,
        learn_sigma=True,
        norm="layer",
        log_fn: callable = print,
        attn_precision=None,
        dtype=None,
        device=None,
        operations=None,
        **kwargs,
    ):
        super().__init__()
        self.log_fn = log_fn
        self.depth = depth
        self.learn_sigma = learn_sigma
        self.in_channels = in_channels
        self.out_channels = in_channels * 2 if learn_sigma else in_channels
        self.patch_size = patch_size
        self.num_heads = num_heads
        self.hidden_size = hidden_size
        self.text_states_dim = text_states_dim
        self.text_states_dim_t5 = text_states_dim_t5
        self.text_len = text_len
        self.text_len_t5 = text_len_t5
        self.size_cond = size_cond
        self.use_style_cond = use_style_cond
        self.norm = norm
        self.dtype = dtype
        self.latent_format = comfy.latent_formats.SDXL
        self.mlp_t5 = nn.Sequential(
            nn.Linear(
                self.text_states_dim_t5,
                self.text_states_dim_t5 * 4,
                bias=True,
                dtype=dtype,
                device=device,
            ),
            nn.SiLU(),
            nn.Linear(
                self.text_states_dim_t5 * 4,
                self.text_states_dim,
                bias=True,
                dtype=dtype,
                device=device,
            ),
        )
        # learnable replace
        self.text_embedding_padding = nn.Parameter(
            torch.randn(
                self.text_len + self.text_len_t5,
                self.text_states_dim,
                dtype=dtype,
                device=device,
            )
        )
        # Attention pooling
        pooler_out_dim = 1024
        self.pooler = AttentionPool(
            self.text_len_t5,
            self.text_states_dim_t5,
            num_heads=8,
            output_dim=pooler_out_dim,
            dtype=dtype,
            device=device,
            operations=operations,
        )
        # Dimension of the extra input vectors
        self.extra_in_dim = pooler_out_dim
        if self.size_cond:
            # Image size and crop size conditions
            self.extra_in_dim += 6 * 256
        if self.use_style_cond:
            # Here we use a default learned embedder layer for future extension.
            self.style_embedder = nn.Embedding(
                1, hidden_size, dtype=dtype, device=device
            )
            self.extra_in_dim += hidden_size
        # Text embedding for `add`
        self.x_embedder = PatchEmbed(
            input_size,
            patch_size,
            in_channels,
            hidden_size,
            dtype=dtype,
            device=device,
            operations=operations,
        )
        self.t_embedder = TimestepEmbedder(
            hidden_size, dtype=dtype, device=device, operations=operations
        )
        self.extra_embedder = nn.Sequential(
            operations.Linear(
                self.extra_in_dim, hidden_size * 4, dtype=dtype, device=device
            ),
            nn.SiLU(),
            operations.Linear(
                hidden_size * 4, hidden_size, bias=True, dtype=dtype, device=device
            ),
        )
        # Image embedding
        num_patches = self.x_embedder.num_patches
        # HUnYuanDiT Blocks
        self.blocks = nn.ModuleList(
            [
                HunYuanDiTBlock(
                    hidden_size=hidden_size,
                    c_emb_size=hidden_size,
                    num_heads=num_heads,
                    mlp_ratio=mlp_ratio,
                    text_states_dim=self.text_states_dim,
                    qk_norm=qk_norm,
                    norm_type=self.norm,
                    skip=False,
                    attn_precision=attn_precision,
                    dtype=dtype,
                    device=device,
                    operations=operations,
                )
                for _ in range(19)
            ]
        )
        # Input zero linear for the first block
        self.before_proj = operations.Linear(self.hidden_size, self.hidden_size, dtype=dtype, device=device)
        # Output zero linear for the every block
        self.after_proj_list = nn.ModuleList(
            [
                    operations.Linear(
                        self.hidden_size, self.hidden_size, dtype=dtype, device=device
                    )
                for _ in range(len(self.blocks))
            ]
        )
    def forward(
        self,
        x,
        hint,
        timesteps,
        context,#encoder_hidden_states=None,
        text_embedding_mask=None,
        encoder_hidden_states_t5=None,
        text_embedding_mask_t5=None,
        image_meta_size=None,
        style=None,
        return_dict=False,
        **kwarg,
    ):
        """
        Forward pass of the encoder.
        Parameters
        ----------
        x: torch.Tensor
            (B, D, H, W)
        t: torch.Tensor
            (B)
        encoder_hidden_states: torch.Tensor
            CLIP text embedding, (B, L_clip, D)
        text_embedding_mask: torch.Tensor
            CLIP text embedding mask, (B, L_clip)
        encoder_hidden_states_t5: torch.Tensor
            T5 text embedding, (B, L_t5, D)
        text_embedding_mask_t5: torch.Tensor
            T5 text embedding mask, (B, L_t5)
        image_meta_size: torch.Tensor
            (B, 6)
        style: torch.Tensor
            (B)
        cos_cis_img: torch.Tensor
        sin_cis_img: torch.Tensor
        return_dict: bool
            Whether to return a dictionary.
        """
        condition = hint
        if condition.shape[0] == 1:
            condition = torch.repeat_interleave(condition, x.shape[0], dim=0)
        text_states = context  # 2,77,1024
        text_states_t5 = encoder_hidden_states_t5  # 2,256,2048
        text_states_mask = text_embedding_mask.bool()  # 2,77
        text_states_t5_mask = text_embedding_mask_t5.bool()  # 2,256
        b_t5, l_t5, c_t5 = text_states_t5.shape
        text_states_t5 = self.mlp_t5(text_states_t5.view(-1, c_t5)).view(b_t5, l_t5, -1)
        padding = comfy.ops.cast_to_input(self.text_embedding_padding, text_states)
        text_states[:, -self.text_len :] = torch.where(
            text_states_mask[:, -self.text_len :].unsqueeze(2),
            text_states[:, -self.text_len :],
            padding[: self.text_len],
        )
        text_states_t5[:, -self.text_len_t5 :] = torch.where(
            text_states_t5_mask[:, -self.text_len_t5 :].unsqueeze(2),
            text_states_t5[:, -self.text_len_t5 :],
            padding[self.text_len :],
        )
        text_states = torch.cat([text_states, text_states_t5], dim=1)  # 2,205，1024
        # _, _, oh, ow = x.shape
        # th, tw = oh // self.patch_size, ow // self.patch_size
        # Get image RoPE embedding according to `reso`lution.
        freqs_cis_img = calc_rope(
            x, self.patch_size, self.hidden_size // self.num_heads
        )  # (cos_cis_img, sin_cis_img)
        # ========================= Build time and image embedding =========================
        t = self.t_embedder(timesteps, dtype=self.dtype)
        x = self.x_embedder(x)
        # ========================= Concatenate all extra vectors =========================
        # Build text tokens with pooling
        extra_vec = self.pooler(encoder_hidden_states_t5)
        # Build image meta size tokens if applicable
        # if image_meta_size is not None:
        #     image_meta_size = timestep_embedding(image_meta_size.view(-1), 256)   # [B * 6, 256]
        #     if image_meta_size.dtype != self.dtype:
        #         image_meta_size = image_meta_size.half()
        #     image_meta_size = image_meta_size.view(-1, 6 * 256)
        #     extra_vec = torch.cat([extra_vec, image_meta_size], dim=1)  # [B, D + 6 * 256]
        # Build style tokens
        if style is not None:
            style_embedding = self.style_embedder(style)
            extra_vec = torch.cat([extra_vec, style_embedding], dim=1)
        # Concatenate all extra vectors
        c = t + self.extra_embedder(extra_vec)  # [B, D]
        # ========================= Deal with Condition =========================
        condition = self.x_embedder(condition)
        # ========================= Forward pass through HunYuanDiT blocks =========================
        controls = []
        x = x + self.before_proj(condition)  # add condition
        for layer, block in enumerate(self.blocks):
            x = block(x, c, text_states, freqs_cis_img)
            controls.append(self.after_proj_list[layer](x))  # zero linear for output
        return {"output": controls}
--- a/comfy/ldm/hydit/models.py
+++ b/comfy/ldm/hydit/models.py
@ -21,6 +21,7 @@ def calc_rope(x, patch_size, head_size):
    sub_args = [start, stop, (th, tw)]
    # head_size = HUNYUAN_DIT_CONFIG['DiT-g/2']['hidden_size'] // HUNYUAN_DIT_CONFIG['DiT-g/2']['num_heads']
    rope = get_2d_rotary_pos_embed(head_size, *sub_args)
    rope = (rope[0].to(x), rope[1].to(x))
    return rope
@ -91,6 +92,8 @@ class HunYuanDiTBlock(nn.Module):
        # Long Skip Connection
        if self.skip_linear is not None:
            cat = torch.cat([x, skip], dim=-1)
            if cat.dtype != x.dtype:
                cat = cat.to(x.dtype)
            cat = self.skip_norm(cat)
            x = self.skip_linear(cat)
@ -362,6 +365,8 @@ class HunYuanDiT(nn.Module):
        c = t + self.extra_embedder(extra_vec)  # [B, D]
        controls = None
        if control:
            controls = control.get("output", None)
        # ========================= Forward pass through HunYuanDiT blocks =========================
        skips = []
        for layer, block in enumerate(self.blocks):
--- a/comfy/ldm/modules/attention.py
+++ b/comfy/ldm/modules/attention.py
@ -358,7 +358,7 @@ def attention_xformers(q, k, v, heads, mask=None, attn_precision=None, skip_resh
            disabled_xformers = True
    if disabled_xformers:
-        return attention_pytorch(q, k, v, heads, mask)
+        return attention_pytorch(q, k, v, heads, mask, skip_reshape=skip_reshape)
    if skip_reshape:
         q, k, v = map(
--- a/comfy/ldm/modules/diffusionmodules/mmdit.py
+++ b/comfy/ldm/modules/diffusionmodules/mmdit.py
@ -9,6 +9,7 @@ from .. import attention
 from einops import rearrange, repeat
 from .util import timestep_embedding
 import comfy.ops
 import comfy.ldm.common_dit
 def default(x, y):
    if x is not None:
@ -111,9 +112,7 @@ class PatchEmbed(nn.Module):
        #             f"Input width ({W}) should be divisible by patch size ({self.patch_size[1]})."
        #         )
        if self.dynamic_img_pad:
-            pad_h = (self.patch_size[0] - H % self.patch_size[0]) % self.patch_size[0]
+            x = comfy.ldm.common_dit.pad_to_patch_size(x, self.patch_size, padding_mode=self.padding_mode)
            pad_w = (self.patch_size[1] - W % self.patch_size[1]) % self.patch_size[1]
            x = torch.nn.functional.pad(x, (0, pad_w, 0, pad_h), mode=self.padding_mode)
        x = self.proj(x)
        if self.flatten:
            x = x.flatten(2).transpose(1, 2)  # NCHW -> NLC
--- a/comfy/lora.py
+++ b/comfy/lora.py
@ -1,5 +1,27 @@
 """
    This file is part of ComfyUI.
    Copyright (C) 2024 Comfy
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.
    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <https://www.gnu.org/licenses/>.
 """
 from __future__ import annotations
 import comfy.utils
 import comfy.model_management
 import comfy.model_base
 import logging
 import torch
 LORA_CLIP_MAP = {
    "mlp.fc1": "mlp_fc1",
@ -218,11 +240,17 @@ def model_lora_keys_clip(model, key_map={}):
                    lora_key = "lora_prior_te_text_model_encoder_layers_{}_{}".format(b, LORA_CLIP_MAP[c]) #cascade lora: TODO put lora key prefix in the model config
                    key_map[lora_key] = k
-    for k in sdk: #OneTrainer SD3 lora
+    for k in sdk:
-        if k.startswith("t5xxl.transformer.") and k.endswith(".weight"):
+        if k.endswith(".weight"):
-            l_key = k[len("t5xxl.transformer."):-len(".weight")]
+            if k.startswith("t5xxl.transformer."):#OneTrainer SD3 lora
-            lora_key = "lora_te3_{}".format(l_key.replace(".", "_"))
+                l_key = k[len("t5xxl.transformer."):-len(".weight")]
-            key_map[lora_key] = k
+                lora_key = "lora_te3_{}".format(l_key.replace(".", "_"))
                key_map[lora_key] = k
            elif k.startswith("hydit_clip.transformer.bert."): #HunyuanDiT Lora
                l_key = k[len("hydit_clip.transformer.bert."):-len(".weight")]
                lora_key = "lora_te1_{}".format(l_key.replace(".", "_"))
                key_map[lora_key] = k
    k = "clip_g.transformer.text_projection.weight"
    if k in sdk:
@ -245,6 +273,7 @@ def model_lora_keys_unet(model, key_map={}):
            key_lora = k[len("diffusion_model."):-len(".weight")].replace(".", "_")
            key_map["lora_unet_{}".format(key_lora)] = k
            key_map["lora_prior_unet_{}".format(key_lora)] = k #cascade lora: TODO put lora key prefix in the model config
            key_map["{}".format(k[:-len(".weight")])] = k #generic lora format without any weird key names
    diffusers_keys = comfy.utils.unet_to_diffusers(model.model_config.unet_config)
    for k in diffusers_keys:
@ -288,4 +317,240 @@ def model_lora_keys_unet(model, key_map={}):
                key_lora = k[len("diffusion_model."):-len(".weight")]
                key_map["base_model.model.{}".format(key_lora)] = k #official hunyuan lora format
    if isinstance(model, comfy.model_base.Flux): #Diffusers lora Flux
        diffusers_keys = comfy.utils.flux_to_diffusers(model.model_config.unet_config, output_prefix="diffusion_model.")
        for k in diffusers_keys:
            if k.endswith(".weight"):
                to = diffusers_keys[k]
                key_map["transformer.{}".format(k[:-len(".weight")])] = to #simpletrainer and probably regular diffusers flux lora format
                key_map["lycoris_{}".format(k[:-len(".weight")].replace(".", "_"))] = to #simpletrainer lycoris
    return key_map
 def weight_decompose(dora_scale, weight, lora_diff, alpha, strength, intermediate_dtype):
    dora_scale = comfy.model_management.cast_to_device(dora_scale, weight.device, intermediate_dtype)
    lora_diff *= alpha
    weight_calc = weight + lora_diff.type(weight.dtype)
    weight_norm = (
        weight_calc.transpose(0, 1)
        .reshape(weight_calc.shape[1], -1)
        .norm(dim=1, keepdim=True)
        .reshape(weight_calc.shape[1], *[1] * (weight_calc.dim() - 1))
        .transpose(0, 1)
    )
    weight_calc *= (dora_scale / weight_norm).type(weight.dtype)
    if strength != 1.0:
        weight_calc -= weight
        weight += strength * (weight_calc)
    else:
        weight[:] = weight_calc
    return weight
 def pad_tensor_to_shape(tensor: torch.Tensor, new_shape: list[int]) -> torch.Tensor:
    """
    Pad a tensor to a new shape with zeros.
    Args:
        tensor (torch.Tensor): The original tensor to be padded.
        new_shape (List[int]): The desired shape of the padded tensor.
    Returns:
        torch.Tensor: A new tensor padded with zeros to the specified shape.
    Note:
        If the new shape is smaller than the original tensor in any dimension,
        the original tensor will be truncated in that dimension.
    """
    if any([new_shape[i] < tensor.shape[i] for i in range(len(new_shape))]):
        raise ValueError("The new shape must be larger than the original tensor in all dimensions")
    if len(new_shape) != len(tensor.shape):
        raise ValueError("The new shape must have the same number of dimensions as the original tensor")
    # Create a new tensor filled with zeros
    padded_tensor = torch.zeros(new_shape, dtype=tensor.dtype, device=tensor.device)
    # Create slicing tuples for both tensors
    orig_slices = tuple(slice(0, dim) for dim in tensor.shape)
    new_slices = tuple(slice(0, dim) for dim in tensor.shape)
    # Copy the original tensor into the new tensor
    padded_tensor[new_slices] = tensor[orig_slices]
    return padded_tensor
 def calculate_weight(patches, weight, key, intermediate_dtype=torch.float32):
    for p in patches:
        strength = p[0]
        v = p[1]
        strength_model = p[2]
        offset = p[3]
        function = p[4]
        if function is None:
            function = lambda a: a
        old_weight = None
        if offset is not None:
            old_weight = weight
            weight = weight.narrow(offset[0], offset[1], offset[2])
        if strength_model != 1.0:
            weight *= strength_model
        if isinstance(v, list):
            v = (calculate_weight(v[1:], v[0].clone(), key, intermediate_dtype=intermediate_dtype), )
        if len(v) == 1:
            patch_type = "diff"
        elif len(v) == 2:
            patch_type = v[0]
            v = v[1]
        if patch_type == "diff":
            diff: torch.Tensor = v[0]
            # An extra flag to pad the weight if the diff's shape is larger than the weight
            do_pad_weight = len(v) > 1 and v[1]['pad_weight']
            if do_pad_weight and diff.shape != weight.shape:
                logging.info("Pad weight {} from {} to shape: {}".format(key, weight.shape, diff.shape))
                weight = pad_tensor_to_shape(weight, diff.shape)
            if strength != 0.0:
                if diff.shape != weight.shape:
                    logging.warning("WARNING SHAPE MISMATCH {} WEIGHT NOT MERGED {} != {}".format(key, diff.shape, weight.shape))
                else:
                    weight += function(strength * comfy.model_management.cast_to_device(diff, weight.device, weight.dtype))
        elif patch_type == "lora": #lora/locon
            mat1 = comfy.model_management.cast_to_device(v[0], weight.device, intermediate_dtype)
            mat2 = comfy.model_management.cast_to_device(v[1], weight.device, intermediate_dtype)
            dora_scale = v[4]
            if v[2] is not None:
                alpha = v[2] / mat2.shape[0]
            else:
                alpha = 1.0
            if v[3] is not None:
                #locon mid weights, hopefully the math is fine because I didn't properly test it
                mat3 = comfy.model_management.cast_to_device(v[3], weight.device, intermediate_dtype)
                final_shape = [mat2.shape[1], mat2.shape[0], mat3.shape[2], mat3.shape[3]]
                mat2 = torch.mm(mat2.transpose(0, 1).flatten(start_dim=1), mat3.transpose(0, 1).flatten(start_dim=1)).reshape(final_shape).transpose(0, 1)
            try:
                lora_diff = torch.mm(mat1.flatten(start_dim=1), mat2.flatten(start_dim=1)).reshape(weight.shape)
                if dora_scale is not None:
                    weight = function(weight_decompose(dora_scale, weight, lora_diff, alpha, strength, intermediate_dtype))
                else:
                    weight += function(((strength * alpha) * lora_diff).type(weight.dtype))
            except Exception as e:
                logging.error("ERROR {} {} {}".format(patch_type, key, e))
        elif patch_type == "lokr":
            w1 = v[0]
            w2 = v[1]
            w1_a = v[3]
            w1_b = v[4]
            w2_a = v[5]
            w2_b = v[6]
            t2 = v[7]
            dora_scale = v[8]
            dim = None
            if w1 is None:
                dim = w1_b.shape[0]
                w1 = torch.mm(comfy.model_management.cast_to_device(w1_a, weight.device, intermediate_dtype),
                                comfy.model_management.cast_to_device(w1_b, weight.device, intermediate_dtype))
            else:
                w1 = comfy.model_management.cast_to_device(w1, weight.device, intermediate_dtype)
            if w2 is None:
                dim = w2_b.shape[0]
                if t2 is None:
                    w2 = torch.mm(comfy.model_management.cast_to_device(w2_a, weight.device, intermediate_dtype),
                                    comfy.model_management.cast_to_device(w2_b, weight.device, intermediate_dtype))
                else:
                    w2 = torch.einsum('i j k l, j r, i p -> p r k l',
                                        comfy.model_management.cast_to_device(t2, weight.device, intermediate_dtype),
                                        comfy.model_management.cast_to_device(w2_b, weight.device, intermediate_dtype),
                                        comfy.model_management.cast_to_device(w2_a, weight.device, intermediate_dtype))
            else:
                w2 = comfy.model_management.cast_to_device(w2, weight.device, intermediate_dtype)
            if len(w2.shape) == 4:
                w1 = w1.unsqueeze(2).unsqueeze(2)
            if v[2] is not None and dim is not None:
                alpha = v[2] / dim
            else:
                alpha = 1.0
            try:
                lora_diff = torch.kron(w1, w2).reshape(weight.shape)
                if dora_scale is not None:
                    weight = function(weight_decompose(dora_scale, weight, lora_diff, alpha, strength, intermediate_dtype))
                else:
                    weight += function(((strength * alpha) * lora_diff).type(weight.dtype))
            except Exception as e:
                logging.error("ERROR {} {} {}".format(patch_type, key, e))
        elif patch_type == "loha":
            w1a = v[0]
            w1b = v[1]
            if v[2] is not None:
                alpha = v[2] / w1b.shape[0]
            else:
                alpha = 1.0
            w2a = v[3]
            w2b = v[4]
            dora_scale = v[7]
            if v[5] is not None: #cp decomposition
                t1 = v[5]
                t2 = v[6]
                m1 = torch.einsum('i j k l, j r, i p -> p r k l',
                                    comfy.model_management.cast_to_device(t1, weight.device, intermediate_dtype),
                                    comfy.model_management.cast_to_device(w1b, weight.device, intermediate_dtype),
                                    comfy.model_management.cast_to_device(w1a, weight.device, intermediate_dtype))
                m2 = torch.einsum('i j k l, j r, i p -> p r k l',
                                    comfy.model_management.cast_to_device(t2, weight.device, intermediate_dtype),
                                    comfy.model_management.cast_to_device(w2b, weight.device, intermediate_dtype),
                                    comfy.model_management.cast_to_device(w2a, weight.device, intermediate_dtype))
            else:
                m1 = torch.mm(comfy.model_management.cast_to_device(w1a, weight.device, intermediate_dtype),
                                comfy.model_management.cast_to_device(w1b, weight.device, intermediate_dtype))
                m2 = torch.mm(comfy.model_management.cast_to_device(w2a, weight.device, intermediate_dtype),
                                comfy.model_management.cast_to_device(w2b, weight.device, intermediate_dtype))
            try:
                lora_diff = (m1 * m2).reshape(weight.shape)
                if dora_scale is not None:
                    weight = function(weight_decompose(dora_scale, weight, lora_diff, alpha, strength, intermediate_dtype))
                else:
                    weight += function(((strength * alpha) * lora_diff).type(weight.dtype))
            except Exception as e:
                logging.error("ERROR {} {} {}".format(patch_type, key, e))
        elif patch_type == "glora":
            if v[4] is not None:
                alpha = v[4] / v[0].shape[0]
            else:
                alpha = 1.0
            dora_scale = v[5]
            a1 = comfy.model_management.cast_to_device(v[0].flatten(start_dim=1), weight.device, intermediate_dtype)
            a2 = comfy.model_management.cast_to_device(v[1].flatten(start_dim=1), weight.device, intermediate_dtype)
            b1 = comfy.model_management.cast_to_device(v[2].flatten(start_dim=1), weight.device, intermediate_dtype)
            b2 = comfy.model_management.cast_to_device(v[3].flatten(start_dim=1), weight.device, intermediate_dtype)
            try:
                lora_diff = (torch.mm(b2, b1) + torch.mm(torch.mm(weight.flatten(start_dim=1), a2), a1)).reshape(weight.shape)
                if dora_scale is not None:
                    weight = function(weight_decompose(dora_scale, weight, lora_diff, alpha, strength, intermediate_dtype))
                else:
                    weight += function(((strength * alpha) * lora_diff).type(weight.dtype))
            except Exception as e:
                logging.error("ERROR {} {} {}".format(patch_type, key, e))
        else:
            logging.warning("patch type not recognized {} {}".format(patch_type, key))
        if old_weight is not None:
            weight = old_weight
    return weight
--- a/comfy/model_base.py
+++ b/comfy/model_base.py
@ -1,3 +1,21 @@
 """
    This file is part of ComfyUI.
    Copyright (C) 2024 Comfy
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.
    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <https://www.gnu.org/licenses/>.
 """
 import torch
 import logging
 from comfy.ldm.modules.diffusionmodules.openaimodel import UNetModel, Timestep
@ -74,16 +92,18 @@ class BaseModel(torch.nn.Module):
        self.latent_format = model_config.latent_format
        self.model_config = model_config
        self.manual_cast_dtype = model_config.manual_cast_dtype
        self.device = device
        if not unet_config.get("disable_unet_model_creation", False):
-            if self.manual_cast_dtype is not None:
+            if model_config.custom_operations is None:
-                operations = comfy.ops.manual_cast
+                operations = comfy.ops.pick_operations(unet_config.get("dtype", None), self.manual_cast_dtype)
            else:
-                operations = comfy.ops.disable_weight_init
+                operations = model_config.custom_operations
            self.diffusion_model = unet_model(**unet_config, device=device, operations=operations)
            if comfy.model_management.force_channels_last():
                self.diffusion_model.to(memory_format=torch.channels_last)
                logging.debug("using channels last mode for diffusion model")
            logging.info("model weight dtype {}, manual cast: {}".format(self.get_dtype(), self.manual_cast_dtype))
        self.model_type = model_type
        self.model_sampling = model_sampling(model_config, model_type)
@ -94,6 +114,7 @@ class BaseModel(torch.nn.Module):
        self.concat_keys = ()
        logging.info("model_type {}".format(model_type.name))
        logging.debug("adm {}".format(self.adm_channels))
        self.memory_usage_factor = model_config.memory_usage_factor
    def apply_model(self, x, t, c_concat=None, c_crossattn=None, control=None, transformer_options={}, **kwargs):
        sigma = t
@ -252,11 +273,11 @@ class BaseModel(torch.nn.Module):
                dtype = self.manual_cast_dtype
            #TODO: this needs to be tweaked
            area = input_shape[0] * math.prod(input_shape[2:])
-            return (area * comfy.model_management.dtype_size(dtype) / 50) * (1024 * 1024)
+            return (area * comfy.model_management.dtype_size(dtype) * 0.01 * self.memory_usage_factor) * (1024 * 1024)
        else:
            #TODO: this formula might be too aggressive since I tweaked the sub-quad and split algorithms to use less memory.
            area = input_shape[0] * math.prod(input_shape[2:])
-            return (area * 0.3) * (1024 * 1024)
+            return (area * 0.15 * self.memory_usage_factor) * (1024 * 1024)
 def unclip_adm(unclip_conditioning, device, noise_augmentor, noise_augment_merge=0.0, seed=None):
@ -354,6 +375,7 @@ class SDXL(BaseModel):
        flat = torch.flatten(torch.cat(out)).unsqueeze(dim=0).repeat(clip_pooled.shape[0], 1)
        return torch.cat((clip_pooled.to(flat.device), flat), dim=1)
 class SVD_img2vid(BaseModel):
    def __init__(self, model_config, model_type=ModelType.V_PREDICTION_EDM, device=None):
        super().__init__(model_config, model_type, device=device)
@ -594,17 +616,6 @@ class SD3(BaseModel):
            out['c_crossattn'] = comfy.conds.CONDRegular(cross_attn)
        return out
    def memory_required(self, input_shape):
        if comfy.model_management.xformers_enabled() or comfy.model_management.pytorch_attention_flash_attention():
            dtype = self.get_dtype()
            if self.manual_cast_dtype is not None:
                dtype = self.manual_cast_dtype
            #TODO: this probably needs to be tweaked
            area = input_shape[0] * input_shape[2] * input_shape[3]
            return (area * comfy.model_management.dtype_size(dtype) * 0.012) * (1024 * 1024)
        else:
            area = input_shape[0] * input_shape[2] * input_shape[3]
            return (area * 0.3) * (1024 * 1024)
 class AuraFlow(BaseModel):
    def __init__(self, model_config, model_type=ModelType.FLOW, device=None):
@ -702,15 +713,3 @@ class Flux(BaseModel):
            out['c_crossattn'] = comfy.conds.CONDRegular(cross_attn)
        out['guidance'] = comfy.conds.CONDRegular(torch.FloatTensor([kwargs.get("guidance", 3.5)]))
        return out
    def memory_required(self, input_shape):
        if comfy.model_management.xformers_enabled() or comfy.model_management.pytorch_attention_flash_attention():
            dtype = self.get_dtype()
            if self.manual_cast_dtype is not None:
                dtype = self.manual_cast_dtype
            #TODO: this probably needs to be tweaked
            area = input_shape[0] * input_shape[2] * input_shape[3]
            return (area * comfy.model_management.dtype_size(dtype) * 0.020) * (1024 * 1024)
        else:
            area = input_shape[0] * input_shape[2] * input_shape[3]
            return (area * 0.3) * (1024 * 1024)
--- a/comfy/model_detection.py
+++ b/comfy/model_detection.py
@ -131,14 +131,14 @@ def detect_unet_config(state_dict, key_prefix):
    if '{}double_blocks.0.img_attn.norm.key_norm.scale'.format(key_prefix) in state_dict_keys: #Flux
        dit_config = {}
        dit_config["image_model"] = "flux"
-        dit_config["in_channels"] = 64
+        dit_config["in_channels"] = 16
        dit_config["vec_in_dim"] = 768
        dit_config["context_in_dim"] = 4096
        dit_config["hidden_size"] = 3072
        dit_config["mlp_ratio"] = 4.0
        dit_config["num_heads"] = 24
-        dit_config["depth"] = 19
+        dit_config["depth"] = count_blocks(state_dict_keys, '{}double_blocks.'.format(key_prefix) + '{}.')
-        dit_config["depth_single_blocks"] = 38
+        dit_config["depth_single_blocks"] = count_blocks(state_dict_keys, '{}single_blocks.'.format(key_prefix) + '{}.')
        dit_config["axes_dim"] = [16, 56, 56]
        dit_config["theta"] = 10000
        dit_config["qkv_bias"] = True
@ -472,9 +472,15 @@ def unet_config_from_diffusers_unet(state_dict, dtype=None):
            'transformer_depth': [0, 1, 1], 'channel_mult': [1, 2, 4], 'transformer_depth_middle': -2, 'use_linear_in_transformer': False,
            'context_dim': 768, 'num_head_channels': 64, 'transformer_depth_output': [0, 0, 1, 1, 1, 1],
            'use_temporal_attention': False, 'use_temporal_resblock': False}
    SD15_diffusers_inpaint = {'use_checkpoint': False, 'image_size': 32, 'out_channels': 4, 'use_spatial_transformer': True, 'legacy': False, 'adm_in_channels': None,
            'dtype': dtype, 'in_channels': 9, 'model_channels': 320, 'num_res_blocks': [2, 2, 2, 2], 'transformer_depth': [1, 1, 1, 1, 1, 1, 0, 0],
            'channel_mult': [1, 2, 4, 4], 'transformer_depth_middle': 1, 'use_linear_in_transformer': False, 'context_dim': 768, 'num_heads': 8,
            'transformer_depth_output': [1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
            'use_temporal_attention': False, 'use_temporal_resblock': False}  
-    supported_models = [SDXL, SDXL_refiner, SD21, SD15, SD21_uncliph, SD21_unclipl, SDXL_mid_cnet, SDXL_small_cnet, SDXL_diffusers_inpaint, SSD_1B, Segmind_Vega, KOALA_700M, KOALA_1B, SD09_XS, SD_XS, SDXL_diffusers_ip2p]
+    supported_models = [SDXL, SDXL_refiner, SD21, SD15, SD21_uncliph, SD21_unclipl, SDXL_mid_cnet, SDXL_small_cnet, SDXL_diffusers_inpaint, SSD_1B, Segmind_Vega, KOALA_700M, KOALA_1B, SD09_XS, SD_XS, SDXL_diffusers_ip2p, SD15_diffusers_inpaint]
    for unet_config in supported_models:
        matches = True
@ -495,7 +501,12 @@ def model_config_from_diffusers_unet(state_dict):
 def convert_diffusers_mmdit(state_dict, output_prefix=""):
    out_sd = {}
-    if 'transformer_blocks.0.attn.add_q_proj.weight' in state_dict: #SD3
+    if 'transformer_blocks.0.attn.norm_added_k.weight' in state_dict: #Flux
        depth = count_blocks(state_dict, 'transformer_blocks.{}.')
        depth_single_blocks = count_blocks(state_dict, 'single_transformer_blocks.{}.')
        hidden_size = state_dict["x_embedder.bias"].shape[0]
        sd_map = comfy.utils.flux_to_diffusers({"depth": depth, "depth_single_blocks": depth_single_blocks, "hidden_size": hidden_size}, output_prefix=output_prefix)
    elif 'transformer_blocks.0.attn.add_q_proj.weight' in state_dict: #SD3
        num_blocks = count_blocks(state_dict, 'transformer_blocks.{}.')
        depth = state_dict["pos_embed.proj.weight"].shape[0] // 64
        sd_map = comfy.utils.mmdit_to_diffusers({"depth": depth, "num_blocks": num_blocks}, output_prefix=output_prefix)
@ -521,7 +532,12 @@ def convert_diffusers_mmdit(state_dict, output_prefix=""):
                    old_weight = out_sd.get(t[0], None)
                    if old_weight is None:
                        old_weight = torch.empty_like(weight)
-                        old_weight = old_weight.repeat([3] + [1] * (len(old_weight.shape) - 1))
+                    if old_weight.shape[offset[0]] < offset[1] + offset[2]:
                        exp = list(weight.shape)
                        exp[offset[0]] = offset[1] + offset[2]
                        new = torch.empty(exp, device=weight.device, dtype=weight.dtype)
                        new[:old_weight.shape[0]] = old_weight
                        old_weight = new
                    w = old_weight.narrow(offset[0], offset[1], offset[2])
                else:
--- a/comfy/model_management.py
+++ b/comfy/model_management.py
@ -1,3 +1,21 @@
 """
    This file is part of ComfyUI.
    Copyright (C) 2024 Comfy
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.
    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <https://www.gnu.org/licenses/>.
 """
 import psutil
 import logging
 from enum import Enum
@ -26,9 +44,14 @@ cpu_state = CPUState.GPU
 total_vram = 0
 lowvram_available = True
 xpu_available = False
 try:
    torch_version = torch.version.__version__
    xpu_available = (int(torch_version[0]) < 2 or (int(torch_version[0]) == 2 and int(torch_version[2]) <= 4)) and torch.xpu.is_available()
 except:
    pass
 lowvram_available = True
 if args.deterministic:
    logging.info("Using deterministic algorithms for pytorch")
    torch.use_deterministic_algorithms(True, warn_only=True)
@ -48,10 +71,10 @@ if args.directml is not None:
 try:
    import intel_extension_for_pytorch as ipex
-    if torch.xpu.is_available():
+    _ = torch.xpu.device_count()
-        xpu_available = True
+    xpu_available = torch.xpu.is_available()
 except:
-    pass
+    xpu_available = xpu_available or (hasattr(torch, "xpu") and torch.xpu.is_available())
 try:
    if torch.backends.mps.is_available():
@ -171,7 +194,6 @@ VAE_DTYPES = [torch.float32]
 try:
    if is_nvidia():
        torch_version = torch.version.__version__
        if int(torch_version[0]) >= 2:
            if ENABLE_PYTORCH_ATTENTION == False and args.use_split_cross_attention == False and args.use_quad_cross_attention == False:
                ENABLE_PYTORCH_ATTENTION = True
@ -273,9 +295,12 @@ class LoadedModel:
    def model_memory(self):
        return self.model.model_size()
    def model_offloaded_memory(self):
        return self.model.model_size() - self.model.loaded_size()
    def model_memory_required(self, device):
-        if device == self.model.current_device:
+        if device == self.model.current_loaded_device():
-            return 0
+            return self.model_offloaded_memory()
        else:
            return self.model_memory()
@ -287,39 +312,77 @@ class LoadedModel:
        load_weights = not self.weights_loaded
-        try:
+        if self.model.loaded_size() > 0:
-            if lowvram_model_memory > 0 and load_weights:
+            use_more_vram = lowvram_model_memory
-                self.real_model = self.model.patch_model_lowvram(device_to=patch_model_to, lowvram_model_memory=lowvram_model_memory, force_patch_weights=force_patch_weights)
+            if use_more_vram == 0:
-            else:
+                use_more_vram = 1e32
-                self.real_model = self.model.patch_model(device_to=patch_model_to, patch_weights=load_weights)
+            self.model_use_more_vram(use_more_vram)
-        except Exception as e:
+        else:
-            self.model.unpatch_model(self.model.offload_device)
+            try:
-            self.model_unload()
+                self.real_model = self.model.patch_model(device_to=patch_model_to, lowvram_model_memory=lowvram_model_memory, load_weights=load_weights, force_patch_weights=force_patch_weights)
-            raise e
+            except Exception as e:
                self.model.unpatch_model(self.model.offload_device)
                self.model_unload()
                raise e
-        if is_intel_xpu() and not args.disable_ipex_optimize:
+        if is_intel_xpu() and not args.disable_ipex_optimize and self.real_model is not None:
-            self.real_model = ipex.optimize(self.real_model.eval(), graph_mode=True, concat_linear=True)
+            with torch.no_grad():
                self.real_model = ipex.optimize(self.real_model.eval(), inplace=True, graph_mode=True, concat_linear=True)
        self.weights_loaded = True
        return self.real_model
    def should_reload_model(self, force_patch_weights=False):
-        if force_patch_weights and self.model.lowvram_patch_counter > 0:
+        if force_patch_weights and self.model.lowvram_patch_counter() > 0:
            return True
        return False
-    def model_unload(self, unpatch_weights=True):
+    def model_unload(self, memory_to_free=None, unpatch_weights=True):
        if memory_to_free is not None:
            if memory_to_free < self.model.loaded_size():
                freed = self.model.partially_unload(self.model.offload_device, memory_to_free)
                if freed >= memory_to_free:
                    return False
        self.model.unpatch_model(self.model.offload_device, unpatch_weights=unpatch_weights)
        self.model.model_patches_to(self.model.offload_device)
        self.weights_loaded = self.weights_loaded and not unpatch_weights
        self.real_model = None
        return True
    def model_use_more_vram(self, extra_memory):
        return self.model.partially_load(self.device, extra_memory)
    def __eq__(self, other):
        return self.model is other.model
 def use_more_memory(extra_memory, loaded_models, device):
    for m in loaded_models:
        if m.device == device:
            extra_memory -= m.model_use_more_vram(extra_memory)
            if extra_memory <= 0:
                break
 def offloaded_memory(loaded_models, device):
    offloaded_mem = 0
    for m in loaded_models:
        if m.device == device:
            offloaded_mem += m.model_offloaded_memory()
    return offloaded_mem
 def minimum_inference_memory():
    return (1024 * 1024 * 1024) * 1.2
 EXTRA_RESERVED_VRAM = 200 * 1024 * 1024
 if any(platform.win32_ver()):
    EXTRA_RESERVED_VRAM = 500 * 1024 * 1024 #Windows is higher because of the shared vram issue
 if args.reserve_vram is not None:
    EXTRA_RESERVED_VRAM = args.reserve_vram * 1024 * 1024 * 1024
    logging.debug("Reserving {}MB vram for other applications.".format(EXTRA_RESERVED_VRAM / (1024 * 1024)))
 def extra_reserved_memory():
    return EXTRA_RESERVED_VRAM
 def unload_model_clones(model, unload_weights_only=True, force_unload=True):
    to_unload = []
    for i in range(len(current_loaded_models)):
@ -342,6 +405,8 @@ def unload_model_clones(model, unload_weights_only=True, force_unload=True):
    if not force_unload:
        if unload_weights_only and unload_weight == False:
            return None
    else:
        unload_weight = True
    for i in to_unload:
        logging.debug("unload clone {} {}".format(i, unload_weight))
@ -352,6 +417,7 @@ def unload_model_clones(model, unload_weights_only=True, force_unload=True):
 def free_memory(memory_required, device, keep_loaded=[]):
    unloaded_model = []
    can_unload = []
    unloaded_models = []
    for i in range(len(current_loaded_models) -1, -1, -1):
        shift_model = current_loaded_models[i]
@ -362,14 +428,18 @@ def free_memory(memory_required, device, keep_loaded=[]):
    for x in sorted(can_unload):
        i = x[-1]
        memory_to_free = None
        if not DISABLE_SMART_MEMORY:
-            if get_free_memory(device) > memory_required:
+            free_mem = get_free_memory(device)
            if free_mem > memory_required:
                break
-        current_loaded_models[i].model_unload()
+            memory_to_free = memory_required - free_mem
-        unloaded_model.append(i)
+        logging.debug(f"Unloading {current_loaded_models[i].model.model.__class__.__name__}")
        if current_loaded_models[i].model_unload(memory_to_free):
            unloaded_model.append(i)
    for i in sorted(unloaded_model, reverse=True):
-        current_loaded_models.pop(i)
+        unloaded_models.append(current_loaded_models.pop(i))
    if len(unloaded_model) > 0:
        soft_empty_cache()
@ -378,16 +448,17 @@ def free_memory(memory_required, device, keep_loaded=[]):
            mem_free_total, mem_free_torch = get_free_memory(device, torch_free_too=True)
            if mem_free_torch > mem_free_total * 0.25:
                soft_empty_cache()
    return unloaded_models
-def load_models_gpu(models, memory_required=0, force_patch_weights=False, minimum_memory_required=None):
+def load_models_gpu(models, memory_required=0, force_patch_weights=False, minimum_memory_required=None, force_full_load=False):
    global vram_state
    inference_memory = minimum_inference_memory()
-    extra_mem = max(inference_memory, memory_required)
+    extra_mem = max(inference_memory, memory_required + extra_reserved_memory())
    if minimum_memory_required is None:
        minimum_memory_required = extra_mem
    else:
-        minimum_memory_required = max(inference_memory, minimum_memory_required)
+        minimum_memory_required = max(inference_memory, minimum_memory_required + extra_reserved_memory())
    models = set(models)
@ -420,25 +491,36 @@ def load_models_gpu(models, memory_required=0, force_patch_weights=False, minimu
        devs = set(map(lambda a: a.device, models_already_loaded))
        for d in devs:
            if d != torch.device("cpu"):
-                free_memory(extra_mem, d, models_already_loaded)
+                free_memory(extra_mem + offloaded_memory(models_already_loaded, d), d, models_already_loaded)
-        return
+                free_mem = get_free_memory(d)
                if free_mem < minimum_memory_required:
                    logging.info("Unloading models for lowram load.") #TODO: partial model unloading when this case happens, also handle the opposite case where models can be unlowvramed.
                    models_to_load = free_memory(minimum_memory_required, d)
                    logging.info("{} models unloaded.".format(len(models_to_load)))
                else:
                    use_more_memory(free_mem - minimum_memory_required, models_already_loaded, d)
        if len(models_to_load) == 0:
            return
    logging.info(f"Loading {len(models_to_load)} new model{'s' if len(models_to_load) > 1 else ''}")
    total_memory_required = {}
    for loaded_model in models_to_load:
-        if unload_model_clones(loaded_model.model, unload_weights_only=True, force_unload=False) == True:#unload clones where the weights are different
+        unload_model_clones(loaded_model.model, unload_weights_only=True, force_unload=False) #unload clones where the weights are different
-            total_memory_required[loaded_model.device] = total_memory_required.get(loaded_model.device, 0) + loaded_model.model_memory_required(loaded_model.device)
+        total_memory_required[loaded_model.device] = total_memory_required.get(loaded_model.device, 0) + loaded_model.model_memory_required(loaded_model.device)
-    for device in total_memory_required:
+    for loaded_model in models_already_loaded:
-        if device != torch.device("cpu"):
+        total_memory_required[loaded_model.device] = total_memory_required.get(loaded_model.device, 0) + loaded_model.model_memory_required(loaded_model.device)
            free_memory(total_memory_required[device] * 1.3 + extra_mem, device, models_already_loaded)
    for loaded_model in models_to_load:
        weights_unloaded = unload_model_clones(loaded_model.model, unload_weights_only=False, force_unload=False) #unload the rest of the clones where the weights can stay loaded
        if weights_unloaded is not None:
            loaded_model.weights_loaded = not weights_unloaded
    for device in total_memory_required:
        if device != torch.device("cpu"):
            free_memory(total_memory_required[device] * 1.1 + extra_mem, device, models_already_loaded)
    for loaded_model in models_to_load:
        model = loaded_model.model
        torch_dev = model.load_device
@ -447,10 +529,10 @@ def load_models_gpu(models, memory_required=0, force_patch_weights=False, minimu
        else:
            vram_set_state = vram_state
        lowvram_model_memory = 0
-        if lowvram_available and (vram_set_state == VRAMState.LOW_VRAM or vram_set_state == VRAMState.NORMAL_VRAM):
+        if lowvram_available and (vram_set_state == VRAMState.LOW_VRAM or vram_set_state == VRAMState.NORMAL_VRAM) and not force_full_load:
            model_size = loaded_model.model_memory_required(torch_dev)
            current_free_mem = get_free_memory(torch_dev)
-            lowvram_model_memory = int(max(64 * (1024 * 1024), (current_free_mem - minimum_memory_required)))
+            lowvram_model_memory = max(64 * (1024 * 1024), (current_free_mem - minimum_memory_required), min(current_free_mem * 0.4, current_free_mem - minimum_inference_memory()))
            if model_size <= lowvram_model_memory: #only switch to lowvram if really necessary
                lowvram_model_memory = 0
@ -459,6 +541,14 @@ def load_models_gpu(models, memory_required=0, force_patch_weights=False, minimu
        cur_loaded_model = loaded_model.model_load(lowvram_model_memory, force_patch_weights=force_patch_weights)
        current_loaded_models.insert(0, loaded_model)
    devs = set(map(lambda a: a.device, models_already_loaded))
    for d in devs:
        if d != torch.device("cpu"):
            free_mem = get_free_memory(d)
            if free_mem > minimum_memory_required:
                use_more_memory(free_mem - minimum_memory_required, models_already_loaded, d)
    return
@ -478,7 +568,9 @@ def loaded_models(only_currently_used=False):
 def cleanup_models(keep_clone_weights_loaded=False):
    to_delete = []
    for i in range(len(current_loaded_models)):
-        if sys.getrefcount(current_loaded_models[i].model) <= 2:
+        #TODO: very fragile function needs improvement
        num_refs = sys.getrefcount(current_loaded_models[i].model)
        if num_refs <= 2:
            if not keep_clone_weights_loaded:
                to_delete = [i] + to_delete
            #TODO: find a less fragile way to do this.
@ -527,6 +619,9 @@ def unet_inital_load_device(parameters, dtype):
    else:
        return cpu_dev
 def maximum_vram_for_weights(device=None):
    return (get_total_memory(device) * 0.88 - minimum_inference_memory())
 def unet_dtype(device=None, model_params=0, supported_dtypes=[torch.float16, torch.bfloat16, torch.float32]):
    if args.bf16_unet:
        return torch.bfloat16
@ -536,12 +631,37 @@ def unet_dtype(device=None, model_params=0, supported_dtypes=[torch.float16, tor
        return torch.float8_e4m3fn
    if args.fp8_e5m2_unet:
        return torch.float8_e5m2
-    if should_use_fp16(device=device, model_params=model_params, manual_cast=True):
+
-        if torch.float16 in supported_dtypes:
+    fp8_dtype = None
-            return torch.float16
+    try:
-    if should_use_bf16(device, model_params=model_params, manual_cast=True):
+        for dtype in [torch.float8_e4m3fn, torch.float8_e5m2]:
-        if torch.bfloat16 in supported_dtypes:
+            if dtype in supported_dtypes:
-            return torch.bfloat16
+                fp8_dtype = dtype
                break
    except:
        pass
    if fp8_dtype is not None:
        free_model_memory = maximum_vram_for_weights(device)
        if model_params * 2 > free_model_memory:
            return fp8_dtype
    for dt in supported_dtypes:
        if dt == torch.float16 and should_use_fp16(device=device, model_params=model_params):
            if torch.float16 in supported_dtypes:
                return torch.float16
        if dt == torch.bfloat16 and should_use_bf16(device, model_params=model_params):
            if torch.bfloat16 in supported_dtypes:
                return torch.bfloat16
    for dt in supported_dtypes:
        if dt == torch.float16 and should_use_fp16(device=device, model_params=model_params, manual_cast=True):
            if torch.float16 in supported_dtypes:
                return torch.float16
        if dt == torch.bfloat16 and should_use_bf16(device, model_params=model_params, manual_cast=True):
            if torch.bfloat16 in supported_dtypes:
                return torch.bfloat16
    return torch.float32
 # None means no manual cast
@ -557,13 +677,14 @@ def unet_manual_cast(weight_dtype, inference_device, supported_dtypes=[torch.flo
    if bf16_supported and weight_dtype == torch.bfloat16:
        return None
-    if fp16_supported and torch.float16 in supported_dtypes:
+    fp16_supported = should_use_fp16(inference_device, prioritize_performance=True)
-        return torch.float16
+    for dt in supported_dtypes:
        if dt == torch.float16 and fp16_supported:
            return torch.float16
        if dt == torch.bfloat16 and bf16_supported:
            return torch.bfloat16
-    elif bf16_supported and torch.bfloat16 in supported_dtypes:
+    return torch.float32
        return torch.bfloat16
    else:
        return torch.float32
 def text_encoder_offload_device():
    if args.gpu_only:
@ -582,6 +703,20 @@ def text_encoder_device():
    else:
        return torch.device("cpu")
 def text_encoder_initial_device(load_device, offload_device, model_size=0):
    if load_device == offload_device or model_size <= 1024 * 1024 * 1024:
        return offload_device
    if is_device_mps(load_device):
        return offload_device
    mem_l = get_free_memory(load_device)
    mem_o = get_free_memory(offload_device)
    if mem_l > (mem_o * 0.5) and model_size * 1.2 < mem_l:
        return load_device
    else:
        return offload_device
 def text_encoder_dtype(device=None):
    if args.fp8_e4m3fn_text_enc:
        return torch.float8_e4m3fn
@ -758,7 +893,8 @@ def pytorch_attention_flash_attention():
 def force_upcast_attention_dtype():
    upcast = args.force_upcast_attention
    try:
-        if platform.mac_ver()[0] in ['14.5']: #black image bug on OSX Sonoma 14.5
+        macos_version = tuple(int(n) for n in platform.mac_ver()[0].split("."))
        if (14, 5) <= macos_version < (14, 7):  # black image bug on recent versions of MacOS
            upcast = True
    except:
        pass
@ -854,24 +990,21 @@ def should_use_fp16(device=None, model_params=0, prioritize_performance=True, ma
    if torch.version.hip:
        return True
-    props = torch.cuda.get_device_properties("cuda")
+    props = torch.cuda.get_device_properties(device)
    if props.major >= 8:
        return True
    if props.major < 6:
        return False
-    fp16_works = False
+    #FP16 is confirmed working on a 1080 (GP104) and on latest pytorch actually seems faster than fp32
    #FP16 is confirmed working on a 1080 (GP104) but it's a bit slower than FP32 so it should only be enabled
    #when the model doesn't actually fit on the card
    #TODO: actually test if GP106 and others have the same type of behavior
    nvidia_10_series = ["1080", "1070", "titan x", "p3000", "p3200", "p4000", "p4200", "p5000", "p5200", "p6000", "1060", "1050", "p40", "p100", "p6", "p4"]
    for x in nvidia_10_series:
        if x in props.name.lower():
-            fp16_works = True
+            return True
-    if fp16_works or manual_cast:
+    if manual_cast:
-        free_model_memory = (get_free_memory() * 0.9 - minimum_inference_memory())
+        free_model_memory = maximum_vram_for_weights(device)
        if (not prioritize_performance) or model_params * 4 > free_model_memory:
            return True
@ -910,9 +1043,6 @@ def should_use_bf16(device=None, model_params=0, prioritize_performance=True, ma
    if is_intel_xpu():
        return True
    if device is None:
        device = torch.device("cuda")
    props = torch.cuda.get_device_properties(device)
    if props.major >= 8:
        return True
@ -920,12 +1050,22 @@ def should_use_bf16(device=None, model_params=0, prioritize_performance=True, ma
    bf16_works = torch.cuda.is_bf16_supported()
    if bf16_works or manual_cast:
-        free_model_memory = (get_free_memory() * 0.9 - minimum_inference_memory())
+        free_model_memory = maximum_vram_for_weights(device)
        if (not prioritize_performance) or model_params * 4 > free_model_memory:
            return True
    return False
 def supports_fp8_compute(device=None):
    props = torch.cuda.get_device_properties(device)
    if props.major >= 9:
        return True
    if props.major < 8:
        return False
    if props.minor < 9:
        return False
    return True
 def soft_empty_cache(force=False):
    global cpu_state
    if cpu_state == CPUState.MPS:
--- a/comfy/model_patcher.py
+++ b/comfy/model_patcher.py
@ -1,34 +1,47 @@
 """
    This file is part of ComfyUI.
    Copyright (C) 2024 Comfy
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.
    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <https://www.gnu.org/licenses/>.
 """
 import torch
 import copy
 import inspect
 import logging
 import uuid
 import collections
 import math
 import comfy.utils
 import comfy.float
 import comfy.model_management
 import comfy.lora
 from comfy.types import UnetWrapperFunction
-
+def string_to_seed(data):
-def weight_decompose(dora_scale, weight, lora_diff, alpha, strength):
+    crc = 0xFFFFFFFF
-    dora_scale = comfy.model_management.cast_to_device(dora_scale, weight.device, torch.float32)
+    for byte in data:
-    lora_diff *= alpha
+        if isinstance(byte, str):
-    weight_calc = weight + lora_diff.type(weight.dtype)
+            byte = ord(byte)
-    weight_norm = (
+        crc ^= byte
-        weight_calc.transpose(0, 1)
+        for _ in range(8):
-        .reshape(weight_calc.shape[1], -1)
+            if crc & 1:
-        .norm(dim=1, keepdim=True)
+                crc = (crc >> 1) ^ 0xEDB88320
-        .reshape(weight_calc.shape[1], *[1] * (weight_calc.dim() - 1))
+            else:
-        .transpose(0, 1)
+                crc >>= 1
-    )
+    return crc ^ 0xFFFFFFFF
    weight_calc *= (dora_scale / weight_norm).type(weight.dtype)
    if strength != 1.0:
        weight_calc -= weight
        weight += strength * (weight_calc)
    else:
        weight[:] = weight_calc
    return weight
 def set_model_options_patch_replace(model_options, patch, name, block_name, number, transformer_index=None):
    to = model_options["transformer_options"].copy()
@ -63,10 +76,30 @@ def set_model_options_pre_cfg_function(model_options, pre_cfg_function, disable_
        model_options["disable_cfg1_optimization"] = True
    return model_options
 def wipe_lowvram_weight(m):
    if hasattr(m, "prev_comfy_cast_weights"):
        m.comfy_cast_weights = m.prev_comfy_cast_weights
        del m.prev_comfy_cast_weights
    m.weight_function = None
    m.bias_function = None
 class LowVramPatch:
    def __init__(self, key, patches):
        self.key = key
        self.patches = patches
    def __call__(self, weight):
        return comfy.lora.calculate_weight(self.patches[self.key], weight, self.key, intermediate_dtype=weight.dtype)
 class ModelPatcher:
-    def __init__(self, model, load_device, offload_device, size=0, current_device=None, weight_inplace_update=False):
+    def __init__(self, model, load_device, offload_device, size=0, weight_inplace_update=False):
        self.size = size
        self.model = model
        if not hasattr(self.model, 'device'):
            logging.debug("Model doesn't have a device attribute.")
            self.model.device = offload_device
        elif self.model.device is None:
            self.model.device = offload_device
        self.patches = {}
        self.backup = {}
        self.object_patches = {}
@ -75,24 +108,32 @@ class ModelPatcher:
        self.model_size()
        self.load_device = load_device
        self.offload_device = offload_device
        if current_device is None:
            self.current_device = self.offload_device
        else:
            self.current_device = current_device
        self.weight_inplace_update = weight_inplace_update
        self.model_lowvram = False
        self.lowvram_patch_counter = 0
        self.patches_uuid = uuid.uuid4()
        if not hasattr(self.model, 'model_loaded_weight_memory'):
            self.model.model_loaded_weight_memory = 0
        if not hasattr(self.model, 'lowvram_patch_counter'):
            self.model.lowvram_patch_counter = 0
        if not hasattr(self.model, 'model_lowvram'):
            self.model.model_lowvram = False
    def model_size(self):
        if self.size > 0:
            return self.size
        self.size = comfy.model_management.module_size(self.model)
        return self.size
    def loaded_size(self):
        return self.model.model_loaded_weight_memory
    def lowvram_patch_counter(self):
        return self.model.lowvram_patch_counter
    def clone(self):
-        n = ModelPatcher(self.model, self.load_device, self.offload_device, self.size, self.current_device, weight_inplace_update=self.weight_inplace_update)
+        n = ModelPatcher(self.model, self.load_device, self.offload_device, self.size, weight_inplace_update=self.weight_inplace_update)
        n.patches = {}
        for k in self.patches:
            n.patches[k] = self.patches[k][:]
@ -264,67 +305,52 @@ class ModelPatcher:
                    sd.pop(k)
        return sd
-    def patch_weight_to_device(self, key, device_to=None):
+    def patch_weight_to_device(self, key, device_to=None, inplace_update=False):
        if key not in self.patches:
            return
        weight = comfy.utils.get_attr(self.model, key)
-        inplace_update = self.weight_inplace_update
+        inplace_update = self.weight_inplace_update or inplace_update
        if key not in self.backup:
-            self.backup[key] = weight.to(device=self.offload_device, copy=inplace_update)
+            self.backup[key] = collections.namedtuple('Dimension', ['weight', 'inplace_update'])(weight.to(device=self.offload_device, copy=inplace_update), inplace_update)
        if device_to is not None:
            temp_weight = comfy.model_management.cast_to_device(weight, device_to, torch.float32, copy=True)
        else:
            temp_weight = weight.to(torch.float32, copy=True)
-        out_weight = self.calculate_weight(self.patches[key], temp_weight, key).to(weight.dtype)
+        out_weight = comfy.lora.calculate_weight(self.patches[key], temp_weight, key)
        out_weight = comfy.float.stochastic_rounding(out_weight, weight.dtype, seed=string_to_seed(key))
        if inplace_update:
            comfy.utils.copy_to_param(self.model, key, out_weight)
        else:
            comfy.utils.set_attr_param(self.model, key, out_weight)
-    def patch_model(self, device_to=None, patch_weights=True):
+    def load(self, device_to=None, lowvram_model_memory=0, force_patch_weights=False, full_load=False):
        for k in self.object_patches:
            old = comfy.utils.set_attr(self.model, k, self.object_patches[k])
            if k not in self.object_patches_backup:
                self.object_patches_backup[k] = old
        if patch_weights:
            model_sd = self.model_state_dict()
            for key in self.patches:
                if key not in model_sd:
                    logging.warning("could not patch. key doesn't exist in model: {}".format(key))
                    continue
                self.patch_weight_to_device(key, device_to)
            if device_to is not None:
                self.model.to(device_to)
                self.current_device = device_to
        return self.model
    def patch_model_lowvram(self, device_to=None, lowvram_model_memory=0, force_patch_weights=False):
        self.patch_model(device_to, patch_weights=False)
        logging.info("loading in lowvram mode {}".format(lowvram_model_memory/(1024 * 1024)))
        class LowVramPatch:
            def __init__(self, key, model_patcher):
                self.key = key
                self.model_patcher = model_patcher
            def __call__(self, weight):
                return self.model_patcher.calculate_weight(self.model_patcher.patches[self.key], weight, self.key)
        mem_counter = 0
        patch_counter = 0
        lowvram_counter = 0
        loading = []
        for n, m in self.model.named_modules():
            if hasattr(m, "comfy_cast_weights") or hasattr(m, "weight"):
                loading.append((comfy.model_management.module_size(m), n, m))
        load_completely = []
        loading.sort(reverse=True)
        for x in loading:
            n = x[1]
            m = x[2]
            module_mem = x[0]
            lowvram_weight = False
-            if hasattr(m, "comfy_cast_weights"):
+
-                module_mem = comfy.model_management.module_size(m)
+            if not full_load and hasattr(m, "comfy_cast_weights"):
                if mem_counter + module_mem >= lowvram_model_memory:
                    lowvram_weight = True
                    lowvram_counter += 1
                    if hasattr(m, "prev_comfy_cast_weights"): #Already lowvramed
                        continue
            weight_key = "{}.weight".format(n)
            bias_key = "{}.bias".format(n)
@ -334,227 +360,173 @@ class ModelPatcher:
                    if force_patch_weights:
                        self.patch_weight_to_device(weight_key)
                    else:
-                        m.weight_function = LowVramPatch(weight_key, self)
+                        m.weight_function = LowVramPatch(weight_key, self.patches)
                        patch_counter += 1
                if bias_key in self.patches:
                    if force_patch_weights:
                        self.patch_weight_to_device(bias_key)
                    else:
-                        m.bias_function = LowVramPatch(bias_key, self)
+                        m.bias_function = LowVramPatch(bias_key, self.patches)
                        patch_counter += 1
                m.prev_comfy_cast_weights = m.comfy_cast_weights
                m.comfy_cast_weights = True
            else:
                if hasattr(m, "comfy_cast_weights"):
                    if m.comfy_cast_weights:
                        wipe_lowvram_weight(m)
                if hasattr(m, "weight"):
-                    self.patch_weight_to_device(weight_key, device_to)
+                    mem_counter += module_mem
-                    self.patch_weight_to_device(bias_key, device_to)
+                    load_completely.append((module_mem, n, m))
                    m.to(device_to)
                    mem_counter += comfy.model_management.module_size(m)
                    logging.debug("lowvram: loaded module regularly {} {}".format(n, m))
-        self.model_lowvram = True
+        load_completely.sort(reverse=True)
-        self.lowvram_patch_counter = patch_counter
+        for x in load_completely:
            n = x[1]
            m = x[2]
            weight_key = "{}.weight".format(n)
            bias_key = "{}.bias".format(n)
            if hasattr(m, "comfy_patched_weights"):
                if m.comfy_patched_weights == True:
                    continue
            self.patch_weight_to_device(weight_key, device_to=device_to)
            self.patch_weight_to_device(bias_key, device_to=device_to)
            logging.debug("lowvram: loaded module regularly {} {}".format(n, m))
            m.comfy_patched_weights = True
        for x in load_completely:
            x[2].to(device_to)
        if lowvram_counter > 0:
            logging.info("loaded partially {} {} {}".format(lowvram_model_memory / (1024 * 1024), mem_counter / (1024 * 1024), patch_counter))
            self.model.model_lowvram = True
        else:
            logging.info("loaded completely {} {} {}".format(lowvram_model_memory / (1024 * 1024), mem_counter / (1024 * 1024), full_load))
            self.model.model_lowvram = False
            if full_load:
                self.model.to(device_to)
                mem_counter = self.model_size()
        self.model.lowvram_patch_counter += patch_counter
        self.model.device = device_to
        self.model.model_loaded_weight_memory = mem_counter
    def patch_model(self, device_to=None, lowvram_model_memory=0, load_weights=True, force_patch_weights=False):
        for k in self.object_patches:
            old = comfy.utils.set_attr(self.model, k, self.object_patches[k])
            if k not in self.object_patches_backup:
                self.object_patches_backup[k] = old
        if lowvram_model_memory == 0:
            full_load = True
        else:
            full_load = False
        if load_weights:
            self.load(device_to, lowvram_model_memory=lowvram_model_memory, force_patch_weights=force_patch_weights, full_load=full_load)
        return self.model
    def calculate_weight(self, patches, weight, key):
        for p in patches:
            strength = p[0]
            v = p[1]
            strength_model = p[2]
            offset = p[3]
            function = p[4]
            if function is None:
                function = lambda a: a
            old_weight = None
            if offset is not None:
                old_weight = weight
                weight = weight.narrow(offset[0], offset[1], offset[2])
            if strength_model != 1.0:
                weight *= strength_model
            if isinstance(v, list):
                v = (self.calculate_weight(v[1:], v[0].clone(), key), )
            if len(v) == 1:
                patch_type = "diff"
            elif len(v) == 2:
                patch_type = v[0]
                v = v[1]
            if patch_type == "diff":
                w1 = v[0]
                if strength != 0.0:
                    if w1.shape != weight.shape:
                        logging.warning("WARNING SHAPE MISMATCH {} WEIGHT NOT MERGED {} != {}".format(key, w1.shape, weight.shape))
                    else:
                        weight += function(strength * comfy.model_management.cast_to_device(w1, weight.device, weight.dtype))
            elif patch_type == "lora": #lora/locon
                mat1 = comfy.model_management.cast_to_device(v[0], weight.device, torch.float32)
                mat2 = comfy.model_management.cast_to_device(v[1], weight.device, torch.float32)
                dora_scale = v[4]
                if v[2] is not None:
                    alpha = v[2] / mat2.shape[0]
                else:
                    alpha = 1.0
                if v[3] is not None:
                    #locon mid weights, hopefully the math is fine because I didn't properly test it
                    mat3 = comfy.model_management.cast_to_device(v[3], weight.device, torch.float32)
                    final_shape = [mat2.shape[1], mat2.shape[0], mat3.shape[2], mat3.shape[3]]
                    mat2 = torch.mm(mat2.transpose(0, 1).flatten(start_dim=1), mat3.transpose(0, 1).flatten(start_dim=1)).reshape(final_shape).transpose(0, 1)
                try:
                    lora_diff = torch.mm(mat1.flatten(start_dim=1), mat2.flatten(start_dim=1)).reshape(weight.shape)
                    if dora_scale is not None:
                        weight = function(weight_decompose(dora_scale, weight, lora_diff, alpha, strength))
                    else:
                        weight += function(((strength * alpha) * lora_diff).type(weight.dtype))
                except Exception as e:
                    logging.error("ERROR {} {} {}".format(patch_type, key, e))
            elif patch_type == "lokr":
                w1 = v[0]
                w2 = v[1]
                w1_a = v[3]
                w1_b = v[4]
                w2_a = v[5]
                w2_b = v[6]
                t2 = v[7]
                dora_scale = v[8]
                dim = None
                if w1 is None:
                    dim = w1_b.shape[0]
                    w1 = torch.mm(comfy.model_management.cast_to_device(w1_a, weight.device, torch.float32),
                                  comfy.model_management.cast_to_device(w1_b, weight.device, torch.float32))
                else:
                    w1 = comfy.model_management.cast_to_device(w1, weight.device, torch.float32)
                if w2 is None:
                    dim = w2_b.shape[0]
                    if t2 is None:
                        w2 = torch.mm(comfy.model_management.cast_to_device(w2_a, weight.device, torch.float32),
                                      comfy.model_management.cast_to_device(w2_b, weight.device, torch.float32))
                    else:
                        w2 = torch.einsum('i j k l, j r, i p -> p r k l',
                                          comfy.model_management.cast_to_device(t2, weight.device, torch.float32),
                                          comfy.model_management.cast_to_device(w2_b, weight.device, torch.float32),
                                          comfy.model_management.cast_to_device(w2_a, weight.device, torch.float32))
                else:
                    w2 = comfy.model_management.cast_to_device(w2, weight.device, torch.float32)
                if len(w2.shape) == 4:
                    w1 = w1.unsqueeze(2).unsqueeze(2)
                if v[2] is not None and dim is not None:
                    alpha = v[2] / dim
                else:
                    alpha = 1.0
                try:
                    lora_diff = torch.kron(w1, w2).reshape(weight.shape)
                    if dora_scale is not None:
                        weight = function(weight_decompose(dora_scale, weight, lora_diff, alpha, strength))
                    else:
                        weight += function(((strength * alpha) * lora_diff).type(weight.dtype))
                except Exception as e:
                    logging.error("ERROR {} {} {}".format(patch_type, key, e))
            elif patch_type == "loha":
                w1a = v[0]
                w1b = v[1]
                if v[2] is not None:
                    alpha = v[2] / w1b.shape[0]
                else:
                    alpha = 1.0
                w2a = v[3]
                w2b = v[4]
                dora_scale = v[7]
                if v[5] is not None: #cp decomposition
                    t1 = v[5]
                    t2 = v[6]
                    m1 = torch.einsum('i j k l, j r, i p -> p r k l',
                                      comfy.model_management.cast_to_device(t1, weight.device, torch.float32),
                                      comfy.model_management.cast_to_device(w1b, weight.device, torch.float32),
                                      comfy.model_management.cast_to_device(w1a, weight.device, torch.float32))
                    m2 = torch.einsum('i j k l, j r, i p -> p r k l',
                                      comfy.model_management.cast_to_device(t2, weight.device, torch.float32),
                                      comfy.model_management.cast_to_device(w2b, weight.device, torch.float32),
                                      comfy.model_management.cast_to_device(w2a, weight.device, torch.float32))
                else:
                    m1 = torch.mm(comfy.model_management.cast_to_device(w1a, weight.device, torch.float32),
                                  comfy.model_management.cast_to_device(w1b, weight.device, torch.float32))
                    m2 = torch.mm(comfy.model_management.cast_to_device(w2a, weight.device, torch.float32),
                                  comfy.model_management.cast_to_device(w2b, weight.device, torch.float32))
                try:
                    lora_diff = (m1 * m2).reshape(weight.shape)
                    if dora_scale is not None:
                        weight = function(weight_decompose(dora_scale, weight, lora_diff, alpha, strength))
                    else:
                        weight += function(((strength * alpha) * lora_diff).type(weight.dtype))
                except Exception as e:
                    logging.error("ERROR {} {} {}".format(patch_type, key, e))
            elif patch_type == "glora":
                if v[4] is not None:
                    alpha = v[4] / v[0].shape[0]
                else:
                    alpha = 1.0
                dora_scale = v[5]
                a1 = comfy.model_management.cast_to_device(v[0].flatten(start_dim=1), weight.device, torch.float32)
                a2 = comfy.model_management.cast_to_device(v[1].flatten(start_dim=1), weight.device, torch.float32)
                b1 = comfy.model_management.cast_to_device(v[2].flatten(start_dim=1), weight.device, torch.float32)
                b2 = comfy.model_management.cast_to_device(v[3].flatten(start_dim=1), weight.device, torch.float32)
                try:
                    lora_diff = (torch.mm(b2, b1) + torch.mm(torch.mm(weight.flatten(start_dim=1), a2), a1)).reshape(weight.shape)
                    if dora_scale is not None:
                        weight = function(weight_decompose(dora_scale, weight, lora_diff, alpha, strength))
                    else:
                        weight += function(((strength * alpha) * lora_diff).type(weight.dtype))
                except Exception as e:
                    logging.error("ERROR {} {} {}".format(patch_type, key, e))
            else:
                logging.warning("patch type not recognized {} {}".format(patch_type, key))
            if old_weight is not None:
                weight = old_weight
        return weight
    def unpatch_model(self, device_to=None, unpatch_weights=True):
        if unpatch_weights:
-            if self.model_lowvram:
+            if self.model.model_lowvram:
                for m in self.model.modules():
-                    if hasattr(m, "prev_comfy_cast_weights"):
+                    wipe_lowvram_weight(m)
                        m.comfy_cast_weights = m.prev_comfy_cast_weights
                        del m.prev_comfy_cast_weights
                    m.weight_function = None
                    m.bias_function = None
-                self.model_lowvram = False
+                self.model.model_lowvram = False
-                self.lowvram_patch_counter = 0
+                self.model.lowvram_patch_counter = 0
            keys = list(self.backup.keys())
-            if self.weight_inplace_update:
+            for k in keys:
-                for k in keys:
+                bk = self.backup[k]
-                    comfy.utils.copy_to_param(self.model, k, self.backup[k])
+                if bk.inplace_update:
-            else:
+                    comfy.utils.copy_to_param(self.model, k, bk.weight)
-                for k in keys:
+                else:
-                    comfy.utils.set_attr_param(self.model, k, self.backup[k])
+                    comfy.utils.set_attr_param(self.model, k, bk.weight)
            self.backup.clear()
            if device_to is not None:
                self.model.to(device_to)
-                self.current_device = device_to
+                self.model.device = device_to
            self.model.model_loaded_weight_memory = 0
            for m in self.model.modules():
                if hasattr(m, "comfy_patched_weights"):
                    del m.comfy_patched_weights
        keys = list(self.object_patches_backup.keys())
        for k in keys:
            comfy.utils.set_attr(self.model, k, self.object_patches_backup[k])
        self.object_patches_backup.clear()
    def partially_unload(self, device_to, memory_to_free=0):
        memory_freed = 0
        patch_counter = 0
        unload_list = []
        for n, m in self.model.named_modules():
            shift_lowvram = False
            if hasattr(m, "comfy_cast_weights"):
                module_mem = comfy.model_management.module_size(m)
                unload_list.append((module_mem, n, m))
        unload_list.sort()
        for unload in unload_list:
            if memory_to_free < memory_freed:
                break
            module_mem = unload[0]
            n = unload[1]
            m = unload[2]
            weight_key = "{}.weight".format(n)
            bias_key = "{}.bias".format(n)
            if hasattr(m, "comfy_patched_weights") and m.comfy_patched_weights == True:
                for key in [weight_key, bias_key]:
                    bk = self.backup.get(key, None)
                    if bk is not None:
                        if bk.inplace_update:
                            comfy.utils.copy_to_param(self.model, key, bk.weight)
                        else:
                            comfy.utils.set_attr_param(self.model, key, bk.weight)
                        self.backup.pop(key)
                m.to(device_to)
                if weight_key in self.patches:
                    m.weight_function = LowVramPatch(weight_key, self.patches)
                    patch_counter += 1
                if bias_key in self.patches:
                    m.bias_function = LowVramPatch(bias_key, self.patches)
                    patch_counter += 1
                m.prev_comfy_cast_weights = m.comfy_cast_weights
                m.comfy_cast_weights = True
                m.comfy_patched_weights = False
                memory_freed += module_mem
                logging.debug("freed {}".format(n))
        self.model.model_lowvram = True
        self.model.lowvram_patch_counter += patch_counter
        self.model.model_loaded_weight_memory -= memory_freed
        return memory_freed
    def partially_load(self, device_to, extra_memory=0):
        self.unpatch_model(unpatch_weights=False)
        self.patch_model(load_weights=False)
        full_load = False
        if self.model.model_lowvram == False:
            return 0
        if self.model.model_loaded_weight_memory + extra_memory > self.model_size():
            full_load = True
        current_used = self.model.model_loaded_weight_memory
        self.load(device_to, lowvram_model_memory=current_used + extra_memory, full_load=full_load)
        return self.model.model_loaded_weight_memory - current_used
    def current_loaded_device(self):
        return self.model.device
    def calculate_weight(self, patches, weight, key, intermediate_dtype=torch.float32):
        print("WARNING the ModelPatcher.calculate_weight function is deprecated, please use: comfy.lora.calculate_weight instead")
        return comfy.lora.calculate_weight(patches, weight, key, intermediate_dtype=intermediate_dtype)
--- a/comfy/ops.py
+++ b/comfy/ops.py
@ -18,29 +18,42 @@
 import torch
 import comfy.model_management
 from comfy.cli_args import args
 def cast_to(weight, dtype=None, device=None, non_blocking=False, copy=False):
    if device is None or weight.device == device:
        if not copy:
            if dtype is None or weight.dtype == dtype:
                return weight
        return weight.to(dtype=dtype, copy=copy)
-def cast_to(weight, dtype=None, device=None, non_blocking=False):
+    r = torch.empty_like(weight, dtype=dtype, device=device)
-    return weight.to(device=device, dtype=dtype, non_blocking=non_blocking)
+    r.copy_(weight, non_blocking=non_blocking)
    return r
-def cast_to_input(weight, input, non_blocking=False):
+def cast_to_input(weight, input, non_blocking=False, copy=True):
-    return cast_to(weight, input.dtype, input.device, non_blocking=non_blocking)
+    return cast_to(weight, input.dtype, input.device, non_blocking=non_blocking, copy=copy)
-def cast_bias_weight(s, input=None, dtype=None, device=None):
+def cast_bias_weight(s, input=None, dtype=None, device=None, bias_dtype=None):
    if input is not None:
        if dtype is None:
            dtype = input.dtype
        if bias_dtype is None:
            bias_dtype = dtype
        if device is None:
            device = input.device
    bias = None
-    non_blocking = comfy.model_management.device_should_use_non_blocking(device)
+    non_blocking = comfy.model_management.device_supports_non_blocking(device)
    if s.bias is not None:
-        bias = cast_to(s.bias, dtype, device, non_blocking=non_blocking)
+        has_function = s.bias_function is not None
-        if s.bias_function is not None:
+        bias = cast_to(s.bias, bias_dtype, device, non_blocking=non_blocking, copy=has_function)
        if has_function:
            bias = s.bias_function(bias)
-    weight = cast_to(s.weight, dtype, device, non_blocking=non_blocking)
+
-    if s.weight_function is not None:
+    has_function = s.weight_function is not None
    weight = cast_to(s.weight, dtype, device, non_blocking=non_blocking, copy=has_function)
    if has_function:
        weight = s.weight_function(weight)
    return weight, bias
@ -238,3 +251,59 @@ class manual_cast(disable_weight_init):
    class Embedding(disable_weight_init.Embedding):
        comfy_cast_weights = True
 def fp8_linear(self, input):
    dtype = self.weight.dtype
    if dtype not in [torch.float8_e4m3fn]:
        return None
    if len(input.shape) == 3:
        inn = input.reshape(-1, input.shape[2]).to(dtype)
        non_blocking = comfy.model_management.device_supports_non_blocking(input.device)
        w, bias = cast_bias_weight(self, input, dtype=dtype, bias_dtype=input.dtype)
        w = w.t()
        scale_weight = self.scale_weight
        scale_input = self.scale_input
        if scale_weight is None:
            scale_weight = torch.ones((1), device=input.device, dtype=torch.float32)
            if scale_input is None:
                scale_input = scale_weight
        if scale_input is None:
            scale_input = torch.ones((1), device=input.device, dtype=torch.float32)
        if bias is not None:
            o = torch._scaled_mm(inn, w, out_dtype=input.dtype, bias=bias, scale_a=scale_input, scale_b=scale_weight)
        else:
            o = torch._scaled_mm(inn, w, out_dtype=input.dtype, scale_a=scale_input, scale_b=scale_weight)
        if isinstance(o, tuple):
            o = o[0]
        return o.reshape((-1, input.shape[1], self.weight.shape[0]))
    return None
 class fp8_ops(manual_cast):
    class Linear(manual_cast.Linear):
        def reset_parameters(self):
            self.scale_weight = None
            self.scale_input = None
            return None
        def forward_comfy_cast_weights(self, input):
            out = fp8_linear(self, input)
            if out is not None:
                return out
            weight, bias = cast_bias_weight(self, input)
            return torch.nn.functional.linear(input, weight, bias)
 def pick_operations(weight_dtype, compute_dtype, load_device=None):
    if compute_dtype is None or weight_dtype == compute_dtype:
        return disable_weight_init
    if args.fast:
        if comfy.model_management.supports_fp8_compute(load_device):
            return fp8_ops
    return manual_cast
--- a/comfy/samplers.py
+++ b/comfy/samplers.py
@ -171,7 +171,7 @@ def calc_cond_batch(model, conds, x_in, timestep, model_options):
        for i in range(1, len(to_batch_temp) + 1):
            batch_amount = to_batch_temp[:len(to_batch_temp)//i]
            input_shape = [len(batch_amount) * first_shape[0]] + list(first_shape)[1:]
-            if model.memory_required(input_shape) < free_memory:
+            if model.memory_required(input_shape) * 1.5 < free_memory:
                to_batch = batch_amount
                break
--- a/comfy/sd.py
+++ b/comfy/sd.py
@ -24,6 +24,7 @@ import comfy.text_encoders.sa_t5
 import comfy.text_encoders.aura_t5
 import comfy.text_encoders.hydit
 import comfy.text_encoders.flux
 import comfy.text_encoders.long_clipl
 import comfy.model_patcher
 import comfy.lora
@ -62,7 +63,7 @@ def load_lora_for_models(model, clip, lora, strength_model, strength_clip):
 class CLIP:
-    def __init__(self, target=None, embedding_directory=None, no_init=False, tokenizer_data={}):
+    def __init__(self, target=None, embedding_directory=None, no_init=False, tokenizer_data={}, parameters=0, model_options={}):
        if no_init:
            return
        params = target.params.copy()
@ -71,20 +72,29 @@ class CLIP:
        load_device = model_management.text_encoder_device()
        offload_device = model_management.text_encoder_offload_device()
-        params['device'] = offload_device
+        dtype = model_options.get("dtype", None)
-        dtype = model_management.text_encoder_dtype(load_device)
+        if dtype is None:
            dtype = model_management.text_encoder_dtype(load_device)
        params['dtype'] = dtype
        params['device'] = model_management.text_encoder_initial_device(load_device, offload_device, parameters * model_management.dtype_size(dtype))
        params['model_options'] = model_options
        self.cond_stage_model = clip(**(params))
        for dt in self.cond_stage_model.dtypes:
            if not model_management.supports_cast(load_device, dt):
                load_device = offload_device
                if params['device'] != offload_device:
                    self.cond_stage_model.to(offload_device)
                    logging.warning("Had to shift TE back.")
        self.tokenizer = tokenizer(embedding_directory=embedding_directory, tokenizer_data=tokenizer_data)
        self.patcher = comfy.model_patcher.ModelPatcher(self.cond_stage_model, load_device=load_device, offload_device=offload_device)
        if params['device'] == load_device:
            model_management.load_models_gpu([self.patcher], force_full_load=True)
        self.layer_idx = None
-        logging.debug("CLIP model load device: {}, offload device: {}".format(load_device, offload_device))
+        logging.debug("CLIP model load device: {}, offload device: {}, current: {}".format(load_device, offload_device, params['device']))
    def clone(self):
        n = CLIP(no_init=True)
@ -390,11 +400,14 @@ class CLIPType(Enum):
    HUNYUAN_DIT = 5
    FLUX = 6
-def load_clip(ckpt_paths, embedding_directory=None, clip_type=CLIPType.STABLE_DIFFUSION):
+def load_clip(ckpt_paths, embedding_directory=None, clip_type=CLIPType.STABLE_DIFFUSION, model_options={}):
    clip_data = []
    for p in ckpt_paths:
        clip_data.append(comfy.utils.load_torch_file(p, safe_load=True))
    return load_text_encoder_state_dicts(clip_data, embedding_directory=embedding_directory, clip_type=clip_type, model_options=model_options)
 def load_text_encoder_state_dicts(state_dicts=[], embedding_directory=None, clip_type=CLIPType.STABLE_DIFFUSION, model_options={}):
    clip_data = state_dicts
    class EmptyClass:
        pass
@ -431,8 +444,13 @@ def load_clip(ckpt_paths, embedding_directory=None, clip_type=CLIPType.STABLE_DI
            clip_target.clip = comfy.text_encoders.sa_t5.SAT5Model
            clip_target.tokenizer = comfy.text_encoders.sa_t5.SAT5Tokenizer
        else:
-            clip_target.clip = sd1_clip.SD1ClipModel
+            w = clip_data[0].get("text_model.embeddings.position_embedding.weight", None)
-            clip_target.tokenizer = sd1_clip.SD1Tokenizer
+            if w is not None and w.shape[0] == 248:
                clip_target.clip = comfy.text_encoders.long_clipl.LongClipModel
                clip_target.tokenizer = comfy.text_encoders.long_clipl.LongClipTokenizer
            else:
                clip_target.clip = sd1_clip.SD1ClipModel
                clip_target.tokenizer = sd1_clip.SD1Tokenizer
    elif len(clip_data) == 2:
        if clip_type == CLIPType.SD3:
            clip_target.clip = comfy.text_encoders.sd3_clip.sd3_clip(clip_l=True, clip_g=True, t5=False)
@ -456,7 +474,11 @@ def load_clip(ckpt_paths, embedding_directory=None, clip_type=CLIPType.STABLE_DI
        clip_target.clip = comfy.text_encoders.sd3_clip.SD3ClipModel
        clip_target.tokenizer = comfy.text_encoders.sd3_clip.SD3Tokenizer
-    clip = CLIP(clip_target, embedding_directory=embedding_directory)
+    parameters = 0
    for c in clip_data:
        parameters += comfy.utils.calculate_parameters(c)
    clip = CLIP(clip_target, embedding_directory=embedding_directory, parameters=parameters, model_options=model_options)
    for c in clip_data:
        m, u = clip.load_sd(c)
        if len(m) > 0:
@ -498,25 +520,39 @@ def load_checkpoint(config_path=None, ckpt_path=None, output_vae=True, output_cl
    return (model, clip, vae)
-def load_checkpoint_guess_config(ckpt_path, output_vae=True, output_clip=True, output_clipvision=False, embedding_directory=None, output_model=True):
+def load_checkpoint_guess_config(ckpt_path, output_vae=True, output_clip=True, output_clipvision=False, embedding_directory=None, output_model=True, model_options={}, te_model_options={}):
    sd = comfy.utils.load_torch_file(ckpt_path)
-    sd_keys = sd.keys()
+    out = load_state_dict_guess_config(sd, output_vae, output_clip, output_clipvision, embedding_directory, output_model, model_options, te_model_options=te_model_options)
    if out is None:
        raise RuntimeError("ERROR: Could not detect model type of: {}".format(ckpt_path))
    return out
 def load_state_dict_guess_config(sd, output_vae=True, output_clip=True, output_clipvision=False, embedding_directory=None, output_model=True, model_options={}, te_model_options={}):
    clip = None
    clipvision = None
    vae = None
    model = None
    model_patcher = None
    clip_target = None
    diffusion_model_prefix = model_detection.unet_prefix_from_state_dict(sd)
    parameters = comfy.utils.calculate_parameters(sd, diffusion_model_prefix)
    weight_dtype = comfy.utils.weight_dtype(sd, diffusion_model_prefix)
    load_device = model_management.get_torch_device()
    model_config = model_detection.model_config_from_unet(sd, diffusion_model_prefix)
    if model_config is None:
-        raise RuntimeError("ERROR: Could not detect model type of: {}".format(ckpt_path))
+        return None
    unet_weight_dtype = list(model_config.supported_inference_dtypes)
    if weight_dtype is not None:
        unet_weight_dtype.append(weight_dtype)
    model_config.custom_operations = model_options.get("custom_operations", None)
    unet_dtype = model_options.get("weight_dtype", None)
    if unet_dtype is None:
        unet_dtype = model_management.unet_dtype(model_params=parameters, supported_dtypes=unet_weight_dtype)
    unet_dtype = model_management.unet_dtype(model_params=parameters, supported_dtypes=model_config.supported_inference_dtypes)
    manual_cast_dtype = model_management.unet_manual_cast(unet_dtype, load_device, model_config.supported_inference_dtypes)
    model_config.set_inference_dtype(unet_dtype, manual_cast_dtype)
@ -540,7 +576,8 @@ def load_checkpoint_guess_config(ckpt_path, output_vae=True, output_clip=True, o
        if clip_target is not None:
            clip_sd = model_config.process_clip_state_dict(sd)
            if len(clip_sd) > 0:
-                clip = CLIP(clip_target, embedding_directory=embedding_directory, tokenizer_data=clip_sd)
+                parameters = comfy.utils.calculate_parameters(clip_sd)
                clip = CLIP(clip_target, embedding_directory=embedding_directory, tokenizer_data=clip_sd, parameters=parameters, model_options=te_model_options)
                m, u = clip.load_sd(clip_sd, full_model=True)
                if len(m) > 0:
                    m_filter = list(filter(lambda a: ".logit_scale" not in a and ".transformer.text_projection.weight" not in a, m))
@ -559,15 +596,16 @@ def load_checkpoint_guess_config(ckpt_path, output_vae=True, output_clip=True, o
        logging.debug("left over keys: {}".format(left_over))
    if output_model:
-        model_patcher = comfy.model_patcher.ModelPatcher(model, load_device=load_device, offload_device=model_management.unet_offload_device(), current_device=inital_load_device)
+        model_patcher = comfy.model_patcher.ModelPatcher(model, load_device=load_device, offload_device=model_management.unet_offload_device())
        if inital_load_device != torch.device("cpu"):
            logging.info("loaded straight to GPU")
-            model_management.load_model_gpu(model_patcher)
+            model_management.load_models_gpu([model_patcher], force_full_load=True)
    return (model_patcher, clip, vae, clipvision)
-def load_unet_state_dict(sd, dtype=None): #load unet in diffusers or regular format
+def load_diffusion_model_state_dict(sd, model_options={}): #load unet in diffusers or regular format
    dtype = model_options.get("dtype", None)
    #Allow loading unets from checkpoint files
    diffusion_model_prefix = model_detection.unet_prefix_from_state_dict(sd)
@ -609,6 +647,7 @@ def load_unet_state_dict(sd, dtype=None): #load unet in diffusers or regular for
    manual_cast_dtype = model_management.unet_manual_cast(unet_dtype, load_device, model_config.supported_inference_dtypes)
    model_config.set_inference_dtype(unet_dtype, manual_cast_dtype)
    model_config.custom_operations = model_options.get("custom_operations", None)
    model = model_config.get_model(new_sd, "")
    model = model.to(offload_device)
    model.load_model_weights(new_sd, "")
@ -617,24 +656,36 @@ def load_unet_state_dict(sd, dtype=None): #load unet in diffusers or regular for
        logging.info("left over keys in unet: {}".format(left_over))
    return comfy.model_patcher.ModelPatcher(model, load_device=load_device, offload_device=offload_device)
-def load_unet(unet_path, dtype=None):
+
 def load_diffusion_model(unet_path, model_options={}):
    sd = comfy.utils.load_torch_file(unet_path)
-    model = load_unet_state_dict(sd, dtype=dtype)
+    model = load_diffusion_model_state_dict(sd, model_options=model_options)
    if model is None:
        logging.error("ERROR UNSUPPORTED UNET {}".format(unet_path))
        raise RuntimeError("ERROR: Could not detect model type of: {}".format(unet_path))
    return model
 def load_unet(unet_path, dtype=None):
    print("WARNING: the load_unet function has been deprecated and will be removed please switch to: load_diffusion_model")
    return load_diffusion_model(unet_path, model_options={"dtype": dtype})
 def load_unet_state_dict(sd, dtype=None):
    print("WARNING: the load_unet_state_dict function has been deprecated and will be removed please switch to: load_diffusion_model_state_dict")
    return load_diffusion_model_state_dict(sd, model_options={"dtype": dtype})
 def save_checkpoint(output_path, model, clip=None, vae=None, clip_vision=None, metadata=None, extra_keys={}):
    clip_sd = None
    load_models = [model]
    if clip is not None:
        load_models.append(clip.load_model())
        clip_sd = clip.get_sd()
    vae_sd = None
    if vae is not None:
        vae_sd = vae.get_sd()
    model_management.load_models_gpu(load_models, force_patch_weights=True)
    clip_vision_sd = clip_vision.get_sd() if clip_vision is not None else None
-    sd = model.model.state_dict_for_saving(clip_sd, vae.get_sd(), clip_vision_sd)
+    sd = model.model.state_dict_for_saving(clip_sd, vae_sd, clip_vision_sd)
    for k in extra_keys:
        sd[k] = extra_keys[k]
--- a/comfy/sd1_clip.py
+++ b/comfy/sd1_clip.py
@ -75,7 +75,6 @@ class ClipTokenWeightEncoder:
        return r
 class SDClipModel(torch.nn.Module, ClipTokenWeightEncoder):
    """Uses the CLIP transformer encoder for text (from huggingface)"""
    LAYERS = [
        "last",
        "pooled",
@ -84,7 +83,7 @@ class SDClipModel(torch.nn.Module, ClipTokenWeightEncoder):
    def __init__(self, version="openai/clip-vit-large-patch14", device="cpu", max_length=77,
                 freeze=True, layer="last", layer_idx=None, textmodel_json_config=None, dtype=None, model_class=comfy.clip_model.CLIPTextModel,
                 special_tokens={"start": 49406, "end": 49407, "pad": 49407}, layer_norm_hidden_state=True, enable_attention_masks=False, zero_out_masked=False,
-                 return_projected_pooled=True, return_attention_masks=False):  # clip-vit-base-patch32
+                 return_projected_pooled=True, return_attention_masks=False, model_options={}):  # clip-vit-base-patch32
        super().__init__()
        assert layer in self.LAYERS
@ -94,7 +93,11 @@ class SDClipModel(torch.nn.Module, ClipTokenWeightEncoder):
        with open(textmodel_json_config) as f:
            config = json.load(f)
-        self.operations = comfy.ops.manual_cast
+        operations = model_options.get("custom_operations", None)
        if operations is None:
            operations = comfy.ops.manual_cast
        self.operations = operations
        self.transformer = model_class(config, dtype, device, self.operations)
        self.num_layers = self.transformer.num_layers
@ -313,6 +316,17 @@ def expand_directory_list(directories):
            dirs.add(root)
    return list(dirs)
 def bundled_embed(embed, prefix, suffix): #bundled embedding in lora format
    i = 0
    out_list = []
    for k in embed:
        if k.startswith(prefix) and k.endswith(suffix):
            out_list.append(embed[k])
    if len(out_list) == 0:
        return None
    return torch.cat(out_list, dim=0)
 def load_embed(embedding_name, embedding_directory, embedding_size, embed_key=None):
    if isinstance(embedding_directory, str):
        embedding_directory = [embedding_directory]
@ -379,8 +393,12 @@ def load_embed(embedding_name, embedding_directory, embedding_size, embed_key=No
        elif embed_key is not None and embed_key in embed:
            embed_out = embed[embed_key]
        else:
-            values = embed.values()
+            embed_out = bundled_embed(embed, 'bundle_emb.', '.string_to_param.*')
-            embed_out = next(iter(values))
+            if embed_out is None:
                embed_out = bundled_embed(embed, 'bundle_emb.', '.{}'.format(embed_key))
            if embed_out is None:
                values = embed.values()
                embed_out = next(iter(values))
    return embed_out
 class SDTokenizer:
@ -537,8 +555,12 @@ class SD1Tokenizer:
    def state_dict(self):
        return {}
 class SD1CheckpointClipModel(SDClipModel):
    def __init__(self, device="cpu", dtype=None, model_options={}):
        super().__init__(device=device, return_projected_pooled=False, dtype=dtype, model_options=model_options)
 class SD1ClipModel(torch.nn.Module):
-    def __init__(self, device="cpu", dtype=None, clip_name="l", clip_model=SDClipModel, name=None, **kwargs):
+    def __init__(self, device="cpu", dtype=None, model_options={}, clip_name="l", clip_model=SD1CheckpointClipModel, name=None, **kwargs):
        super().__init__()
        if name is not None:
@ -548,7 +570,7 @@ class SD1ClipModel(torch.nn.Module):
            self.clip_name = clip_name
            self.clip = "clip_{}".format(self.clip_name)
-        setattr(self, self.clip, clip_model(device=device, dtype=dtype, **kwargs))
+        setattr(self, self.clip, clip_model(device=device, dtype=dtype, model_options=model_options, **kwargs))
        self.dtypes = set()
        if dtype is not None:
--- a/comfy/sdxl_clip.py
+++ b/comfy/sdxl_clip.py
@ -3,14 +3,14 @@ import torch
 import os
 class SDXLClipG(sd1_clip.SDClipModel):
-    def __init__(self, device="cpu", max_length=77, freeze=True, layer="penultimate", layer_idx=None, dtype=None):
+    def __init__(self, device="cpu", max_length=77, freeze=True, layer="penultimate", layer_idx=None, dtype=None, model_options={}):
        if layer == "penultimate":
            layer="hidden"
            layer_idx=-2
        textmodel_json_config = os.path.join(os.path.dirname(os.path.realpath(__file__)), "clip_config_bigg.json")
        super().__init__(device=device, freeze=freeze, layer=layer, layer_idx=layer_idx, textmodel_json_config=textmodel_json_config, dtype=dtype,
-                         special_tokens={"start": 49406, "end": 49407, "pad": 0}, layer_norm_hidden_state=False)
+                         special_tokens={"start": 49406, "end": 49407, "pad": 0}, layer_norm_hidden_state=False, return_projected_pooled=True, model_options=model_options)
    def load_sd(self, sd):
        return super().load_sd(sd)
@ -38,10 +38,10 @@ class SDXLTokenizer:
        return {}
 class SDXLClipModel(torch.nn.Module):
-    def __init__(self, device="cpu", dtype=None):
+    def __init__(self, device="cpu", dtype=None, model_options={}):
        super().__init__()
-        self.clip_l = sd1_clip.SDClipModel(layer="hidden", layer_idx=-2, device=device, dtype=dtype, layer_norm_hidden_state=False)
+        self.clip_l = sd1_clip.SDClipModel(layer="hidden", layer_idx=-2, device=device, dtype=dtype, layer_norm_hidden_state=False, model_options=model_options)
-        self.clip_g = SDXLClipG(device=device, dtype=dtype)
+        self.clip_g = SDXLClipG(device=device, dtype=dtype, model_options=model_options)
        self.dtypes = set([dtype])
    def set_clip_options(self, options):
@ -66,8 +66,8 @@ class SDXLClipModel(torch.nn.Module):
            return self.clip_l.load_sd(sd)
 class SDXLRefinerClipModel(sd1_clip.SD1ClipModel):
-    def __init__(self, device="cpu", dtype=None):
+    def __init__(self, device="cpu", dtype=None, model_options={}):
-        super().__init__(device=device, dtype=dtype, clip_name="g", clip_model=SDXLClipG)
+        super().__init__(device=device, dtype=dtype, clip_name="g", clip_model=SDXLClipG, model_options=model_options)
 class StableCascadeClipGTokenizer(sd1_clip.SDTokenizer):
@ -79,14 +79,14 @@ class StableCascadeTokenizer(sd1_clip.SD1Tokenizer):
        super().__init__(embedding_directory=embedding_directory, tokenizer_data=tokenizer_data, clip_name="g", tokenizer=StableCascadeClipGTokenizer)
 class StableCascadeClipG(sd1_clip.SDClipModel):
-    def __init__(self, device="cpu", max_length=77, freeze=True, layer="hidden", layer_idx=-1, dtype=None):
+    def __init__(self, device="cpu", max_length=77, freeze=True, layer="hidden", layer_idx=-1, dtype=None, model_options={}):
        textmodel_json_config = os.path.join(os.path.dirname(os.path.realpath(__file__)), "clip_config_bigg.json")
        super().__init__(device=device, freeze=freeze, layer=layer, layer_idx=layer_idx, textmodel_json_config=textmodel_json_config, dtype=dtype,
-                         special_tokens={"start": 49406, "end": 49407, "pad": 49407}, layer_norm_hidden_state=False, enable_attention_masks=True)
+                         special_tokens={"start": 49406, "end": 49407, "pad": 49407}, layer_norm_hidden_state=False, enable_attention_masks=True, return_projected_pooled=True, model_options=model_options)
    def load_sd(self, sd):
        return super().load_sd(sd)
 class StableCascadeClipModel(sd1_clip.SD1ClipModel):
-    def __init__(self, device="cpu", dtype=None):
+    def __init__(self, device="cpu", dtype=None, model_options={}):
-        super().__init__(device=device, dtype=dtype, clip_name="g", clip_model=StableCascadeClipG)
+        super().__init__(device=device, dtype=dtype, clip_name="g", clip_model=StableCascadeClipG, model_options=model_options)
--- a/comfy/supported_models.py
+++ b/comfy/supported_models.py
@ -31,6 +31,7 @@ class SD15(supported_models_base.BASE):
    }
    latent_format = latent_formats.SD15
    memory_usage_factor = 1.0
    def process_clip_state_dict(self, state_dict):
        k = list(state_dict.keys())
@ -77,6 +78,7 @@ class SD20(supported_models_base.BASE):
    }
    latent_format = latent_formats.SD15
    memory_usage_factor = 1.0
    def model_type(self, state_dict, prefix=""):
        if self.unet_config["in_channels"] == 4: #SD2.0 inpainting models are not v prediction
@ -140,6 +142,7 @@ class SDXLRefiner(supported_models_base.BASE):
    }
    latent_format = latent_formats.SDXL
    memory_usage_factor = 1.0
    def get_model(self, state_dict, prefix="", device=None):
        return model_base.SDXLRefiner(self, device=device)
@ -178,6 +181,8 @@ class SDXL(supported_models_base.BASE):
    latent_format = latent_formats.SDXL
    memory_usage_factor = 0.8
    def model_type(self, state_dict, prefix=""):
        if 'edm_mean' in state_dict and 'edm_std' in state_dict: #Playground V2.5
            self.latent_format = latent_formats.SDXL_Playground_2_5()
@ -505,6 +510,9 @@ class SD3(supported_models_base.BASE):
    unet_extra_config = {}
    latent_format = latent_formats.SD3
    memory_usage_factor = 1.2
    text_encoder_key_prefix = ["text_encoders."]
    def get_model(self, state_dict, prefix="", device=None):
@ -631,7 +639,10 @@ class Flux(supported_models_base.BASE):
    unet_extra_config = {}
    latent_format = latent_formats.Flux
-    supported_inference_dtypes = [torch.bfloat16, torch.float32]
+
    memory_usage_factor = 2.8
    supported_inference_dtypes = [torch.bfloat16, torch.float16, torch.float32]
    vae_key_prefix = ["vae."]
    text_encoder_key_prefix = ["text_encoders."]
@ -641,7 +652,12 @@ class Flux(supported_models_base.BASE):
        return out
    def clip_target(self, state_dict={}):
-        return supported_models_base.ClipTarget(comfy.text_encoders.flux.FluxTokenizer, comfy.text_encoders.flux.FluxClipModel)
+        pref = self.text_encoder_key_prefix[0]
        t5_key = "{}t5xxl.transformer.encoder.final_layer_norm.weight".format(pref)
        dtype_t5 = None
        if t5_key in state_dict:
            dtype_t5 = state_dict[t5_key].dtype
        return supported_models_base.ClipTarget(comfy.text_encoders.flux.FluxTokenizer, comfy.text_encoders.flux.flux_clip(dtype_t5=dtype_t5))
 class FluxSchnell(Flux):
    unet_config = {
--- a/comfy/supported_models_base.py
+++ b/comfy/supported_models_base.py
@ -1,3 +1,21 @@
 """
    This file is part of ComfyUI.
    Copyright (C) 2024 Comfy
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.
    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <https://www.gnu.org/licenses/>.
 """
 import torch
 from . import model_base
 from . import utils
@ -27,7 +45,10 @@ class BASE:
    text_encoder_key_prefix = ["cond_stage_model."]
    supported_inference_dtypes = [torch.float16, torch.bfloat16, torch.float32]
    memory_usage_factor = 2.0
    manual_cast_dtype = None
    custom_operations = None
    @classmethod
    def matches(s, unet_config, state_dict=None):
--- a/comfy/text_encoders/aura_t5.py
+++ b/comfy/text_encoders/aura_t5.py
@ -4,9 +4,9 @@ import comfy.text_encoders.t5
 import os
 class PT5XlModel(sd1_clip.SDClipModel):
-    def __init__(self, device="cpu", layer="last", layer_idx=None, dtype=None):
+    def __init__(self, device="cpu", layer="last", layer_idx=None, dtype=None, model_options={}):
        textmodel_json_config = os.path.join(os.path.dirname(os.path.realpath(__file__)), "t5_pile_config_xl.json")
-        super().__init__(device=device, layer=layer, layer_idx=layer_idx, textmodel_json_config=textmodel_json_config, dtype=dtype, special_tokens={"end": 2, "pad": 1}, model_class=comfy.text_encoders.t5.T5, enable_attention_masks=True, zero_out_masked=True)
+        super().__init__(device=device, layer=layer, layer_idx=layer_idx, textmodel_json_config=textmodel_json_config, dtype=dtype, special_tokens={"end": 2, "pad": 1}, model_class=comfy.text_encoders.t5.T5, enable_attention_masks=True, zero_out_masked=True, model_options=model_options)
 class PT5XlTokenizer(sd1_clip.SDTokenizer):
    def __init__(self, embedding_directory=None, tokenizer_data={}):
@ -18,5 +18,5 @@ class AuraT5Tokenizer(sd1_clip.SD1Tokenizer):
        super().__init__(embedding_directory=embedding_directory, tokenizer_data=tokenizer_data, clip_name="pile_t5xl", tokenizer=PT5XlTokenizer)
 class AuraT5Model(sd1_clip.SD1ClipModel):
-    def __init__(self, device="cpu", dtype=None, **kwargs):
+    def __init__(self, device="cpu", dtype=None, model_options={}, **kwargs):
-        super().__init__(device=device, dtype=dtype, name="pile_t5xl", clip_model=PT5XlModel, **kwargs)
+        super().__init__(device=device, dtype=dtype, model_options=model_options, name="pile_t5xl", clip_model=PT5XlModel, **kwargs)
--- a/comfy/text_encoders/flux.py
+++ b/comfy/text_encoders/flux.py
@ -6,9 +6,9 @@ import torch
 import os
 class T5XXLModel(sd1_clip.SDClipModel):
-    def __init__(self, device="cpu", layer="last", layer_idx=None, dtype=None):
+    def __init__(self, device="cpu", layer="last", layer_idx=None, dtype=None, model_options={}):
        textmodel_json_config = os.path.join(os.path.dirname(os.path.realpath(__file__)), "t5_config_xxl.json")
-        super().__init__(device=device, layer=layer, layer_idx=layer_idx, textmodel_json_config=textmodel_json_config, dtype=dtype, special_tokens={"end": 1, "pad": 0}, model_class=comfy.text_encoders.t5.T5)
+        super().__init__(device=device, layer=layer, layer_idx=layer_idx, textmodel_json_config=textmodel_json_config, dtype=dtype, special_tokens={"end": 1, "pad": 0}, model_class=comfy.text_encoders.t5.T5, model_options=model_options)
 class T5XXLTokenizer(sd1_clip.SDTokenizer):
    def __init__(self, embedding_directory=None, tokenizer_data={}):
@ -35,11 +35,11 @@ class FluxTokenizer:
 class FluxClipModel(torch.nn.Module):
-    def __init__(self, dtype_t5=None, device="cpu", dtype=None):
+    def __init__(self, dtype_t5=None, device="cpu", dtype=None, model_options={}):
        super().__init__()
        dtype_t5 = comfy.model_management.pick_weight_dtype(dtype_t5, dtype, device)
-        self.clip_l = sd1_clip.SDClipModel(device=device, dtype=dtype, return_projected_pooled=False)
+        self.clip_l = sd1_clip.SDClipModel(device=device, dtype=dtype, return_projected_pooled=False, model_options=model_options)
-        self.t5xxl = T5XXLModel(device=device, dtype=dtype_t5)
+        self.t5xxl = T5XXLModel(device=device, dtype=dtype_t5, model_options=model_options)
        self.dtypes = set([dtype, dtype_t5])
    def set_clip_options(self, options):
@ -52,9 +52,9 @@ class FluxClipModel(torch.nn.Module):
    def encode_token_weights(self, token_weight_pairs):
        token_weight_pairs_l = token_weight_pairs["l"]
-        token_weight_pars_t5 = token_weight_pairs["t5xxl"]
+        token_weight_pairs_t5 = token_weight_pairs["t5xxl"]
-        t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pars_t5)
+        t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
        l_out, l_pooled = self.clip_l.encode_token_weights(token_weight_pairs_l)
        return t5_out, l_pooled
@ -66,6 +66,6 @@ class FluxClipModel(torch.nn.Module):
 def flux_clip(dtype_t5=None):
    class FluxClipModel_(FluxClipModel):
-        def __init__(self, device="cpu", dtype=None):
+        def __init__(self, device="cpu", dtype=None, model_options={}):
-            super().__init__(dtype_t5=dtype_t5, device=device, dtype=dtype)
+            super().__init__(dtype_t5=dtype_t5, device=device, dtype=dtype, model_options=model_options)
    return FluxClipModel_
--- a/comfy/text_encoders/hydit.py
+++ b/comfy/text_encoders/hydit.py
@ -7,9 +7,9 @@ import os
 import torch
 class HyditBertModel(sd1_clip.SDClipModel):
-    def __init__(self, device="cpu", layer="last", layer_idx=None, dtype=None):
+    def __init__(self, device="cpu", layer="last", layer_idx=None, dtype=None, model_options={}):
        textmodel_json_config = os.path.join(os.path.dirname(os.path.realpath(__file__)), "hydit_clip.json")
-        super().__init__(device=device, layer=layer, layer_idx=layer_idx, textmodel_json_config=textmodel_json_config, dtype=dtype, special_tokens={"start": 101, "end": 102, "pad": 0}, model_class=BertModel, enable_attention_masks=True, return_attention_masks=True)
+        super().__init__(device=device, layer=layer, layer_idx=layer_idx, textmodel_json_config=textmodel_json_config, dtype=dtype, special_tokens={"start": 101, "end": 102, "pad": 0}, model_class=BertModel, enable_attention_masks=True, return_attention_masks=True, model_options=model_options)
 class HyditBertTokenizer(sd1_clip.SDTokenizer):
    def __init__(self, embedding_directory=None, tokenizer_data={}):
@ -18,9 +18,9 @@ class HyditBertTokenizer(sd1_clip.SDTokenizer):
 class MT5XLModel(sd1_clip.SDClipModel):
-    def __init__(self, device="cpu", layer="last", layer_idx=None, dtype=None):
+    def __init__(self, device="cpu", layer="last", layer_idx=None, dtype=None, model_options={}):
        textmodel_json_config = os.path.join(os.path.dirname(os.path.realpath(__file__)), "mt5_config_xl.json")
-        super().__init__(device=device, layer=layer, layer_idx=layer_idx, textmodel_json_config=textmodel_json_config, dtype=dtype, special_tokens={"end": 1, "pad": 0}, model_class=comfy.text_encoders.t5.T5, enable_attention_masks=True, return_attention_masks=True)
+        super().__init__(device=device, layer=layer, layer_idx=layer_idx, textmodel_json_config=textmodel_json_config, dtype=dtype, special_tokens={"end": 1, "pad": 0}, model_class=comfy.text_encoders.t5.T5, enable_attention_masks=True, return_attention_masks=True, model_options=model_options)
 class MT5XLTokenizer(sd1_clip.SDTokenizer):
    def __init__(self, embedding_directory=None, tokenizer_data={}):
@ -50,10 +50,10 @@ class HyditTokenizer:
        return {"mt5xl.spiece_model": self.mt5xl.state_dict()["spiece_model"]}
 class HyditModel(torch.nn.Module):
-    def __init__(self, device="cpu", dtype=None):
+    def __init__(self, device="cpu", dtype=None, model_options={}):
        super().__init__()
-        self.hydit_clip = HyditBertModel(dtype=dtype)
+        self.hydit_clip = HyditBertModel(dtype=dtype, model_options=model_options)
-        self.mt5xl = MT5XLModel(dtype=dtype)
+        self.mt5xl = MT5XLModel(dtype=dtype, model_options=model_options)
        self.dtypes = set()
        if dtype is not None:
--- a/comfy/text_encoders/long_clipl.json
+++ b/comfy/text_encoders/long_clipl.json
@ -0,0 +1,25 @@
 {
  "_name_or_path": "openai/clip-vit-large-patch14",
  "architectures": [
    "CLIPTextModel"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 0,
  "dropout": 0.0,
  "eos_token_id": 49407,
  "hidden_act": "quick_gelu",
  "hidden_size": 768,
  "initializer_factor": 1.0,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 248,
  "model_type": "clip_text_model",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 1,
  "projection_dim": 768,
  "torch_dtype": "float32",
  "transformers_version": "4.24.0",
  "vocab_size": 49408
 }
--- a/comfy/text_encoders/long_clipl.py
+++ b/comfy/text_encoders/long_clipl.py
@ -0,0 +1,19 @@
 from comfy import sd1_clip
 import os
 class LongClipTokenizer_(sd1_clip.SDTokenizer):
    def __init__(self, embedding_directory=None, tokenizer_data={}):
        super().__init__(max_length=248, embedding_directory=embedding_directory, tokenizer_data=tokenizer_data)
 class LongClipModel_(sd1_clip.SDClipModel):
    def __init__(self, device="cpu", dtype=None, model_options={}):
        textmodel_json_config = os.path.join(os.path.dirname(os.path.realpath(__file__)), "long_clipl.json")
        super().__init__(device=device, textmodel_json_config=textmodel_json_config, return_projected_pooled=False, dtype=dtype, model_options=model_options)
 class LongClipTokenizer(sd1_clip.SD1Tokenizer):
    def __init__(self, embedding_directory=None, tokenizer_data={}):
        super().__init__(embedding_directory=embedding_directory, tokenizer_data=tokenizer_data, tokenizer=LongClipTokenizer_)
 class LongClipModel(sd1_clip.SD1ClipModel):
    def __init__(self, device="cpu", dtype=None, model_options={}, **kwargs):
        super().__init__(device=device, dtype=dtype, model_options=model_options, clip_model=LongClipModel_, **kwargs)
--- a/comfy/text_encoders/sa_t5.py
+++ b/comfy/text_encoders/sa_t5.py
@ -4,9 +4,9 @@ import comfy.text_encoders.t5
 import os
 class T5BaseModel(sd1_clip.SDClipModel):
-    def __init__(self, device="cpu", layer="last", layer_idx=None, dtype=None):
+    def __init__(self, device="cpu", layer="last", layer_idx=None, dtype=None, model_options={}):
        textmodel_json_config = os.path.join(os.path.dirname(os.path.realpath(__file__)), "t5_config_base.json")
-        super().__init__(device=device, layer=layer, layer_idx=layer_idx, textmodel_json_config=textmodel_json_config, dtype=dtype, special_tokens={"end": 1, "pad": 0}, model_class=comfy.text_encoders.t5.T5, enable_attention_masks=True, zero_out_masked=True)
+        super().__init__(device=device, layer=layer, layer_idx=layer_idx, textmodel_json_config=textmodel_json_config, dtype=dtype, model_options=model_options, special_tokens={"end": 1, "pad": 0}, model_class=comfy.text_encoders.t5.T5, enable_attention_masks=True, zero_out_masked=True)
 class T5BaseTokenizer(sd1_clip.SDTokenizer):
    def __init__(self, embedding_directory=None, tokenizer_data={}):
@ -18,5 +18,5 @@ class SAT5Tokenizer(sd1_clip.SD1Tokenizer):
        super().__init__(embedding_directory=embedding_directory, tokenizer_data=tokenizer_data, clip_name="t5base", tokenizer=T5BaseTokenizer)
 class SAT5Model(sd1_clip.SD1ClipModel):
-    def __init__(self, device="cpu", dtype=None, **kwargs):
+    def __init__(self, device="cpu", dtype=None, model_options={}, **kwargs):
-        super().__init__(device=device, dtype=dtype, name="t5base", clip_model=T5BaseModel, **kwargs)
+        super().__init__(device=device, dtype=dtype, model_options=model_options, name="t5base", clip_model=T5BaseModel, **kwargs)
--- a/comfy/text_encoders/sd2_clip.py
+++ b/comfy/text_encoders/sd2_clip.py
@ -2,13 +2,13 @@ from comfy import sd1_clip
 import os
 class SD2ClipHModel(sd1_clip.SDClipModel):
-    def __init__(self, arch="ViT-H-14", device="cpu", max_length=77, freeze=True, layer="penultimate", layer_idx=None, dtype=None):
+    def __init__(self, arch="ViT-H-14", device="cpu", max_length=77, freeze=True, layer="penultimate", layer_idx=None, dtype=None, model_options={}):
        if layer == "penultimate":
            layer="hidden"
            layer_idx=-2
        textmodel_json_config = os.path.join(os.path.dirname(os.path.realpath(__file__)), "sd2_clip_config.json")
-        super().__init__(device=device, freeze=freeze, layer=layer, layer_idx=layer_idx, textmodel_json_config=textmodel_json_config, dtype=dtype, special_tokens={"start": 49406, "end": 49407, "pad": 0})
+        super().__init__(device=device, freeze=freeze, layer=layer, layer_idx=layer_idx, textmodel_json_config=textmodel_json_config, dtype=dtype, special_tokens={"start": 49406, "end": 49407, "pad": 0}, return_projected_pooled=True, model_options=model_options)
 class SD2ClipHTokenizer(sd1_clip.SDTokenizer):
    def __init__(self, tokenizer_path=None, embedding_directory=None, tokenizer_data={}):
@ -19,5 +19,5 @@ class SD2Tokenizer(sd1_clip.SD1Tokenizer):
        super().__init__(embedding_directory=embedding_directory, tokenizer_data=tokenizer_data, clip_name="h", tokenizer=SD2ClipHTokenizer)
 class SD2ClipModel(sd1_clip.SD1ClipModel):
-    def __init__(self, device="cpu", dtype=None, **kwargs):
+    def __init__(self, device="cpu", dtype=None, model_options={}, **kwargs):
-        super().__init__(device=device, dtype=dtype, clip_name="h", clip_model=SD2ClipHModel, **kwargs)
+        super().__init__(device=device, dtype=dtype, model_options=model_options, clip_name="h", clip_model=SD2ClipHModel, **kwargs)
--- a/comfy/text_encoders/sd3_clip.py
+++ b/comfy/text_encoders/sd3_clip.py
@ -8,14 +8,14 @@ import comfy.model_management
 import logging
 class T5XXLModel(sd1_clip.SDClipModel):
-    def __init__(self, device="cpu", layer="last", layer_idx=None, dtype=None):
+    def __init__(self, device="cpu", layer="last", layer_idx=None, dtype=None, model_options={}):
        textmodel_json_config = os.path.join(os.path.dirname(os.path.realpath(__file__)), "t5_config_xxl.json")
-        super().__init__(device=device, layer=layer, layer_idx=layer_idx, textmodel_json_config=textmodel_json_config, dtype=dtype, special_tokens={"end": 1, "pad": 0}, model_class=comfy.text_encoders.t5.T5)
+        super().__init__(device=device, layer=layer, layer_idx=layer_idx, textmodel_json_config=textmodel_json_config, dtype=dtype, special_tokens={"end": 1, "pad": 0}, model_class=comfy.text_encoders.t5.T5, model_options=model_options)
 class T5XXLTokenizer(sd1_clip.SDTokenizer):
    def __init__(self, embedding_directory=None, tokenizer_data={}):
        tokenizer_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "t5_tokenizer")
-        super().__init__(tokenizer_path, pad_with_end=False, embedding_size=4096, embedding_key='t5xxl', tokenizer_class=T5TokenizerFast, has_start_token=False, pad_to_max_length=False, max_length=99999999, min_length=77)
+        super().__init__(tokenizer_path, embedding_directory=embedding_directory, pad_with_end=False, embedding_size=4096, embedding_key='t5xxl', tokenizer_class=T5TokenizerFast, has_start_token=False, pad_to_max_length=False, max_length=99999999, min_length=77)
 class SD3Tokenizer:
@ -38,24 +38,24 @@ class SD3Tokenizer:
        return {}
 class SD3ClipModel(torch.nn.Module):
-    def __init__(self, clip_l=True, clip_g=True, t5=True, dtype_t5=None, device="cpu", dtype=None):
+    def __init__(self, clip_l=True, clip_g=True, t5=True, dtype_t5=None, device="cpu", dtype=None, model_options={}):
        super().__init__()
        self.dtypes = set()
        if clip_l:
-            self.clip_l = sd1_clip.SDClipModel(layer="hidden", layer_idx=-2, device=device, dtype=dtype, layer_norm_hidden_state=False, return_projected_pooled=False)
+            self.clip_l = sd1_clip.SDClipModel(layer="hidden", layer_idx=-2, device=device, dtype=dtype, layer_norm_hidden_state=False, return_projected_pooled=False, model_options=model_options)
            self.dtypes.add(dtype)
        else:
            self.clip_l = None
        if clip_g:
-            self.clip_g = sdxl_clip.SDXLClipG(device=device, dtype=dtype)
+            self.clip_g = sdxl_clip.SDXLClipG(device=device, dtype=dtype, model_options=model_options)
            self.dtypes.add(dtype)
        else:
            self.clip_g = None
        if t5:
            dtype_t5 = comfy.model_management.pick_weight_dtype(dtype_t5, dtype, device)
-            self.t5xxl = T5XXLModel(device=device, dtype=dtype_t5)
+            self.t5xxl = T5XXLModel(device=device, dtype=dtype_t5, model_options=model_options)
            self.dtypes.add(dtype_t5)
        else:
            self.t5xxl = None
@ -81,7 +81,7 @@ class SD3ClipModel(torch.nn.Module):
    def encode_token_weights(self, token_weight_pairs):
        token_weight_pairs_l = token_weight_pairs["l"]
        token_weight_pairs_g = token_weight_pairs["g"]
-        token_weight_pars_t5 = token_weight_pairs["t5xxl"]
+        token_weight_pairs_t5 = token_weight_pairs["t5xxl"]
        lg_out = None
        pooled = None
        out = None
@ -108,7 +108,7 @@ class SD3ClipModel(torch.nn.Module):
            pooled = torch.cat((l_pooled, g_pooled), dim=-1)
        if self.t5xxl is not None:
-            t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pars_t5)
+            t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
            if lg_out is not None:
                out = torch.cat([lg_out, t5_out], dim=-2)
            else:
@ -132,6 +132,6 @@ class SD3ClipModel(torch.nn.Module):
 def sd3_clip(clip_l=True, clip_g=True, t5=True, dtype_t5=None):
    class SD3ClipModel_(SD3ClipModel):
-        def __init__(self, device="cpu", dtype=None):
+        def __init__(self, device="cpu", dtype=None, model_options={}):
-            super().__init__(clip_l=clip_l, clip_g=clip_g, t5=t5, dtype_t5=dtype_t5, device=device, dtype=dtype)
+            super().__init__(clip_l=clip_l, clip_g=clip_g, t5=t5, dtype_t5=dtype_t5, device=device, dtype=dtype, model_options=model_options)
    return SD3ClipModel_
--- a/comfy/utils.py
+++ b/comfy/utils.py
@ -1,3 +1,22 @@
 """
    This file is part of ComfyUI.
    Copyright (C) 2024 Comfy
    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.
    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.
    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <https://www.gnu.org/licenses/>.
 """
 import torch
 import math
 import struct
@ -40,9 +59,22 @@ def calculate_parameters(sd, prefix=""):
    params = 0
    for k in sd.keys():
        if k.startswith(prefix):
-            params += sd[k].nelement()
+            w = sd[k]
            params += w.nelement()
    return params
 def weight_dtype(sd, prefix=""):
    dtypes = {}
    for k in sd.keys():
        if k.startswith(prefix):
            w = sd[k]
            dtypes[w.dtype] = dtypes.get(w.dtype, 0) + 1
    if len(dtypes) == 0:
        return None
    return max(dtypes, key=dtypes.get)
 def state_dict_key_replace(state_dict, keys_to_replace):
    for x in keys_to_replace:
        if x in state_dict:
@ -402,6 +434,110 @@ def auraflow_to_diffusers(mmdit_config, output_prefix=""):
    return key_map
 def flux_to_diffusers(mmdit_config, output_prefix=""):
    n_double_layers = mmdit_config.get("depth", 0)
    n_single_layers = mmdit_config.get("depth_single_blocks", 0)
    hidden_size = mmdit_config.get("hidden_size", 0)
    key_map = {}
    for index in range(n_double_layers):
        prefix_from = "transformer_blocks.{}".format(index)
        prefix_to = "{}double_blocks.{}".format(output_prefix, index)
        for end in ("weight", "bias"):
            k = "{}.attn.".format(prefix_from)
            qkv = "{}.img_attn.qkv.{}".format(prefix_to, end)
            key_map["{}to_q.{}".format(k, end)] = (qkv, (0, 0, hidden_size))
            key_map["{}to_k.{}".format(k, end)] = (qkv, (0, hidden_size, hidden_size))
            key_map["{}to_v.{}".format(k, end)] = (qkv, (0, hidden_size * 2, hidden_size))
            k = "{}.attn.".format(prefix_from)
            qkv = "{}.txt_attn.qkv.{}".format(prefix_to, end)
            key_map["{}add_q_proj.{}".format(k, end)] = (qkv, (0, 0, hidden_size))
            key_map["{}add_k_proj.{}".format(k, end)] = (qkv, (0, hidden_size, hidden_size))
            key_map["{}add_v_proj.{}".format(k, end)] = (qkv, (0, hidden_size * 2, hidden_size))
        block_map = {
                        "attn.to_out.0.weight": "img_attn.proj.weight",
                        "attn.to_out.0.bias": "img_attn.proj.bias",
                        "norm1.linear.weight": "img_mod.lin.weight",
                        "norm1.linear.bias": "img_mod.lin.bias",
                        "norm1_context.linear.weight": "txt_mod.lin.weight",
                        "norm1_context.linear.bias": "txt_mod.lin.bias",
                        "attn.to_add_out.weight": "txt_attn.proj.weight",
                        "attn.to_add_out.bias": "txt_attn.proj.bias",
                        "ff.net.0.proj.weight": "img_mlp.0.weight",
                        "ff.net.0.proj.bias": "img_mlp.0.bias",
                        "ff.net.2.weight": "img_mlp.2.weight",
                        "ff.net.2.bias": "img_mlp.2.bias",
                        "ff_context.net.0.proj.weight": "txt_mlp.0.weight",
                        "ff_context.net.0.proj.bias": "txt_mlp.0.bias",
                        "ff_context.net.2.weight": "txt_mlp.2.weight",
                        "ff_context.net.2.bias": "txt_mlp.2.bias",
                        "attn.norm_q.weight": "img_attn.norm.query_norm.scale",
                        "attn.norm_k.weight": "img_attn.norm.key_norm.scale",
                        "attn.norm_added_q.weight": "txt_attn.norm.query_norm.scale",
                        "attn.norm_added_k.weight": "txt_attn.norm.key_norm.scale",
                    }
        for k in block_map:
            key_map["{}.{}".format(prefix_from, k)] = "{}.{}".format(prefix_to, block_map[k])
    for index in range(n_single_layers):
        prefix_from = "single_transformer_blocks.{}".format(index)
        prefix_to = "{}single_blocks.{}".format(output_prefix, index)
        for end in ("weight", "bias"):
            k = "{}.attn.".format(prefix_from)
            qkv = "{}.linear1.{}".format(prefix_to, end)
            key_map["{}to_q.{}".format(k, end)] = (qkv, (0, 0, hidden_size))
            key_map["{}to_k.{}".format(k, end)] = (qkv, (0, hidden_size, hidden_size))
            key_map["{}to_v.{}".format(k, end)] = (qkv, (0, hidden_size * 2, hidden_size))
            key_map["{}.proj_mlp.{}".format(prefix_from, end)] = (qkv, (0, hidden_size * 3, hidden_size * 4))
        block_map = {
                        "norm.linear.weight": "modulation.lin.weight",
                        "norm.linear.bias": "modulation.lin.bias",
                        "proj_out.weight": "linear2.weight",
                        "proj_out.bias": "linear2.bias",
                        "attn.norm_q.weight": "norm.query_norm.scale",
                        "attn.norm_k.weight": "norm.key_norm.scale",
                    }
        for k in block_map:
            key_map["{}.{}".format(prefix_from, k)] = "{}.{}".format(prefix_to, block_map[k])
    MAP_BASIC = {
        ("final_layer.linear.bias", "proj_out.bias"),
        ("final_layer.linear.weight", "proj_out.weight"),
        ("img_in.bias", "x_embedder.bias"),
        ("img_in.weight", "x_embedder.weight"),
        ("time_in.in_layer.bias", "time_text_embed.timestep_embedder.linear_1.bias"),
        ("time_in.in_layer.weight", "time_text_embed.timestep_embedder.linear_1.weight"),
        ("time_in.out_layer.bias", "time_text_embed.timestep_embedder.linear_2.bias"),
        ("time_in.out_layer.weight", "time_text_embed.timestep_embedder.linear_2.weight"),
        ("txt_in.bias", "context_embedder.bias"),
        ("txt_in.weight", "context_embedder.weight"),
        ("vector_in.in_layer.bias", "time_text_embed.text_embedder.linear_1.bias"),
        ("vector_in.in_layer.weight", "time_text_embed.text_embedder.linear_1.weight"),
        ("vector_in.out_layer.bias", "time_text_embed.text_embedder.linear_2.bias"),
        ("vector_in.out_layer.weight", "time_text_embed.text_embedder.linear_2.weight"),
        ("guidance_in.in_layer.bias", "time_text_embed.guidance_embedder.linear_1.bias"),
        ("guidance_in.in_layer.weight", "time_text_embed.guidance_embedder.linear_1.weight"),
        ("guidance_in.out_layer.bias", "time_text_embed.guidance_embedder.linear_2.bias"),
        ("guidance_in.out_layer.weight", "time_text_embed.guidance_embedder.linear_2.weight"),
        ("final_layer.adaLN_modulation.1.bias", "norm_out.linear.bias", swap_scale_shift),
        ("final_layer.adaLN_modulation.1.weight", "norm_out.linear.weight", swap_scale_shift),
    }
    for k in MAP_BASIC:
        if len(k) > 2:
            key_map[k[1]] = ("{}{}".format(output_prefix, k[0]), None, k[2])
        else:
            key_map[k[1]] = "{}{}".format(output_prefix, k[0])
    return key_map
 def repeat_to_batch_size(tensor, batch_size, dim=0):
    if tensor.shape[dim] > batch_size:
        return tensor.narrow(dim, 0, batch_size)
--- a/comfy_execution/caching.py
+++ b/comfy_execution/caching.py
@ -0,0 +1,308 @@
 import itertools
 from typing import Sequence, Mapping
 from comfy_execution.graph import DynamicPrompt
 import nodes
 from comfy_execution.graph_utils import is_link
 class CacheKeySet:
    def __init__(self, dynprompt, node_ids, is_changed_cache):
        self.keys = {}
        self.subcache_keys = {}
    def add_keys(self, node_ids):
        raise NotImplementedError()
    def all_node_ids(self):
        return set(self.keys.keys())
    def get_used_keys(self):
        return self.keys.values()
    def get_used_subcache_keys(self):
        return self.subcache_keys.values()
    def get_data_key(self, node_id):
        return self.keys.get(node_id, None)
    def get_subcache_key(self, node_id):
        return self.subcache_keys.get(node_id, None)
 class Unhashable:
    def __init__(self):
        self.value = float("NaN")
 def to_hashable(obj):
    # So that we don't infinitely recurse since frozenset and tuples
    # are Sequences.
    if isinstance(obj, (int, float, str, bool, type(None))):
        return obj
    elif isinstance(obj, Mapping):
        return frozenset([(to_hashable(k), to_hashable(v)) for k, v in sorted(obj.items())])
    elif isinstance(obj, Sequence):
        return frozenset(zip(itertools.count(), [to_hashable(i) for i in obj]))
    else:
        # TODO - Support other objects like tensors?
        return Unhashable()
 class CacheKeySetID(CacheKeySet):
    def __init__(self, dynprompt, node_ids, is_changed_cache):
        super().__init__(dynprompt, node_ids, is_changed_cache)
        self.dynprompt = dynprompt
        self.add_keys(node_ids)
    def add_keys(self, node_ids):
        for node_id in node_ids:
            if node_id in self.keys:
                continue
            if not self.dynprompt.has_node(node_id):
                continue
            node = self.dynprompt.get_node(node_id)
            self.keys[node_id] = (node_id, node["class_type"])
            self.subcache_keys[node_id] = (node_id, node["class_type"])
 class CacheKeySetInputSignature(CacheKeySet):
    def __init__(self, dynprompt, node_ids, is_changed_cache):
        super().__init__(dynprompt, node_ids, is_changed_cache)
        self.dynprompt = dynprompt
        self.is_changed_cache = is_changed_cache
        self.add_keys(node_ids)
    def include_node_id_in_input(self) -> bool:
        return False
    def add_keys(self, node_ids):
        for node_id in node_ids:
            if node_id in self.keys:
                continue
            if not self.dynprompt.has_node(node_id):
                continue
            node = self.dynprompt.get_node(node_id)
            self.keys[node_id] = self.get_node_signature(self.dynprompt, node_id)
            self.subcache_keys[node_id] = (node_id, node["class_type"])
    def get_node_signature(self, dynprompt, node_id):
        signature = []
        ancestors, order_mapping = self.get_ordered_ancestry(dynprompt, node_id)
        signature.append(self.get_immediate_node_signature(dynprompt, node_id, order_mapping))
        for ancestor_id in ancestors:
            signature.append(self.get_immediate_node_signature(dynprompt, ancestor_id, order_mapping))
        return to_hashable(signature)
    def get_immediate_node_signature(self, dynprompt, node_id, ancestor_order_mapping):
        if not dynprompt.has_node(node_id):
            # This node doesn't exist -- we can't cache it.
            return [float("NaN")]
        node = dynprompt.get_node(node_id)
        class_type = node["class_type"]
        class_def = nodes.NODE_CLASS_MAPPINGS[class_type]
        signature = [class_type, self.is_changed_cache.get(node_id)]
        if self.include_node_id_in_input() or (hasattr(class_def, "NOT_IDEMPOTENT") and class_def.NOT_IDEMPOTENT):
            signature.append(node_id)
        inputs = node["inputs"]
        for key in sorted(inputs.keys()):
            if is_link(inputs[key]):
                (ancestor_id, ancestor_socket) = inputs[key]
                ancestor_index = ancestor_order_mapping[ancestor_id]
                signature.append((key,("ANCESTOR", ancestor_index, ancestor_socket)))
            else:
                signature.append((key, inputs[key]))
        return signature
    # This function returns a list of all ancestors of the given node. The order of the list is
    # deterministic based on which specific inputs the ancestor is connected by.
    def get_ordered_ancestry(self, dynprompt, node_id):
        ancestors = []
        order_mapping = {}
        self.get_ordered_ancestry_internal(dynprompt, node_id, ancestors, order_mapping)
        return ancestors, order_mapping
    def get_ordered_ancestry_internal(self, dynprompt, node_id, ancestors, order_mapping):
        if not dynprompt.has_node(node_id):
            return
        inputs = dynprompt.get_node(node_id)["inputs"]
        input_keys = sorted(inputs.keys())
        for key in input_keys:
            if is_link(inputs[key]):
                ancestor_id = inputs[key][0]
                if ancestor_id not in order_mapping:
                    ancestors.append(ancestor_id)
                    order_mapping[ancestor_id] = len(ancestors) - 1
                    self.get_ordered_ancestry_internal(dynprompt, ancestor_id, ancestors, order_mapping)
 class BasicCache:
    def __init__(self, key_class):
        self.key_class = key_class
        self.initialized = False
        self.dynprompt: DynamicPrompt
        self.cache_key_set: CacheKeySet
        self.cache = {}
        self.subcaches = {}
    def set_prompt(self, dynprompt, node_ids, is_changed_cache):
        self.dynprompt = dynprompt
        self.cache_key_set = self.key_class(dynprompt, node_ids, is_changed_cache)
        self.is_changed_cache = is_changed_cache
        self.initialized = True
    def all_node_ids(self):
        assert self.initialized
        node_ids = self.cache_key_set.all_node_ids()
        for subcache in self.subcaches.values():
            node_ids = node_ids.union(subcache.all_node_ids())
        return node_ids
    def _clean_cache(self):
        preserve_keys = set(self.cache_key_set.get_used_keys())
        to_remove = []
        for key in self.cache:
            if key not in preserve_keys:
                to_remove.append(key)
        for key in to_remove:
            del self.cache[key]
    def _clean_subcaches(self):
        preserve_subcaches = set(self.cache_key_set.get_used_subcache_keys())
        to_remove = []
        for key in self.subcaches:
            if key not in preserve_subcaches:
                to_remove.append(key)
        for key in to_remove:
            del self.subcaches[key]
    def clean_unused(self):
        assert self.initialized
        self._clean_cache()
        self._clean_subcaches()
    def _set_immediate(self, node_id, value):
        assert self.initialized
        cache_key = self.cache_key_set.get_data_key(node_id)
        self.cache[cache_key] = value
    def _get_immediate(self, node_id):
        if not self.initialized:
            return None
        cache_key = self.cache_key_set.get_data_key(node_id)
        if cache_key in self.cache:
            return self.cache[cache_key]
        else:
            return None
    def _ensure_subcache(self, node_id, children_ids):
        subcache_key = self.cache_key_set.get_subcache_key(node_id)
        subcache = self.subcaches.get(subcache_key, None)
        if subcache is None:
            subcache = BasicCache(self.key_class)
            self.subcaches[subcache_key] = subcache
        subcache.set_prompt(self.dynprompt, children_ids, self.is_changed_cache)
        return subcache
    def _get_subcache(self, node_id):
        assert self.initialized
        subcache_key = self.cache_key_set.get_subcache_key(node_id)
        if subcache_key in self.subcaches:
            return self.subcaches[subcache_key]
        else:
            return None
    def recursive_debug_dump(self):
        result = []
        for key in self.cache:
            result.append({"key": key, "value": self.cache[key]})
        for key in self.subcaches:
            result.append({"subcache_key": key, "subcache": self.subcaches[key].recursive_debug_dump()})
        return result
 class HierarchicalCache(BasicCache):
    def __init__(self, key_class):
        super().__init__(key_class)
    def _get_cache_for(self, node_id):
        assert self.dynprompt is not None
        parent_id = self.dynprompt.get_parent_node_id(node_id)
        if parent_id is None:
            return self
        hierarchy = []
        while parent_id is not None:
            hierarchy.append(parent_id)
            parent_id = self.dynprompt.get_parent_node_id(parent_id)
        cache = self
        for parent_id in reversed(hierarchy):
            cache = cache._get_subcache(parent_id)
            if cache is None:
                return None
        return cache
    def get(self, node_id):
        cache = self._get_cache_for(node_id)
        if cache is None:
            return None
        return cache._get_immediate(node_id)
    def set(self, node_id, value):
        cache = self._get_cache_for(node_id)
        assert cache is not None
        cache._set_immediate(node_id, value)
    def ensure_subcache_for(self, node_id, children_ids):
        cache = self._get_cache_for(node_id)
        assert cache is not None
        return cache._ensure_subcache(node_id, children_ids)
 class LRUCache(BasicCache):
    def __init__(self, key_class, max_size=100):
        super().__init__(key_class)
        self.max_size = max_size
        self.min_generation = 0
        self.generation = 0
        self.used_generation = {}
        self.children = {}
    def set_prompt(self, dynprompt, node_ids, is_changed_cache):
        super().set_prompt(dynprompt, node_ids, is_changed_cache)
        self.generation += 1
        for node_id in node_ids:
            self._mark_used(node_id)
    def clean_unused(self):
        while len(self.cache) > self.max_size and self.min_generation < self.generation:
            self.min_generation += 1
            to_remove = [key for key in self.cache if self.used_generation[key] < self.min_generation]
            for key in to_remove:
                del self.cache[key]
                del self.used_generation[key]
                if key in self.children:
                    del self.children[key]
        self._clean_subcaches()
    def get(self, node_id):
        self._mark_used(node_id)
        return self._get_immediate(node_id)
    def _mark_used(self, node_id):
        cache_key = self.cache_key_set.get_data_key(node_id)
        if cache_key is not None:
            self.used_generation[cache_key] = self.generation
    def set(self, node_id, value):
        self._mark_used(node_id)
        return self._set_immediate(node_id, value)
    def ensure_subcache_for(self, node_id, children_ids):
        # Just uses subcaches for tracking 'live' nodes
        super()._ensure_subcache(node_id, children_ids)
        self.cache_key_set.add_keys(children_ids)
        self._mark_used(node_id)
        cache_key = self.cache_key_set.get_data_key(node_id)
        self.children[cache_key] = []
        for child_id in children_ids:
            self._mark_used(child_id)
            self.children[cache_key].append(self.cache_key_set.get_data_key(child_id))
        return self
--- a/comfy_execution/graph.py
+++ b/comfy_execution/graph.py
@ -0,0 +1,259 @@
 import nodes
 from comfy_execution.graph_utils import is_link
 class DependencyCycleError(Exception):
    pass
 class NodeInputError(Exception):
    pass
 class NodeNotFoundError(Exception):
    pass
 class DynamicPrompt:
    def __init__(self, original_prompt):
        # The original prompt provided by the user
        self.original_prompt = original_prompt
        # Any extra pieces of the graph created during execution
        self.ephemeral_prompt = {}
        self.ephemeral_parents = {}
        self.ephemeral_display = {}
    def get_node(self, node_id):
        if node_id in self.ephemeral_prompt:
            return self.ephemeral_prompt[node_id]
        if node_id in self.original_prompt:
            return self.original_prompt[node_id]
        raise NodeNotFoundError(f"Node {node_id} not found")
    def has_node(self, node_id):
        return node_id in self.original_prompt or node_id in self.ephemeral_prompt
    def add_ephemeral_node(self, node_id, node_info, parent_id, display_id):
        self.ephemeral_prompt[node_id] = node_info
        self.ephemeral_parents[node_id] = parent_id
        self.ephemeral_display[node_id] = display_id
    def get_real_node_id(self, node_id):
        while node_id in self.ephemeral_parents:
            node_id = self.ephemeral_parents[node_id]
        return node_id
    def get_parent_node_id(self, node_id):
        return self.ephemeral_parents.get(node_id, None)
    def get_display_node_id(self, node_id):
        while node_id in self.ephemeral_display:
            node_id = self.ephemeral_display[node_id]
        return node_id
    def all_node_ids(self):
        return set(self.original_prompt.keys()).union(set(self.ephemeral_prompt.keys()))
    def get_original_prompt(self):
        return self.original_prompt
 def get_input_info(class_def, input_name):
    valid_inputs = class_def.INPUT_TYPES()
    input_info = None
    input_category = None
    if "required" in valid_inputs and input_name in valid_inputs["required"]:
        input_category = "required"
        input_info = valid_inputs["required"][input_name]
    elif "optional" in valid_inputs and input_name in valid_inputs["optional"]:
        input_category = "optional"
        input_info = valid_inputs["optional"][input_name]
    elif "hidden" in valid_inputs and input_name in valid_inputs["hidden"]:
        input_category = "hidden"
        input_info = valid_inputs["hidden"][input_name]
    if input_info is None:
        return None, None, None
    input_type = input_info[0]
    if len(input_info) > 1:
        extra_info = input_info[1]
    else:
        extra_info = {}
    return input_type, input_category, extra_info
 class TopologicalSort:
    def __init__(self, dynprompt):
        self.dynprompt = dynprompt
        self.pendingNodes = {}
        self.blockCount = {} # Number of nodes this node is directly blocked by
        self.blocking = {} # Which nodes are blocked by this node
    def get_input_info(self, unique_id, input_name):
        class_type = self.dynprompt.get_node(unique_id)["class_type"]
        class_def = nodes.NODE_CLASS_MAPPINGS[class_type]
        return get_input_info(class_def, input_name)
    def make_input_strong_link(self, to_node_id, to_input):
        inputs = self.dynprompt.get_node(to_node_id)["inputs"]
        if to_input not in inputs:
            raise NodeInputError(f"Node {to_node_id} says it needs input {to_input}, but there is no input to that node at all")
        value = inputs[to_input]
        if not is_link(value):
            raise NodeInputError(f"Node {to_node_id} says it needs input {to_input}, but that value is a constant")
        from_node_id, from_socket = value
        self.add_strong_link(from_node_id, from_socket, to_node_id)
    def add_strong_link(self, from_node_id, from_socket, to_node_id):
        self.add_node(from_node_id)
        if to_node_id not in self.blocking[from_node_id]:
            self.blocking[from_node_id][to_node_id] = {}
            self.blockCount[to_node_id] += 1
        self.blocking[from_node_id][to_node_id][from_socket] = True
    def add_node(self, unique_id, include_lazy=False, subgraph_nodes=None):
        if unique_id in self.pendingNodes:
            return
        self.pendingNodes[unique_id] = True
        self.blockCount[unique_id] = 0
        self.blocking[unique_id] = {}
        inputs = self.dynprompt.get_node(unique_id)["inputs"]
        for input_name in inputs:
            value = inputs[input_name]
            if is_link(value):
                from_node_id, from_socket = value
                if subgraph_nodes is not None and from_node_id not in subgraph_nodes:
                    continue
                input_type, input_category, input_info = self.get_input_info(unique_id, input_name)
                is_lazy = input_info is not None and "lazy" in input_info and input_info["lazy"]
                if include_lazy or not is_lazy:
                    self.add_strong_link(from_node_id, from_socket, unique_id)
    def get_ready_nodes(self):
        return [node_id for node_id in self.pendingNodes if self.blockCount[node_id] == 0]
    def pop_node(self, unique_id):
        del self.pendingNodes[unique_id]
        for blocked_node_id in self.blocking[unique_id]:
            self.blockCount[blocked_node_id] -= 1
        del self.blocking[unique_id]
    def is_empty(self):
        return len(self.pendingNodes) == 0
 class ExecutionList(TopologicalSort):
    """
    ExecutionList implements a topological dissolve of the graph. After a node is staged for execution,
    it can still be returned to the graph after having further dependencies added.
    """
    def __init__(self, dynprompt, output_cache):
        super().__init__(dynprompt)
        self.output_cache = output_cache
        self.staged_node_id = None
    def add_strong_link(self, from_node_id, from_socket, to_node_id):
        if self.output_cache.get(from_node_id) is not None:
            # Nothing to do
            return
        super().add_strong_link(from_node_id, from_socket, to_node_id)
    def stage_node_execution(self):
        assert self.staged_node_id is None
        if self.is_empty():
            return None, None, None
        available = self.get_ready_nodes()
        if len(available) == 0:
            cycled_nodes = self.get_nodes_in_cycle()
            # Because cycles composed entirely of static nodes are caught during initial validation,
            # we will 'blame' the first node in the cycle that is not a static node.
            blamed_node = cycled_nodes[0]
            for node_id in cycled_nodes:
                display_node_id = self.dynprompt.get_display_node_id(node_id)
                if display_node_id != node_id:
                    blamed_node = display_node_id
                    break
            ex = DependencyCycleError("Dependency cycle detected")
            error_details = {
                "node_id": blamed_node,
                "exception_message": str(ex),
                "exception_type": "graph.DependencyCycleError",
                "traceback": [],
                "current_inputs": []
            }
            return None, error_details, ex
        self.staged_node_id = self.ux_friendly_pick_node(available)
        return self.staged_node_id, None, None
    def ux_friendly_pick_node(self, node_list):
        # If an output node is available, do that first.
        # Technically this has no effect on the overall length of execution, but it feels better as a user
        # for a PreviewImage to display a result as soon as it can
        # Some other heuristics could probably be used here to improve the UX further.
        def is_output(node_id):
            class_type = self.dynprompt.get_node(node_id)["class_type"]
            class_def = nodes.NODE_CLASS_MAPPINGS[class_type]
            if hasattr(class_def, 'OUTPUT_NODE') and class_def.OUTPUT_NODE == True:
                return True
            return False
        for node_id in node_list:
            if is_output(node_id):
                return node_id
        #This should handle the VAEDecode -> preview case
        for node_id in node_list:
            for blocked_node_id in self.blocking[node_id]:
                if is_output(blocked_node_id):
                    return node_id
        #This should handle the VAELoader -> VAEDecode -> preview case
        for node_id in node_list:
            for blocked_node_id in self.blocking[node_id]:
                for blocked_node_id1 in self.blocking[blocked_node_id]:
                    if is_output(blocked_node_id1):
                        return node_id
        #TODO: this function should be improved
        return node_list[0]
    def unstage_node_execution(self):
        assert self.staged_node_id is not None
        self.staged_node_id = None
    def complete_node_execution(self):
        node_id = self.staged_node_id
        self.pop_node(node_id)
        self.staged_node_id = None
    def get_nodes_in_cycle(self):
        # We'll dissolve the graph in reverse topological order to leave only the nodes in the cycle.
        # We're skipping some of the performance optimizations from the original TopologicalSort to keep
        # the code simple (and because having a cycle in the first place is a catastrophic error)
        blocked_by = { node_id: {} for node_id in self.pendingNodes }
        for from_node_id in self.blocking:
            for to_node_id in self.blocking[from_node_id]:
                if True in self.blocking[from_node_id][to_node_id].values():
                    blocked_by[to_node_id][from_node_id] = True
        to_remove = [node_id for node_id in blocked_by if len(blocked_by[node_id]) == 0]
        while len(to_remove) > 0:
            for node_id in to_remove:
                for to_node_id in blocked_by:
                    if node_id in blocked_by[to_node_id]:
                        del blocked_by[to_node_id][node_id]
                del blocked_by[node_id]
            to_remove = [node_id for node_id in blocked_by if len(blocked_by[node_id]) == 0]
        return list(blocked_by.keys())
 class ExecutionBlocker:
    """
    Return this from a node and any users will be blocked with the given error message.
    If the message is None, execution will be blocked silently instead.
    Generally, you should avoid using this functionality unless absolutely necessary. Whenever it's
    possible, a lazy input will be more efficient and have a better user experience.
    This functionality is useful in two cases:
    1. You want to conditionally prevent an output node from executing. (Particularly a built-in node
       like SaveImage. For your own output nodes, I would recommend just adding a BOOL input and using
       lazy evaluation to let it conditionally disable itself.)
    2. You have a node with multiple possible outputs, some of which are invalid and should not be used.
       (I would recommend not making nodes like this in the future -- instead, make multiple nodes with
       different outputs. Unfortunately, there are several popular existing nodes using this pattern.)
    """
    def __init__(self, message):
        self.message = message
--- a/comfy_execution/graph_utils.py
+++ b/comfy_execution/graph_utils.py
@ -0,0 +1,139 @@
 def is_link(obj):
    if not isinstance(obj, list):
        return False
    if len(obj) != 2:
        return False
    if not isinstance(obj[0], str):
        return False
    if not isinstance(obj[1], int) and not isinstance(obj[1], float):
        return False
    return True
 # The GraphBuilder is just a utility class that outputs graphs in the form expected by the ComfyUI back-end
 class GraphBuilder:
    _default_prefix_root = ""
    _default_prefix_call_index = 0
    _default_prefix_graph_index = 0
    def __init__(self, prefix = None):
        if prefix is None:
            self.prefix = GraphBuilder.alloc_prefix()
        else:
            self.prefix = prefix
        self.nodes = {}
        self.id_gen = 1
    @classmethod
    def set_default_prefix(cls, prefix_root, call_index, graph_index = 0):
        cls._default_prefix_root = prefix_root
        cls._default_prefix_call_index = call_index
        cls._default_prefix_graph_index = graph_index
    @classmethod
    def alloc_prefix(cls, root=None, call_index=None, graph_index=None):
        if root is None:
            root = GraphBuilder._default_prefix_root
        if call_index is None:
            call_index = GraphBuilder._default_prefix_call_index
        if graph_index is None:
            graph_index = GraphBuilder._default_prefix_graph_index
        result = f"{root}.{call_index}.{graph_index}."
        GraphBuilder._default_prefix_graph_index += 1
        return result
    def node(self, class_type, id=None, **kwargs):
        if id is None:
            id = str(self.id_gen)
            self.id_gen += 1
        id = self.prefix + id
        if id in self.nodes:
            return self.nodes[id]
        node = Node(id, class_type, kwargs)
        self.nodes[id] = node
        return node
    def lookup_node(self, id):
        id = self.prefix + id
        return self.nodes.get(id)
    def finalize(self):
        output = {}
        for node_id, node in self.nodes.items():
            output[node_id] = node.serialize()
        return output
    def replace_node_output(self, node_id, index, new_value):
        node_id = self.prefix + node_id
        to_remove = []
        for node in self.nodes.values():
            for key, value in node.inputs.items():
                if is_link(value) and value[0] == node_id and value[1] == index:
                    if new_value is None:
                        to_remove.append((node, key))
                    else:
                        node.inputs[key] = new_value
        for node, key in to_remove:
            del node.inputs[key]
    def remove_node(self, id):
        id = self.prefix + id
        del self.nodes[id]
 class Node:
    def __init__(self, id, class_type, inputs):
        self.id = id
        self.class_type = class_type
        self.inputs = inputs
        self.override_display_id = None
    def out(self, index):
        return [self.id, index]
    def set_input(self, key, value):
        if value is None:
            if key in self.inputs:
                del self.inputs[key]
        else:
            self.inputs[key] = value
    def get_input(self, key):
        return self.inputs.get(key)
    def set_override_display_id(self, override_display_id):
        self.override_display_id = override_display_id
    def serialize(self):
        serialized = {
            "class_type": self.class_type,
            "inputs": self.inputs
        }
        if self.override_display_id is not None:
            serialized["override_display_id"] = self.override_display_id
        return serialized
 def add_graph_prefix(graph, outputs, prefix):
    # Change the node IDs and any internal links
    new_graph = {}
    for node_id, node_info in graph.items():
        # Make sure the added nodes have unique IDs
        new_node_id = prefix + node_id
        new_node = { "class_type": node_info["class_type"], "inputs": {} }
        for input_name, input_value in node_info.get("inputs", {}).items():
            if is_link(input_value):
                new_node["inputs"][input_name] = [prefix + input_value[0], input_value[1]]
            else:
                new_node["inputs"][input_name] = input_value
        new_graph[new_node_id] = new_node
    # Change the node IDs in the outputs
    new_outputs = []
    for n in range(len(outputs)):
        output = outputs[n]
        if is_link(output):
            new_outputs.append([prefix + output[0], output[1]])
        else:
            new_outputs.append(output)
    return new_graph, tuple(new_outputs)
--- a/comfy_extras/nodes_hunyuan.py
+++ b/comfy_extras/nodes_hunyuan.py
@ -19,6 +19,7 @@ class CLIPTextEncodeHunyuanDiT:
        cond = output.pop("cond")
        return ([[cond, output]], )
 NODE_CLASS_MAPPINGS = {
    "CLIPTextEncodeHunyuanDiT": CLIPTextEncodeHunyuanDiT,
 }
--- a/comfy_extras/nodes_model_advanced.py
+++ b/comfy_extras/nodes_model_advanced.py
@ -2,6 +2,7 @@ import folder_paths
 import comfy.sd
 import comfy.model_sampling
 import comfy.latent_formats
 import nodes
 import torch
 class LCM(comfy.model_sampling.EPS):
@ -170,6 +171,42 @@ class ModelSamplingAuraFlow(ModelSamplingSD3):
    def patch_aura(self, model, shift):
        return self.patch(model, shift, multiplier=1.0)
 class ModelSamplingFlux:
    @classmethod
    def INPUT_TYPES(s):
        return {"required": { "model": ("MODEL",),
                              "max_shift": ("FLOAT", {"default": 1.15, "min": 0.0, "max": 100.0, "step":0.01}),
                              "base_shift": ("FLOAT", {"default": 0.5, "min": 0.0, "max": 100.0, "step":0.01}),
                              "width": ("INT", {"default": 1024, "min": 16, "max": nodes.MAX_RESOLUTION, "step": 8}),
                              "height": ("INT", {"default": 1024, "min": 16, "max": nodes.MAX_RESOLUTION, "step": 8}),
                              }}
    RETURN_TYPES = ("MODEL",)
    FUNCTION = "patch"
    CATEGORY = "advanced/model"
    def patch(self, model, max_shift, base_shift, width, height):
        m = model.clone()
        x1 = 256
        x2 = 4096
        mm = (max_shift - base_shift) / (x2 - x1)
        b = base_shift - mm * x1
        shift = (width * height / (8 * 8 * 2 * 2)) * mm + b
        sampling_base = comfy.model_sampling.ModelSamplingFlux
        sampling_type = comfy.model_sampling.CONST
        class ModelSamplingAdvanced(sampling_base, sampling_type):
            pass
        model_sampling = ModelSamplingAdvanced(model.model.model_config)
        model_sampling.set_parameters(shift=shift)
        m.add_object_patch("model_sampling", model_sampling)
        return (m, )
 class ModelSamplingContinuousEDM:
    @classmethod
    def INPUT_TYPES(s):
@ -284,5 +321,6 @@ NODE_CLASS_MAPPINGS = {
    "ModelSamplingStableCascade": ModelSamplingStableCascade,
    "ModelSamplingSD3": ModelSamplingSD3,
    "ModelSamplingAuraFlow": ModelSamplingAuraFlow,
    "ModelSamplingFlux": ModelSamplingFlux,
    "RescaleCFG": RescaleCFG,
 }
--- a/comfy_extras/nodes_model_merging.py
+++ b/comfy_extras/nodes_model_merging.py
@ -264,6 +264,7 @@ class CLIPSave:
        metadata = {}
        if not args.disable_metadata:
            metadata["format"] = "pt"
            metadata["prompt"] = prompt_info
            if extra_pnginfo is not None:
                for x in extra_pnginfo:
@ -332,6 +333,25 @@ class VAESave:
        comfy.utils.save_torch_file(vae.get_sd(), output_checkpoint, metadata=metadata)
        return {}
 class ModelSave:
    def __init__(self):
        self.output_dir = folder_paths.get_output_directory()
    @classmethod
    def INPUT_TYPES(s):
        return {"required": { "model": ("MODEL",),
                              "filename_prefix": ("STRING", {"default": "diffusion_models/ComfyUI"}),},
                "hidden": {"prompt": "PROMPT", "extra_pnginfo": "EXTRA_PNGINFO"},}
    RETURN_TYPES = ()
    FUNCTION = "save"
    OUTPUT_NODE = True
    CATEGORY = "advanced/model_merging"
    def save(self, model, filename_prefix, prompt=None, extra_pnginfo=None):
        save_checkpoint(model, filename_prefix=filename_prefix, output_dir=self.output_dir, prompt=prompt, extra_pnginfo=extra_pnginfo)
        return {}
 NODE_CLASS_MAPPINGS = {
    "ModelMergeSimple": ModelMergeSimple,
    "ModelMergeBlocks": ModelMergeBlocks,
@ -343,4 +363,9 @@ NODE_CLASS_MAPPINGS = {
    "CLIPMergeAdd": CLIPAdd,
    "CLIPSave": CLIPSave,
    "VAESave": VAESave,
    "ModelSave": ModelSave,
 }
 NODE_DISPLAY_NAME_MAPPINGS = {
    "CheckpointSave": "Save Checkpoint",
 }
--- a/comfy_extras/nodes_model_merging_model_specific.py
+++ b/comfy_extras/nodes_model_merging_model_specific.py
@ -75,9 +75,36 @@ class ModelMergeSD3_2B(comfy_extras.nodes_model_merging.ModelMergeBlocks):
        return {"required": arg_dict}
 class ModelMergeFlux1(comfy_extras.nodes_model_merging.ModelMergeBlocks):
    CATEGORY = "advanced/model_merging/model_specific"
    @classmethod
    def INPUT_TYPES(s):
        arg_dict = { "model1": ("MODEL",),
                              "model2": ("MODEL",)}
        argument = ("FLOAT", {"default": 1.0, "min": 0.0, "max": 1.0, "step": 0.01})
        arg_dict["img_in."] = argument
        arg_dict["time_in."] = argument
        arg_dict["guidance_in"] = argument
        arg_dict["vector_in."] = argument
        arg_dict["txt_in."] = argument
        for i in range(19):
            arg_dict["double_blocks.{}.".format(i)] = argument
        for i in range(38):
            arg_dict["single_blocks.{}.".format(i)] = argument
        arg_dict["final_layer."] = argument
        return {"required": arg_dict}
 NODE_CLASS_MAPPINGS = {
    "ModelMergeSD1": ModelMergeSD1,
    "ModelMergeSD2": ModelMergeSD1, #SD1 and SD2 have the same blocks
    "ModelMergeSDXL": ModelMergeSDXL,
    "ModelMergeSD3_2B": ModelMergeSD3_2B,
    "ModelMergeFlux1": ModelMergeFlux1,
 }
--- a/comfy_extras/nodes_sd3.py
+++ b/comfy_extras/nodes_sd3.py
@ -27,8 +27,8 @@ class EmptySD3LatentImage:
    @classmethod
    def INPUT_TYPES(s):
-        return {"required": { "width": ("INT", {"default": 1024, "min": 16, "max": nodes.MAX_RESOLUTION, "step": 8}),
+        return {"required": { "width": ("INT", {"default": 1024, "min": 16, "max": nodes.MAX_RESOLUTION, "step": 16}),
-                              "height": ("INT", {"default": 1024, "min": 16, "max": nodes.MAX_RESOLUTION, "step": 8}),
+                              "height": ("INT", {"default": 1024, "min": 16, "max": nodes.MAX_RESOLUTION, "step": 16}),
                              "batch_size": ("INT", {"default": 1, "min": 1, "max": 4096})}}
    RETURN_TYPES = ("LATENT",)
    FUNCTION = "generate"
@ -100,3 +100,8 @@ NODE_CLASS_MAPPINGS = {
    "CLIPTextEncodeSD3": CLIPTextEncodeSD3,
    "ControlNetApplySD3": ControlNetApplySD3,
 }
 NODE_DISPLAY_NAME_MAPPINGS = {
    # Sampling
    "ControlNetApplySD3": "ControlNetApply SD3 and HunyuanDiT",
 }
--- a/custom_nodes/example_node.py.example
+++ b/custom_nodes/example_node.py.example
@ -4,14 +4,14 @@ class Example:
    Class methods
    -------------
-    INPUT_TYPES (dict): 
+    INPUT_TYPES (dict):
        Tell the main program input parameters of nodes.
    IS_CHANGED:
        optional method to control when the node is re executed.
    Attributes
    ----------
-    RETURN_TYPES (`tuple`): 
+    RETURN_TYPES (`tuple`):
        The type of each element in the output tuple.
    RETURN_NAMES (`tuple`):
        Optional: The name of each output in the output tuple.
@ -23,13 +23,19 @@ class Example:
        Assumed to be False if not present.
    CATEGORY (`str`):
        The category the node should appear in the UI.
    DEPRECATED (`bool`):
        Indicates whether the node is deprecated. Deprecated nodes are hidden by default in the UI, but remain
        functional in existing workflows that use them.
    EXPERIMENTAL (`bool`):
        Indicates whether the node is experimental. Experimental nodes are marked as such in the UI and may be subject to
        significant changes or removal in future versions. Use with caution in production workflows.
    execute(s) -> tuple || None:
        The entry point method. The name of this method must be the same as the value of property `FUNCTION`.
        For example, if `FUNCTION = "execute"` then this method's name must be `execute`, if `FUNCTION = "foo"` then it must be `foo`.
    """
    def __init__(self):
        pass
-    
+
    @classmethod
    def INPUT_TYPES(s):
        """
@ -54,7 +60,8 @@ class Example:
                    "min": 0, #Minimum value
                    "max": 4096, #Maximum value
                    "step": 64, #Slider's step
-                    "display": "number" # Cosmetic only: display as "number" or "slider"
+                    "display": "number", # Cosmetic only: display as "number" or "slider"
                    "lazy": True # Will only be evaluated if check_lazy_status requires it
                }),
                "float_field": ("FLOAT", {
                    "default": 1.0,
@ -62,11 +69,14 @@ class Example:
                    "max": 10.0,
                    "step": 0.01,
                    "round": 0.001, #The value representing the precision to round to, will be set to the step value by default. Can be set to False to disable rounding.
-                    "display": "number"}),
+                    "display": "number",
                    "lazy": True
                }),
                "print_to_screen": (["enable", "disable"],),
                "string_field": ("STRING", {
                    "multiline": False, #True if you want the field to look like the one on the ClipTextEncode node
-                    "default": "Hello World!"
+                    "default": "Hello World!",
                    "lazy": True
                }),
            },
        }
@ -80,6 +90,23 @@ class Example:
    CATEGORY = "Example"
    def check_lazy_status(self, image, string_field, int_field, float_field, print_to_screen):
        """
            Return a list of input names that need to be evaluated.
            This function will be called if there are any lazy inputs which have not yet been
            evaluated. As long as you return at least one field which has not yet been evaluated
            (and more exist), this function will be called again once the value of the requested
            field is available.
            Any evaluated inputs will be passed as arguments to this function. Any unevaluated
            inputs will have the value None.
        """
        if print_to_screen == "enable":
            return ["int_field", "float_field", "string_field"]
        else:
            return []
    def test(self, image, string_field, int_field, float_field, print_to_screen):
        if print_to_screen == "enable":
            print(f"""Your input contains:
--- a/execution.py
+++ b/execution.py
@ -5,6 +5,7 @@ import threading
 import heapq
 import time
 import traceback
 from enum import Enum
 import inspect
 from typing import List, Literal, NamedTuple, Optional
@ -12,102 +13,219 @@ import torch
 import nodes
 import comfy.model_management
 from comfy_execution.graph import get_input_info, ExecutionList, DynamicPrompt, ExecutionBlocker
 from comfy_execution.graph_utils import is_link, GraphBuilder
 from comfy_execution.caching import HierarchicalCache, LRUCache, CacheKeySetInputSignature, CacheKeySetID
 from comfy.cli_args import args
-def get_input_data(inputs, class_def, unique_id, outputs={}, prompt={}, extra_data={}):
+class ExecutionResult(Enum):
    SUCCESS = 0
    FAILURE = 1
    PENDING = 2
 class DuplicateNodeError(Exception):
    pass
 class IsChangedCache:
    def __init__(self, dynprompt, outputs_cache):
        self.dynprompt = dynprompt
        self.outputs_cache = outputs_cache
        self.is_changed = {}
    def get(self, node_id):
        if node_id in self.is_changed:
            return self.is_changed[node_id]
        node = self.dynprompt.get_node(node_id)
        class_type = node["class_type"]
        class_def = nodes.NODE_CLASS_MAPPINGS[class_type]
        if not hasattr(class_def, "IS_CHANGED"):
            self.is_changed[node_id] = False
            return self.is_changed[node_id]
        if "is_changed" in node:
            self.is_changed[node_id] = node["is_changed"]
            return self.is_changed[node_id]
        # Intentionally do not use cached outputs here. We only want constants in IS_CHANGED
        input_data_all, _ = get_input_data(node["inputs"], class_def, node_id, None)
        try:
            is_changed = _map_node_over_list(class_def, input_data_all, "IS_CHANGED")
            node["is_changed"] = [None if isinstance(x, ExecutionBlocker) else x for x in is_changed]
        except Exception as e:
            logging.warning("WARNING: {}".format(e))
            node["is_changed"] = float("NaN")
        finally:
            self.is_changed[node_id] = node["is_changed"]
        return self.is_changed[node_id]
 class CacheSet:
    def __init__(self, lru_size=None):
        if lru_size is None or lru_size == 0:
            self.init_classic_cache() 
        else:
            self.init_lru_cache(lru_size)
        self.all = [self.outputs, self.ui, self.objects]
    # Useful for those with ample RAM/VRAM -- allows experimenting without
    # blowing away the cache every time
    def init_lru_cache(self, cache_size):
        self.outputs = LRUCache(CacheKeySetInputSignature, max_size=cache_size)
        self.ui = LRUCache(CacheKeySetInputSignature, max_size=cache_size)
        self.objects = HierarchicalCache(CacheKeySetID)
    # Performs like the old cache -- dump data ASAP
    def init_classic_cache(self):
        self.outputs = HierarchicalCache(CacheKeySetInputSignature)
        self.ui = HierarchicalCache(CacheKeySetInputSignature)
        self.objects = HierarchicalCache(CacheKeySetID)
    def recursive_debug_dump(self):
        result = {
            "outputs": self.outputs.recursive_debug_dump(),
            "ui": self.ui.recursive_debug_dump(),
        }
        return result
 def get_input_data(inputs, class_def, unique_id, outputs=None, dynprompt=None, extra_data={}):
    valid_inputs = class_def.INPUT_TYPES()
    input_data_all = {}
    missing_keys = {}
    for x in inputs:
        input_data = inputs[x]
-        if isinstance(input_data, list):
+        input_type, input_category, input_info = get_input_info(class_def, x)
        def mark_missing():
            missing_keys[x] = True
            input_data_all[x] = (None,)
        if is_link(input_data) and (not input_info or not input_info.get("rawLink", False)):
            input_unique_id = input_data[0]
            output_index = input_data[1]
-            if input_unique_id not in outputs:
+            if outputs is None:
-                input_data_all[x] = (None,)
+                mark_missing()
                continue # This might be a lazily-evaluated input
            cached_output = outputs.get(input_unique_id)
            if cached_output is None:
                mark_missing()
                continue
-            obj = outputs[input_unique_id][output_index]
+            if output_index >= len(cached_output):
                mark_missing()
                continue
            obj = cached_output[output_index]
            input_data_all[x] = obj
-        else:
+        elif input_category is not None:
-            if ("required" in valid_inputs and x in valid_inputs["required"]) or ("optional" in valid_inputs and x in valid_inputs["optional"]):
+            input_data_all[x] = [input_data]
                input_data_all[x] = [input_data]
    if "hidden" in valid_inputs:
        h = valid_inputs["hidden"]
        for x in h:
            if h[x] == "PROMPT":
-                input_data_all[x] = [prompt]
+                input_data_all[x] = [dynprompt.get_original_prompt() if dynprompt is not None else {}]
            if h[x] == "DYNPROMPT":
                input_data_all[x] = [dynprompt]
            if h[x] == "EXTRA_PNGINFO":
                input_data_all[x] = [extra_data.get('extra_pnginfo', None)]
            if h[x] == "UNIQUE_ID":
                input_data_all[x] = [unique_id]
-    return input_data_all
+    return input_data_all, missing_keys
-def map_node_over_list(obj, input_data_all, func, allow_interrupt=False):
+map_node_over_list = None #Don't hook this please
 def _map_node_over_list(obj, input_data_all, func, allow_interrupt=False, execution_block_cb=None, pre_execute_cb=None):
    # check if node wants the lists
-    input_is_list = False
+    input_is_list = getattr(obj, "INPUT_IS_LIST", False)
    if hasattr(obj, "INPUT_IS_LIST"):
        input_is_list = obj.INPUT_IS_LIST
    if len(input_data_all) == 0:
        max_len_input = 0
    else:
-        max_len_input = max([len(x) for x in input_data_all.values()])
+        max_len_input = max(len(x) for x in input_data_all.values())
    # get a slice of inputs, repeat last input when list isn't long enough
    def slice_dict(d, i):
-        d_new = dict()
+        return {k: v[i if len(v) > i else -1] for k, v in d.items()}
        for k,v in d.items():
            d_new[k] = v[i if len(v) > i else -1]
        return d_new
    results = []
    def process_inputs(inputs, index=None):
        if allow_interrupt:
            nodes.before_node_execution()
        execution_block = None
        for k, v in inputs.items():
            if isinstance(v, ExecutionBlocker):
                execution_block = execution_block_cb(v) if execution_block_cb else v
                break
        if execution_block is None:
            if pre_execute_cb is not None and index is not None:
                pre_execute_cb(index)
            results.append(getattr(obj, func)(**inputs))
        else:
            results.append(execution_block)
    if input_is_list:
-        if allow_interrupt:
+        process_inputs(input_data_all, 0)
            nodes.before_node_execution()
        results.append(getattr(obj, func)(**input_data_all))
    elif max_len_input == 0:
-        if allow_interrupt:
+        process_inputs({})
-            nodes.before_node_execution()
+    else: 
        results.append(getattr(obj, func)())
    else:
        for i in range(max_len_input):
-            if allow_interrupt:
+            input_dict = slice_dict(input_data_all, i)
-                nodes.before_node_execution()
+            process_inputs(input_dict, i)
            results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
    return results
-def get_output_data(obj, input_data_all):
+def merge_result_data(results, obj):
    # check which outputs need concatenating
    output = []
    output_is_list = [False] * len(results[0])
    if hasattr(obj, "OUTPUT_IS_LIST"):
        output_is_list = obj.OUTPUT_IS_LIST
    # merge node execution results
    for i, is_list in zip(range(len(results[0])), output_is_list):
        if is_list:
            output.append([x for o in results for x in o[i]])
        else:
            output.append([o[i] for o in results])
    return output
 def get_output_data(obj, input_data_all, execution_block_cb=None, pre_execute_cb=None):
    results = []
    uis = []
-    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
+    subgraph_results = []
-
+    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
-    for r in return_values:
+    has_subgraph = False
    for i in range(len(return_values)):
        r = return_values[i]
        if isinstance(r, dict):
            if 'ui' in r:
                uis.append(r['ui'])
-            if 'result' in r:
+            if 'expand' in r:
-                results.append(r['result'])
+                # Perform an expansion, but do not append results
                has_subgraph = True
                new_graph = r['expand']
                result = r.get("result", None)
                if isinstance(result, ExecutionBlocker):
                    result = tuple([result] * len(obj.RETURN_TYPES))
                subgraph_results.append((new_graph, result))
            elif 'result' in r:
                result = r.get("result", None)
                if isinstance(result, ExecutionBlocker):
                    result = tuple([result] * len(obj.RETURN_TYPES))
                results.append(result)
                subgraph_results.append((None, result))
        else:
            if isinstance(r, ExecutionBlocker):
                r = tuple([r] * len(obj.RETURN_TYPES))
            results.append(r)
            subgraph_results.append((None, r))
-    output = []
+    if has_subgraph:
-    if len(results) > 0:
+        output = subgraph_results
-        # check which outputs need concatenating
+    elif len(results) > 0:
-        output_is_list = [False] * len(results[0])
+        output = merge_result_data(results, obj)
-        if hasattr(obj, "OUTPUT_IS_LIST"):
+    else:
-            output_is_list = obj.OUTPUT_IS_LIST
+        output = []
        # merge node execution results
        for i, is_list in zip(range(len(results[0])), output_is_list):
            if is_list:
                output.append([x for o in results for x in o[i]])
            else:
                output.append([o[i] for o in results])
    ui = dict()    
    if len(uis) > 0:
        ui = {k: [y for x in uis for y in x[k]] for k in uis[0].keys()}
-    return output, ui
+    return output, ui, has_subgraph
 def format_value(x):
    if x is None:
@ -117,53 +235,145 @@ def format_value(x):
    else:
        return str(x)
-def recursive_execute(server, prompt, outputs, current_item, extra_data, executed, prompt_id, outputs_ui, object_storage):
+def execute(server, dynprompt, caches, current_item, extra_data, executed, prompt_id, execution_list, pending_subgraph_results):
    unique_id = current_item
-    inputs = prompt[unique_id]['inputs']
+    real_node_id = dynprompt.get_real_node_id(unique_id)
-    class_type = prompt[unique_id]['class_type']
+    display_node_id = dynprompt.get_display_node_id(unique_id)
    parent_node_id = dynprompt.get_parent_node_id(unique_id)
    inputs = dynprompt.get_node(unique_id)['inputs']
    class_type = dynprompt.get_node(unique_id)['class_type']
    class_def = nodes.NODE_CLASS_MAPPINGS[class_type]
-    if unique_id in outputs:
+    if caches.outputs.get(unique_id) is not None:
-        return (True, None, None)
+        if server.client_id is not None:
-
+            cached_output = caches.ui.get(unique_id) or {}
-    for x in inputs:
+            server.send_sync("executed", { "node": unique_id, "display_node": display_node_id, "output": cached_output.get("output",None), "prompt_id": prompt_id }, server.client_id)
-        input_data = inputs[x]
+        return (ExecutionResult.SUCCESS, None, None)
        if isinstance(input_data, list):
            input_unique_id = input_data[0]
            output_index = input_data[1]
            if input_unique_id not in outputs:
                result = recursive_execute(server, prompt, outputs, input_unique_id, extra_data, executed, prompt_id, outputs_ui, object_storage)
                if result[0] is not True:
                    # Another node failed further upstream
                    return result
    input_data_all = None
    try:
-        input_data_all = get_input_data(inputs, class_def, unique_id, outputs, prompt, extra_data)
+        if unique_id in pending_subgraph_results:
-        if server.client_id is not None:
+            cached_results = pending_subgraph_results[unique_id]
-            server.last_node_id = unique_id
+            resolved_outputs = []
-            server.send_sync("executing", { "node": unique_id, "prompt_id": prompt_id }, server.client_id)
+            for is_subgraph, result in cached_results:
                if not is_subgraph:
                    resolved_outputs.append(result)
                else:
                    resolved_output = []
                    for r in result:
                        if is_link(r):
                            source_node, source_output = r[0], r[1]
                            node_output = caches.outputs.get(source_node)[source_output]
                            for o in node_output:
                                resolved_output.append(o)
-        obj = object_storage.get((unique_id, class_type), None)
+                        else:
-        if obj is None:
+                            resolved_output.append(r)
-            obj = class_def()
+                    resolved_outputs.append(tuple(resolved_output))
-            object_storage[(unique_id, class_type)] = obj
+            output_data = merge_result_data(resolved_outputs, class_def)
-
+            output_ui = []
-        output_data, output_ui = get_output_data(obj, input_data_all)
+            has_subgraph = False
-        outputs[unique_id] = output_data
+        else:
-        if len(output_ui) > 0:
+            input_data_all, missing_keys = get_input_data(inputs, class_def, unique_id, caches.outputs, dynprompt, extra_data)
            outputs_ui[unique_id] = output_ui
            if server.client_id is not None:
-                server.send_sync("executed", { "node": unique_id, "output": output_ui, "prompt_id": prompt_id }, server.client_id)
+                server.last_node_id = display_node_id
                server.send_sync("executing", { "node": unique_id, "display_node": display_node_id, "prompt_id": prompt_id }, server.client_id)
            obj = caches.objects.get(unique_id)
            if obj is None:
                obj = class_def()
                caches.objects.set(unique_id, obj)
            if hasattr(obj, "check_lazy_status"):
                required_inputs = _map_node_over_list(obj, input_data_all, "check_lazy_status", allow_interrupt=True)
                required_inputs = set(sum([r for r in required_inputs if isinstance(r,list)], []))
                required_inputs = [x for x in required_inputs if isinstance(x,str) and (
                    x not in input_data_all or x in missing_keys
                )]
                if len(required_inputs) > 0:
                    for i in required_inputs:
                        execution_list.make_input_strong_link(unique_id, i)
                    return (ExecutionResult.PENDING, None, None)
            def execution_block_cb(block):
                if block.message is not None:
                    mes = {
                        "prompt_id": prompt_id,
                        "node_id": unique_id,
                        "node_type": class_type,
                        "executed": list(executed),
                        "exception_message": f"Execution Blocked: {block.message}",
                        "exception_type": "ExecutionBlocked",
                        "traceback": [],
                        "current_inputs": [],
                        "current_outputs": [],
                    }
                    server.send_sync("execution_error", mes, server.client_id)
                    return ExecutionBlocker(None)
                else:
                    return block
            def pre_execute_cb(call_index):
                GraphBuilder.set_default_prefix(unique_id, call_index, 0)
            output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
        if len(output_ui) > 0:
            caches.ui.set(unique_id, {
                "meta": {
                    "node_id": unique_id,
                    "display_node": display_node_id,
                    "parent_node": parent_node_id,
                    "real_node_id": real_node_id,
                },
                "output": output_ui
            })
            if server.client_id is not None:
                server.send_sync("executed", { "node": unique_id, "display_node": display_node_id, "output": output_ui, "prompt_id": prompt_id }, server.client_id)
        if has_subgraph:
            cached_outputs = []
            new_node_ids = []
            new_output_ids = []
            new_output_links = []
            for i in range(len(output_data)):
                new_graph, node_outputs = output_data[i]
                if new_graph is None:
                    cached_outputs.append((False, node_outputs))
                else:
                    # Check for conflicts
                    for node_id in new_graph.keys():
                        if dynprompt.has_node(node_id):
                            raise DuplicateNodeError(f"Attempt to add duplicate node {node_id}. Ensure node ids are unique and deterministic or use graph_utils.GraphBuilder.")
                    for node_id, node_info in new_graph.items():
                        new_node_ids.append(node_id)
                        display_id = node_info.get("override_display_id", unique_id)
                        dynprompt.add_ephemeral_node(node_id, node_info, unique_id, display_id)
                        # Figure out if the newly created node is an output node
                        class_type = node_info["class_type"]
                        class_def = nodes.NODE_CLASS_MAPPINGS[class_type]
                        if hasattr(class_def, 'OUTPUT_NODE') and class_def.OUTPUT_NODE == True:
                            new_output_ids.append(node_id)
                    for i in range(len(node_outputs)):
                        if is_link(node_outputs[i]):
                            from_node_id, from_socket = node_outputs[i][0], node_outputs[i][1]
                            new_output_links.append((from_node_id, from_socket))
                    cached_outputs.append((True, node_outputs))
            new_node_ids = set(new_node_ids)
            for cache in caches.all:
                cache.ensure_subcache_for(unique_id, new_node_ids).clean_unused()
            for node_id in new_output_ids:
                execution_list.add_node(node_id)
            for link in new_output_links:
                execution_list.add_strong_link(link[0], link[1], unique_id)
            pending_subgraph_results[unique_id] = cached_outputs
            return (ExecutionResult.PENDING, None, None)
        caches.outputs.set(unique_id, output_data)
    except comfy.model_management.InterruptProcessingException as iex:
        logging.info("Processing interrupted")
        # skip formatting inputs/outputs
        error_details = {
-            "node_id": unique_id,
+            "node_id": real_node_id,
        }
-        return (False, error_details, iex)
+        return (ExecutionResult.FAILURE, error_details, iex)
    except Exception as ex:
        typ, _, tb = sys.exc_info()
        exception_type = full_type_name(typ)
@ -173,116 +383,36 @@ def recursive_execute(server, prompt, outputs, current_item, extra_data, execute
            for name, inputs in input_data_all.items():
                input_data_formatted[name] = [format_value(x) for x in inputs]
-        output_data_formatted = {}
+        logging.error(f"!!! Exception during processing !!! {ex}")
        for node_id, node_outputs in outputs.items():
            output_data_formatted[node_id] = [[format_value(x) for x in l] for l in node_outputs]
        logging.error(f"!!! Exception during processing!!! {ex}")
        logging.error(traceback.format_exc())
        error_details = {
-            "node_id": unique_id,
+            "node_id": real_node_id,
            "exception_message": str(ex),
            "exception_type": exception_type,
            "traceback": traceback.format_tb(tb),
-            "current_inputs": input_data_formatted,
+            "current_inputs": input_data_formatted
            "current_outputs": output_data_formatted
        }
-        return (False, error_details, ex)
+        if isinstance(ex, comfy.model_management.OOM_EXCEPTION):
            logging.error("Got an OOM, unloading all loaded models.")
            comfy.model_management.unload_all_models()
        return (ExecutionResult.FAILURE, error_details, ex)
    executed.add(unique_id)
-    return (True, None, None)
+    return (ExecutionResult.SUCCESS, None, None)
 def recursive_will_execute(prompt, outputs, current_item, memo={}):
    unique_id = current_item
    if unique_id in memo:
        return memo[unique_id]
    inputs = prompt[unique_id]['inputs']
    will_execute = []
    if unique_id in outputs:
        return []
    for x in inputs:
        input_data = inputs[x]
        if isinstance(input_data, list):
            input_unique_id = input_data[0]
            output_index = input_data[1]
            if input_unique_id not in outputs:
                will_execute += recursive_will_execute(prompt, outputs, input_unique_id, memo)
    memo[unique_id] = will_execute + [unique_id]
    return memo[unique_id]
 def recursive_output_delete_if_changed(prompt, old_prompt, outputs, current_item):
    unique_id = current_item
    inputs = prompt[unique_id]['inputs']
    class_type = prompt[unique_id]['class_type']
    class_def = nodes.NODE_CLASS_MAPPINGS[class_type]
    is_changed_old = ''
    is_changed = ''
    to_delete = False
    if hasattr(class_def, 'IS_CHANGED'):
        if unique_id in old_prompt and 'is_changed' in old_prompt[unique_id]:
            is_changed_old = old_prompt[unique_id]['is_changed']
        if 'is_changed' not in prompt[unique_id]:
            input_data_all = get_input_data(inputs, class_def, unique_id, outputs)
            if input_data_all is not None:
                try:
                    #is_changed = class_def.IS_CHANGED(**input_data_all)
                    is_changed = map_node_over_list(class_def, input_data_all, "IS_CHANGED")
                    prompt[unique_id]['is_changed'] = is_changed
                except:
                    to_delete = True
        else:
            is_changed = prompt[unique_id]['is_changed']
    if unique_id not in outputs:
        return True
    if not to_delete:
        if is_changed != is_changed_old:
            to_delete = True
        elif unique_id not in old_prompt:
            to_delete = True
        elif class_type != old_prompt[unique_id]['class_type']:
            to_delete = True
        elif inputs == old_prompt[unique_id]['inputs']:
            for x in inputs:
                input_data = inputs[x]
                if isinstance(input_data, list):
                    input_unique_id = input_data[0]
                    output_index = input_data[1]
                    if input_unique_id in outputs:
                        to_delete = recursive_output_delete_if_changed(prompt, old_prompt, outputs, input_unique_id)
                    else:
                        to_delete = True
                    if to_delete:
                        break
        else:
            to_delete = True
    if to_delete:
        d = outputs.pop(unique_id)
        del d
    return to_delete
 class PromptExecutor:
-    def __init__(self, server):
+    def __init__(self, server, lru_size=None):
        self.lru_size = lru_size
        self.server = server
        self.reset()
    def reset(self):
-        self.outputs = {}
+        self.caches = CacheSet(self.lru_size)
        self.object_storage = {}
        self.outputs_ui = {}
        self.status_messages = []
        self.success = True
        self.old_prompt = {}
    def add_message(self, event, data: dict, broadcast: bool):
        data = {
@ -313,26 +443,13 @@ class PromptExecutor:
                "node_id": node_id,
                "node_type": class_type,
                "executed": list(executed),
                "exception_message": error["exception_message"],
                "exception_type": error["exception_type"],
                "traceback": error["traceback"],
                "current_inputs": error["current_inputs"],
-                "current_outputs": error["current_outputs"],
+                "current_outputs": list(current_outputs),
            }
            self.add_message("execution_error", mes, broadcast=False)
        # Next, remove the subsequent outputs since they will not be executed
        to_delete = []
        for o in self.outputs:
            if (o not in current_outputs) and (o not in executed):
                to_delete += [o]
                if o in self.old_prompt:
                    d = self.old_prompt.pop(o)
                    del d
        for o in to_delete:
            d = self.outputs.pop(o)
            del d
    def execute(self, prompt, prompt_id, extra_data={}, execute_outputs=[]):
        nodes.interrupt_processing(False)
@ -346,65 +463,59 @@ class PromptExecutor:
        self.add_message("execution_start", { "prompt_id": prompt_id}, broadcast=False)
        with torch.inference_mode():
-            #delete cached outputs if nodes don't exist for them
+            dynamic_prompt = DynamicPrompt(prompt)
-            to_delete = []
+            is_changed_cache = IsChangedCache(dynamic_prompt, self.caches.outputs)
-            for o in self.outputs:
+            for cache in self.caches.all:
-                if o not in prompt:
+                cache.set_prompt(dynamic_prompt, prompt.keys(), is_changed_cache)
-                    to_delete += [o]
+                cache.clean_unused()
            for o in to_delete:
                d = self.outputs.pop(o)
                del d
            to_delete = []
            for o in self.object_storage:
                if o[0] not in prompt:
                    to_delete += [o]
                else:
                    p = prompt[o[0]]
                    if o[1] != p['class_type']:
                        to_delete += [o]
            for o in to_delete:
                d = self.object_storage.pop(o)
                del d
-            for x in prompt:
+            cached_nodes = []
-                recursive_output_delete_if_changed(prompt, self.old_prompt, self.outputs, x)
+            for node_id in prompt:
-
+                if self.caches.outputs.get(node_id) is not None:
-            current_outputs = set(self.outputs.keys())
+                    cached_nodes.append(node_id)
            for x in list(self.outputs_ui.keys()):
                if x not in current_outputs:
                    d = self.outputs_ui.pop(x)
                    del d
            comfy.model_management.cleanup_models(keep_clone_weights_loaded=True)
            self.add_message("execution_cached",
-                          { "nodes": list(current_outputs) , "prompt_id": prompt_id},
+                          { "nodes": cached_nodes, "prompt_id": prompt_id},
                          broadcast=False)
            pending_subgraph_results = {}
            executed = set()
-            output_node_id = None
+            execution_list = ExecutionList(dynamic_prompt, self.caches.outputs)
-            to_execute = []
+            current_outputs = self.caches.outputs.all_node_ids()
            for node_id in list(execute_outputs):
-                to_execute += [(0, node_id)]
+                execution_list.add_node(node_id)
-            while len(to_execute) > 0:
+            while not execution_list.is_empty():
-                #always execute the output that depends on the least amount of unexecuted nodes first
+                node_id, error, ex = execution_list.stage_node_execution()
-                memo = {}
+                if error is not None:
-                to_execute = sorted(list(map(lambda a: (len(recursive_will_execute(prompt, self.outputs, a[-1], memo)), a[-1]), to_execute)))
+                    self.handle_execution_error(prompt_id, dynamic_prompt.original_prompt, current_outputs, executed, error, ex)
                output_node_id = to_execute.pop(0)[-1]
                # This call shouldn't raise anything if there's an error deep in
                # the actual SD code, instead it will report the node where the
                # error was raised
                self.success, error, ex = recursive_execute(self.server, prompt, self.outputs, output_node_id, extra_data, executed, prompt_id, self.outputs_ui, self.object_storage)
                if self.success is not True:
                    self.handle_execution_error(prompt_id, prompt, current_outputs, executed, error, ex)
                    break
                result, error, ex = execute(self.server, dynamic_prompt, self.caches, node_id, extra_data, executed, prompt_id, execution_list, pending_subgraph_results)
                self.success = result != ExecutionResult.FAILURE
                if result == ExecutionResult.FAILURE:
                    self.handle_execution_error(prompt_id, dynamic_prompt.original_prompt, current_outputs, executed, error, ex)
                    break
                elif result == ExecutionResult.PENDING:
                    execution_list.unstage_node_execution()
                else: # result == ExecutionResult.SUCCESS:
                    execution_list.complete_node_execution()
            else:
                # Only execute when the while-loop ends without break
                self.add_message("execution_success", { "prompt_id": prompt_id }, broadcast=False)
-            for x in executed:
+            ui_outputs = {}
-                self.old_prompt[x] = copy.deepcopy(prompt[x])
+            meta_outputs = {}
            all_node_ids = self.caches.ui.all_node_ids()
            for node_id in all_node_ids:
                ui_info = self.caches.ui.get(node_id)
                if ui_info is not None:
                    ui_outputs[node_id] = ui_info["output"]
                    meta_outputs[node_id] = ui_info["meta"]
            self.history_result = {
                "outputs": ui_outputs,
                "meta": meta_outputs,
            }
            self.server.last_node_id = None
            if comfy.model_management.DISABLE_SMART_MEMORY:
                comfy.model_management.unload_all_models()
@ -421,31 +532,37 @@ def validate_inputs(prompt, item, validated):
    obj_class = nodes.NODE_CLASS_MAPPINGS[class_type]
    class_inputs = obj_class.INPUT_TYPES()
-    required_inputs = class_inputs['required']
+    valid_inputs = set(class_inputs.get('required',{})).union(set(class_inputs.get('optional',{})))
    errors = []
    valid = True
    validate_function_inputs = []
    validate_has_kwargs = False
    if hasattr(obj_class, "VALIDATE_INPUTS"):
-        validate_function_inputs = inspect.getfullargspec(obj_class.VALIDATE_INPUTS).args
+        argspec = inspect.getfullargspec(obj_class.VALIDATE_INPUTS)
        validate_function_inputs = argspec.args
        validate_has_kwargs = argspec.varkw is not None
    received_types = {}
-    for x in required_inputs:
+    for x in valid_inputs:
        type_input, input_category, extra_info = get_input_info(obj_class, x)
        assert extra_info is not None
        if x not in inputs:
-            error = {
+            if input_category == "required":
-                "type": "required_input_missing",
+                error = {
-                "message": "Required input is missing",
+                    "type": "required_input_missing",
-                "details": f"{x}",
+                    "message": "Required input is missing",
-                "extra_info": {
+                    "details": f"{x}",
-                    "input_name": x
+                    "extra_info": {
                        "input_name": x
                    }
                }
-            }
+                errors.append(error)
            errors.append(error)
            continue
        val = inputs[x]
-        info = required_inputs[x]
+        info = (type_input, extra_info)
        type_input = info[0]
        if isinstance(val, list):
            if len(val) != 2:
                error = {
@ -464,8 +581,9 @@ def validate_inputs(prompt, item, validated):
            o_id = val[0]
            o_class_type = prompt[o_id]['class_type']
            r = nodes.NODE_CLASS_MAPPINGS[o_class_type].RETURN_TYPES
-            if r[val[1]] != type_input:
+            received_type = r[val[1]]
-                received_type = r[val[1]]
+            received_types[x] = received_type
            if 'input_types' not in validate_function_inputs and received_type != type_input:
                details = f"{x}, {received_type} != {type_input}"
                error = {
                    "type": "return_type_mismatch",
@ -516,6 +634,9 @@ def validate_inputs(prompt, item, validated):
                if type_input == "STRING":
                    val = str(val)
                    inputs[x] = val
                if type_input == "BOOLEAN":
                    val = bool(val)
                    inputs[x] = val
            except Exception as ex:
                error = {
                    "type": "invalid_input_type",
@ -531,11 +652,11 @@ def validate_inputs(prompt, item, validated):
                errors.append(error)
                continue
-            if len(info) > 1:
+            if x not in validate_function_inputs and not validate_has_kwargs:
-                if "min" in info[1] and val < info[1]["min"]:
+                if "min" in extra_info and val < extra_info["min"]:
                    error = {
                        "type": "value_smaller_than_min",
-                        "message": "Value {} smaller than min of {}".format(val, info[1]["min"]),
+                        "message": "Value {} smaller than min of {}".format(val, extra_info["min"]),
                        "details": f"{x}",
                        "extra_info": {
                            "input_name": x,
@ -545,10 +666,10 @@ def validate_inputs(prompt, item, validated):
                    }
                    errors.append(error)
                    continue
-                if "max" in info[1] and val > info[1]["max"]:
+                if "max" in extra_info and val > extra_info["max"]:
                    error = {
                        "type": "value_bigger_than_max",
-                        "message": "Value {} bigger than max of {}".format(val, info[1]["max"]),
+                        "message": "Value {} bigger than max of {}".format(val, extra_info["max"]),
                        "details": f"{x}",
                        "extra_info": {
                            "input_name": x,
@ -559,7 +680,6 @@ def validate_inputs(prompt, item, validated):
                    errors.append(error)
                    continue
            if x not in validate_function_inputs:
                if isinstance(type_input, list):
                    if val not in type_input:
                        input_config = info
@ -586,18 +706,20 @@ def validate_inputs(prompt, item, validated):
                        errors.append(error)
                        continue
-    if len(validate_function_inputs) > 0:
+    if len(validate_function_inputs) > 0 or validate_has_kwargs:
-        input_data_all = get_input_data(inputs, obj_class, unique_id)
+        input_data_all, _ = get_input_data(inputs, obj_class, unique_id)
        input_filtered = {}
        for x in input_data_all:
-            if x in validate_function_inputs:
+            if x in validate_function_inputs or validate_has_kwargs:
                input_filtered[x] = input_data_all[x]
        if 'input_types' in validate_function_inputs:
            input_filtered['input_types'] = [received_types]
        #ret = obj_class.VALIDATE_INPUTS(**input_filtered)
-        ret = map_node_over_list(obj_class, input_filtered, "VALIDATE_INPUTS")
+        ret = _map_node_over_list(obj_class, input_filtered, "VALIDATE_INPUTS")
        for x in input_filtered:
            for i, r in enumerate(ret):
-                if r is not True:
+                if r is not True and not isinstance(r, ExecutionBlocker):
                    details = f"{x}"
                    if r is not False:
                        details += f" - {str(r)}"
@ -608,8 +730,6 @@ def validate_inputs(prompt, item, validated):
                        "details": details,
                        "extra_info": {
                            "input_name": x,
                            "input_config": info,
                            "received_value": val,
                        }
                    }
                    errors.append(error)
@ -775,7 +895,7 @@ class PromptQueue:
        completed: bool
        messages: List[str]
-    def task_done(self, item_id, outputs,
+    def task_done(self, item_id, history_result,
                  status: Optional['PromptQueue.ExecutionStatus']):
        with self.mutex:
            prompt = self.currently_running.pop(item_id)
@ -788,9 +908,10 @@ class PromptQueue:
            self.history[prompt[1]] = {
                "prompt": prompt,
-                "outputs": copy.deepcopy(outputs),
+                "outputs": {},
                'status': status_dict,
            }
            self.history[prompt[1]].update(history_result)
            self.server.queue_updated()
    def get_current_queue(self):
--- a/folder_paths.py
+++ b/folder_paths.py
@ -1,13 +1,13 @@
 from __future__ import annotations
 import os
 import time
 import logging
-from typing import Set, List, Dict, Tuple
+from collections.abc import Collection
-supported_pt_extensions: Set[str] = set(['.ckpt', '.pt', '.bin', '.pth', '.safetensors', '.pkl', '.sft'])
+supported_pt_extensions: set[str] = {'.ckpt', '.pt', '.bin', '.pth', '.safetensors', '.pkl', '.sft'}
-SupportedFileExtensionsType = Set[str]
+folder_names_and_paths: dict[str, tuple[list[str], set[str]]] = {}
 ScanPathType = List[str]
 folder_names_and_paths: Dict[str, Tuple[ScanPathType, SupportedFileExtensionsType]] = {}
 base_path = os.path.dirname(os.path.realpath(__file__))
 models_dir = os.path.join(base_path, "models")
@ -17,7 +17,7 @@ folder_names_and_paths["configs"] = ([os.path.join(models_dir, "configs")], [".y
 folder_names_and_paths["loras"] = ([os.path.join(models_dir, "loras")], supported_pt_extensions)
 folder_names_and_paths["vae"] = ([os.path.join(models_dir, "vae")], supported_pt_extensions)
 folder_names_and_paths["clip"] = ([os.path.join(models_dir, "clip")], supported_pt_extensions)
-folder_names_and_paths["unet"] = ([os.path.join(models_dir, "unet")], supported_pt_extensions)
+folder_names_and_paths["diffusion_models"] = ([os.path.join(models_dir, "unet"), os.path.join(models_dir, "diffusion_models")], supported_pt_extensions)
 folder_names_and_paths["clip_vision"] = ([os.path.join(models_dir, "clip_vision")], supported_pt_extensions)
 folder_names_and_paths["style_models"] = ([os.path.join(models_dir, "style_models")], supported_pt_extensions)
 folder_names_and_paths["embeddings"] = ([os.path.join(models_dir, "embeddings")], supported_pt_extensions)
@ -42,7 +42,11 @@ temp_directory = os.path.join(os.path.dirname(os.path.realpath(__file__)), "temp
 input_directory = os.path.join(os.path.dirname(os.path.realpath(__file__)), "input")
 user_directory = os.path.join(os.path.dirname(os.path.realpath(__file__)), "user")
-filename_list_cache = {}
+filename_list_cache: dict[str, tuple[list[str], dict[str, float], float]] = {}
 def map_legacy(folder_name: str) -> str:
    legacy = {"unet": "diffusion_models"}
    return legacy.get(folder_name, folder_name)
 if not os.path.exists(input_directory):
    try:
@ -50,33 +54,33 @@ if not os.path.exists(input_directory):
    except:
        logging.error("Failed to create input directory")
-def set_output_directory(output_dir):
+def set_output_directory(output_dir: str) -> None:
    global output_directory
    output_directory = output_dir
-def set_temp_directory(temp_dir):
+def set_temp_directory(temp_dir: str) -> None:
    global temp_directory
    temp_directory = temp_dir
-def set_input_directory(input_dir):
+def set_input_directory(input_dir: str) -> None:
    global input_directory
    input_directory = input_dir
-def get_output_directory():
+def get_output_directory() -> str:
    global output_directory
    return output_directory
-def get_temp_directory():
+def get_temp_directory() -> str:
    global temp_directory
    return temp_directory
-def get_input_directory():
+def get_input_directory() -> str:
    global input_directory
    return input_directory
 #NOTE: used in http server so don't put folders that should not be accessed remotely
-def get_directory_by_type(type_name):
+def get_directory_by_type(type_name: str) -> str | None:
    if type_name == "output":
        return get_output_directory()
    if type_name == "temp":
@ -88,7 +92,7 @@ def get_directory_by_type(type_name):
 # determine base_dir rely on annotation if name is 'filename.ext [annotation]' format
 # otherwise use default_path as base_dir
-def annotated_filepath(name):
+def annotated_filepath(name: str) -> tuple[str, str | None]:
    if name.endswith("[output]"):
        base_dir = get_output_directory()
        name = name[:-9]
@ -104,7 +108,7 @@ def annotated_filepath(name):
    return name, base_dir
-def get_annotated_filepath(name, default_dir=None):
+def get_annotated_filepath(name: str, default_dir: str | None=None) -> str:
    name, base_dir = annotated_filepath(name)
    if base_dir is None:
@ -116,7 +120,7 @@ def get_annotated_filepath(name, default_dir=None):
    return os.path.join(base_dir, name)
-def exists_annotated_filepath(name):
+def exists_annotated_filepath(name) -> bool:
    name, base_dir = annotated_filepath(name)
    if base_dir is None:
@ -126,17 +130,19 @@ def exists_annotated_filepath(name):
    return os.path.exists(filepath)
-def add_model_folder_path(folder_name, full_folder_path):
+def add_model_folder_path(folder_name: str, full_folder_path: str) -> None:
    global folder_names_and_paths
    folder_name = map_legacy(folder_name)
    if folder_name in folder_names_and_paths:
        folder_names_and_paths[folder_name][0].append(full_folder_path)
    else:
        folder_names_and_paths[folder_name] = ([full_folder_path], set())
-def get_folder_paths(folder_name):
+def get_folder_paths(folder_name: str) -> list[str]:
    folder_name = map_legacy(folder_name)
    return folder_names_and_paths[folder_name][0][:]
-def recursive_search(directory, excluded_dir_names=None):
+def recursive_search(directory: str, excluded_dir_names: list[str] | None=None) -> tuple[list[str], dict[str, float]]:
    if not os.path.isdir(directory):
        return [], {}
@ -153,6 +159,10 @@ def recursive_search(directory, excluded_dir_names=None):
        logging.warning(f"Warning: Unable to access {directory}. Skipping this path.")
    logging.debug("recursive file list on directory {}".format(directory))
    dirpath: str
    subdirs: list[str]
    filenames: list[str]
    for dirpath, subdirs, filenames in os.walk(directory, followlinks=True, topdown=True):
        subdirs[:] = [d for d in subdirs if d not in excluded_dir_names]
        for file_name in filenames:
@ -160,7 +170,7 @@ def recursive_search(directory, excluded_dir_names=None):
            result.append(relative_path)
        for d in subdirs:
-            path = os.path.join(dirpath, d)
+            path: str = os.path.join(dirpath, d)
            try:
                dirs[path] = os.path.getmtime(path)
            except FileNotFoundError:
@ -169,13 +179,14 @@ def recursive_search(directory, excluded_dir_names=None):
    logging.debug("found {} files".format(len(result)))
    return result, dirs
-def filter_files_extensions(files, extensions):
+def filter_files_extensions(files: Collection[str], extensions: Collection[str]) -> list[str]:
    return sorted(list(filter(lambda a: os.path.splitext(a)[-1].lower() in extensions or len(extensions) == 0, files)))
-def get_full_path(folder_name, filename):
+def get_full_path(folder_name: str, filename: str) -> str | None:
    global folder_names_and_paths
    folder_name = map_legacy(folder_name)
    if folder_name not in folder_names_and_paths:
        return None
    folders = folder_names_and_paths[folder_name]
@ -189,7 +200,8 @@ def get_full_path(folder_name, filename):
    return None
-def get_filename_list_(folder_name):
+def get_filename_list_(folder_name: str) -> tuple[list[str], dict[str, float], float]:
    folder_name = map_legacy(folder_name)
    global folder_names_and_paths
    output_list = set()
    folders = folder_names_and_paths[folder_name]
@ -199,11 +211,12 @@ def get_filename_list_(folder_name):
        output_list.update(filter_files_extensions(files, folders[1]))
        output_folders = {**output_folders, **folders_all}
-    return (sorted(list(output_list)), output_folders, time.perf_counter())
+    return sorted(list(output_list)), output_folders, time.perf_counter()
-def cached_filename_list_(folder_name):
+def cached_filename_list_(folder_name: str) -> tuple[list[str], dict[str, float], float] | None:
    global filename_list_cache
    global folder_names_and_paths
    folder_name = map_legacy(folder_name)
    if folder_name not in filename_list_cache:
        return None
    out = filename_list_cache[folder_name]
@ -222,7 +235,8 @@ def cached_filename_list_(folder_name):
    return out
-def get_filename_list(folder_name):
+def get_filename_list(folder_name: str) -> list[str]:
    folder_name = map_legacy(folder_name)
    out = cached_filename_list_(folder_name)
    if out is None:
        out = get_filename_list_(folder_name)
@ -230,17 +244,17 @@ def get_filename_list(folder_name):
        filename_list_cache[folder_name] = out
    return list(out[0])
-def get_save_image_path(filename_prefix, output_dir, image_width=0, image_height=0):
+def get_save_image_path(filename_prefix: str, output_dir: str, image_width=0, image_height=0) -> tuple[str, str, int, str, str]:
-    def map_filename(filename):
+    def map_filename(filename: str) -> tuple[int, str]:
        prefix_len = len(os.path.basename(filename_prefix))
        prefix = filename[:prefix_len + 1]
        try:
            digits = int(filename[prefix_len + 1:].split('_')[0])
        except:
            digits = 0
-        return (digits, prefix)
+        return digits, prefix
-    def compute_vars(input, image_width, image_height):
+    def compute_vars(input: str, image_width: int, image_height: int) -> str:
        input = input.replace("%width%", str(image_width))
        input = input.replace("%height%", str(image_height))
        return input
--- a/main.py
+++ b/main.py
@ -101,7 +101,7 @@ def cuda_malloc_warning():
            logging.warning("\nWARNING: this card most likely does not support cuda-malloc, if you get \"CUDA error\" please run ComfyUI with: --disable-cuda-malloc\n")
 def prompt_worker(q, server):
-    e = execution.PromptExecutor(server)
+    e = execution.PromptExecutor(server, lru_size=args.cache_lru)
    last_gc_collect = 0
    need_gc = False
    gc_collect_interval = 10.0
@ -121,7 +121,7 @@ def prompt_worker(q, server):
            e.execute(item[2], prompt_id, item[3], item[4])
            need_gc = True
            q.task_done(item_id,
-                        e.outputs_ui,
+                        e.history_result,
                        status=execution.PromptQueue.ExecutionStatus(
                            status_str='success' if e.success else 'error',
                            completed=e.success,
@ -242,6 +242,7 @@ if __name__ == "__main__":
    folder_paths.add_model_folder_path("checkpoints", os.path.join(folder_paths.get_output_directory(), "checkpoints"))
    folder_paths.add_model_folder_path("clip", os.path.join(folder_paths.get_output_directory(), "clip"))
    folder_paths.add_model_folder_path("vae", os.path.join(folder_paths.get_output_directory(), "vae"))
    folder_paths.add_model_folder_path("diffusion_models", os.path.join(folder_paths.get_output_directory(), "diffusion_models"))
    if args.input_directory:
        input_dir = os.path.abspath(args.input_directory)
@ -261,6 +262,7 @@ if __name__ == "__main__":
        call_on_start = startup_server
    try:
        loop.run_until_complete(server.setup())
        loop.run_until_complete(run(server, address=args.listen, port=args.port, verbose=not args.dont_print_server, call_on_start=call_on_start))
    except KeyboardInterrupt:
        logging.info("\nStopped server")
--- a/model_filemanager/init.py
+++ b/model_filemanager/init.py
@ -0,0 +1,2 @@
 # model_manager/__init__.py
 from .download_models import download_model, DownloadModelStatus, DownloadStatusType, create_model_path, check_file_exists, track_download_progress, validate_model_subdirectory, validate_filename
--- a/model_filemanager/download_models.py
+++ b/model_filemanager/download_models.py
@ -0,0 +1,240 @@
 from __future__ import annotations
 import aiohttp
 import os
 import traceback
 import logging
 from folder_paths import models_dir
 import re
 from typing import Callable, Any, Optional, Awaitable, Dict
 from enum import Enum
 import time
 from dataclasses import dataclass
 class DownloadStatusType(Enum):
    PENDING = "pending"
    IN_PROGRESS = "in_progress"
    COMPLETED = "completed"
    ERROR = "error"
@dataclass
 class DownloadModelStatus():
    status: str
    progress_percentage: float
    message: str
    already_existed: bool = False
    def __init__(self, status: DownloadStatusType, progress_percentage: float, message: str, already_existed: bool):
        self.status = status.value  # Store the string value of the Enum
        self.progress_percentage = progress_percentage
        self.message = message
        self.already_existed = already_existed
    def to_dict(self) -> Dict[str, Any]:
        return {
            "status": self.status,
            "progress_percentage": self.progress_percentage,
            "message": self.message,
            "already_existed": self.already_existed
        }
 async def download_model(model_download_request: Callable[[str], Awaitable[aiohttp.ClientResponse]],
                         model_name: str,  
                         model_url: str, 
                         model_sub_directory: str,
                         progress_callback: Callable[[str, DownloadModelStatus], Awaitable[Any]],
                         progress_interval: float = 1.0) -> DownloadModelStatus:
    """
    Download a model file from a given URL into the models directory.
    Args:
        model_download_request (Callable[[str], Awaitable[aiohttp.ClientResponse]]): 
            A function that makes an HTTP request. This makes it easier to mock in unit tests.
        model_name (str): 
            The name of the model file to be downloaded. This will be the filename on disk.
        model_url (str): 
            The URL from which to download the model.
        model_sub_directory (str): 
            The subdirectory within the main models directory where the model 
            should be saved (e.g., 'checkpoints', 'loras', etc.).
        progress_callback (Callable[[str, DownloadModelStatus], Awaitable[Any]]): 
            An asynchronous function to call with progress updates.
    Returns:
        DownloadModelStatus: The result of the download operation.
    """
    if not validate_model_subdirectory(model_sub_directory):
        return DownloadModelStatus(
            DownloadStatusType.ERROR, 
            0,
            "Invalid model subdirectory", 
            False
        )
    if not validate_filename(model_name):
        return DownloadModelStatus(
            DownloadStatusType.ERROR, 
            0,
            "Invalid model name", 
            False
        )
    file_path, relative_path = create_model_path(model_name, model_sub_directory, models_dir)
    existing_file = await check_file_exists(file_path, model_name, progress_callback, relative_path)
    if existing_file:
        return existing_file
    try:
        status = DownloadModelStatus(DownloadStatusType.PENDING, 0, f"Starting download of {model_name}", False)
        await progress_callback(relative_path, status)
        response = await model_download_request(model_url)
        if response.status != 200:
            error_message = f"Failed to download {model_name}. Status code: {response.status}"
            logging.error(error_message)
            status = DownloadModelStatus(DownloadStatusType.ERROR, 0, error_message, False)
            await progress_callback(relative_path, status)
            return DownloadModelStatus(DownloadStatusType.ERROR, 0, error_message, False)
        return await track_download_progress(response, file_path, model_name, progress_callback, relative_path, progress_interval)
    except Exception as e:
        logging.error(f"Error in downloading model: {e}")
        return await handle_download_error(e, model_name, progress_callback, relative_path)
 def create_model_path(model_name: str, model_directory: str, models_base_dir: str) -> tuple[str, str]:
    full_model_dir = os.path.join(models_base_dir, model_directory)
    os.makedirs(full_model_dir, exist_ok=True)
    file_path = os.path.join(full_model_dir, model_name)
    # Ensure the resulting path is still within the base directory
    abs_file_path = os.path.abspath(file_path)
    abs_base_dir = os.path.abspath(str(models_base_dir))
    if os.path.commonprefix([abs_file_path, abs_base_dir]) != abs_base_dir:
        raise Exception(f"Invalid model directory: {model_directory}/{model_name}")
    relative_path = '/'.join([model_directory, model_name])
    return file_path, relative_path
 async def check_file_exists(file_path: str, 
                            model_name: str, 
                            progress_callback: Callable[[str, DownloadModelStatus], Awaitable[Any]], 
                            relative_path: str) -> Optional[DownloadModelStatus]:
    if os.path.exists(file_path):
        status = DownloadModelStatus(DownloadStatusType.COMPLETED, 100, f"{model_name} already exists", True)
        await progress_callback(relative_path, status)
        return status
    return None
 async def track_download_progress(response: aiohttp.ClientResponse, 
                                  file_path: str, 
                                  model_name: str, 
                                  progress_callback: Callable[[str, DownloadModelStatus], Awaitable[Any]], 
                                  relative_path: str, 
                                  interval: float = 1.0) -> DownloadModelStatus:
    try:
        total_size = int(response.headers.get('Content-Length', 0))
        downloaded = 0
        last_update_time = time.time()
        async def update_progress():
            nonlocal last_update_time
            progress = (downloaded / total_size) * 100 if total_size > 0 else 0
            status = DownloadModelStatus(DownloadStatusType.IN_PROGRESS, progress, f"Downloading {model_name}", False)
            await progress_callback(relative_path, status)
            last_update_time = time.time()
        with open(file_path, 'wb') as f:
            chunk_iterator = response.content.iter_chunked(8192)
            while True:
                try:
                    chunk = await chunk_iterator.__anext__()
                except StopAsyncIteration:
                    break
                f.write(chunk)
                downloaded += len(chunk)
                if time.time() - last_update_time >= interval:
                    await update_progress()
        await update_progress()
        logging.info(f"Successfully downloaded {model_name}. Total downloaded: {downloaded}")
        status = DownloadModelStatus(DownloadStatusType.COMPLETED, 100, f"Successfully downloaded {model_name}", False)
        await progress_callback(relative_path, status)
        return status
    except Exception as e:
        logging.error(f"Error in track_download_progress: {e}")
        logging.error(traceback.format_exc())
        return await handle_download_error(e, model_name, progress_callback, relative_path)
 async def handle_download_error(e: Exception, 
                                model_name: str, 
                                progress_callback: Callable[[str, DownloadModelStatus], Any], 
                                relative_path: str) -> DownloadModelStatus:
    error_message = f"Error downloading {model_name}: {str(e)}"
    status = DownloadModelStatus(DownloadStatusType.ERROR, 0, error_message, False)
    await progress_callback(relative_path, status)
    return status
 def validate_model_subdirectory(model_subdirectory: str) -> bool:
    """
    Validate that the model subdirectory is safe to install into. 
    Must not contain relative paths, nested paths or special characters
    other than underscores and hyphens.
    Args:
        model_subdirectory (str): The subdirectory for the specific model type.
    Returns:
        bool: True if the subdirectory is safe, False otherwise.
    """
    if len(model_subdirectory) > 50:
        return False
    if '..' in model_subdirectory or '/' in model_subdirectory:
        return False
    if not re.match(r'^[a-zA-Z0-9_-]+$', model_subdirectory):
        return False
    return True
 def validate_filename(filename: str)-> bool:
    """
    Validate a filename to ensure it's safe and doesn't contain any path traversal attempts.
    Args:
    filename (str): The filename to validate
    Returns:
    bool: True if the filename is valid, False otherwise
    """
    if not filename.lower().endswith(('.sft', '.safetensors')):
        return False
    # Check if the filename is empty, None, or just whitespace
    if not filename or not filename.strip():
        return False
    # Check for any directory traversal attempts or invalid characters
    if any(char in filename for char in ['..', '/', '\\', '\n', '\r', '\t', '\0']):
        return False
    # Check if the filename starts with a dot (hidden file)
    if filename.startswith('.'):
        return False
    # Use a whitelist of allowed characters
    if not re.match(r'^[a-zA-Z0-9_\-. ]+$', filename):
        return False
    # Ensure the filename isn't too long
    if len(filename) > 255:
        return False
    return True
--- a/models/diffusion_models/put_diffusion_model_files_here
+++ b/models/diffusion_models/put_diffusion_model_files_here
--- a/nodes.py
+++ b/nodes.py
@ -47,11 +47,18 @@ MAX_RESOLUTION=16384
 class CLIPTextEncode:
    @classmethod
    def INPUT_TYPES(s):
-        return {"required": {"text": ("STRING", {"multiline": True, "dynamicPrompts": True}), "clip": ("CLIP", )}}
+        return {
            "required": {
                "text": ("STRING", {"multiline": True, "dynamicPrompts": True, "tooltip": "The text to be encoded."}), 
                "clip": ("CLIP", {"tooltip": "The CLIP model used for encoding the text."})
            }
        }
    RETURN_TYPES = ("CONDITIONING",)
    OUTPUT_TOOLTIPS = ("A conditioning containing the embedded text used to guide the diffusion model.",)
    FUNCTION = "encode"
    CATEGORY = "conditioning"
    DESCRIPTION = "Encodes a text prompt using a CLIP model into an embedding that can be used to guide the diffusion model towards generating specific images."
    def encode(self, clip, text):
        tokens = clip.tokenize(text)
@ -260,11 +267,18 @@ class ConditioningSetTimestepRange:
 class VAEDecode:
    @classmethod
    def INPUT_TYPES(s):
-        return {"required": { "samples": ("LATENT", ), "vae": ("VAE", )}}
+        return {
            "required": { 
                "samples": ("LATENT", {"tooltip": "The latent to be decoded."}), 
                "vae": ("VAE", {"tooltip": "The VAE model used for decoding the latent."})
            }
        }
    RETURN_TYPES = ("IMAGE",)
    OUTPUT_TOOLTIPS = ("The decoded image.",)
    FUNCTION = "decode"
    CATEGORY = "latent"
    DESCRIPTION = "Decodes latent images back into pixel space images."
    def decode(self, vae, samples):
        return (vae.decode(samples["samples"]), )
@ -506,12 +520,19 @@ class CheckpointLoader:
 class CheckpointLoaderSimple:
    @classmethod
    def INPUT_TYPES(s):
-        return {"required": { "ckpt_name": (folder_paths.get_filename_list("checkpoints"), ),
+        return {
-                             }}
+            "required": { 
                "ckpt_name": (folder_paths.get_filename_list("checkpoints"), {"tooltip": "The name of the checkpoint (model) to load."}),
            }
        }
    RETURN_TYPES = ("MODEL", "CLIP", "VAE")
    OUTPUT_TOOLTIPS = ("The model used for denoising latents.", 
                       "The CLIP model used for encoding text prompts.", 
                       "The VAE model used for encoding and decoding images to and from latent space.")
    FUNCTION = "load_checkpoint"
    CATEGORY = "loaders"
    DESCRIPTION = "Loads a diffusion model checkpoint, diffusion models are used to denoise latents."
    def load_checkpoint(self, ckpt_name):
        ckpt_path = folder_paths.get_full_path("checkpoints", ckpt_name)
@ -582,16 +603,22 @@ class LoraLoader:
    @classmethod
    def INPUT_TYPES(s):
-        return {"required": { "model": ("MODEL",),
+        return {
-                              "clip": ("CLIP", ),
+            "required": { 
-                              "lora_name": (folder_paths.get_filename_list("loras"), ),
+                "model": ("MODEL", {"tooltip": "The diffusion model the LoRA will be applied to."}),
-                              "strength_model": ("FLOAT", {"default": 1.0, "min": -100.0, "max": 100.0, "step": 0.01}),
+                "clip": ("CLIP", {"tooltip": "The CLIP model the LoRA will be applied to."}),
-                              "strength_clip": ("FLOAT", {"default": 1.0, "min": -100.0, "max": 100.0, "step": 0.01}),
+                "lora_name": (folder_paths.get_filename_list("loras"), {"tooltip": "The name of the LoRA."}),
-                              }}
+                "strength_model": ("FLOAT", {"default": 1.0, "min": -100.0, "max": 100.0, "step": 0.01, "tooltip": "How strongly to modify the diffusion model. This value can be negative."}),
                "strength_clip": ("FLOAT", {"default": 1.0, "min": -100.0, "max": 100.0, "step": 0.01, "tooltip": "How strongly to modify the CLIP model. This value can be negative."}),
            }
        }
    RETURN_TYPES = ("MODEL", "CLIP")
    OUTPUT_TOOLTIPS = ("The modified diffusion model.", "The modified CLIP model.")
    FUNCTION = "load_lora"
    CATEGORY = "loaders"
    DESCRIPTION = "LoRAs are used to modify diffusion and CLIP models, altering the way in which latents are denoised such as applying styles. Multiple LoRA nodes can be linked together."
    def load_lora(self, model, clip, lora_name, strength_model, strength_clip):
        if strength_model == 0 and strength_clip == 0:
@ -638,6 +665,8 @@ class VAELoader:
        sd1_taesd_dec = False
        sd3_taesd_enc = False
        sd3_taesd_dec = False
        f1_taesd_enc = False
        f1_taesd_dec = False
        for v in approx_vaes:
            if v.startswith("taesd_decoder."):
@ -652,12 +681,18 @@ class VAELoader:
                sd3_taesd_dec = True
            elif v.startswith("taesd3_encoder."):
                sd3_taesd_enc = True
            elif v.startswith("taef1_encoder."):
                f1_taesd_dec = True
            elif v.startswith("taef1_decoder."):
                f1_taesd_enc = True
        if sd1_taesd_dec and sd1_taesd_enc:
            vaes.append("taesd")
        if sdxl_taesd_dec and sdxl_taesd_enc:
            vaes.append("taesdxl")
        if sd3_taesd_dec and sd3_taesd_enc:
            vaes.append("taesd3")
        if f1_taesd_dec and f1_taesd_enc:
            vaes.append("taef1")
        return vaes
    @staticmethod
@ -685,6 +720,9 @@ class VAELoader:
        elif name == "taesd3":
            sd["vae_scale"] = torch.tensor(1.5305)
            sd["vae_shift"] = torch.tensor(0.0609)
        elif name == "taef1":
            sd["vae_scale"] = torch.tensor(0.3611)
            sd["vae_shift"] = torch.tensor(0.1159)
        return sd
    @classmethod
@ -697,7 +735,7 @@ class VAELoader:
    #TODO: scale factor?
    def load_vae(self, vae_name):
-        if vae_name in ["taesd", "taesdxl", "taesd3"]:
+        if vae_name in ["taesd", "taesdxl", "taesd3", "taef1"]:
            sd = self.load_taesd(vae_name)
        else:
            vae_path = folder_paths.get_full_path("vae", vae_name)
@ -817,7 +855,7 @@ class ControlNetApplyAdvanced:
 class UNETLoader:
    @classmethod
    def INPUT_TYPES(s):
-        return {"required": { "unet_name": (folder_paths.get_filename_list("unet"), ),
+        return {"required": { "unet_name": (folder_paths.get_filename_list("diffusion_models"), ),
                              "weight_dtype": (["default", "fp8_e4m3fn", "fp8_e5m2"],)
                             }}
    RETURN_TYPES = ("MODEL",)
@ -826,14 +864,14 @@ class UNETLoader:
    CATEGORY = "advanced/loaders"
    def load_unet(self, unet_name, weight_dtype):
-        dtype = None
+        model_options = {}
        if weight_dtype == "fp8_e4m3fn":
-            dtype = torch.float8_e4m3fn
+            model_options["dtype"] = torch.float8_e4m3fn
        elif weight_dtype == "fp8_e5m2":
-            dtype = torch.float8_e5m2
+            model_options["dtype"] = torch.float8_e5m2
-        unet_path = folder_paths.get_full_path("unet", unet_name)
+        unet_path = folder_paths.get_full_path("diffusion_models", unet_name)
-        model = comfy.sd.load_unet(unet_path, dtype=dtype)
+        model = comfy.sd.load_diffusion_model(unet_path, model_options=model_options)
        return (model,)
 class CLIPLoader:
@ -1033,13 +1071,19 @@ class EmptyLatentImage:
    @classmethod
    def INPUT_TYPES(s):
-        return {"required": { "width": ("INT", {"default": 512, "min": 16, "max": MAX_RESOLUTION, "step": 8}),
+        return {
-                              "height": ("INT", {"default": 512, "min": 16, "max": MAX_RESOLUTION, "step": 8}),
+            "required": { 
-                              "batch_size": ("INT", {"default": 1, "min": 1, "max": 4096})}}
+                "width": ("INT", {"default": 512, "min": 16, "max": MAX_RESOLUTION, "step": 8, "tooltip": "The width of the latent images in pixels."}),
                "height": ("INT", {"default": 512, "min": 16, "max": MAX_RESOLUTION, "step": 8, "tooltip": "The height of the latent images in pixels."}),
                "batch_size": ("INT", {"default": 1, "min": 1, "max": 4096, "tooltip": "The number of latent images in the batch."})
            }
        }
    RETURN_TYPES = ("LATENT",)
    OUTPUT_TOOLTIPS = ("The empty latent image batch.",)
    FUNCTION = "generate"
    CATEGORY = "latent"
    DESCRIPTION = "Create a new batch of empty latent images to be denoised via sampling."
    def generate(self, width, height, batch_size=1):
        latent = torch.zeros([batch_size, 4, height // 8, width // 8], device=self.device)
@ -1359,24 +1403,27 @@ def common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive,
 class KSampler:
    @classmethod
    def INPUT_TYPES(s):
-        return {"required":
+        return {
-                    {"model": ("MODEL",),
+            "required": {
-                    "seed": ("INT", {"default": 0, "min": 0, "max": 0xffffffffffffffff}),
+                "model": ("MODEL", {"tooltip": "The model used for denoising the input latent."}),
-                    "steps": ("INT", {"default": 20, "min": 1, "max": 10000}),
+                "seed": ("INT", {"default": 0, "min": 0, "max": 0xffffffffffffffff, "tooltip": "The random seed used for creating the noise."}),
-                    "cfg": ("FLOAT", {"default": 8.0, "min": 0.0, "max": 100.0, "step":0.1, "round": 0.01}),
+                "steps": ("INT", {"default": 20, "min": 1, "max": 10000, "tooltip": "The number of steps used in the denoising process."}),
-                    "sampler_name": (comfy.samplers.KSampler.SAMPLERS, ),
+                "cfg": ("FLOAT", {"default": 8.0, "min": 0.0, "max": 100.0, "step":0.1, "round": 0.01, "tooltip": "The Classifier-Free Guidance scale balances creativity and adherence to the prompt. Higher values result in images more closely matching the prompt however too high values will negatively impact quality."}),
-                    "scheduler": (comfy.samplers.KSampler.SCHEDULERS, ),
+                "sampler_name": (comfy.samplers.KSampler.SAMPLERS, {"tooltip": "The algorithm used when sampling, this can affect the quality, speed, and style of the generated output."}),
-                    "positive": ("CONDITIONING", ),
+                "scheduler": (comfy.samplers.KSampler.SCHEDULERS, {"tooltip": "The scheduler controls how noise is gradually removed to form the image."}),
-                    "negative": ("CONDITIONING", ),
+                "positive": ("CONDITIONING", {"tooltip": "The conditioning describing the attributes you want to include in the image."}),
-                    "latent_image": ("LATENT", ),
+                "negative": ("CONDITIONING", {"tooltip": "The conditioning describing the attributes you want to exclude from the image."}),
-                    "denoise": ("FLOAT", {"default": 1.0, "min": 0.0, "max": 1.0, "step": 0.01}),
+                "latent_image": ("LATENT", {"tooltip": "The latent image to denoise."}),
-                     }
+                "denoise": ("FLOAT", {"default": 1.0, "min": 0.0, "max": 1.0, "step": 0.01, "tooltip": "The amount of denoising applied, lower values will maintain the structure of the initial image allowing for image to image sampling."}),
-                }
+            }
        }
    RETURN_TYPES = ("LATENT",)
    OUTPUT_TOOLTIPS = ("The denoised latent.",)
    FUNCTION = "sample"
    CATEGORY = "sampling"
    DESCRIPTION = "Uses the provided model, positive and negative conditioning to denoise the latent image."
    def sample(self, model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=1.0):
        return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
@ -1424,11 +1471,15 @@ class SaveImage:
    @classmethod
    def INPUT_TYPES(s):
-        return {"required": 
+        return {
-                    {"images": ("IMAGE", ),
+            "required": {
-                     "filename_prefix": ("STRING", {"default": "ComfyUI"})},
+                "images": ("IMAGE", {"tooltip": "The images to save."}),
-                "hidden": {"prompt": "PROMPT", "extra_pnginfo": "EXTRA_PNGINFO"},
+                "filename_prefix": ("STRING", {"default": "ComfyUI", "tooltip": "The prefix for the file to save. This may include formatting information such as %date:yyyy-MM-dd% or %Empty Latent Image.width% to include values from nodes."})
-                }
+            },
            "hidden": {
                "prompt": "PROMPT", "extra_pnginfo": "EXTRA_PNGINFO"
            },
        }
    RETURN_TYPES = ()
    FUNCTION = "save_images"
@ -1436,6 +1487,7 @@ class SaveImage:
    OUTPUT_NODE = True
    CATEGORY = "image"
    DESCRIPTION = "Saves the input images to your ComfyUI output directory."
    def save_images(self, images, filename_prefix="ComfyUI", prompt=None, extra_pnginfo=None):
        filename_prefix += self.prefix_append
--- a/pytest.ini
+++ b/pytest.ini
@ -1,6 +1,7 @@
 [pytest]
 markers = 
  inference: mark as inference test (deselect with '-m "not inference"')
  execution: mark as execution test (deselect with '-m "not execution"')
 testpaths =
  tests
  tests-unit
--- a/server.py
+++ b/server.py
@ -12,7 +12,6 @@ import json
 import glob
 import struct
 import ssl
 import hashlib
 from PIL import Image, ImageOps
 from PIL.PngImagePlugin import PngInfo
 from io import BytesIO
@ -28,6 +27,9 @@ import comfy.model_management
 import node_helpers
 from app.frontend_management import FrontendManager
 from app.user_manager import UserManager
 from model_filemanager import download_model, DownloadModelStatus
 from typing import Optional
 from api_server.routes.internal.internal_routes import InternalRoutes
 class BinaryEventTypes:
@ -72,10 +74,12 @@ class PromptServer():
        mimetypes.types_map['.js'] = 'application/javascript; charset=utf-8'
        self.user_manager = UserManager()
        self.internal_routes = InternalRoutes()
        self.supports = ["custom_nodes_from_web"]
        self.prompt_queue = None
        self.loop = loop
        self.messages = asyncio.Queue()
        self.client_session:Optional[aiohttp.ClientSession] = None
        self.number = 0
        middlewares = [cache_control]
@ -127,13 +131,25 @@ class PromptServer():
        @routes.get("/")
        async def get_root(request):
-            return web.FileResponse(os.path.join(self.web_root, "index.html"))
+            response = web.FileResponse(os.path.join(self.web_root, "index.html"))
            response.headers['Cache-Control'] = 'no-cache'
            response.headers["Pragma"] = "no-cache"
            response.headers["Expires"] = "0"
            return response
        @routes.get("/embeddings")
        def get_embeddings(self):
            embeddings = folder_paths.get_filename_list("embeddings")
            return web.json_response(list(map(lambda a: os.path.splitext(a)[0], embeddings)))
        @routes.get("/models/{folder}")
        async def get_models(request):
            folder = request.match_info.get("folder", None)
            if not folder in folder_paths.folder_names_and_paths:
                return web.Response(status=404)
            files = folder_paths.get_filename_list(folder)
            return web.json_response(files)
        @routes.get("/extensions")
        async def get_extensions(request):
            files = glob.glob(os.path.join(
@ -418,6 +434,7 @@ class PromptServer():
            obj_class = nodes.NODE_CLASS_MAPPINGS[node_class]
            info = {}
            info['input'] = obj_class.INPUT_TYPES()
            info['input_order'] = {key: list(value.keys()) for (key, value) in obj_class.INPUT_TYPES().items()}
            info['output'] = obj_class.RETURN_TYPES
            info['output_is_list'] = obj_class.OUTPUT_IS_LIST if hasattr(obj_class, 'OUTPUT_IS_LIST') else [False] * len(obj_class.RETURN_TYPES)
            info['output_name'] = obj_class.RETURN_NAMES if hasattr(obj_class, 'RETURN_NAMES') else info['output']
@ -433,6 +450,14 @@ class PromptServer():
            if hasattr(obj_class, 'CATEGORY'):
                info['category'] = obj_class.CATEGORY
            if hasattr(obj_class, 'OUTPUT_TOOLTIPS'):
                info['output_tooltips'] = obj_class.OUTPUT_TOOLTIPS
            if getattr(obj_class, "DEPRECATED", False):
                info['deprecated'] = True
            if getattr(obj_class, "EXPERIMENTAL", False):
                info['experimental'] = True
            return info
        @routes.get("/object_info")
@ -555,9 +580,42 @@ class PromptServer():
                    self.prompt_queue.delete_history_item(id_to_delete)
            return web.Response(status=200)
        # Internal route. Should not be depended upon and is subject to change at any time.
        # TODO(robinhuang): Move to internal route table class once we refactor PromptServer to pass around Websocket.
        @routes.post("/internal/models/download")
        async def download_handler(request):
            async def report_progress(filename: str, status: DownloadModelStatus):
                payload = status.to_dict()
                payload['download_path'] = filename
                await self.send_json("download_progress", payload)
            data = await request.json()
            url = data.get('url')
            model_directory = data.get('model_directory')
            model_filename = data.get('model_filename')
            progress_interval = data.get('progress_interval', 1.0) # In seconds, how often to report download progress.
            if not url or not model_directory or not model_filename:
                return web.json_response({"status": "error", "message": "Missing URL or folder path or filename"}, status=400)
            session = self.client_session
            if session is None:
                logging.error("Client session is not initialized")
                return web.Response(status=500)
            task = asyncio.create_task(download_model(lambda url: session.get(url), model_filename, url, model_directory, report_progress, progress_interval))
            await task
            return web.json_response(task.result().to_dict())
    async def setup(self):
        timeout = aiohttp.ClientTimeout(total=None) # no timeout
        self.client_session = aiohttp.ClientSession(timeout=timeout)
    def add_routes(self):
        self.user_manager.add_routes(self.routes)
        self.app.add_subapp('/internal', self.internal_routes.get_app())
        # Prefix every route with /api for easier matching for delegation.
        # This is very useful for frontend dev server, which need to forward
@ -676,6 +734,9 @@ class PromptServer():
        site = web.TCPSite(runner, address, port, ssl_context=ssl_ctx)
        await site.start()
        self.address = address
        self.port = port
        if verbose:
            logging.info("Starting server\n")
            logging.info("To see the GUI go to: {}://{}:{}".format(scheme, address, port))
--- a/tests-unit/app_test/frontend_manager_test.py
+++ b/tests-unit/app_test/frontend_manager_test.py
@ -1,6 +1,7 @@
 import argparse
 import pytest
 from requests.exceptions import HTTPError
 from unittest.mock import patch
 from app.frontend_management import (
    FrontendManager,
@ -83,6 +84,35 @@ def test_init_frontend_invalid_provider():
    with pytest.raises(HTTPError):
        FrontendManager.init_frontend_unsafe(version_string)
@pytest.fixture
 def mock_os_functions():
    with patch('app.frontend_management.os.makedirs') as mock_makedirs, \
         patch('app.frontend_management.os.listdir') as mock_listdir, \
         patch('app.frontend_management.os.rmdir') as mock_rmdir:
        mock_listdir.return_value = []  # Simulate empty directory
        yield mock_makedirs, mock_listdir, mock_rmdir
@pytest.fixture
 def mock_download():
    with patch('app.frontend_management.download_release_asset_zip') as mock:
        mock.side_effect = Exception("Download failed")  # Simulate download failure
        yield mock
 def test_finally_block(mock_os_functions, mock_download, mock_provider):
    # Arrange
    mock_makedirs, mock_listdir, mock_rmdir = mock_os_functions
    version_string = 'test-owner/test-repo@1.0.0'
    # Act & Assert
    with pytest.raises(Exception):
        FrontendManager.init_frontend_unsafe(version_string, mock_provider)
    # Assert
    mock_makedirs.assert_called_once()
    mock_download.assert_called_once()
    mock_listdir.assert_called_once()
    mock_rmdir.assert_called_once()
 def test_parse_version_string():
    version_string = "owner/repo@1.0.0"
--- a/tests-unit/prompt_server_test/init.py
+++ b/tests-unit/prompt_server_test/init.py
--- a/tests-unit/prompt_server_test/download_models_test.py
+++ b/tests-unit/prompt_server_test/download_models_test.py
@ -0,0 +1,321 @@
 import pytest
 import aiohttp
 from aiohttp import ClientResponse
 import itertools
 import os 
 from unittest.mock import AsyncMock, patch, MagicMock
 from model_filemanager import download_model, validate_model_subdirectory, track_download_progress, create_model_path, check_file_exists, DownloadStatusType, DownloadModelStatus, validate_filename
 class AsyncIteratorMock:
    """
    A mock class that simulates an asynchronous iterator.
    This is used to mimic the behavior of aiohttp's content iterator.
    """
    def __init__(self, seq):
        # Convert the input sequence into an iterator
        self.iter = iter(seq)
    def __aiter__(self):
        # This method is called when 'async for' is used
        return self
    async def __anext__(self):
        # This method is called for each iteration in an 'async for' loop
        try:
            return next(self.iter)
        except StopIteration:
            # This is the asynchronous equivalent of StopIteration
            raise StopAsyncIteration
 class ContentMock:
    """
    A mock class that simulates the content attribute of an aiohttp ClientResponse.
    This class provides the iter_chunked method which returns an async iterator of chunks.
    """
    def __init__(self, chunks):
        # Store the chunks that will be returned by the iterator
        self.chunks = chunks
    def iter_chunked(self, chunk_size):
        # This method mimics aiohttp's content.iter_chunked()
        # For simplicity in testing, we ignore chunk_size and just return our predefined chunks
        return AsyncIteratorMock(self.chunks)
@pytest.mark.asyncio
 async def test_download_model_success():
    mock_response = AsyncMock(spec=aiohttp.ClientResponse)
    mock_response.status = 200
    mock_response.headers = {'Content-Length': '1000'}
    # Create a mock for content that returns an async iterator directly
    chunks = [b'a' * 500, b'b' * 300, b'c' * 200]
    mock_response.content = ContentMock(chunks)
    mock_make_request = AsyncMock(return_value=mock_response)
    mock_progress_callback = AsyncMock()
    # Mock file operations
    mock_open = MagicMock()
    mock_file = MagicMock()
    mock_open.return_value.__enter__.return_value = mock_file
    time_values = itertools.count(0, 0.1)
    with patch('model_filemanager.create_model_path', return_value=('models/checkpoints/model.sft', 'checkpoints/model.sft')), \
         patch('model_filemanager.check_file_exists', return_value=None), \
         patch('builtins.open', mock_open), \
         patch('time.time', side_effect=time_values):  # Simulate time passing
        result = await download_model(
            mock_make_request,
            'model.sft',
            'http://example.com/model.sft',
            'checkpoints',
            mock_progress_callback
        )
    # Assert the result
    assert isinstance(result, DownloadModelStatus)
    assert result.message == 'Successfully downloaded model.sft'
    assert result.status == 'completed'
    assert result.already_existed is False
    # Check progress callback calls
    assert mock_progress_callback.call_count >= 3  # At least start, one progress update, and completion
    # Check initial call
    mock_progress_callback.assert_any_call(
        'checkpoints/model.sft',
        DownloadModelStatus(DownloadStatusType.PENDING, 0, "Starting download of model.sft", False)
    )
    # Check final call
    mock_progress_callback.assert_any_call(
        'checkpoints/model.sft',
        DownloadModelStatus(DownloadStatusType.COMPLETED, 100, "Successfully downloaded model.sft", False)
    )
    # Verify file writing
    mock_file.write.assert_any_call(b'a' * 500)
    mock_file.write.assert_any_call(b'b' * 300)
    mock_file.write.assert_any_call(b'c' * 200)
    # Verify request was made
    mock_make_request.assert_called_once_with('http://example.com/model.sft')
@pytest.mark.asyncio
 async def test_download_model_url_request_failure():
    # Mock dependencies
    mock_response = AsyncMock(spec=ClientResponse)
    mock_response.status = 404  # Simulate a "Not Found" error
    mock_get = AsyncMock(return_value=mock_response)
    mock_progress_callback = AsyncMock()
    # Mock the create_model_path function
    with patch('model_filemanager.create_model_path', return_value=('/mock/path/model.safetensors', 'mock/path/model.safetensors')):
        # Mock the check_file_exists function to return None (file doesn't exist)
        with patch('model_filemanager.check_file_exists', return_value=None):
            # Call the function
            result = await download_model(
                mock_get,
                'model.safetensors',
                'http://example.com/model.safetensors',
                'mock_directory',
                mock_progress_callback
            )
    # Assert the expected behavior
    assert isinstance(result, DownloadModelStatus)
    assert result.status == 'error'
    assert result.message == 'Failed to download model.safetensors. Status code: 404'
    assert result.already_existed is False
    # Check that progress_callback was called with the correct arguments
    mock_progress_callback.assert_any_call(
        'mock_directory/model.safetensors',
        DownloadModelStatus(
            status=DownloadStatusType.PENDING,
            progress_percentage=0,
            message='Starting download of model.safetensors',
            already_existed=False
        )
    )
    mock_progress_callback.assert_called_with(
        'mock_directory/model.safetensors',
        DownloadModelStatus(
            status=DownloadStatusType.ERROR,
            progress_percentage=0,
            message='Failed to download model.safetensors. Status code: 404',
            already_existed=False
        )
    )
    # Verify that the get method was called with the correct URL
    mock_get.assert_called_once_with('http://example.com/model.safetensors')
@pytest.mark.asyncio
 async def test_download_model_invalid_model_subdirectory():
    mock_make_request = AsyncMock()
    mock_progress_callback = AsyncMock()
    result = await download_model(
        mock_make_request,
        'model.sft',
        'http://example.com/model.sft',
        '../bad_path',
        mock_progress_callback
    )
    # Assert the result
    assert isinstance(result, DownloadModelStatus)
    assert result.message == 'Invalid model subdirectory'
    assert result.status == 'error'
    assert result.already_existed is False
 # For create_model_path function
 def test_create_model_path(tmp_path, monkeypatch):
    mock_models_dir = tmp_path / "models"
    monkeypatch.setattr('folder_paths.models_dir', str(mock_models_dir))
    model_name = "test_model.sft"
    model_directory = "test_dir"
    file_path, relative_path = create_model_path(model_name, model_directory, mock_models_dir)
    assert file_path == str(mock_models_dir / model_directory / model_name)
    assert relative_path == f"{model_directory}/{model_name}"
    assert os.path.exists(os.path.dirname(file_path))
@pytest.mark.asyncio
 async def test_check_file_exists_when_file_exists(tmp_path):
    file_path = tmp_path / "existing_model.sft"
    file_path.touch()  # Create an empty file
    mock_callback = AsyncMock()
    result = await check_file_exists(str(file_path), "existing_model.sft", mock_callback, "test/existing_model.sft")
    assert result is not None
    assert result.status == "completed"
    assert result.message == "existing_model.sft already exists"
    assert result.already_existed is True
    mock_callback.assert_called_once_with(
        "test/existing_model.sft",
        DownloadModelStatus(DownloadStatusType.COMPLETED, 100, "existing_model.sft already exists", already_existed=True)
    )
@pytest.mark.asyncio
 async def test_check_file_exists_when_file_does_not_exist(tmp_path):
    file_path = tmp_path / "non_existing_model.sft"
    mock_callback = AsyncMock()
    result = await check_file_exists(str(file_path), "non_existing_model.sft", mock_callback, "test/non_existing_model.sft")
    assert result is None
    mock_callback.assert_not_called()
@pytest.mark.asyncio
 async def test_track_download_progress_no_content_length():
    mock_response = AsyncMock(spec=aiohttp.ClientResponse)
    mock_response.headers = {}  # No Content-Length header
    mock_response.content.iter_chunked.return_value = AsyncIteratorMock([b'a' * 500, b'b' * 500])
    mock_callback = AsyncMock()
    mock_open = MagicMock(return_value=MagicMock())
    with patch('builtins.open', mock_open):
        result = await track_download_progress(
            mock_response, '/mock/path/model.sft', 'model.sft',
            mock_callback, 'models/model.sft', interval=0.1
        )
    assert result.status == "completed"
    # Check that progress was reported even without knowing the total size
    mock_callback.assert_any_call(
        'models/model.sft',
        DownloadModelStatus(DownloadStatusType.IN_PROGRESS, 0, "Downloading model.sft", already_existed=False)
    )
@pytest.mark.asyncio
 async def test_track_download_progress_interval():
    mock_response = AsyncMock(spec=aiohttp.ClientResponse)
    mock_response.headers = {'Content-Length': '1000'}
    mock_response.content.iter_chunked.return_value = AsyncIteratorMock([b'a' * 100] * 10)
    mock_callback = AsyncMock()
    mock_open = MagicMock(return_value=MagicMock())
    # Create a mock time function that returns incremental float values
    mock_time = MagicMock()
    mock_time.side_effect = [i * 0.5 for i in range(30)]  # This should be enough for 10 chunks
    with patch('builtins.open', mock_open), \
         patch('time.time', mock_time):
        await track_download_progress(
            mock_response, '/mock/path/model.sft', 'model.sft',
            mock_callback, 'models/model.sft', interval=1.0
        )
    # Print out the actual call count and the arguments of each call for debugging
    print(f"mock_callback was called {mock_callback.call_count} times")
    for i, call in enumerate(mock_callback.call_args_list):
        args, kwargs = call
        print(f"Call {i + 1}: {args[1].status}, Progress: {args[1].progress_percentage:.2f}%")
    # Assert that progress was updated at least 3 times (start, at least one interval, and end)
    assert mock_callback.call_count >= 3, f"Expected at least 3 calls, but got {mock_callback.call_count}"
    # Verify the first and last calls
    first_call = mock_callback.call_args_list[0]
    assert first_call[0][1].status == "in_progress"
    # Allow for some initial progress, but it should be less than 50%
    assert 0 <= first_call[0][1].progress_percentage < 50, f"First call progress was {first_call[0][1].progress_percentage}%"
    last_call = mock_callback.call_args_list[-1]
    assert last_call[0][1].status == "completed"
    assert last_call[0][1].progress_percentage == 100
 def test_valid_subdirectory():
    assert validate_model_subdirectory("valid-model123") is True
 def test_subdirectory_too_long():
    assert validate_model_subdirectory("a" * 51) is False
 def test_subdirectory_with_double_dots():
    assert validate_model_subdirectory("model/../unsafe") is False
 def test_subdirectory_with_slash():
    assert validate_model_subdirectory("model/unsafe") is False
 def test_subdirectory_with_special_characters():
    assert validate_model_subdirectory("model@unsafe") is False
 def test_subdirectory_with_underscore_and_dash():
    assert validate_model_subdirectory("valid_model-name") is True
 def test_empty_subdirectory():
    assert validate_model_subdirectory("") is False
@pytest.mark.parametrize("filename, expected", [
    ("valid_model.safetensors", True),
    ("valid_model.sft", True),
    ("valid model.safetensors", True), # Test with space
    ("UPPERCASE_MODEL.SAFETENSORS", True),
    ("model_with.multiple.dots.pt", False),
    ("", False),  # Empty string
    ("../../../etc/passwd", False),  # Path traversal attempt
    ("/etc/passwd", False),  # Absolute path
    ("\\windows\\system32\\config\\sam", False),  # Windows path
    (".hidden_file.pt", False),  # Hidden file
    ("invalid<char>.ckpt", False),  # Invalid character
    ("invalid?.ckpt", False),  # Another invalid character
    ("very" * 100 + ".safetensors", False),  # Too long filename
    ("\nmodel_with_newline.pt", False),  # Newline character
    ("model_with_emoji😊.pt", False),  # Emoji in filename
 ])
 def test_validate_filename(filename, expected):
    assert validate_filename(filename) == expected
--- a/tests-unit/requirements.txt
+++ b/tests-unit/requirements.txt
@ -1 +1,3 @@
 pytest>=7.8.0
 pytest-aiohttp
 pytest-asyncio
--- a/tests-unit/server/routes/internal_routes_test.py
+++ b/tests-unit/server/routes/internal_routes_test.py
@ -0,0 +1,115 @@
 import pytest
 from aiohttp import web
 from unittest.mock import MagicMock, patch
 from api_server.routes.internal.internal_routes import InternalRoutes
 from api_server.services.file_service import FileService
 from folder_paths import models_dir, user_directory, output_directory
@pytest.fixture
 def internal_routes():
    return InternalRoutes()
@pytest.fixture
 def aiohttp_client_factory(aiohttp_client, internal_routes):
    async def _get_client():
        app = internal_routes.get_app()
        return await aiohttp_client(app)
    return _get_client
@pytest.mark.asyncio
 async def test_list_files_valid_directory(aiohttp_client_factory, internal_routes):
    mock_file_list = [
        {"name": "file1.txt", "path": "file1.txt", "type": "file", "size": 100},
        {"name": "dir1", "path": "dir1", "type": "directory"}
    ]
    internal_routes.file_service.list_files = MagicMock(return_value=mock_file_list)
    client = await aiohttp_client_factory()
    resp = await client.get('/files?directory=models')
    assert resp.status == 200
    data = await resp.json()
    assert 'files' in data
    assert len(data['files']) == 2
    assert data['files'] == mock_file_list
    # Check other valid directories
    resp = await client.get('/files?directory=user')
    assert resp.status == 200
    resp = await client.get('/files?directory=output')
    assert resp.status == 200
@pytest.mark.asyncio
 async def test_list_files_invalid_directory(aiohttp_client_factory, internal_routes):
    internal_routes.file_service.list_files = MagicMock(side_effect=ValueError("Invalid directory key"))
    client = await aiohttp_client_factory()
    resp = await client.get('/files?directory=invalid')
    assert resp.status == 400
    data = await resp.json()
    assert 'error' in data
    assert data['error'] == "Invalid directory key"
@pytest.mark.asyncio
 async def test_list_files_exception(aiohttp_client_factory, internal_routes):
    internal_routes.file_service.list_files = MagicMock(side_effect=Exception("Unexpected error"))
    client = await aiohttp_client_factory()
    resp = await client.get('/files?directory=models')
    assert resp.status == 500
    data = await resp.json()
    assert 'error' in data
    assert data['error'] == "Unexpected error"
@pytest.mark.asyncio
 async def test_list_files_no_directory_param(aiohttp_client_factory, internal_routes):
    mock_file_list = []
    internal_routes.file_service.list_files = MagicMock(return_value=mock_file_list)
    client = await aiohttp_client_factory()
    resp = await client.get('/files')
    assert resp.status == 200
    data = await resp.json()
    assert 'files' in data
    assert len(data['files']) == 0
 def test_setup_routes(internal_routes):
    internal_routes.setup_routes()
    routes = internal_routes.routes
    assert any(route.method == 'GET' and str(route.path) == '/files' for route in routes)
 def test_get_app(internal_routes):
    app = internal_routes.get_app()
    assert isinstance(app, web.Application)
    assert internal_routes._app is not None
 def test_get_app_reuse(internal_routes):
    app1 = internal_routes.get_app()
    app2 = internal_routes.get_app()
    assert app1 is app2
@pytest.mark.asyncio
 async def test_routes_added_to_app(aiohttp_client_factory, internal_routes):
    client = await aiohttp_client_factory()
    try:
        resp = await client.get('/files')
        print(f"Response received: status {resp.status}")
    except Exception as e:
        print(f"Exception occurred during GET request: {e}")
        raise
    assert resp.status != 404, "Route /files does not exist"
@pytest.mark.asyncio
 async def test_file_service_initialization():
    with patch('api_server.routes.internal.internal_routes.FileService') as MockFileService:
        # Create a mock instance
        mock_file_service_instance = MagicMock(spec=FileService)
        MockFileService.return_value = mock_file_service_instance
        internal_routes = InternalRoutes()
        # Check if FileService was initialized with the correct parameters
        MockFileService.assert_called_once_with({
            "models": models_dir,
            "user": user_directory,
            "output": output_directory
        })
        # Verify that the file_service attribute of InternalRoutes is set
        assert internal_routes.file_service == mock_file_service_instance
--- a/tests-unit/server/services/file_service_test.py
+++ b/tests-unit/server/services/file_service_test.py
@ -0,0 +1,54 @@
 import pytest
 from unittest.mock import MagicMock
 from api_server.services.file_service import FileService
@pytest.fixture
 def mock_file_system_ops():
    return MagicMock()
@pytest.fixture
 def file_service(mock_file_system_ops):
    allowed_directories = {
        "models": "/path/to/models",
        "user": "/path/to/user",
        "output": "/path/to/output"
    }
    return FileService(allowed_directories, file_system_ops=mock_file_system_ops)
 def test_list_files_valid_directory(file_service, mock_file_system_ops):
    mock_file_system_ops.walk_directory.return_value = [
        {"name": "file1.txt", "path": "file1.txt", "type": "file", "size": 100},
        {"name": "dir1", "path": "dir1", "type": "directory"}
    ]
    result = file_service.list_files("models")
    assert len(result) == 2
    assert result[0]["name"] == "file1.txt"
    assert result[1]["name"] == "dir1"
    mock_file_system_ops.walk_directory.assert_called_once_with("/path/to/models")
 def test_list_files_invalid_directory(file_service):
    # Does not support walking directories outside of the allowed directories
    with pytest.raises(ValueError, match="Invalid directory key"):
        file_service.list_files("invalid_key")
 def test_list_files_empty_directory(file_service, mock_file_system_ops):
    mock_file_system_ops.walk_directory.return_value = []
    result = file_service.list_files("models")
    assert len(result) == 0
    mock_file_system_ops.walk_directory.assert_called_once_with("/path/to/models")
@pytest.mark.parametrize("directory_key", ["models", "user", "output"])
 def test_list_files_all_allowed_directories(file_service, mock_file_system_ops, directory_key):
    mock_file_system_ops.walk_directory.return_value = [
        {"name": f"file_{directory_key}.txt", "path": f"file_{directory_key}.txt", "type": "file", "size": 100}
    ]
    result = file_service.list_files(directory_key)
    assert len(result) == 1
    assert result[0]["name"] == f"file_{directory_key}.txt"
    mock_file_system_ops.walk_directory.assert_called_once_with(f"/path/to/{directory_key}")
--- a/tests-unit/server/utils/file_operations_test.py
+++ b/tests-unit/server/utils/file_operations_test.py
@ -0,0 +1,42 @@
 import pytest
 from typing import List
 from api_server.utils.file_operations import FileSystemOperations, FileSystemItem, is_file_info
@pytest.fixture
 def temp_directory(tmp_path):
    # Create a temporary directory structure
    dir1 = tmp_path / "dir1"
    dir2 = tmp_path / "dir2"
    dir1.mkdir()
    dir2.mkdir()
    (dir1 / "file1.txt").write_text("content1")
    (dir2 / "file2.txt").write_text("content2")
    (tmp_path / "file3.txt").write_text("content3")
    return tmp_path
 def test_walk_directory(temp_directory):
    result: List[FileSystemItem] = FileSystemOperations.walk_directory(str(temp_directory))
    assert len(result) == 5  # 2 directories and 3 files
    files = [item for item in result if item['type'] == 'file']
    dirs = [item for item in result if item['type'] == 'directory']
    assert len(files) == 3
    assert len(dirs) == 2
    file_names = {file['name'] for file in files}
    assert file_names == {'file1.txt', 'file2.txt', 'file3.txt'}
    dir_names = {dir['name'] for dir in dirs}
    assert dir_names == {'dir1', 'dir2'}
 def test_walk_directory_empty(tmp_path):
    result = FileSystemOperations.walk_directory(str(tmp_path))
    assert len(result) == 0
 def test_walk_directory_file_size(temp_directory):
    result: List[FileSystemItem] = FileSystemOperations.walk_directory(str(temp_directory))
    files = [item for item in result if is_file_info(item)]
    for file in files:
        assert file['size'] > 0  # Assuming all files have some content
--- a/tests/inference/extra_model_paths.yaml
+++ b/tests/inference/extra_model_paths.yaml
@ -0,0 +1,4 @@
 # Config for testing nodes
 testing:
    custom_nodes: tests/inference/testing_nodes
--- a/tests/inference/test_execution.py
+++ b/tests/inference/test_execution.py
@ -0,0 +1,499 @@
 from io import BytesIO
 import numpy
 from PIL import Image
 import pytest
 from pytest import fixture
 import time
 import torch
 from typing import Union, Dict
 import json
 import subprocess
 import websocket #NOTE: websocket-client (https://github.com/websocket-client/websocket-client)
 import uuid
 import urllib.request
 import urllib.parse
 import urllib.error
 from comfy_execution.graph_utils import GraphBuilder, Node
 class RunResult:
    def __init__(self, prompt_id: str):
        self.outputs: Dict[str,Dict] = {}
        self.runs: Dict[str,bool] = {}
        self.prompt_id: str = prompt_id
    def get_output(self, node: Node):
        return self.outputs.get(node.id, None)
    def did_run(self, node: Node):
        return self.runs.get(node.id, False)
    def get_images(self, node: Node):
        output = self.get_output(node)
        if output is None:
            return []
        return output.get('image_objects', [])
    def get_prompt_id(self):
        return self.prompt_id
 class ComfyClient:
    def __init__(self):
        self.test_name = ""
    def connect(self, 
                    listen:str = '127.0.0.1', 
                    port:Union[str,int] = 8188,
                    client_id: str = str(uuid.uuid4())
                    ):
        self.client_id = client_id
        self.server_address = f"{listen}:{port}"
        ws = websocket.WebSocket()
        ws.connect("ws://{}/ws?clientId={}".format(self.server_address, self.client_id))
        self.ws = ws
    def queue_prompt(self, prompt):
        p = {"prompt": prompt, "client_id": self.client_id}
        data = json.dumps(p).encode('utf-8')
        req =  urllib.request.Request("http://{}/prompt".format(self.server_address), data=data)
        return json.loads(urllib.request.urlopen(req).read())
    def get_image(self, filename, subfolder, folder_type):
        data = {"filename": filename, "subfolder": subfolder, "type": folder_type}
        url_values = urllib.parse.urlencode(data)
        with urllib.request.urlopen("http://{}/view?{}".format(self.server_address, url_values)) as response:
            return response.read()
    def get_history(self, prompt_id):
        with urllib.request.urlopen("http://{}/history/{}".format(self.server_address, prompt_id)) as response:
            return json.loads(response.read())
    def set_test_name(self, name):
        self.test_name = name
    def run(self, graph):
        prompt = graph.finalize()
        for node in graph.nodes.values():
            if node.class_type == 'SaveImage':
                node.inputs['filename_prefix'] = self.test_name
        prompt_id = self.queue_prompt(prompt)['prompt_id']
        result = RunResult(prompt_id)
        while True:
            out = self.ws.recv()
            if isinstance(out, str):
                message = json.loads(out)
                if message['type'] == 'executing':
                    data = message['data']
                    if data['prompt_id'] != prompt_id:
                        continue
                    if data['node'] is None:
                        break
                    result.runs[data['node']] = True
                elif message['type'] == 'execution_error':
                    raise Exception(message['data'])
                elif message['type'] == 'execution_cached':
                    pass # Probably want to store this off for testing
        history = self.get_history(prompt_id)[prompt_id]
        for o in history['outputs']:
            for node_id in history['outputs']:
                node_output = history['outputs'][node_id]
                result.outputs[node_id] = node_output
                if 'images' in node_output:
                    images_output = []
                    for image in node_output['images']:
                        image_data = self.get_image(image['filename'], image['subfolder'], image['type'])
                        image_obj = Image.open(BytesIO(image_data))
                        images_output.append(image_obj)
                    node_output['image_objects'] = images_output
        return result
 #
 # Loop through these variables
 #
@pytest.mark.execution
 class TestExecution:
    #
    # Initialize server and client
    #
    @fixture(scope="class", autouse=True, params=[
        # (use_lru, lru_size)
        (False, 0),
        (True, 0),
        (True, 100),
    ])
    def _server(self, args_pytest, request):
        # Start server
        pargs = [
            'python','main.py', 
            '--output-directory', args_pytest["output_dir"],
            '--listen', args_pytest["listen"],
            '--port', str(args_pytest["port"]),
            '--extra-model-paths-config', 'tests/inference/extra_model_paths.yaml',
        ]
        use_lru, lru_size = request.param
        if use_lru:
            pargs += ['--cache-lru', str(lru_size)]
        print("Running server with args:", pargs)
        p = subprocess.Popen(pargs)
        yield
        p.kill()
        torch.cuda.empty_cache()
    def start_client(self, listen:str, port:int):
        # Start client
        comfy_client = ComfyClient()
        # Connect to server (with retries)
        n_tries = 5
        for i in range(n_tries):
            time.sleep(4)
            try:
                comfy_client.connect(listen=listen, port=port)
            except ConnectionRefusedError as e:
                print(e)
                print(f"({i+1}/{n_tries}) Retrying...")
            else:
                break
        return comfy_client
    @fixture(scope="class", autouse=True)
    def shared_client(self, args_pytest, _server):
        client = self.start_client(args_pytest["listen"], args_pytest["port"])
        yield client
        del client
        torch.cuda.empty_cache()
    @fixture
    def client(self, shared_client, request):
        shared_client.set_test_name(f"execution[{request.node.name}]")
        yield shared_client
    @fixture
    def builder(self, request):
        yield GraphBuilder(prefix=request.node.name)
    def test_lazy_input(self, client: ComfyClient, builder: GraphBuilder):
        g = builder
        input1 = g.node("StubImage", content="BLACK", height=512, width=512, batch_size=1)
        input2 = g.node("StubImage", content="WHITE", height=512, width=512, batch_size=1)
        mask = g.node("StubMask", value=0.0, height=512, width=512, batch_size=1)
        lazy_mix = g.node("TestLazyMixImages", image1=input1.out(0), image2=input2.out(0), mask=mask.out(0))
        output = g.node("SaveImage", images=lazy_mix.out(0))
        result = client.run(g)
        result_image = result.get_images(output)[0]
        assert numpy.array(result_image).any() == 0, "Image should be black"
        assert result.did_run(input1)
        assert not result.did_run(input2)
        assert result.did_run(mask)
        assert result.did_run(lazy_mix)
    def test_full_cache(self, client: ComfyClient, builder: GraphBuilder):
        g = builder
        input1 = g.node("StubImage", content="BLACK", height=512, width=512, batch_size=1)
        input2 = g.node("StubImage", content="NOISE", height=512, width=512, batch_size=1)
        mask = g.node("StubMask", value=0.5, height=512, width=512, batch_size=1)
        lazy_mix = g.node("TestLazyMixImages", image1=input1.out(0), image2=input2.out(0), mask=mask.out(0))
        g.node("SaveImage", images=lazy_mix.out(0))
        client.run(g)
        result2 = client.run(g)
        for node_id, node in g.nodes.items():
            assert not result2.did_run(node), f"Node {node_id} ran, but should have been cached"
    def test_partial_cache(self, client: ComfyClient, builder: GraphBuilder):
        g = builder
        input1 = g.node("StubImage", content="BLACK", height=512, width=512, batch_size=1)
        input2 = g.node("StubImage", content="NOISE", height=512, width=512, batch_size=1)
        mask = g.node("StubMask", value=0.5, height=512, width=512, batch_size=1)
        lazy_mix = g.node("TestLazyMixImages", image1=input1.out(0), image2=input2.out(0), mask=mask.out(0))
        g.node("SaveImage", images=lazy_mix.out(0))
        client.run(g)
        mask.inputs['value'] = 0.4
        result2 = client.run(g)
        assert not result2.did_run(input1), "Input1 should have been cached"
        assert not result2.did_run(input2), "Input2 should have been cached"
    def test_error(self, client: ComfyClient, builder: GraphBuilder):
        g = builder
        input1 = g.node("StubImage", content="BLACK", height=512, width=512, batch_size=1)
        # Different size of the two images
        input2 = g.node("StubImage", content="NOISE", height=256, width=256, batch_size=1)
        mask = g.node("StubMask", value=0.5, height=512, width=512, batch_size=1)
        lazy_mix = g.node("TestLazyMixImages", image1=input1.out(0), image2=input2.out(0), mask=mask.out(0))
        g.node("SaveImage", images=lazy_mix.out(0))
        try:
            client.run(g)
            assert False, "Should have raised an error"
        except Exception as e:
            assert 'prompt_id' in e.args[0], f"Did not get back a proper error message: {e}"
    @pytest.mark.parametrize("test_value, expect_error", [
        (5, True),
        ("foo", True),
        (5.0, False),
    ])
    def test_validation_error_literal(self, test_value, expect_error, client: ComfyClient, builder: GraphBuilder):
        g = builder
        validation1 = g.node("TestCustomValidation1", input1=test_value, input2=3.0)
        g.node("SaveImage", images=validation1.out(0))
        if expect_error:
            with pytest.raises(urllib.error.HTTPError):
                client.run(g)
        else:
            client.run(g)
    @pytest.mark.parametrize("test_type, test_value", [
        ("StubInt", 5),
        ("StubFloat", 5.0)
    ])
    def test_validation_error_edge1(self, test_type, test_value, client: ComfyClient, builder: GraphBuilder):
        g = builder
        stub = g.node(test_type, value=test_value)
        validation1 = g.node("TestCustomValidation1", input1=stub.out(0), input2=3.0)
        g.node("SaveImage", images=validation1.out(0))
        with pytest.raises(urllib.error.HTTPError):
            client.run(g)
    @pytest.mark.parametrize("test_type, test_value, expect_error", [
        ("StubInt", 5, True),
        ("StubFloat", 5.0, False)
    ])
    def test_validation_error_edge2(self, test_type, test_value, expect_error, client: ComfyClient, builder: GraphBuilder):
        g = builder
        stub = g.node(test_type, value=test_value)
        validation2 = g.node("TestCustomValidation2", input1=stub.out(0), input2=3.0)
        g.node("SaveImage", images=validation2.out(0))
        if expect_error:
            with pytest.raises(urllib.error.HTTPError):
                client.run(g)
        else:
            client.run(g)
    @pytest.mark.parametrize("test_type, test_value, expect_error", [
        ("StubInt", 5, True),
        ("StubFloat", 5.0, False)
    ])
    def test_validation_error_edge3(self, test_type, test_value, expect_error, client: ComfyClient, builder: GraphBuilder):
        g = builder
        stub = g.node(test_type, value=test_value)
        validation3 = g.node("TestCustomValidation3", input1=stub.out(0), input2=3.0)
        g.node("SaveImage", images=validation3.out(0))
        if expect_error:
            with pytest.raises(urllib.error.HTTPError):
                client.run(g)
        else:
            client.run(g)
    @pytest.mark.parametrize("test_type, test_value, expect_error", [
        ("StubInt", 5, True),
        ("StubFloat", 5.0, False)
    ])
    def test_validation_error_edge4(self, test_type, test_value, expect_error, client: ComfyClient, builder: GraphBuilder):
        g = builder
        stub = g.node(test_type, value=test_value)
        validation4 = g.node("TestCustomValidation4", input1=stub.out(0), input2=3.0)
        g.node("SaveImage", images=validation4.out(0))
        if expect_error:
            with pytest.raises(urllib.error.HTTPError):
                client.run(g)
        else:
            client.run(g)
    @pytest.mark.parametrize("test_value1, test_value2, expect_error", [
        (0.0, 0.5, False),
        (0.0, 5.0, False),
        (0.0, 7.0, True)
    ])
    def test_validation_error_kwargs(self, test_value1, test_value2, expect_error, client: ComfyClient, builder: GraphBuilder):
        g = builder
        validation5 = g.node("TestCustomValidation5", input1=test_value1, input2=test_value2)
        g.node("SaveImage", images=validation5.out(0))
        if expect_error:
            with pytest.raises(urllib.error.HTTPError):
                client.run(g)
        else:
            client.run(g)
    def test_cycle_error(self, client: ComfyClient, builder: GraphBuilder):
        g = builder
        input1 = g.node("StubImage", content="BLACK", height=512, width=512, batch_size=1)
        input2 = g.node("StubImage", content="WHITE", height=512, width=512, batch_size=1)
        mask = g.node("StubMask", value=0.5, height=512, width=512, batch_size=1)
        lazy_mix1 = g.node("TestLazyMixImages", image1=input1.out(0), mask=mask.out(0))
        lazy_mix2 = g.node("TestLazyMixImages", image1=lazy_mix1.out(0), image2=input2.out(0), mask=mask.out(0))
        g.node("SaveImage", images=lazy_mix2.out(0))
        # When the cycle exists on initial submission, it should raise a validation error
        with pytest.raises(urllib.error.HTTPError):
            client.run(g)
    def test_dynamic_cycle_error(self, client: ComfyClient, builder: GraphBuilder):
        g = builder
        input1 = g.node("StubImage", content="BLACK", height=512, width=512, batch_size=1)
        input2 = g.node("StubImage", content="WHITE", height=512, width=512, batch_size=1)
        generator = g.node("TestDynamicDependencyCycle", input1=input1.out(0), input2=input2.out(0))
        g.node("SaveImage", images=generator.out(0))
        # When the cycle is in a graph that is generated dynamically, it should raise a runtime error
        try:
            client.run(g)
            assert False, "Should have raised an error"
        except Exception as e:
            assert 'prompt_id' in e.args[0], f"Did not get back a proper error message: {e}"
            assert e.args[0]['node_id'] == generator.id, "Error should have been on the generator node"
    def test_missing_node_error(self, client: ComfyClient, builder: GraphBuilder):
        g = builder
        input1 = g.node("StubImage", content="BLACK", height=512, width=512, batch_size=1)
        input2 = g.node("StubImage", id="removeme", content="WHITE", height=512, width=512, batch_size=1)
        input3 = g.node("StubImage", content="WHITE", height=512, width=512, batch_size=1)
        mask = g.node("StubMask", value=0.5, height=512, width=512, batch_size=1)
        mix1 = g.node("TestLazyMixImages", image1=input1.out(0), image2=input2.out(0), mask=mask.out(0))
        mix2 = g.node("TestLazyMixImages", image1=input1.out(0), image2=input3.out(0), mask=mask.out(0))
        # We have multiple outputs. The first is invalid, but the second is valid
        g.node("SaveImage", images=mix1.out(0))
        g.node("SaveImage", images=mix2.out(0))
        g.remove_node("removeme")
        client.run(g)
        # Add back in the missing node to make sure the error doesn't break the server
        input2 = g.node("StubImage", id="removeme", content="WHITE", height=512, width=512, batch_size=1)
        client.run(g)
    def test_custom_is_changed(self, client: ComfyClient, builder: GraphBuilder):
        g = builder
        # Creating the nodes in this specific order previously caused a bug
        save = g.node("SaveImage")
        is_changed = g.node("TestCustomIsChanged", should_change=False)
        input1 = g.node("StubImage", content="BLACK", height=512, width=512, batch_size=1)
        save.set_input('images', is_changed.out(0))
        is_changed.set_input('image', input1.out(0))
        result1 = client.run(g)
        result2 = client.run(g)
        is_changed.set_input('should_change', True)
        result3 = client.run(g)
        result4 = client.run(g)
        assert result1.did_run(is_changed), "is_changed should have been run"
        assert not result2.did_run(is_changed), "is_changed should have been cached"
        assert result3.did_run(is_changed), "is_changed should have been re-run"
        assert result4.did_run(is_changed), "is_changed should not have been cached"
    def test_undeclared_inputs(self, client: ComfyClient, builder: GraphBuilder):
        g = builder
        input1 = g.node("StubImage", content="BLACK", height=512, width=512, batch_size=1)
        input2 = g.node("StubImage", content="WHITE", height=512, width=512, batch_size=1)
        input3 = g.node("StubImage", content="BLACK", height=512, width=512, batch_size=1)
        input4 = g.node("StubImage", content="BLACK", height=512, width=512, batch_size=1)
        average = g.node("TestVariadicAverage", input1=input1.out(0), input2=input2.out(0), input3=input3.out(0), input4=input4.out(0))
        output = g.node("SaveImage", images=average.out(0))
        result = client.run(g)
        result_image = result.get_images(output)[0]
        expected = 255 // 4
        assert numpy.array(result_image).min() == expected and numpy.array(result_image).max() == expected, "Image should be grey"
    def test_for_loop(self, client: ComfyClient, builder: GraphBuilder):
        g = builder
        iterations = 4
        input1 = g.node("StubImage", content="BLACK", height=512, width=512, batch_size=1)
        input2 = g.node("StubImage", content="WHITE", height=512, width=512, batch_size=1)
        is_changed = g.node("TestCustomIsChanged", should_change=True, image=input2.out(0))
        for_open = g.node("TestForLoopOpen", remaining=iterations, initial_value1=is_changed.out(0))
        average = g.node("TestVariadicAverage", input1=input1.out(0), input2=for_open.out(2))
        for_close = g.node("TestForLoopClose", flow_control=for_open.out(0), initial_value1=average.out(0))
        output = g.node("SaveImage", images=for_close.out(0))
        for iterations in range(1, 5):
            for_open.set_input('remaining', iterations)
            result = client.run(g)
            result_image = result.get_images(output)[0]
            expected = 255 // (2 ** iterations)
            assert numpy.array(result_image).min() == expected and numpy.array(result_image).max() == expected, "Image should be grey"
            assert result.did_run(is_changed)
    def test_mixed_expansion_returns(self, client: ComfyClient, builder: GraphBuilder):
        g = builder
        val_list = g.node("TestMakeListNode", value1=0.1, value2=0.2, value3=0.3)
        mixed = g.node("TestMixedExpansionReturns", input1=val_list.out(0))
        output_dynamic = g.node("SaveImage", images=mixed.out(0))
        output_literal = g.node("SaveImage", images=mixed.out(1))
        result = client.run(g)
        images_dynamic = result.get_images(output_dynamic)
        assert len(images_dynamic) == 3, "Should have 2 images"
        assert numpy.array(images_dynamic[0]).min() == 25 and numpy.array(images_dynamic[0]).max() == 25, "First image should be 0.1"
        assert numpy.array(images_dynamic[1]).min() == 51 and numpy.array(images_dynamic[1]).max() == 51, "Second image should be 0.2"
        assert numpy.array(images_dynamic[2]).min() == 76 and numpy.array(images_dynamic[2]).max() == 76, "Third image should be 0.3"
        images_literal = result.get_images(output_literal)
        assert len(images_literal) == 3, "Should have 2 images"
        for i in range(3):
            assert numpy.array(images_literal[i]).min() == 255 and numpy.array(images_literal[i]).max() == 255, "All images should be white"
    def test_mixed_lazy_results(self, client: ComfyClient, builder: GraphBuilder):
        g = builder
        val_list = g.node("TestMakeListNode", value1=0.0, value2=0.5, value3=1.0)
        mask = g.node("StubMask", value=val_list.out(0), height=512, width=512, batch_size=1)
        input1 = g.node("StubImage", content="BLACK", height=512, width=512, batch_size=1)
        input2 = g.node("StubImage", content="WHITE", height=512, width=512, batch_size=1)
        mix = g.node("TestLazyMixImages", image1=input1.out(0), image2=input2.out(0), mask=mask.out(0))
        rebatch = g.node("RebatchImages", images=mix.out(0), batch_size=3)
        output = g.node("SaveImage", images=rebatch.out(0))
        result = client.run(g)
        images = result.get_images(output)
        assert len(images) == 3, "Should have 3 image"
        assert numpy.array(images[0]).min() == 0 and numpy.array(images[0]).max() == 0, "First image should be 0.0"
        assert numpy.array(images[1]).min() == 127 and numpy.array(images[1]).max() == 127, "Second image should be 0.5"
        assert numpy.array(images[2]).min() == 255 and numpy.array(images[2]).max() == 255, "Third image should be 1.0"
    def test_output_reuse(self, client: ComfyClient, builder: GraphBuilder):
        g = builder
        input1 = g.node("StubImage", content="BLACK", height=512, width=512, batch_size=1)
        output1 = g.node("SaveImage", images=input1.out(0))
        output2 = g.node("SaveImage", images=input1.out(0))
        result = client.run(g)
        images1 = result.get_images(output1)
        images2 = result.get_images(output2)
        assert len(images1) == 1, "Should have 1 image"
        assert len(images2) == 1, "Should have 1 image"
    # This tests that only constant outputs are used in the call to `IS_CHANGED`
    def test_is_changed_with_outputs(self, client: ComfyClient, builder: GraphBuilder):
        g = builder
        input1 = g.node("StubConstantImage", value=0.5, height=512, width=512, batch_size=1)
        test_node = g.node("TestIsChangedWithConstants", image=input1.out(0), value=0.5)
        output = g.node("PreviewImage", images=test_node.out(0))
        result = client.run(g)
        images = result.get_images(output)
        assert len(images) == 1, "Should have 1 image"
        assert numpy.array(images[0]).min() == 63 and numpy.array(images[0]).max() == 63, "Image should have value 0.25"
        result = client.run(g)
        images = result.get_images(output)
        assert len(images) == 1, "Should have 1 image"
        assert numpy.array(images[0]).min() == 63 and numpy.array(images[0]).max() == 63, "Image should have value 0.25"
        assert not result.did_run(test_node), "The execution should have been cached"
--- a/tests/inference/testing_nodes/testing-pack/init.py
+++ b/tests/inference/testing_nodes/testing-pack/init.py
@ -0,0 +1,23 @@
 from .specific_tests import TEST_NODE_CLASS_MAPPINGS, TEST_NODE_DISPLAY_NAME_MAPPINGS
 from .flow_control import FLOW_CONTROL_NODE_CLASS_MAPPINGS, FLOW_CONTROL_NODE_DISPLAY_NAME_MAPPINGS
 from .util import UTILITY_NODE_CLASS_MAPPINGS, UTILITY_NODE_DISPLAY_NAME_MAPPINGS
 from .conditions import CONDITION_NODE_CLASS_MAPPINGS, CONDITION_NODE_DISPLAY_NAME_MAPPINGS
 from .stubs import TEST_STUB_NODE_CLASS_MAPPINGS, TEST_STUB_NODE_DISPLAY_NAME_MAPPINGS
 # NODE_CLASS_MAPPINGS = GENERAL_NODE_CLASS_MAPPINGS.update(COMPONENT_NODE_CLASS_MAPPINGS)
 # NODE_DISPLAY_NAME_MAPPINGS = GENERAL_NODE_DISPLAY_NAME_MAPPINGS.update(COMPONENT_NODE_DISPLAY_NAME_MAPPINGS)
 NODE_CLASS_MAPPINGS = {}
 NODE_CLASS_MAPPINGS.update(TEST_NODE_CLASS_MAPPINGS)
 NODE_CLASS_MAPPINGS.update(FLOW_CONTROL_NODE_CLASS_MAPPINGS)
 NODE_CLASS_MAPPINGS.update(UTILITY_NODE_CLASS_MAPPINGS)
 NODE_CLASS_MAPPINGS.update(CONDITION_NODE_CLASS_MAPPINGS)
 NODE_CLASS_MAPPINGS.update(TEST_STUB_NODE_CLASS_MAPPINGS)
 NODE_DISPLAY_NAME_MAPPINGS = {}
 NODE_DISPLAY_NAME_MAPPINGS.update(TEST_NODE_DISPLAY_NAME_MAPPINGS)
 NODE_DISPLAY_NAME_MAPPINGS.update(FLOW_CONTROL_NODE_DISPLAY_NAME_MAPPINGS)
 NODE_DISPLAY_NAME_MAPPINGS.update(UTILITY_NODE_DISPLAY_NAME_MAPPINGS)
 NODE_DISPLAY_NAME_MAPPINGS.update(CONDITION_NODE_DISPLAY_NAME_MAPPINGS)
 NODE_DISPLAY_NAME_MAPPINGS.update(TEST_STUB_NODE_DISPLAY_NAME_MAPPINGS)
--- a/tests/inference/testing_nodes/testing-pack/conditions.py
+++ b/tests/inference/testing_nodes/testing-pack/conditions.py
@ -0,0 +1,194 @@
 import re
 import torch
 class TestIntConditions:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "a": ("INT", {"default": 0, "min": -0xffffffffffffffff, "max": 0xffffffffffffffff, "step": 1}),
                "b": ("INT", {"default": 0, "min": -0xffffffffffffffff, "max": 0xffffffffffffffff, "step": 1}),
                "operation": (["==", "!=", "<", ">", "<=", ">="],),
            },
        }
    RETURN_TYPES = ("BOOLEAN",)
    FUNCTION = "int_condition"
    CATEGORY = "Testing/Logic"
    def int_condition(self, a, b, operation):
        if operation == "==":
            return (a == b,)
        elif operation == "!=":
            return (a != b,)
        elif operation == "<":
            return (a < b,)
        elif operation == ">":
            return (a > b,)
        elif operation == "<=":
            return (a <= b,)
        elif operation == ">=":
            return (a >= b,)
 class TestFloatConditions:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "a": ("FLOAT", {"default": 0, "min": -999999999999.0, "max": 999999999999.0, "step": 1}),
                "b": ("FLOAT", {"default": 0, "min": -999999999999.0, "max": 999999999999.0, "step": 1}),
                "operation": (["==", "!=", "<", ">", "<=", ">="],),
            },
        }
    RETURN_TYPES = ("BOOLEAN",)
    FUNCTION = "float_condition"
    CATEGORY = "Testing/Logic"
    def float_condition(self, a, b, operation):
        if operation == "==":
            return (a == b,)
        elif operation == "!=":
            return (a != b,)
        elif operation == "<":
            return (a < b,)
        elif operation == ">":
            return (a > b,)
        elif operation == "<=":
            return (a <= b,)
        elif operation == ">=":
            return (a >= b,)
 class TestStringConditions:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "a": ("STRING", {"multiline": False}),
                "b": ("STRING", {"multiline": False}),
                "operation": (["a == b", "a != b", "a IN b", "a MATCH REGEX(b)", "a BEGINSWITH b", "a ENDSWITH b"],),
                "case_sensitive": ("BOOLEAN", {"default": True}),
            },
        }
    RETURN_TYPES = ("BOOLEAN",)
    FUNCTION = "string_condition"
    CATEGORY = "Testing/Logic"
    def string_condition(self, a, b, operation, case_sensitive):
        if not case_sensitive:
            a = a.lower()
            b = b.lower()
        if operation == "a == b":
            return (a == b,)
        elif operation == "a != b":
            return (a != b,)
        elif operation == "a IN b":
            return (a in b,)
        elif operation == "a MATCH REGEX(b)":
            try:
                return (re.match(b, a) is not None,)
            except:
                return (False,)
        elif operation == "a BEGINSWITH b":
            return (a.startswith(b),)
        elif operation == "a ENDSWITH b":
            return (a.endswith(b),)
 class TestToBoolNode:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "value": ("*",),
            },
            "optional": {
                "invert": ("BOOLEAN", {"default": False}),
            },
        }
    RETURN_TYPES = ("BOOLEAN",)
    FUNCTION = "to_bool"
    CATEGORY = "Testing/Logic"
    def to_bool(self, value, invert = False):
        if isinstance(value, torch.Tensor):
            if value.max().item() == 0 and value.min().item() == 0:
                result = False
            else:
                result = True
        else:
            try:
                result = bool(value)
            except:
                # Can't convert it? Well then it's something or other. I dunno, I'm not a Python programmer.
                result = True
        if invert:
            result = not result
        return (result,)
 class TestBoolOperationNode:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "a": ("BOOLEAN",),
                "b": ("BOOLEAN",),
                "op": (["a AND b", "a OR b", "a XOR b", "NOT a"],),
            },
        }
    RETURN_TYPES = ("BOOLEAN",)
    FUNCTION = "bool_operation"
    CATEGORY = "Testing/Logic"
    def bool_operation(self, a, b, op):
        if op == "a AND b":
            return (a and b,)
        elif op == "a OR b":
            return (a or b,)
        elif op == "a XOR b":
            return (a ^ b,)
        elif op == "NOT a":
            return (not a,)
 CONDITION_NODE_CLASS_MAPPINGS = {
    "TestIntConditions": TestIntConditions,
    "TestFloatConditions": TestFloatConditions,
    "TestStringConditions": TestStringConditions,
    "TestToBoolNode": TestToBoolNode,
    "TestBoolOperationNode": TestBoolOperationNode,
 }
 CONDITION_NODE_DISPLAY_NAME_MAPPINGS = {
    "TestIntConditions": "Int Condition",
    "TestFloatConditions": "Float Condition",
    "TestStringConditions": "String Condition",
    "TestToBoolNode": "To Bool",
    "TestBoolOperationNode": "Bool Operation",
 }
--- a/tests/inference/testing_nodes/testing-pack/flow_control.py
+++ b/tests/inference/testing_nodes/testing-pack/flow_control.py
@ -0,0 +1,173 @@
 from comfy_execution.graph_utils import GraphBuilder, is_link
 from comfy_execution.graph import ExecutionBlocker
 from .tools import VariantSupport
 NUM_FLOW_SOCKETS = 5
@VariantSupport()
 class TestWhileLoopOpen:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        inputs = {
            "required": {
                "condition": ("BOOLEAN", {"default": True}),
            },
            "optional": {
            },
        }
        for i in range(NUM_FLOW_SOCKETS):
            inputs["optional"][f"initial_value{i}"] = ("*",)
        return inputs
    RETURN_TYPES = tuple(["FLOW_CONTROL"] + ["*"] * NUM_FLOW_SOCKETS)
    RETURN_NAMES = tuple(["FLOW_CONTROL"] + [f"value{i}" for i in range(NUM_FLOW_SOCKETS)])
    FUNCTION = "while_loop_open"
    CATEGORY = "Testing/Flow"
    def while_loop_open(self, condition, **kwargs):
        values = []
        for i in range(NUM_FLOW_SOCKETS):
            values.append(kwargs.get(f"initial_value{i}", None))
        return tuple(["stub"] + values)
@VariantSupport()
 class TestWhileLoopClose:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        inputs = {
            "required": {
                "flow_control": ("FLOW_CONTROL", {"rawLink": True}),
                "condition": ("BOOLEAN", {"forceInput": True}),
            },
            "optional": {
            },
            "hidden": {
                "dynprompt": "DYNPROMPT",
                "unique_id": "UNIQUE_ID",
            }
        }
        for i in range(NUM_FLOW_SOCKETS):
            inputs["optional"][f"initial_value{i}"] = ("*",)
        return inputs
    RETURN_TYPES = tuple(["*"] * NUM_FLOW_SOCKETS)
    RETURN_NAMES = tuple([f"value{i}" for i in range(NUM_FLOW_SOCKETS)])
    FUNCTION = "while_loop_close"
    CATEGORY = "Testing/Flow"
    def explore_dependencies(self, node_id, dynprompt, upstream):
        node_info = dynprompt.get_node(node_id)
        if "inputs" not in node_info:
            return
        for k, v in node_info["inputs"].items():
            if is_link(v):
                parent_id = v[0]
                if parent_id not in upstream:
                    upstream[parent_id] = []
                    self.explore_dependencies(parent_id, dynprompt, upstream)
                upstream[parent_id].append(node_id)
    def collect_contained(self, node_id, upstream, contained):
        if node_id not in upstream:
            return
        for child_id in upstream[node_id]:
            if child_id not in contained:
                contained[child_id] = True
                self.collect_contained(child_id, upstream, contained)
    def while_loop_close(self, flow_control, condition, dynprompt=None, unique_id=None, **kwargs):
        assert dynprompt is not None
        if not condition:
            # We're done with the loop
            values = []
            for i in range(NUM_FLOW_SOCKETS):
                values.append(kwargs.get(f"initial_value{i}", None))
            return tuple(values)
        # We want to loop
        upstream = {}
        # Get the list of all nodes between the open and close nodes
        self.explore_dependencies(unique_id, dynprompt, upstream)
        contained = {}
        open_node = flow_control[0]
        self.collect_contained(open_node, upstream, contained)
        contained[unique_id] = True
        contained[open_node] = True
        # We'll use the default prefix, but to avoid having node names grow exponentially in size,
        # we'll use "Recurse" for the name of the recursively-generated copy of this node.
        graph = GraphBuilder()
        for node_id in contained:
            original_node = dynprompt.get_node(node_id)
            node = graph.node(original_node["class_type"], "Recurse" if node_id == unique_id else node_id)
            node.set_override_display_id(node_id)
        for node_id in contained:
            original_node = dynprompt.get_node(node_id)
            node = graph.lookup_node("Recurse" if node_id == unique_id else node_id)
            assert node is not None
            for k, v in original_node["inputs"].items():
                if is_link(v) and v[0] in contained:
                    parent = graph.lookup_node(v[0])
                    assert parent is not None
                    node.set_input(k, parent.out(v[1]))
                else:
                    node.set_input(k, v)
        new_open = graph.lookup_node(open_node)
        assert new_open is not None
        for i in range(NUM_FLOW_SOCKETS):
            key = f"initial_value{i}"
            new_open.set_input(key, kwargs.get(key, None))
        my_clone = graph.lookup_node("Recurse")
        assert my_clone is not None
        result = map(lambda x: my_clone.out(x), range(NUM_FLOW_SOCKETS))
        return {
            "result": tuple(result),
            "expand": graph.finalize(),
        }
@VariantSupport()
 class TestExecutionBlockerNode:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        inputs = {
            "required": {
                "input": ("*",),
                "block": ("BOOLEAN",),
                "verbose": ("BOOLEAN", {"default": False}),
            },
        }
        return inputs
    RETURN_TYPES = ("*",)
    RETURN_NAMES = ("output",)
    FUNCTION = "execution_blocker"
    CATEGORY = "Testing/Flow"
    def execution_blocker(self, input, block, verbose):
        if block:
            return (ExecutionBlocker("Blocked Execution" if verbose else None),)
        return (input,)
 FLOW_CONTROL_NODE_CLASS_MAPPINGS = {
    "TestWhileLoopOpen": TestWhileLoopOpen,
    "TestWhileLoopClose": TestWhileLoopClose,
    "TestExecutionBlocker": TestExecutionBlockerNode,
 }
 FLOW_CONTROL_NODE_DISPLAY_NAME_MAPPINGS = {
    "TestWhileLoopOpen": "While Loop Open",
    "TestWhileLoopClose": "While Loop Close",
    "TestExecutionBlocker": "Execution Blocker",
 }
--- a/tests/inference/testing_nodes/testing-pack/specific_tests.py
+++ b/tests/inference/testing_nodes/testing-pack/specific_tests.py
@ -0,0 +1,362 @@
 import torch
 from .tools import VariantSupport
 from comfy_execution.graph_utils import GraphBuilder
 class TestLazyMixImages:
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "image1": ("IMAGE",{"lazy": True}),
                "image2": ("IMAGE",{"lazy": True}),
                "mask": ("MASK",),
            },
        }
    RETURN_TYPES = ("IMAGE",)
    FUNCTION = "mix"
    CATEGORY = "Testing/Nodes"
    def check_lazy_status(self, mask, image1, image2):
        mask_min = mask.min()
        mask_max = mask.max()
        needed = []
        if image1 is None and (mask_min != 1.0 or mask_max != 1.0):
            needed.append("image1")
        if image2 is None and (mask_min != 0.0 or mask_max != 0.0):
            needed.append("image2")
        return needed
    # Not trying to handle different batch sizes here just to keep the demo simple
    def mix(self, mask, image1, image2):
        mask_min = mask.min()
        mask_max = mask.max()
        if mask_min == 0.0 and mask_max == 0.0:
            return (image1,)
        elif mask_min == 1.0 and mask_max == 1.0:
            return (image2,)
        if len(mask.shape) == 2:
            mask = mask.unsqueeze(0)
        if len(mask.shape) == 3:
            mask = mask.unsqueeze(3)
        if mask.shape[3] < image1.shape[3]:
            mask = mask.repeat(1, 1, 1, image1.shape[3])
        result = image1 * (1. - mask) + image2 * mask,
        return (result[0],)
 class TestVariadicAverage:
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "input1": ("IMAGE",),
            },
        }
    RETURN_TYPES = ("IMAGE",)
    FUNCTION = "variadic_average"
    CATEGORY = "Testing/Nodes"
    def variadic_average(self, input1, **kwargs):
        inputs = [input1]
        while 'input' + str(len(inputs) + 1) in kwargs:
            inputs.append(kwargs['input' + str(len(inputs) + 1)])
        return (torch.stack(inputs).mean(dim=0),)
 class TestCustomIsChanged:
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "image": ("IMAGE",),
            },
            "optional": {
                "should_change": ("BOOL", {"default": False}),
            },
        }
    RETURN_TYPES = ("IMAGE",)
    FUNCTION = "custom_is_changed"
    CATEGORY = "Testing/Nodes"
    def custom_is_changed(self, image, should_change=False):
        return (image,)
    @classmethod
    def IS_CHANGED(cls, should_change=False, *args, **kwargs):
        if should_change:
            return float("NaN")
        else:
            return False
 class TestIsChangedWithConstants:
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "image": ("IMAGE",),
                "value": ("FLOAT", {"default": 1.0, "min": 0.0, "max": 10.0}),
            },
        }
    RETURN_TYPES = ("IMAGE",)
    FUNCTION = "custom_is_changed"
    CATEGORY = "Testing/Nodes"
    def custom_is_changed(self, image, value):
        return (image * value,)
    @classmethod
    def IS_CHANGED(cls, image, value):
        if image is None:
            return value
        else:
            return image.mean().item() * value
 class TestCustomValidation1:
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "input1": ("IMAGE,FLOAT",),
                "input2": ("IMAGE,FLOAT",),
            },
        }
    RETURN_TYPES = ("IMAGE",)
    FUNCTION = "custom_validation1"
    CATEGORY = "Testing/Nodes"
    def custom_validation1(self, input1, input2):
        if isinstance(input1, float) and isinstance(input2, float):
            result = torch.ones([1, 512, 512, 3]) * input1 * input2
        else:
            result = input1 * input2
        return (result,)
    @classmethod
    def VALIDATE_INPUTS(cls, input1=None, input2=None):
        if input1 is not None:
            if not isinstance(input1, (torch.Tensor, float)):
                return f"Invalid type of input1: {type(input1)}"
        if input2 is not None:
            if not isinstance(input2, (torch.Tensor, float)):
                return f"Invalid type of input2: {type(input2)}"
        return True
 class TestCustomValidation2:
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "input1": ("IMAGE,FLOAT",),
                "input2": ("IMAGE,FLOAT",),
            },
        }
    RETURN_TYPES = ("IMAGE",)
    FUNCTION = "custom_validation2"
    CATEGORY = "Testing/Nodes"
    def custom_validation2(self, input1, input2):
        if isinstance(input1, float) and isinstance(input2, float):
            result = torch.ones([1, 512, 512, 3]) * input1 * input2
        else:
            result = input1 * input2
        return (result,)
    @classmethod
    def VALIDATE_INPUTS(cls, input_types, input1=None, input2=None):
        if input1 is not None:
            if not isinstance(input1, (torch.Tensor, float)):
                return f"Invalid type of input1: {type(input1)}"
        if input2 is not None:
            if not isinstance(input2, (torch.Tensor, float)):
                return f"Invalid type of input2: {type(input2)}"
        if 'input1' in input_types:
            if input_types['input1'] not in ["IMAGE", "FLOAT"]:
                return f"Invalid type of input1: {input_types['input1']}"
        if 'input2' in input_types:
            if input_types['input2'] not in ["IMAGE", "FLOAT"]:
                return f"Invalid type of input2: {input_types['input2']}"
        return True
@VariantSupport()
 class TestCustomValidation3:
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "input1": ("IMAGE,FLOAT",),
                "input2": ("IMAGE,FLOAT",),
            },
        }
    RETURN_TYPES = ("IMAGE",)
    FUNCTION = "custom_validation3"
    CATEGORY = "Testing/Nodes"
    def custom_validation3(self, input1, input2):
        if isinstance(input1, float) and isinstance(input2, float):
            result = torch.ones([1, 512, 512, 3]) * input1 * input2
        else:
            result = input1 * input2
        return (result,)
 class TestCustomValidation4:
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "input1": ("FLOAT",),
                "input2": ("FLOAT",),
            },
        }
    RETURN_TYPES = ("IMAGE",)
    FUNCTION = "custom_validation4"
    CATEGORY = "Testing/Nodes"
    def custom_validation4(self, input1, input2):
        result = torch.ones([1, 512, 512, 3]) * input1 * input2
        return (result,)
    @classmethod
    def VALIDATE_INPUTS(cls, input1, input2):
        if input1 is not None:
            if not isinstance(input1, float):
                return f"Invalid type of input1: {type(input1)}"
        if input2 is not None:
            if not isinstance(input2, float):
                return f"Invalid type of input2: {type(input2)}"
        return True
 class TestCustomValidation5:
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "input1": ("FLOAT", {"min": 0.0, "max": 1.0}),
                "input2": ("FLOAT", {"min": 0.0, "max": 1.0}),
            },
        }
    RETURN_TYPES = ("IMAGE",)
    FUNCTION = "custom_validation5"
    CATEGORY = "Testing/Nodes"
    def custom_validation5(self, input1, input2):
        value = input1 * input2
        return (torch.ones([1, 512, 512, 3]) * value,)
    @classmethod
    def VALIDATE_INPUTS(cls, **kwargs):
        if kwargs['input2'] == 7.0:
            return "7s are not allowed. I've never liked 7s."
        return True
 class TestDynamicDependencyCycle:
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "input1": ("IMAGE",),
                "input2": ("IMAGE",),
            },
        }
    RETURN_TYPES = ("IMAGE",)
    FUNCTION = "dynamic_dependency_cycle"
    CATEGORY = "Testing/Nodes"
    def dynamic_dependency_cycle(self, input1, input2):
        g = GraphBuilder()
        mask = g.node("StubMask", value=0.5, height=512, width=512, batch_size=1)
        mix1 = g.node("TestLazyMixImages", image1=input1, mask=mask.out(0))
        mix2 = g.node("TestLazyMixImages", image1=mix1.out(0), image2=input2, mask=mask.out(0))
        # Create the cyle
        mix1.set_input("image2", mix2.out(0))
        return {
            "result": (mix2.out(0),),
            "expand": g.finalize(),
        }
 class TestMixedExpansionReturns:
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "input1": ("FLOAT",),
            },
        }
    RETURN_TYPES = ("IMAGE","IMAGE")
    FUNCTION = "mixed_expansion_returns"
    CATEGORY = "Testing/Nodes"
    def mixed_expansion_returns(self, input1):
        white_image = torch.ones([1, 512, 512, 3])
        if input1 <= 0.1:
            return (torch.ones([1, 512, 512, 3]) * 0.1, white_image)
        elif input1 <= 0.2:
            return {
                "result": (torch.ones([1, 512, 512, 3]) * 0.2, white_image),
            }
        else:
            g = GraphBuilder()
            mask = g.node("StubMask", value=0.3, height=512, width=512, batch_size=1)
            black = g.node("StubImage", content="BLACK", height=512, width=512, batch_size=1)
            white = g.node("StubImage", content="WHITE", height=512, width=512, batch_size=1)
            mix = g.node("TestLazyMixImages", image1=black.out(0), image2=white.out(0), mask=mask.out(0))
            return {
                "result": (mix.out(0), white_image),
                "expand": g.finalize(),
            }
 TEST_NODE_CLASS_MAPPINGS = {
    "TestLazyMixImages": TestLazyMixImages,
    "TestVariadicAverage": TestVariadicAverage,
    "TestCustomIsChanged": TestCustomIsChanged,
    "TestIsChangedWithConstants": TestIsChangedWithConstants,
    "TestCustomValidation1": TestCustomValidation1,
    "TestCustomValidation2": TestCustomValidation2,
    "TestCustomValidation3": TestCustomValidation3,
    "TestCustomValidation4": TestCustomValidation4,
    "TestCustomValidation5": TestCustomValidation5,
    "TestDynamicDependencyCycle": TestDynamicDependencyCycle,
    "TestMixedExpansionReturns": TestMixedExpansionReturns,
 }
 TEST_NODE_DISPLAY_NAME_MAPPINGS = {
    "TestLazyMixImages": "Lazy Mix Images",
    "TestVariadicAverage": "Variadic Average",
    "TestCustomIsChanged": "Custom IsChanged",
    "TestIsChangedWithConstants": "IsChanged With Constants",
    "TestCustomValidation1": "Custom Validation 1",
    "TestCustomValidation2": "Custom Validation 2",
    "TestCustomValidation3": "Custom Validation 3",
    "TestCustomValidation4": "Custom Validation 4",
    "TestCustomValidation5": "Custom Validation 5",
    "TestDynamicDependencyCycle": "Dynamic Dependency Cycle",
    "TestMixedExpansionReturns": "Mixed Expansion Returns",
 }
--- a/tests/inference/testing_nodes/testing-pack/stubs.py
+++ b/tests/inference/testing_nodes/testing-pack/stubs.py
@ -0,0 +1,129 @@
 import torch
 class StubImage:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "content": (['WHITE', 'BLACK', 'NOISE'],),
                "height": ("INT", {"default": 512, "min": 1, "max": 1024 ** 3, "step": 1}),
                "width": ("INT", {"default": 512, "min": 1, "max": 4096 ** 3, "step": 1}),
                "batch_size": ("INT", {"default": 1, "min": 1, "max": 1024 ** 3, "step": 1}),
            },
        }
    RETURN_TYPES = ("IMAGE",)
    FUNCTION = "stub_image"
    CATEGORY = "Testing/Stub Nodes"
    def stub_image(self, content, height, width, batch_size):
        if content == "WHITE":
            return (torch.ones(batch_size, height, width, 3),)
        elif content == "BLACK":
            return (torch.zeros(batch_size, height, width, 3),)
        elif content == "NOISE":
            return (torch.rand(batch_size, height, width, 3),)
 class StubConstantImage:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "value": ("FLOAT", {"default": 0.5, "min": 0.0, "max": 1.0, "step": 0.01}),
                "height": ("INT", {"default": 512, "min": 1, "max": 1024 ** 3, "step": 1}),
                "width": ("INT", {"default": 512, "min": 1, "max": 4096 ** 3, "step": 1}),
                "batch_size": ("INT", {"default": 1, "min": 1, "max": 1024 ** 3, "step": 1}),
            },
        }
    RETURN_TYPES = ("IMAGE",)
    FUNCTION = "stub_constant_image"
    CATEGORY = "Testing/Stub Nodes"
    def stub_constant_image(self, value, height, width, batch_size):
        return (torch.ones(batch_size, height, width, 3) * value,)
 class StubMask:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "value": ("FLOAT", {"default": 0.5, "min": 0.0, "max": 1.0, "step": 0.01}),
                "height": ("INT", {"default": 512, "min": 1, "max": 1024 ** 3, "step": 1}),
                "width": ("INT", {"default": 512, "min": 1, "max": 4096 ** 3, "step": 1}),
                "batch_size": ("INT", {"default": 1, "min": 1, "max": 1024 ** 3, "step": 1}),
            },
        }
    RETURN_TYPES = ("MASK",)
    FUNCTION = "stub_mask"
    CATEGORY = "Testing/Stub Nodes"
    def stub_mask(self, value, height, width, batch_size):
        return (torch.ones(batch_size, height, width) * value,)
 class StubInt:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "value": ("INT", {"default": 0, "min": -0xffffffff, "max": 0xffffffff, "step": 1}),
            },
        }
    RETURN_TYPES = ("INT",)
    FUNCTION = "stub_int"
    CATEGORY = "Testing/Stub Nodes"
    def stub_int(self, value):
        return (value,)
 class StubFloat:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "value": ("FLOAT", {"default": 0.0, "min": -1.0e38, "max": 1.0e38, "step": 0.01}),
            },
        }
    RETURN_TYPES = ("FLOAT",)
    FUNCTION = "stub_float"
    CATEGORY = "Testing/Stub Nodes"
    def stub_float(self, value):
        return (value,)
 TEST_STUB_NODE_CLASS_MAPPINGS = {
    "StubImage": StubImage,
    "StubConstantImage": StubConstantImage,
    "StubMask": StubMask,
    "StubInt": StubInt,
    "StubFloat": StubFloat,
 }
 TEST_STUB_NODE_DISPLAY_NAME_MAPPINGS = {
    "StubImage": "Stub Image",
    "StubConstantImage": "Stub Constant Image",
    "StubMask": "Stub Mask",
    "StubInt": "Stub Int",
    "StubFloat": "Stub Float",
 }
--- a/tests/inference/testing_nodes/testing-pack/tools.py
+++ b/tests/inference/testing_nodes/testing-pack/tools.py
@ -0,0 +1,53 @@
 def MakeSmartType(t):
    if isinstance(t, str):
        return SmartType(t)
    return t
 class SmartType(str):
    def __ne__(self, other):
        if self == "*" or other == "*":
            return False
        selfset = set(self.split(','))
        otherset = set(other.split(','))
        return not selfset.issubset(otherset)
 def VariantSupport():
    def decorator(cls):
        if hasattr(cls, "INPUT_TYPES"):
            old_input_types = getattr(cls, "INPUT_TYPES")
            def new_input_types(*args, **kwargs):
                types = old_input_types(*args, **kwargs)
                for category in ["required", "optional"]:
                    if category not in types:
                        continue
                    for key, value in types[category].items():
                        if isinstance(value, tuple):
                            types[category][key] = (MakeSmartType(value[0]),) + value[1:]
                return types
            setattr(cls, "INPUT_TYPES", new_input_types)
        if hasattr(cls, "RETURN_TYPES"):
            old_return_types = cls.RETURN_TYPES
            setattr(cls, "RETURN_TYPES", tuple(MakeSmartType(x) for x in old_return_types))
        if hasattr(cls, "VALIDATE_INPUTS"):
            # Reflection is used to determine what the function signature is, so we can't just change the function signature
            raise NotImplementedError("VariantSupport does not support VALIDATE_INPUTS yet")
        else:
            def validate_inputs(input_types):
                inputs = cls.INPUT_TYPES()
                for key, value in input_types.items():
                    if isinstance(value, SmartType):
                        continue
                    if "required" in inputs and key in inputs["required"]:
                        expected_type = inputs["required"][key][0]
                    elif "optional" in inputs and key in inputs["optional"]:
                        expected_type = inputs["optional"][key][0]
                    else:
                        expected_type = None
                    if expected_type is not None and MakeSmartType(value) != expected_type:
                        return f"Invalid type of {key}: {value} (expected {expected_type})"
                return True
            setattr(cls, "VALIDATE_INPUTS", validate_inputs)
        return cls
    return decorator
--- a/tests/inference/testing_nodes/testing-pack/util.py
+++ b/tests/inference/testing_nodes/testing-pack/util.py
@ -0,0 +1,364 @@
 from comfy_execution.graph_utils import GraphBuilder
 from .tools import VariantSupport
@VariantSupport()
 class TestAccumulateNode:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "to_add": ("*",),
            },
            "optional": {
                "accumulation": ("ACCUMULATION",),
            },
        }
    RETURN_TYPES = ("ACCUMULATION",)
    FUNCTION = "accumulate"
    CATEGORY = "Testing/Lists"
    def accumulate(self, to_add, accumulation = None):
        if accumulation is None:
            value = [to_add]
        else:
            value = accumulation["accum"] + [to_add]
        return ({"accum": value},)
@VariantSupport()
 class TestAccumulationHeadNode:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "accumulation": ("ACCUMULATION",),
            },
        }
    RETURN_TYPES = ("ACCUMULATION", "*",)
    FUNCTION = "accumulation_head"
    CATEGORY = "Testing/Lists"
    def accumulation_head(self, accumulation):
        accum = accumulation["accum"]
        if len(accum) == 0:
            return (accumulation, None)
        else:
            return ({"accum": accum[1:]}, accum[0])
 class TestAccumulationTailNode:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "accumulation": ("ACCUMULATION",),
            },
        }
    RETURN_TYPES = ("ACCUMULATION", "*",)
    FUNCTION = "accumulation_tail"
    CATEGORY = "Testing/Lists"
    def accumulation_tail(self, accumulation):
        accum = accumulation["accum"]
        if len(accum) == 0:
            return (None, accumulation)
        else:
            return ({"accum": accum[:-1]}, accum[-1])
@VariantSupport()
 class TestAccumulationToListNode:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "accumulation": ("ACCUMULATION",),
            },
        }
    RETURN_TYPES = ("*",)
    OUTPUT_IS_LIST = (True,)
    FUNCTION = "accumulation_to_list"
    CATEGORY = "Testing/Lists"
    def accumulation_to_list(self, accumulation):
        return (accumulation["accum"],)
@VariantSupport()
 class TestListToAccumulationNode:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "list": ("*",),
            },
        }
    RETURN_TYPES = ("ACCUMULATION",)
    INPUT_IS_LIST = (True,)
    FUNCTION = "list_to_accumulation"
    CATEGORY = "Testing/Lists"
    def list_to_accumulation(self, list):
        return ({"accum": list},)
@VariantSupport()
 class TestAccumulationGetLengthNode:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "accumulation": ("ACCUMULATION",),
            },
        }
    RETURN_TYPES = ("INT",)
    FUNCTION = "accumlength"
    CATEGORY = "Testing/Lists"
    def accumlength(self, accumulation):
        return (len(accumulation['accum']),)
@VariantSupport()
 class TestAccumulationGetItemNode:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "accumulation": ("ACCUMULATION",),
                "index": ("INT", {"default":0, "step":1})
            },
        }
    RETURN_TYPES = ("*",)
    FUNCTION = "get_item"
    CATEGORY = "Testing/Lists"
    def get_item(self, accumulation, index):
        return (accumulation['accum'][index],)
@VariantSupport()
 class TestAccumulationSetItemNode:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "accumulation": ("ACCUMULATION",),
                "index": ("INT", {"default":0, "step":1}),
                "value": ("*",),
            },
        }
    RETURN_TYPES = ("ACCUMULATION",)
    FUNCTION = "set_item"
    CATEGORY = "Testing/Lists"
    def set_item(self, accumulation, index, value):
        new_accum = accumulation['accum'][:]
        new_accum[index] = value
        return ({"accum": new_accum},)
 class TestIntMathOperation:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "a": ("INT", {"default": 0, "min": -0xffffffffffffffff, "max": 0xffffffffffffffff, "step": 1}),
                "b": ("INT", {"default": 0, "min": -0xffffffffffffffff, "max": 0xffffffffffffffff, "step": 1}),
                "operation": (["add", "subtract", "multiply", "divide", "modulo", "power"],),
            },
        }
    RETURN_TYPES = ("INT",)
    FUNCTION = "int_math_operation"
    CATEGORY = "Testing/Logic"
    def int_math_operation(self, a, b, operation):
        if operation == "add":
            return (a + b,)
        elif operation == "subtract":
            return (a - b,)
        elif operation == "multiply":
            return (a * b,)
        elif operation == "divide":
            return (a // b,)
        elif operation == "modulo":
            return (a % b,)
        elif operation == "power":
            return (a ** b,)
 from .flow_control import NUM_FLOW_SOCKETS
@VariantSupport()
 class TestForLoopOpen:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "remaining": ("INT", {"default": 1, "min": 0, "max": 100000, "step": 1}),
            },
            "optional": {
                f"initial_value{i}": ("*",) for i in range(1, NUM_FLOW_SOCKETS)
            },
            "hidden": {
                "initial_value0": ("*",)
            }
        }
    RETURN_TYPES = tuple(["FLOW_CONTROL", "INT",] + ["*"] * (NUM_FLOW_SOCKETS-1))
    RETURN_NAMES = tuple(["flow_control", "remaining"] + [f"value{i}" for i in range(1, NUM_FLOW_SOCKETS)])
    FUNCTION = "for_loop_open"
    CATEGORY = "Testing/Flow"
    def for_loop_open(self, remaining, **kwargs):
        graph = GraphBuilder()
        if "initial_value0" in kwargs:
            remaining = kwargs["initial_value0"]
        while_open = graph.node("TestWhileLoopOpen", condition=remaining, initial_value0=remaining, **{(f"initial_value{i}"): kwargs.get(f"initial_value{i}", None) for i in range(1, NUM_FLOW_SOCKETS)})
        outputs = [kwargs.get(f"initial_value{i}", None) for i in range(1, NUM_FLOW_SOCKETS)]
        return {
            "result": tuple(["stub", remaining] + outputs),
            "expand": graph.finalize(),
        }
@VariantSupport()
 class TestForLoopClose:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "flow_control": ("FLOW_CONTROL", {"rawLink": True}),
            },
            "optional": {
                f"initial_value{i}": ("*",{"rawLink": True}) for i in range(1, NUM_FLOW_SOCKETS)
            },
        }
    RETURN_TYPES = tuple(["*"] * (NUM_FLOW_SOCKETS-1))
    RETURN_NAMES = tuple([f"value{i}" for i in range(1, NUM_FLOW_SOCKETS)])
    FUNCTION = "for_loop_close"
    CATEGORY = "Testing/Flow"
    def for_loop_close(self, flow_control, **kwargs):
        graph = GraphBuilder()
        while_open = flow_control[0]
        sub = graph.node("TestIntMathOperation", operation="subtract", a=[while_open,1], b=1)
        cond = graph.node("TestToBoolNode", value=sub.out(0))
        input_values = {f"initial_value{i}": kwargs.get(f"initial_value{i}", None) for i in range(1, NUM_FLOW_SOCKETS)}
        while_close = graph.node("TestWhileLoopClose",
                flow_control=flow_control,
                condition=cond.out(0),
                initial_value0=sub.out(0),
                **input_values)
        return {
            "result": tuple([while_close.out(i) for i in range(1, NUM_FLOW_SOCKETS)]),
            "expand": graph.finalize(),
        }
 NUM_LIST_SOCKETS = 10
@VariantSupport()
 class TestMakeListNode:
    def __init__(self):
        pass
    @classmethod
    def INPUT_TYPES(cls):
        return {
            "required": {
                "value1": ("*",),
            },
            "optional": {
                f"value{i}": ("*",) for i in range(1, NUM_LIST_SOCKETS)
            },
        }
    RETURN_TYPES = ("*",)
    FUNCTION = "make_list"
    OUTPUT_IS_LIST = (True,)
    CATEGORY = "Testing/Lists"
    def make_list(self, **kwargs):
        result = []
        for i in range(NUM_LIST_SOCKETS):
            if f"value{i}" in kwargs:
                result.append(kwargs[f"value{i}"])
        return (result,)
 UTILITY_NODE_CLASS_MAPPINGS = {
    "TestAccumulateNode": TestAccumulateNode,
    "TestAccumulationHeadNode": TestAccumulationHeadNode,
    "TestAccumulationTailNode": TestAccumulationTailNode,
    "TestAccumulationToListNode": TestAccumulationToListNode,
    "TestListToAccumulationNode": TestListToAccumulationNode,
    "TestAccumulationGetLengthNode": TestAccumulationGetLengthNode,
    "TestAccumulationGetItemNode": TestAccumulationGetItemNode,
    "TestAccumulationSetItemNode": TestAccumulationSetItemNode,
    "TestForLoopOpen": TestForLoopOpen,
    "TestForLoopClose": TestForLoopClose,
    "TestIntMathOperation": TestIntMathOperation,
    "TestMakeListNode": TestMakeListNode,
 }
 UTILITY_NODE_DISPLAY_NAME_MAPPINGS = {
    "TestAccumulateNode": "Accumulate",
    "TestAccumulationHeadNode": "Accumulation Head",
    "TestAccumulationTailNode": "Accumulation Tail",
    "TestAccumulationToListNode": "Accumulation to List",
    "TestListToAccumulationNode": "List to Accumulation",
    "TestAccumulationGetLengthNode": "Accumulation Get Length",
    "TestAccumulationGetItemNode": "Accumulation Get Item",
    "TestAccumulationSetItemNode": "Accumulation Set Item",
    "TestForLoopOpen": "For Loop Open",
    "TestForLoopClose": "For Loop Close",
    "TestIntMathOperation": "Int Math Operation",
    "TestMakeListNode": "Make List",
 }
--- a/web/assets/index-CaD4RONs.js
+++ b/web/assets/index-CaD4RONs.js
--- a/web/assets/index-CaD4RONs.js.map
+++ b/web/assets/index-CaD4RONs.js.map
--- a/web/assets/index-DAK31IJJ.css
+++ b/web/assets/index-DAK31IJJ.css
--- a/web/assets/index-DjWyclij.css
+++ b/web/assets/index-DjWyclij.css
@ -0,0 +1,149 @@
 .comfy-group-manage {
 	background: var(--bg-color);
 	color: var(--fg-color);
 	padding: 0;
 	font-family: Arial, Helvetica, sans-serif;
 	border-color: black;
 	margin: 20vh auto;
 	max-height: 60vh;
 }
 .comfy-group-manage-outer {
 	max-height: 60vh;
 	min-width: 500px;
 	display: flex;
 	flex-direction: column;
 }
 .comfy-group-manage-outer > header {
 	display: flex;
 	align-items: center;
 	gap: 10px;
 	justify-content: space-between;
 	background: var(--comfy-menu-bg);
 	padding: 15px 20px;
 }
 .comfy-group-manage-outer > header select {
 	background: var(--comfy-input-bg);
 	border: 1px solid var(--border-color);
 	color: var(--input-text);
 	padding: 5px 10px;
 	border-radius: 5px;
 }
 .comfy-group-manage h2 {
 	margin: 0;
 	font-weight: normal;
 }
 .comfy-group-manage main {
 	display: flex;
 	overflow: hidden;
 }
 .comfy-group-manage .drag-handle {
 	font-weight: bold;
 }
 .comfy-group-manage-list {
 	border-right: 1px solid var(--comfy-menu-bg);
 }
 .comfy-group-manage-list ul {
 	margin: 40px 0 0;
 	padding: 0;
 	list-style: none;
 }
 .comfy-group-manage-list-items {
 	max-height: calc(100% - 40px);
 	overflow-y: scroll;
 	overflow-x: hidden;
 }
 .comfy-group-manage-list li {
 	display: flex;
 	padding: 10px 20px 10px 10px;
 	cursor: pointer;
 	align-items: center;
 	gap: 5px;
 }
 .comfy-group-manage-list div {
 	display: flex;
 	flex-direction: column;
 }
 .comfy-group-manage-list li:not(.selected):hover div {
 	text-decoration: underline;
 }
 .comfy-group-manage-list li.selected {
 	background: var(--border-color);
 }
 .comfy-group-manage-list li span {
 	opacity: 0.7;
 	font-size: smaller;
 }
 .comfy-group-manage-node {
 	flex: auto;
 	background: var(--border-color);
 	display: flex;
 	flex-direction: column;
 }
 .comfy-group-manage-node > div {
 	overflow: auto;
 }
 .comfy-group-manage-node header {
 	display: flex;
 	background: var(--bg-color);
 	height: 40px;
 }
 .comfy-group-manage-node header a {
 	text-align: center;
 	flex: auto;
 	border-right: 1px solid var(--comfy-menu-bg);
 	border-bottom: 1px solid var(--comfy-menu-bg);
 	padding: 10px;
 	cursor: pointer;
 	font-size: 15px;
 }
 .comfy-group-manage-node header a:last-child {
    border-right: none;
 }
 .comfy-group-manage-node header a:not(.active):hover {
 	text-decoration: underline;
 }
 .comfy-group-manage-node header a.active {
 	background: var(--border-color);
 	border-bottom: none;
 }
 .comfy-group-manage-node-page {
 	display: none;
 	overflow: auto;
 }
 .comfy-group-manage-node-page.active {
 	display: block;
 }
 .comfy-group-manage-node-page div {
 	padding: 10px;
 	display: flex;
 	align-items: center;
 	gap: 10px;
 }
 .comfy-group-manage-node-page input {
 	border: none;
 	color: var(--input-text);
 	background: var(--comfy-input-bg);
 	padding: 5px 10px;
 }
 .comfy-group-manage-node-page input[type="text"] {
 	flex: auto;
 }
 .comfy-group-manage-node-page label {
 	display: flex;
 	gap: 5px;
 	align-items: center;
 }
 .comfy-group-manage footer {
 	border-top: 1px solid var(--comfy-menu-bg);
 	padding: 10px;
 	display: flex;
 	gap: 10px;
 }
 .comfy-group-manage footer button {
 	font-size: 14px;
 	padding: 5px 10px;
 	border-radius: 0;
 }
 .comfy-group-manage footer button:first-child {
 	margin-right: auto;
 }
--- a/web/assets/index-DkvOTKox.js
+++ b/web/assets/index-DkvOTKox.js
--- a/web/assets/index-DkvOTKox.js.map
+++ b/web/assets/index-DkvOTKox.js.map
--- a/web/assets/primeicons-C6QP2o4f.woff2
+++ b/web/assets/primeicons-C6QP2o4f.woff2
--- a/web/assets/primeicons-DMOk5skT.eot
+++ b/web/assets/primeicons-DMOk5skT.eot
--- a/web/assets/primeicons-Dr5RGzOO.svg
+++ b/web/assets/primeicons-Dr5RGzOO.svg
--- a/web/assets/primeicons-MpK4pl85.ttf
+++ b/web/assets/primeicons-MpK4pl85.ttf
--- a/web/assets/primeicons-WjwUDZjB.woff
+++ b/web/assets/primeicons-WjwUDZjB.woff
--- a/web/assets/userSelection-BGzn1LuN.css
+++ b/web/assets/userSelection-BGzn1LuN.css
@ -0,0 +1,136 @@
 .comfy-user-selection {
    width: 100vw;
    height: 100vh;
    position: absolute;
    top: 0;
    left: 0;
    z-index: 999;
    display: flex;
    align-items: center;
    justify-content: center;
    font-family: sans-serif;
    background: linear-gradient(var(--tr-even-bg-color), var(--tr-odd-bg-color));
 }
 .comfy-user-selection-inner {
    background: var(--comfy-menu-bg);
    margin-top: -30vh;
    padding: 20px 40px;
    border-radius: 10px;
    min-width: 365px;
    position: relative;
    box-shadow: 0 0 20px rgba(0, 0, 0, 0.3);
 }
 .comfy-user-selection-inner form {
    width: 100%;
    display: flex;
    flex-direction: column;
    align-items: center;
 }
 .comfy-user-selection-inner h1 {
    margin: 10px 0 30px 0;
    font-weight: normal;
 }
 .comfy-user-selection-inner label {
    display: flex;
    flex-direction: column;
    width: 100%;
 }
 .comfy-user-selection input,
 .comfy-user-selection select {
    background-color: var(--comfy-input-bg);
    color: var(--input-text);
    border: 0;
    border-radius: 5px;
    padding: 5px;
    margin-top: 10px;
 }
 .comfy-user-selection input::-moz-placeholder {
    color: var(--descrip-text);
    opacity: 1;
 }
 .comfy-user-selection input::placeholder {
    color: var(--descrip-text);
    opacity: 1;
 }
 .comfy-user-existing {
    width: 100%;
 }
 .no-users .comfy-user-existing {
    display: none;
 }
 .comfy-user-selection-inner .or-separator {
    margin: 10px 0;
    padding: 10px;
    display: block;
    width: 100%;
    color: var(--descrip-text);
    overflow: hidden;
    text-align: center;
    margin-left: -10px;
 }
 .comfy-user-selection-inner .or-separator::before,
 .comfy-user-selection-inner .or-separator::after {
    content: "";
    background-color: var(--border-color);
    position: relative;
    height: 1px;
    vertical-align: middle;
    display: inline-block;
    width: calc(50% - 20px);
    top: -1px;
 }
 .comfy-user-selection-inner .or-separator::before {
    right: 10px;
    margin-left: -50%;
 }
 .comfy-user-selection-inner .or-separator::after {
    left: 10px;
    margin-right: -50%;
 }
 .comfy-user-selection-inner section {
    width: 100%;
    padding: 10px;
    margin: -10px;
    transition: background-color 0.2s;
 }
 .comfy-user-selection-inner section.selected {
    background: var(--border-color);
    border-radius: 5px;
 }
 .comfy-user-selection-inner footer {
    display: flex;
    flex-direction: column;
    align-items: center;
    margin-top: 20px;
 }
 .comfy-user-selection-inner .comfy-user-error {
    color: var(--error-text);
    margin-bottom: 10px;
 }
 .comfy-user-button-next {
    font-size: 16px;
    padding: 6px 10px;
    width: 100px;
    display: flex;
    gap: 5px;
    align-items: center;
    justify-content: center;
 }
--- a/web/assets/userSelection-GRU1gtOt.js
+++ b/web/assets/userSelection-GRU1gtOt.js
@ -0,0 +1,142 @@
 var __defProp = Object.defineProperty;
 var __name = (target, value) => __defProp(target, "name", { value, configurable: true });
 var __async = (__this, __arguments, generator) => {
  return new Promise((resolve, reject) => {
    var fulfilled = (value) => {
      try {
        step(generator.next(value));
      } catch (e) {
        reject(e);
      }
    };
    var rejected = (value) => {
      try {
        step(generator.throw(value));
      } catch (e) {
        reject(e);
      }
    };
    var step = (x) => x.done ? resolve(x.value) : Promise.resolve(x.value).then(fulfilled, rejected);
    step((generator = generator.apply(__this, __arguments)).next());
  });
 };
 import { j as createSpinner, g as api, $ as $el } from "./index-CaD4RONs.js";
 const _UserSelectionScreen = class _UserSelectionScreen {
  show(users, user) {
    return __async(this, null, function* () {
      const userSelection = document.getElementById("comfy-user-selection");
      userSelection.style.display = "";
      return new Promise((resolve) => {
        const input = userSelection.getElementsByTagName("input")[0];
        const select = userSelection.getElementsByTagName("select")[0];
        const inputSection = input.closest("section");
        const selectSection = select.closest("section");
        const form = userSelection.getElementsByTagName("form")[0];
        const error = userSelection.getElementsByClassName("comfy-user-error")[0];
        const button = userSelection.getElementsByClassName(
          "comfy-user-button-next"
        )[0];
        let inputActive = null;
        input.addEventListener("focus", () => {
          inputSection.classList.add("selected");
          selectSection.classList.remove("selected");
          inputActive = true;
        });
        select.addEventListener("focus", () => {
          inputSection.classList.remove("selected");
          selectSection.classList.add("selected");
          inputActive = false;
          select.style.color = "";
        });
        select.addEventListener("blur", () => {
          if (!select.value) {
            select.style.color = "var(--descrip-text)";
          }
        });
        form.addEventListener("submit", (e) => __async(this, null, function* () {
          var _a, _b, _c;
          e.preventDefault();
          if (inputActive == null) {
            error.textContent = "Please enter a username or select an existing user.";
          } else if (inputActive) {
            const username = input.value.trim();
            if (!username) {
              error.textContent = "Please enter a username.";
              return;
            }
            input.disabled = select.disabled = // @ts-expect-error
            input.readonly = // @ts-expect-error
            select.readonly = true;
            const spinner = createSpinner();
            button.prepend(spinner);
            try {
              const resp = yield api.createUser(username);
              if (resp.status >= 300) {
                let message = "Error creating user: " + resp.status + " " + resp.statusText;
                try {
                  const res = yield resp.json();
                  if (res.error) {
                    message = res.error;
                  }
                } catch (error2) {
                }
                throw new Error(message);
              }
              resolve({ username, userId: yield resp.json(), created: true });
            } catch (err) {
              spinner.remove();
              error.textContent = (_c = (_b = (_a = err.message) != null ? _a : err.statusText) != null ? _b : err) != null ? _c : "An unknown error occurred.";
              input.disabled = select.disabled = // @ts-expect-error
              input.readonly = // @ts-expect-error
              select.readonly = false;
              return;
            }
          } else if (!select.value) {
            error.textContent = "Please select an existing user.";
            return;
          } else {
            resolve({
              username: users[select.value],
              userId: select.value,
              created: false
            });
          }
        }));
        if (user) {
          const name = localStorage["Comfy.userName"];
          if (name) {
            input.value = name;
          }
        }
        if (input.value) {
          input.focus();
        }
        const userIds = Object.keys(users != null ? users : {});
        if (userIds.length) {
          for (const u of userIds) {
            $el("option", { textContent: users[u], value: u, parent: select });
          }
          select.style.color = "var(--descrip-text)";
          if (select.value) {
            select.focus();
          }
        } else {
          userSelection.classList.add("no-users");
          input.focus();
        }
      }).then((r) => {
        userSelection.remove();
        return r;
      });
    });
  }
 };
 __name(_UserSelectionScreen, "UserSelectionScreen");
 let UserSelectionScreen = _UserSelectionScreen;
 window.comfyAPI = window.comfyAPI || {};
 window.comfyAPI.userSelection = window.comfyAPI.userSelection || {};
 window.comfyAPI.userSelection.UserSelectionScreen = UserSelectionScreen;
 export {
  UserSelectionScreen
 };
 //# sourceMappingURL=userSelection-GRU1gtOt.js.map
--- a/web/assets/userSelection-GRU1gtOt.js.map
+++ b/web/assets/userSelection-GRU1gtOt.js.map
--- a/web/extensions/core/clipspace.js
+++ b/web/extensions/core/clipspace.js
@ -1,166 +1,2 @@
-import { app } from "../../scripts/app.js";
+// Shim for extensions\core\clipspace.ts
-import { ComfyDialog, $el } from "../../scripts/ui.js";
+export const ClipspaceDialog = window.comfyAPI.clipspace.ClipspaceDialog;
 import { ComfyApp } from "../../scripts/app.js";
 export class ClipspaceDialog extends ComfyDialog {
 	static items = [];
 	static instance = null;
 	static registerButton(name, contextPredicate, callback) {
 		const item =
 			$el("button", {
 				type: "button",
 				textContent: name,
 				contextPredicate: contextPredicate,
 				onclick: callback
 			})
 		ClipspaceDialog.items.push(item);
 	}
 	static invalidatePreview() {
 		if(ComfyApp.clipspace && ComfyApp.clipspace.imgs && ComfyApp.clipspace.imgs.length > 0) {
 			const img_preview = document.getElementById("clipspace_preview");
 			if(img_preview) {
 				img_preview.src = ComfyApp.clipspace.imgs[ComfyApp.clipspace['selectedIndex']].src;
 				img_preview.style.maxHeight = "100%";
 				img_preview.style.maxWidth = "100%";
 			}
 		}
 	}
 	static invalidate() {
 		if(ClipspaceDialog.instance) {
 			const self = ClipspaceDialog.instance;
 			// allow reconstruct controls when copying from non-image to image content.
 			const children = $el("div.comfy-modal-content", [ self.createImgSettings(), ...self.createButtons() ]);
 			if(self.element) {
 				// update
 				self.element.removeChild(self.element.firstChild);
 				self.element.appendChild(children);
 			}
 			else {
 				// new
 				self.element = $el("div.comfy-modal", { parent: document.body }, [children,]);
 			}
 			if(self.element.children[0].children.length <= 1) {
 				self.element.children[0].appendChild($el("p", {}, ["Unable to find the features to edit content of a format stored in the current Clipspace."]));
 			}
 			ClipspaceDialog.invalidatePreview();
 		}
 	}
 	constructor() {
 		super();
 	}
 	createButtons(self) {
 		const buttons = [];
 		for(let idx in ClipspaceDialog.items) {
 			const item = ClipspaceDialog.items[idx];
 			if(!item.contextPredicate || item.contextPredicate())
 				buttons.push(ClipspaceDialog.items[idx]);
 		}
 		buttons.push(
 			$el("button", {
 				type: "button",
 				textContent: "Close",
 				onclick: () => { this.close(); }
 			})
 		);
 		return buttons;
 	}
 	createImgSettings() {
 		if(ComfyApp.clipspace.imgs) {
 			const combo_items = [];
 			const imgs = ComfyApp.clipspace.imgs;
 			for(let i=0; i < imgs.length; i++) {
 				combo_items.push($el("option", {value:i}, [`${i}`]));
 			}
 			const combo1 = $el("select",
 				{id:"clipspace_img_selector", onchange:(event) => {
 					ComfyApp.clipspace['selectedIndex'] = event.target.selectedIndex;
 					ClipspaceDialog.invalidatePreview();
 				} }, combo_items);
 			const row1 =
 				$el("tr", {},
 						[
 							$el("td", {}, [$el("font", {color:"white"}, ["Select Image"])]),
 							$el("td", {}, [combo1])
 						]);
 			const combo2 = $el("select",
 								{id:"clipspace_img_paste_mode", onchange:(event) => {
 									ComfyApp.clipspace['img_paste_mode'] = event.target.value;
 								} },
 								[
 									$el("option", {value:'selected'}, 'selected'),
 									$el("option", {value:'all'}, 'all')
 								]);
 			combo2.value = ComfyApp.clipspace['img_paste_mode'];
 			const row2 =
 				$el("tr", {},
 						[
 							$el("td", {}, [$el("font", {color:"white"}, ["Paste Mode"])]),
 							$el("td", {}, [combo2])
 						]);
 			const td = $el("td", {align:'center', width:'100px', height:'100px', colSpan:'2'},
 								[ $el("img",{id:"clipspace_preview", ondragstart:() => false},[]) ]);
 			const row3 =
 				$el("tr", {}, [td]);
 			return $el("table", {}, [row1, row2, row3]);
 		}
 		else {
 			return [];
 		}
 	}
 	createImgPreview() {
 		if(ComfyApp.clipspace.imgs) {
 			return $el("img",{id:"clipspace_preview", ondragstart:() => false});
 		}
 		else
 			return [];
 	}
 	show() {
 		const img_preview = document.getElementById("clipspace_preview");
 		ClipspaceDialog.invalidate();
 		this.element.style.display = "block";
 	}
 }
 app.registerExtension({
 	name: "Comfy.Clipspace",
 	init(app) {
 		app.openClipspace =
 			function () {
 				if(!ClipspaceDialog.instance) {
 					ClipspaceDialog.instance = new ClipspaceDialog(app);
 					ComfyApp.clipspace_invalidate_handler = ClipspaceDialog.invalidate;
 				}
 				if(ComfyApp.clipspace) {
 					ClipspaceDialog.instance.show();
 				}
 				else
 					app.ui.dialog.show("Clipspace is Empty!");
 			};
 	}
 });
--- a/web/extensions/core/groupNode.js
+++ b/web/extensions/core/groupNode.js
--- a/web/extensions/core/groupNodeManage.js
+++ b/web/extensions/core/groupNodeManage.js
@ -1,422 +1,2 @@
-import { $el, ComfyDialog } from "../../scripts/ui.js";
+// Shim for extensions\core\groupNodeManage.ts
-import { DraggableList } from "../../scripts/ui/draggableList.js";
+export const ManageGroupDialog = window.comfyAPI.groupNodeManage.ManageGroupDialog;
 import { addStylesheet } from "../../scripts/utils.js";
 import { GroupNodeConfig, GroupNodeHandler } from "./groupNode.js";
 addStylesheet(import.meta.url);
 const ORDER = Symbol();
 function merge(target, source) {
 	if (typeof target === "object" && typeof source === "object") {
 		for (const key in source) {
 			const sv = source[key];
 			if (typeof sv === "object") {
 				let tv = target[key];
 				if (!tv) tv = target[key] = {};
 				merge(tv, source[key]);
 			} else {
 				target[key] = sv;
 			}
 		}
 	}
 	return target;
 }
 export class ManageGroupDialog extends ComfyDialog {
 	/** @type { Record<"Inputs" | "Outputs" | "Widgets", {tab: HTMLAnchorElement, page: HTMLElement}> } */
 	tabs = {};
 	/** @type { number | null | undefined } */
 	selectedNodeIndex;
 	/** @type { keyof ManageGroupDialog["tabs"] } */
 	selectedTab = "Inputs";
 	/** @type { string | undefined } */
 	selectedGroup;
 	/** @type { Record<string, Record<string, Record<string, { name?: string | undefined, visible?: boolean | undefined }>>> } */
 	modifications = {};
 	get selectedNodeInnerIndex() {
 		return +this.nodeItems[this.selectedNodeIndex].dataset.nodeindex;
 	}
 	constructor(app) {
 		super();
 		this.app = app;
 		this.element = $el("dialog.comfy-group-manage", {
 			parent: document.body,
 		});
 	}
 	changeTab(tab) {
 		this.tabs[this.selectedTab].tab.classList.remove("active");
 		this.tabs[this.selectedTab].page.classList.remove("active");
 		this.tabs[tab].tab.classList.add("active");
 		this.tabs[tab].page.classList.add("active");
 		this.selectedTab = tab;
 	}
 	changeNode(index, force) {
 		if (!force && this.selectedNodeIndex === index) return;
 		if (this.selectedNodeIndex != null) {
 			this.nodeItems[this.selectedNodeIndex].classList.remove("selected");
 		}
 		this.nodeItems[index].classList.add("selected");
 		this.selectedNodeIndex = index;
 		if (!this.buildInputsPage() && this.selectedTab === "Inputs") {
 			this.changeTab("Widgets");
 		}
 		if (!this.buildWidgetsPage() && this.selectedTab === "Widgets") {
 			this.changeTab("Outputs");
 		}
 		if (!this.buildOutputsPage() && this.selectedTab === "Outputs") {
 			this.changeTab("Inputs");
 		}
 		this.changeTab(this.selectedTab);
 	}
 	getGroupData() {
 		this.groupNodeType = LiteGraph.registered_node_types["workflow/" + this.selectedGroup];
 		this.groupNodeDef = this.groupNodeType.nodeData;
 		this.groupData = GroupNodeHandler.getGroupData(this.groupNodeType);
 	}
 	changeGroup(group, reset = true) {
 		this.selectedGroup = group;
 		this.getGroupData();
 		const nodes = this.groupData.nodeData.nodes;
 		this.nodeItems = nodes.map((n, i) =>
 			$el(
 				"li.draggable-item",
 				{
 					dataset: {
 						nodeindex: n.index + "",
 					},
 					onclick: () => {
 						this.changeNode(i);
 					},
 				},
 				[
 					$el("span.drag-handle"),
 					$el(
 						"div",
 						{
 							textContent: n.title ?? n.type,
 						},
 						n.title
 							? $el("span", {
 									textContent: n.type,
 							  })
 							: []
 					),
 				]
 			)
 		);
 		this.innerNodesList.replaceChildren(...this.nodeItems);
 		if (reset) {
 			this.selectedNodeIndex = null;
 			this.changeNode(0);
 		} else {
 			const items = this.draggable.getAllItems();
 			let index = items.findIndex(item => item.classList.contains("selected"));
 			if(index === -1) index = this.selectedNodeIndex;
 			this.changeNode(index, true);
 		}
 		const ordered = [...nodes];
 		this.draggable?.dispose();
 		this.draggable = new DraggableList(this.innerNodesList, "li");
 		this.draggable.addEventListener("dragend", ({ detail: { oldPosition, newPosition } }) => {
 			if (oldPosition === newPosition) return;
 			ordered.splice(newPosition, 0, ordered.splice(oldPosition, 1)[0]);
 			for (let i = 0; i < ordered.length; i++) {
 				this.storeModification({ nodeIndex: ordered[i].index, section: ORDER, prop: "order", value: i });
 			}
 		});
 	}
 	storeModification({ nodeIndex, section, prop, value }) {
 		const groupMod = (this.modifications[this.selectedGroup] ??= {});
 		const nodesMod = (groupMod.nodes ??= {});
 		const nodeMod = (nodesMod[nodeIndex ?? this.selectedNodeInnerIndex] ??= {});
 		const typeMod = (nodeMod[section] ??= {});
 		if (typeof value === "object") {
 			const objMod = (typeMod[prop] ??= {});
 			Object.assign(objMod, value);
 		} else {
 			typeMod[prop] = value;
 		}
 	}
 	getEditElement(section, prop, value, placeholder, checked, checkable = true) {
 		if (value === placeholder) value = "";
 		const mods = this.modifications[this.selectedGroup]?.nodes?.[this.selectedNodeInnerIndex]?.[section]?.[prop];
 		if (mods) {
 			if (mods.name != null) {
 				value = mods.name;
 			}
 			if (mods.visible != null) {
 				checked = mods.visible;
 			}
 		}
 		return $el("div", [
 			$el("input", {
 				value,
 				placeholder,
 				type: "text",
 				onchange: (e) => {
 					this.storeModification({ section, prop, value: { name: e.target.value } });
 				},
 			}),
 			$el("label", { textContent: "Visible" }, [
 				$el("input", {
 					type: "checkbox",
 					checked,
 					disabled: !checkable,
 					onchange: (e) => {
 						this.storeModification({ section, prop, value: { visible: !!e.target.checked } });
 					},
 				}),
 			]),
 		]);
 	}
 	buildWidgetsPage() {
 		const widgets = this.groupData.oldToNewWidgetMap[this.selectedNodeInnerIndex];
 		const items = Object.keys(widgets ?? {});
 		const type = app.graph.extra.groupNodes[this.selectedGroup];
 		const config = type.config?.[this.selectedNodeInnerIndex]?.input;
 		this.widgetsPage.replaceChildren(
 			...items.map((oldName) => {
 				return this.getEditElement("input", oldName, widgets[oldName], oldName, config?.[oldName]?.visible !== false);
 			})
 		);
 		return !!items.length;
 	}
 	buildInputsPage() {
 		const inputs = this.groupData.nodeInputs[this.selectedNodeInnerIndex];
 		const items = Object.keys(inputs ?? {});
 		const type = app.graph.extra.groupNodes[this.selectedGroup];
 		const config = type.config?.[this.selectedNodeInnerIndex]?.input;
 		this.inputsPage.replaceChildren(
 			...items
 				.map((oldName) => {
 					let value = inputs[oldName];
 					if (!value) {
 						return;
 					}
 					return this.getEditElement("input", oldName, value, oldName, config?.[oldName]?.visible !== false);
 				})
 				.filter(Boolean)
 		);
 		return !!items.length;
 	}
 	buildOutputsPage() {
 		const nodes = this.groupData.nodeData.nodes;
 		const innerNodeDef = this.groupData.getNodeDef(nodes[this.selectedNodeInnerIndex]);
 		const outputs = innerNodeDef?.output ?? [];
 		const groupOutputs = this.groupData.oldToNewOutputMap[this.selectedNodeInnerIndex];
 		const type = app.graph.extra.groupNodes[this.selectedGroup];
 		const config = type.config?.[this.selectedNodeInnerIndex]?.output;
 		const node = this.groupData.nodeData.nodes[this.selectedNodeInnerIndex];
 		const checkable = node.type !== "PrimitiveNode";
 		this.outputsPage.replaceChildren(
 			...outputs
 				.map((type, slot) => {
 					const groupOutputIndex = groupOutputs?.[slot];
 					const oldName = innerNodeDef.output_name?.[slot] ?? type;
 					let value = config?.[slot]?.name;
 					const visible = config?.[slot]?.visible || groupOutputIndex != null;
 					if (!value || value === oldName) {
 						value = "";
 					}
 					return this.getEditElement("output", slot, value, oldName, visible, checkable);
 				})
 				.filter(Boolean)
 		);
 		return !!outputs.length;
 	}
 	show(type) {
 		const groupNodes = Object.keys(app.graph.extra?.groupNodes ?? {}).sort((a, b) => a.localeCompare(b));
 		this.innerNodesList = $el("ul.comfy-group-manage-list-items");
 		this.widgetsPage = $el("section.comfy-group-manage-node-page");
 		this.inputsPage = $el("section.comfy-group-manage-node-page");
 		this.outputsPage = $el("section.comfy-group-manage-node-page");
 		const pages = $el("div", [this.widgetsPage, this.inputsPage, this.outputsPage]);
 		this.tabs = [
 			["Inputs", this.inputsPage],
 			["Widgets", this.widgetsPage],
 			["Outputs", this.outputsPage],
 		].reduce((p, [name, page]) => {
 			p[name] = {
 				tab: $el("a", {
 					onclick: () => {
 						this.changeTab(name);
 					},
 					textContent: name,
 				}),
 				page,
 			};
 			return p;
 		}, {});
 		const outer = $el("div.comfy-group-manage-outer", [
 			$el("header", [
 				$el("h2", "Group Nodes"),
 				$el(
 					"select",
 					{
 						onchange: (e) => {
 							this.changeGroup(e.target.value);
 						},
 					},
 					groupNodes.map((g) =>
 						$el("option", {
 							textContent: g,
 							selected: "workflow/" + g === type,
 							value: g,
 						})
 					)
 				),
 			]),
 			$el("main", [
 				$el("section.comfy-group-manage-list", this.innerNodesList),
 				$el("section.comfy-group-manage-node", [
 					$el(
 						"header",
 						Object.values(this.tabs).map((t) => t.tab)
 					),
 					pages,
 				]),
 			]),
 			$el("footer", [
 				$el(
 					"button.comfy-btn",
 					{
 						onclick: (e) => {
 							const node = app.graph._nodes.find((n) => n.type === "workflow/" + this.selectedGroup);
 							if (node) {
 								alert("This group node is in use in the current workflow, please first remove these.");
 								return;
 							}
 							if (confirm(`Are you sure you want to remove the node: "${this.selectedGroup}"`)) {
 								delete app.graph.extra.groupNodes[this.selectedGroup];
 								LiteGraph.unregisterNodeType("workflow/" + this.selectedGroup);
 							}
 							this.show();
 						},
 					},
 					"Delete Group Node"
 				),
 				$el(
 					"button.comfy-btn",
 					{
 						onclick: async () => {
 							let nodesByType;
 							let recreateNodes = [];
 							const types = {};
 							for (const g in this.modifications) {
 								const type = app.graph.extra.groupNodes[g];
 								let config = (type.config ??= {});
 								let nodeMods = this.modifications[g]?.nodes;
 								if (nodeMods) {
 									const keys = Object.keys(nodeMods);
 									if (nodeMods[keys[0]][ORDER]) {
 										// If any node is reordered, they will all need sequencing
 										const orderedNodes = [];
 										const orderedMods = {};
 										const orderedConfig = {};
 										for (const n of keys) {
 											const order = nodeMods[n][ORDER].order;
 											orderedNodes[order] = type.nodes[+n];
 											orderedMods[order] = nodeMods[n];
 											orderedNodes[order].index = order;
 										}
 										// Rewrite links
 										for (const l of type.links) {
 											if (l[0] != null) l[0] = type.nodes[l[0]].index;
 											if (l[2] != null) l[2] = type.nodes[l[2]].index;
 										}
 										// Rewrite externals
 										if (type.external) {
 											for (const ext of type.external) {
 												ext[0] = type.nodes[ext[0]];
 											}
 										}
 										// Rewrite modifications
 										for (const id of keys) {
 											if (config[id]) {
 												orderedConfig[type.nodes[id].index] = config[id];
 											}
 											delete config[id];
 										}
 										type.nodes = orderedNodes;
 										nodeMods = orderedMods;
 										type.config = config = orderedConfig;
 									}
 									merge(config, nodeMods);
 								}
 								types[g] = type;
 								if (!nodesByType) {
 									nodesByType = app.graph._nodes.reduce((p, n) => {
 										p[n.type] ??= [];
 										p[n.type].push(n);
 										return p;
 									}, {});
 								}
 								const nodes = nodesByType["workflow/" + g];
 								if (nodes) recreateNodes.push(...nodes);
 							}
 							await GroupNodeConfig.registerFromWorkflow(types, {});
 							for (const node of recreateNodes) {
 								node.recreate();
 							}
 							this.modifications = {};
 							this.app.graph.setDirtyCanvas(true, true);
 							this.changeGroup(this.selectedGroup, false);
 						},
 					},
 					"Save"
 				),
 				$el("button.comfy-btn", { onclick: () => this.element.close() }, "Close"),
 			]),
 		]);
 		this.element.replaceChildren(outer);
 		this.changeGroup(type ? groupNodes.find((g) => "workflow/" + g === type) : groupNodes[0]);
 		this.element.showModal();
 		this.element.addEventListener("close", () => {
 			this.draggable?.dispose();
 		});
 	}
 }
--- a/Show More
+++ b/Show More
		`@ -0,0 +1,3 @@`
							`# ComfyUI Internal Routes`

							All routes under the `/internal` path are designated for internal use by ComfyUI only. These routes are not intended for use by external applications may change at any time without notice.
		`@ -0,0 +1,2 @@`
							`# model_manager/__init__.py`
							`from .download_models import download_model, DownloadModelStatus, DownloadStatusType, create_model_path, check_file_exists, track_download_progress, validate_model_subdirectory, validate_filename`