Skip to content

Install ComfyUI Locally with LM Downloader

ComfyUI is a visual tool for image and video generation, allowing you to easily modify and use AI painting and video models—just like building blocks. It supports popular models such as FLUX.1 Dev, Stable Diffusion 3.5, Hunyuan Video, Framepack, WAN Video, Qwen Image, and LTX-2 Video.

If you have some development experience, you can download the source code from GitHub and deploy it locally. Alternatively, you can directly download the client version from the official website, which is simpler and perfect for beginners to get started quickly.

Note

  • For Windows users: An NVIDIA GPU with updated CUDA drivers and at least 4GB VRAM is recommended.
  • For macOS users: M-series chips are preferred for optimal performance.
  • Ensure 10GB of free disk space for a smooth ComfyUI experience. For advanced usage with multiple models, 100GB+ of storage is advised.

Find ComfyUI in LM Downloader

Open LM Downloader, then click the "Local Apps" in the left menu. You could see ComfyUI in the app list. Click the ComfyUI icon to go to the introduction page.

Click the Install Button,the install window opens. If you already have ComfyUI installed, don't worry, this can be treated as an update to ComfyUI and won't affect the models you've previously downloaded.

Close this window after the installation is complete.

Run ComfyUI

On the application details page, click the Run button on the right to open the execution window. Upon successful launch, your browser will open automatically.

 

 

 

How to Choose Your "Startup Options"

If you are using Windows with an NVIDIA GPU, you will see startup options on the right side allowing you to select different execution environments. Choosing torch 2.7.1 ensures maximum stability, while other versions are optimized to unlock higher hardware performance. Users with other GPU brands or macOS will not see these options.

We have expanded our perspective from "consumer-grade cards" to a full-dimensional view—ranging from the classic Pascal architecture to the flagship Rubin architecture—covering workstations (RTX Ada/A-series) and data centers (H/B/R-series).

🚀 ComfyUI Environment Guide: Unleashing VRAM Potential from the Architecture Level

ComfyUI performance depends on the "Trinity" of PyTorch (Backend), CUDA (Instruction Set), and GPU Architecture (Hardware Core). We have preset three environments designed to cover everything from decade-old classics to top-tier chips of the next five years.


📊 Hardware Architecture Matching Matrix

Environment OptionCore ArchitectureTypical GPU ModelsKey Advantage
torch 2.7.1Pascal / Turing / AmpereGTX 10/20 series, RTX 30 series, Quadro P/RTX series, A100/A800Stability First: Rock-solid drivers for older architectures.
torch 2.8.0Ada Lovelace / BlackwellRTX 40/50 series, RTX 6000 Ada, L40S, H100 / H200 / B100Performance Hub: Native support for FP8 inference.
torch 2.9.1Blackwell / Rubin (Next Gen)RTX 50 series, B200, GB200, Rubin R100 / R200Arch Overclocking: Supports NVFP4 quantization.

🔍 Deep Dive: Which One Should You Choose?

1️⃣ Default Environment: torch 2.7.1 (Compatibility & Stability)

  • Target Architectures: sm_60 to sm_86.
  • Core Specs: Torch 2.7.1 + CU 12.8. This is currently the most mature ecosystem. Almost all custom nodes (especially those involving C++ compilation like face-swapping or video streaming) are developed and tested on this version.
  • Best For: Legacy workstation upgrades or A100/A800 compute clusters. If your priority is "running every workflow without errors," choose this.

2️⃣ torch 2.8.0 (Ada/Blackwell Optimized)

  • Target Architectures: sm_89 (Ada) / sm_100 (Blackwell Initial).
  • Core Specs: Torch 2.8.0 + CU 12.8. Tailored specifically for the RTX 40 series and H100/B100 data center cards.
  • Key Improvements: Significantly optimizes FP8 (8-bit floating point) matrix operations. When running Flux.1 (Schnell/Dev) or large-scale SD3.5 models, VRAM recycling efficiency is improved by 20%, with generation speeds noticeably faster than the default environment.
  • Best For: RTX 40 series owners or H100 server users who need a balance of performance and node compatibility.

3️⃣ torch 2.9.1 (Blackwell / Rubin Experimental)

  • Target Architectures: sm_120 (Blackwell) and future-looking architectures sm_130/140 (Rubin).

  • Core Specs: Torch 2.9.1 + CU 12.8. Cutting-edge adaptation for the RTX 50 series (5060Ti-5090) and NVIDIA’s next-gen Rubin (R100) architecture.

  • Architectural Benefits:

  • Native NVFP4 Support: Leverages Blackwell's hardware-level 4-bit floating point acceleration to potentially double inference efficiency.

  • Deep Optimization: Reserved higher memory exchange bandwidth for the HBM4 VRAM features of the Rubin architecture.

  • Best For: Top-tier personal studios or B200/GB200 compute nodes. Choose this if you are processing 10B+ parameter models and demand "instant" generation.


💡 Tech Primer: Where does my GPU sit?

CategoryArchitectureRepresentative Models
Future / Cutting-EdgeRubinR100, R200 (Mainstream after 2026)
Current FlagshipBlackwellRTX 50 series, B100, B200, GB200
Modern MainstreamAda LovelaceRTX 40 series, RTX 6000 Ada, L40S
Classic High-PerfAmpereRTX 30 series, A100, A10, A30
Legacy/BasicTuring / PascalRTX 20 series, GTX 10 series, P100, T4

⚠️ torch 2.9.1 Risks & Maintenance Tips

  1. Driver Requirements: If selecting "torch 2.9.1," ensure your GPU driver is updated to the latest version (v580.xx or higher), or the CUDA instruction set may fail to initialize.
  2. Node Breakage: Newer versions may cause errors in custom nodes that haven't been updated since before 2024. If an environment fails to load, check the custom_nodes directory for outdated plugins.
  3. VRAM Management: Features enhanced Async Offloading capabilities, ideal for devices with high bandwidth or large VRAM like the 5060 Ti 16G or B100.

❓ FAQ

Q: I have an RTX 5060 Ti 16G. Which should I choose?A: For stability, go with torch 2.7.1. However, since this card uses the new Blackwell architecture, you must use torch 2.8.0 if you want to run FP4 quantized models. For maximum performance, try torch 2.9.1.

Q: I have an RTX 4060 Ti 16G. Which should I choose?A: torch 2.7.1 is recommended for stability; for higher performance, torch 2.8.0 is suitable, offering stable operation in most scenarios and support for running fp8 quantized models. However, this GPU does not deliver performance gains with fp4 quantized models, so their use is not advised.

Q: I have an RTX 3060 8GB. Which should I choose?A: Stick with torch 2.7.1. While torch 2.8.0 might be faster in some cases, it could also be slower on this older architecture.

Q: Can I switch environments later?A: Yes. Choosing one environment does not affect your ability to pick another next time. However, switching requires a dependency re-check, which is time-consuming. Staying with the same environment results in faster startups.

Q: Is there a difference in image quality between the three?A: No. The environment only affects generation speed, temperature control, and VRAM usage. It does not change the mathematical logic of the final image output.


If you're using ComfyUI for the first time, you'll see the simplest example provided by the official team. You can try this to quickly experience what ComfyUI can do.

While the default example gives you a basic idea, we recommend exploring other workflows and models that offer richer features and much better results.

Links:

LAN Access

If you want other computers on the local network to access this ComfyUI service, you can choose "Allow LAN Access" to enable remote access. However, be aware that there are risks of data leakage and attacks. Ensure your network is secure. If you're unsure, do not enable this feature.

 

If you still encounter issues, please contact our technical support team. tech@daiyl.com