[Flux.1] It runs Smoothly with 6GB VRAM! A Recommendation for Using --novram

Columbus 4-frame cartoon_result.png (1133×1600)
  • Flux.1 runs smoothly with 16GB VRAM
  • With --novram, it works with just 6GB VRAM
  • Ideally, have 32GB system RAM

Introduction

Hello, I'm Easygoing.

I write articles about AI image generation, and recently, my main focus has been on Flux.1.

Flux.1 is a high-quality AI image generation model, but its heavy VRAM usage is a drawback.

Today, I will discuss ways to run Flux.1 smoothly on low-spec PCs.

Conclusion: Use the --novram Option

To get straight to the point, Flux.1 runs smoothly with 6GB of VRAM when using the --novram option!

Illustration of a sailing ship navigating the rough Atlantic Ocean_result.png (1600×1600)

Test Environment

The testing was done in the following environment:

  • ComfyUI
  • System RAM: 32GB
  • GPU: RTX 4060 Ti (16GB VRAM)
flowchart LR
A(Rendering<br>1440 x 1440)
B(Upscale<br>x 1.8)
C(Rerender<br>2592 x 2592)
A-->B
B-->C

The workflow I used involved generating images followed by upscaling to create high-resolution illustrations.

I measured VRAM usage at each step and plotted the results in graphs.

Results

First, I checked the VRAM usage in the original FP16 format.

  • T5xxl-fp16: 9.6 GB
  • blue_pencil-flux1_v001-fp16: 33GB
VRAM usage graph_result.png (800×1131)

The first generation is shown by the blue graph, and the second by the green graph.

In both cases, the VRAM usage reached 15 GB, meaning that if your system has less than 12GB VRAM, the generation time will increase dramatically.

Lowering Resolution Doesn’t Help Save VRAM?

To explore ways to generate images using less VRAM, I tried lowering the resolution.

VRAM usage graph at a resolution of 128 x 128_result.png (800×1131)

When the resolution was set to 128 x 128 (1/64th the original size), the image generation time was reduced, but the VRAM usage remained largely unchanged.

This shows that lowering the resolution does not significantly reduce VRAM usage.

I’ll explore other ways to save VRAM.

--lowvram and --novram Options

ComfyUI has two options for optimizing VRAM usage when VRAM is low.

--lowvram

The --lowvram option transfers part of the model to system RAM to reduce VRAM usage.

It can save VRAM but slows down the overall process.

--novram

The --novram option is more aggressive in saving VRAM.

Although VRAM usage becomes extremely low, the processing speed slows down significantly.

--lowvram is Barely Effective!

First, I checked VRAM usage with the --lowvram option.

  • T5xxl-fp16: 9.6 GB
  • blue_pencil-flux1_v001-fp16: 33GB
VRAM usage graph for --lowvram and --novram conditions_result.png (800×1131)

With the --lowvram setting, there was hardly any change in VRAM usage.

When testing --lowvram with SDXL, I saw some VRAM savings, but Flux.1 seems not to be well-optimized yet, perhaps because it’s still new.

--novram is Highly Effective!

In the same graph, the --novram setting significantly reduced VRAM usage, keeping it almost within 6GB.

The only time VRAM usage exceeded 6GB was during the upscaling process, which is short and unnecessary if the resolution is sufficient at 1440 x 1440.

Sailors on watch on a sailing ship_result.png (1600×1600)

However, since --novram shifts VRAM data to system RAM, the system RAM was fully utilized at 32GB.

By optimizing system RAM usage, it might be possible to generate images even faster.

Changing the Checkpoint to gguf Format

To save system RAM, I tried switching to a smaller gguf checkpoint.

  • T5xxl-fp16: 9.6 GB
  • blue_pencil-flux1_v001-Q_8.gguf: 12.5 GB (-20GB)

By using a smaller model, system RAM usage stayed below 32GB, and generation time was reduced by 2 minutes.

Illustration of a sailing ship in FP16 format_result.png (1600×1600)
FP16
Illustration of a sailing ship in Q_8.gguf format_result.png (1600×1600)
Q_8

The generation time was almost the same as with 16GB VRAM, and there was little to no degradation in image quality.

The gguf format is very effective in saving RAM.

Reducing System RAM to 16GB

Next, I tested it under conditions where system RAM was reduced.

I removed one stick of RAM, reducing system RAM to 16GB while keeping the --novram setting.

VRAM usage graph for Q_8 checkpoint with --novram and 16GB RAM_result.png (800×1131)

With system RAM maxed out at 16GB, generation time increased by 2 minutes due to memory shortages.

This suggests that further reducing the model size could speed things up again.

Further Model Optimization

To further reduce system RAM usage, I tried lightening the models even more.

  • T5xxl-Q_5_K_M.gguf: 3.4GB (-6.2 GB)
  • blue_pencil-flux1_v001-Q_4_K_M.gguf: 6.8GB (-5.7 GB)
VRAM usage graph for Q_5 and Q_4 models with --novram and 16GB RAM_result.png (800×1131)

By lightening the model, the generation time returned to what it was originally.

However, reducing it to Q_4 resulted in less detailed images.

Illustration of a sailing ship at anchor in Q_8.gguf format_result.png (1600×1600)
Q_8
Illustration of a sailing ship at anchor in Q_4_K_M.gguf format_result.png (1600×1600)
Q_4

In a system with 16GB RAM, using Q_4 for normal use and Q_8 for higher quality seems like a good balance.

Summary of Test Results

Here’s a summary of the test results in a table.

Flux.1 t5xxl checkpoint Option Render Upscale Rerender System RAM Time (min)
RAM 32GB FP16 FP16 First Draw 15.1 11.8 14.1 24.4 11:01
Seconed Draw 14.8 11.6 14.1 19.1 09:22
128 x 128 14.6 14.6 14.6 22.6 03:28
--lowvram 15 12 14.2 24.4 11:19
--novram 2.6 6.3 5.8 30.4 12:21
Q_8 --novram 2.4 6.3 4.9 26.1 10:13
RAM 16GB FP16 Q_8 --novram 2.8 6.4 5.2 15.1 12:08
Q_5_K_M Q_4_K_M --novram 2.5 6.5 5.8 11.5 09:28

Flux.1 can run sufficiently with 6GB VRAM by using --novram and gguf format.

While 16GB system RAM is enough to generate images, 32GB is ideal for balancing quality and speed.

The tests revealed that --novram offers significant benefits with almost no downsides.

By using --novram and adjusting model sizes, it’s possible to maintain nearly original speeds.

Recommended Settings

Based on the test results, here are the recommended settings.

VRAM 16GB or More

  • T5xxl-fp16
  • Checkpoint fp16

VRAM 6GB to 12GB

  • --novram setting
  • T5xxl-fp16
  • Checkpoint Q_8.gguf

System RAM Less than 16GB

  • T5xxl-Q_5_K_M.gguf
  • Checkpoint Q_4.gguf
sailer ship return)
sailer ship return

--novram Settings

Here’s how to enable the --novram setting in Stability Matrix, which I’m using.

Stability Matrix Settings Screen_result.png (1600×1091)
Stability Matrix's --novram setting Commented_result.png (1600×1091)

After launching Stability Matrix, press the settings button for ComfyUI and scroll down to find the --novram checkbox.

Once you check this box and launch ComfyUI, you’re good to go!

Conclusion

  • Flux.1 runs smoothly with 16GB VRAM
  • With --novram, it works with just 6GB VRAM
  • Ideally, have 32GB system RAM
ship in gulf)

Thank you for reading to the end!