FLUX.2-dev β€” Attention-only INT8 Weight-Only Transformer (ROCm)

This repository provides an INT8 weight-only quantized transformer for
black-forest-labs/FLUX.2-dev.

It is designed to be:

  • βœ… ROCm-compatible
  • βœ… Stable on AMD Instinct MI210
  • βœ… Image-quality preserving

Only attention Linear layers (Q/K/V + projections) are quantized. All other components remain in BF16.


πŸ” What is included

  • βœ… Transformer with attention-only INT8 weight-only quantization
  • βœ… TorchAO-based quantization (no bitsandbytes)
  • βœ… Compatible with Diffusers standard pipelines

❌ What is NOT included

  • ❌ VAE
  • ❌ Text encoders
  • ❌ Scheduler

These components are automatically loaded from the base FLUX.2 model.


πŸ’‘ Why attention-only INT8?

Full INT8 quantization of FLUX.2 introduces visible artifacts on ROCm. Quantizing only attention layers provides:

  • Significant VRAM reduction
  • Stable generation
  • No "confetti noise" artifacts
  • Safe inference on MI210 (64 GB)

πŸš€ Usage (Diffusers)

import torch
from diffusers import Flux2Pipeline, AutoModel

BASE_MODEL = "black-forest-labs/FLUX.2-dev"
ATTN_INT8 = "AmdGoose/FLUX.2-dev-transformer-attn-int8wo"

dtype = torch.bfloat16
device = "cuda"  # ROCm uses "cuda" in PyTorch

transformer = AutoModel.from_pretrained(
    ATTN_INT8,
    subfolder="transformer_attn_int8wo",
    torch_dtype=dtype,
    use_safetensors=False,
).to(device)

pipe = Flux2Pipeline.from_pretrained(
    BASE_MODEL,
    transformer=transformer,
    torch_dtype=dtype,
)

pipe.enable_attention_slicing()
pipe.vae.enable_tiling()
pipe.enable_model_cpu_offload()

image = pipe(
    prompt="A realistic starter pack figurine in a blister box, studio lighting",
    num_inference_steps=28,
    guidance_scale=4,
    height=1024,
    width=1024,
).images[0]

image.save("out.png")
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for AmdGoose/FLUX.2-dev-transformer-int8wo

Finetuned
(24)
this model