flux-upscale / README.md

wangkanai

Upload folder using huggingface_hub

20a9e73 verified 2 months ago

preview code

raw

history blame contribute delete

6.97 kB

metadata

license: apache-2.0
library_name: realesrgan
pipeline_tag: image-to-image
tags:
  - image-upscaling
  - super-resolution
  - realesrgan
  - esrgan
  - post-processing
  - image-enhancement

FLUX Upscale Models Collection v1.3

This repository contains Real-ESRGAN upscale models for post-processing and enhancing generated images. These models can upscale images by 2x or 4x while adding fine details and improving sharpness.

Model Description

Real-ESRGAN (Real Enhanced Super-Resolution Generative Adversarial Networks) models for high-quality image upscaling. These models are commonly used as post-processing steps for AI-generated images to increase resolution and enhance details.

Key Capabilities:

2x and 4x image upscaling
Detail enhancement and sharpening
Noise reduction and artifact removal
Optimized for AI-generated images
CPU and GPU compatible

Repository Contents

Total Size: ~192MB

Upscale Models (upscale_models/)

4x-UltraSharp.pth - 64MB - 4x upscaling with ultra-sharp detail enhancement
RealESRGAN-x2plus.pth - 64MB - 2x upscaling model
RealESRGAN-x4plus.pth - 64MB - 4x upscaling model

Hardware Requirements

VRAM: 4GB+ recommended for GPU inference
Disk Space: 192MB
Memory: 8GB+ system RAM recommended
Compatible with: CPU or GPU inference (CUDA, ROCm, or CPU)

Usage Examples

Basic Usage with Real-ESRGAN

from basicsr.archs.rrdbnet_arch import RRDBNet
from realesrgan import RealESRGANer
import cv2

# Load the upscaler model
model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32)

upsampler = RealESRGANer(
    scale=4,
    model_path="E:\\huggingface\\flux-upscale\\upscale_models\\4x-UltraSharp.pth",
    model=model,
    tile=0,
    tile_pad=10,
    pre_pad=0,
    half=True  # Use FP16 for faster inference on GPU
)

# Load and upscale an image
img = cv2.imread("input.png", cv2.IMREAD_COLOR)
output, _ = upsampler.enhance(img, outscale=4)
cv2.imwrite("output_upscaled.png", output)

Using with FLUX Pipeline

from diffusers import FluxPipeline
from realesrgan import RealESRGANer
from basicsr.archs.rrdbnet_arch import RRDBNet
import torch
import numpy as np

# Generate image with FLUX
pipe = FluxPipeline.from_pretrained(
    "E:\\huggingface\\flux-dev-fp16",
    torch_dtype=torch.float16
)
pipe.to("cuda")

image = pipe(
    prompt="a beautiful landscape with mountains",
    num_inference_steps=30
).images[0]

# Convert PIL to numpy/cv2 format
img_array = np.array(image)

# Initialize upscaler
model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32)
upsampler = RealESRGANer(
    scale=4,
    model_path="E:\\huggingface\\flux-upscale\\upscale_models\\4x-UltraSharp.pth",
    model=model,
    half=True
)

# Upscale the generated image
upscaled, _ = upsampler.enhance(img_array, outscale=4)

# Save result
import cv2
cv2.imwrite("flux_upscaled_4x.png", upscaled)

Tiled Processing for Large Images

from basicsr.archs.rrdbnet_arch import RRDBNet
from realesrgan import RealESRGANer
import cv2

# Configure for large images with tiling
model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32)

upsampler = RealESRGANer(
    scale=4,
    model_path="E:\\huggingface\\flux-upscale\\upscale_models\\RealESRGAN_x4plus.pth",
    model=model,
    tile=512,      # Process in 512x512 tiles
    tile_pad=10,   # Padding to avoid seams
    pre_pad=0,
    half=True
)

# Process large image
img = cv2.imread("large_image.png", cv2.IMREAD_COLOR)
output, _ = upsampler.enhance(img, outscale=4)
cv2.imwrite("large_upscaled.png", output)

Model Comparison

Model	Scale	Best For	File Size	Speed
4x-UltraSharp	4x	Sharp details, AI-generated images	64MB	Moderate
RealESRGAN_x2plus	2x	Moderate upscaling, faster processing	64MB	Fast
RealESRGAN_x4plus	4x	General purpose 4x upscaling	64MB	Moderate

Model Selection Guide:

4x-UltraSharp: Best for AI-generated images needing maximum sharpness
RealESRGAN_x2plus: Quick 2x upscaling with balanced quality
RealESRGAN_x4plus: General-purpose 4x upscaling for various image types

Model Specifications

Architecture: RRDB (Residual in Residual Dense Block)
Input Channels: 3 (RGB)
Output Channels: 3 (RGB)
Feature Dimensions: 64
Network Blocks: 23 (standard configuration)
Growth Channels: 32
Format: PyTorch .pth files
Precision: FP32 (supports FP16 inference)

Performance Tips

GPU Acceleration: Use half=True for FP16 inference on compatible GPUs (approximately 2x faster)
Tiling for VRAM: Enable tiling with tile=512 to reduce VRAM usage for large images
Tile Padding: Use tile_pad=10 to minimize visible seams between tiles
Batch Processing: Process multiple images sequentially to amortize model loading time
CPU Fallback: Models work on CPU but will be significantly slower (~10-20x)
Optimal Scale: Use 2x for faster processing, 4x for maximum detail enhancement
Input Quality: Better input images produce better upscaling results
File Formats: Use lossless formats (PNG) for best quality preservation

Use Cases

Post-processing AI-generated images from FLUX.1, Stable Diffusion, etc.
Enhancing FLUX.1-dev outputs for high-resolution prints
Increasing resolution of generated artwork for commercial use
Adding fine details to synthetic images
Print preparation for generated images (posters, canvas prints)
Upscaling video frames for AI video generation pipelines
Restoring and enhancing low-resolution generated content

Installation

pip install realesrgan basicsr

Dependencies:

Python 3.8+
PyTorch 1.7+
basicsr
realesrgan
opencv-python
numpy

License

These models are released under the Apache 2.0 license.

Citation

@InProceedings{wang2021realesrgan,
    author    = {Xintao Wang and Liangbin Xie and Chao Dong and Ying Shan},
    title     = {Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data},
    booktitle = {International Conference on Computer Vision Workshops (ICCVW)},
    year      = {2021}
}

Links and Resources

Real-ESRGAN Paper: arXiv:2107.10833
Official Repository: xinntao/Real-ESRGAN
BasicSR Library: xinntao/BasicSR
Hugging Face: Real-ESRGAN Models
Model Downloads: Available through official Real-ESRGAN releases

Model Card Contact

For questions about Real-ESRGAN models, refer to the official Real-ESRGAN repository and documentation at https://github.com/xinntao/Real-ESRGAN