---
language: en
library_name: diffusers
pipeline_tag: text-to-image
tags:
  - satellite
  - controlnet
  - diffusers
  - text-to-image
widget:
  - prompt: satellite image of farmland
    output:
      url: demo_images/readme_text2img.jpeg
---

> [!NOTE]
> If you encounter pipeline loading failure or unexpected output, please contact bili_sakura@zju.edu.cn.

# DiffusionSat Custom Pipelines

Custom community pipelines for loading DiffusionSat checkpoints directly with `diffusers.DiffusionPipeline.from_pretrained()`.

> See [Diffusers Community Pipeline Documentation](https://huggingface.co/docs/diffusers/using-diffusers/custom_pipeline_overview)

## Model Index

`model_index.json` is set to the default text-to-image pipeline (`DiffusionSatPipeline`) so `DiffusionPipeline.from_pretrained()` works out of the box. The ControlNet variant is loaded via `custom_pipeline` plus the `controlnet` subfolder, as shown below.

## Available Pipelines

This directory contains two custom pipelines:

1. **`pipeline_diffusionsat.py`**: Standard text-to-image pipeline with DiffusionSat metadata support.
2. **`pipeline_diffusionsat_controlnet.py`**: ControlNet pipeline with DiffusionSat metadata and conditional metadata support.

## Setup

The checkpoint folder (`ckpt/diffusionsat/`) should contain the standard diffusers components (unet, vae, scheduler, etc.). You can reference these pipeline files directly from this directory or copy them to your checkpoint folder.

## Usage

### 1. Text-to-Image Pipeline

Use `pipeline_diffusionsat.py` for standard generation.

```python
import torch
from diffusers import DiffusionPipeline

# Load pipeline
pipe = DiffusionPipeline.from_pretrained(
    "path/to/ckpt/diffusionsat",
    custom_pipeline="./pipeline_diffusionsat.py",  # Path to this file
    torch_dtype=torch.float16,
    trust_remote_code=True,
)
pipe = pipe.to("cuda")

# Optional: Metadata (normalized lat, lon, timestamp, GSD, etc.)
# metadata = [0.5, -0.3, 0.7, 0.2, 0.1, 0.0, 0.5] 

# Generate
image = pipe(
    "satellite image of farmland",
    metadata=None,  # Optional
    height=256,
    width=256,
    num_inference_steps=30,
).images[0]
```