Upload LumiTrace model

Browse files

Files changed (10) hide show

.gitattributes +5 -0
README.md +134 -0
config.yml +84 -0
model.pth +3 -0
requirements.txt +47 -0
samples/compare_1.png +3 -0
samples/compare_111.png +3 -0
samples/compare_22.png +3 -0
samples/compare_23.png +3 -0
samples/compare_55.png +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,8 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+samples/compare_1.png filter=lfs diff=lfs merge=lfs -text
+samples/compare_111.png filter=lfs diff=lfs merge=lfs -text
+samples/compare_22.png filter=lfs diff=lfs merge=lfs -text
+samples/compare_23.png filter=lfs diff=lfs merge=lfs -text
+samples/compare_55.png filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,134 @@

+---
+license: apache-2.0
+tags:
+- low-light-enhancement
+- video-processing
+- temporal-consistency
+- computer-vision
+- image-enhancement
+library_name: pytorch
+pipeline_tag: image-to-image
+---
+# LumiTrace: Temporal Low-Light Video Enhancement
+**LumiTrace** is a state-of-the-art temporal video enhancement model designed to brighten low-light videos while maintaining temporal consistency and preserving fine details.
+## Model Description
+LumiTrace combines the power of **RetinexFormer** (a Retinex-based image enhancement architecture) with custom **temporal modules** to process video sequences. It uses a 2-stage training strategy to achieve exceptional performance on challenging low-light scenarios.
+### Key Features
+- 🎬 **Temporal Consistency**: Processes 3-frame sequences to eliminate flickering
+- 🌟 **High Quality**: Achieves 22+ dB PSNR and 0.83+ SSIM on LOL benchmarks
+- ⚡ **Memory Efficient**: Supports high-resolution inference via tiled processing
+- 🔧 **Production Ready**: Includes video processing pipeline with automatic resolution standardization
+### Architecture
+- **Base**: RetinexFormer (2.2M parameters)
+- **Temporal Modules**: Custom 3D convolution + attention (0.8M parameters)
+- **Total Parameters**: ~3M
+- **Input**: 3-frame sequences (RGB, [0,1] normalized)
+- **Output**: Enhanced center frame
+## Training Data
+The model was trained on:
+- **LOL-v1**: 485 training pairs, 15 test pairs
+- **LOL-v2-Real**: 689 training pairs, 100 test pairs
+Training used synthetic temporal sequences generated from static image pairs with brightness/spatial augmentation.
+## Training Procedure
+### Two-Stage Training Strategy
+**Stage 1** (50 epochs):
+- Freeze RetinexFormer backbone
+- Train only temporal modules
+- Learning rate: 1e-4
+- Loss: L2 reconstruction + temporal consistency
+**Stage 2** (60 epochs):
+- Unfreeze all parameters
+- Discriminative learning rates:
+  - RetinexFormer: 1e-6
+  - Temporal modules: 1e-4
+- Loss: L2 + Temporal + Perceptual (VGG)
+### Performance
+| Dataset | PSNR | SSIM |
+|---------|------|------|
+| LOL-v1 | 22.70 dB | 0.8389 |
+| LOL-v2-Real | 21.72 dB | 0.8199 |
+## Usage
+### Installation
+```bash
+git clone https://github.com/yourusername/LumiTrace
+cd LumiTrace
+pip install -r requirements.txt
+```
+### Inference (Python)
+```python
+from lumitrace.inference import VideoEnhancer
+import yaml
+# Load config
+with open('configs/lol_v1_temporal.yml', 'r') as f:
+    config = yaml.safe_load(f)
+# Initialize enhancer
+enhancer = VideoEnhancer(
+    model_path='checkpoints/lol_v1/stage2/best.pth',
+    config=config
+)
+# Process video
+enhancer.enhance_video(
+    input_path='input.mp4',
+    output_path='enhanced.mp4'
+)
+```
+### Inference (CLI)
+```bash
+./scripts/enhance_video.sh input.mp4 output.mp4
+```
+## Limitations
+- Trained primarily on indoor/static scenes (LOL dataset)
+- May struggle with extreme motion or outdoor dynamic lighting
+- Best performance on videos with resolution ≤720p
+- Requires GPU for real-time processing
+## Citation
+If you use this model, please cite:
+```bibtex
+@software{lumitrace2024,
+  title={LumiTrace: Temporal Low-Light Video Enhancement},
+  author={Your Name},
+  year={2024},
+  url={https://github.com/yourusername/LumiTrace}
+}
+```
+## Acknowledgments
+- Based on [RetinexFormer](https://github.com/caiyuanhao1998/Retinexformer)
+- Trained on [LOL datasets](https://daooshee.github.io/BMVC2018website/)
+## License
+Apache 2.0

config.yml ADDED Viewed

	@@ -0,0 +1,84 @@

+# Training configuration for LOL-v1 dataset with temporal consistency
+# Dataset settings
+dataset:
+  name: "LOL-v1"
+  root_dir: "data/LOLv1"
+  num_frames: 3
+  patch_size: 128  # Reduced from 256 for GPU memory
+# Model settings
+model:
+  name: "TemporalRetinexFormer"
+  # Retinexformer parameters
+  in_channels: 3
+  out_channels: 3
+  n_feat: 40
+  stage: 2
+  num_blocks: [1, 1, 1]
+  # Temporal parameters
+  temporal_feat_channels: 64
+  # Training control
+  freeze_retinex: false  # Set true for Stage 1
+# Training settings
+training:
+  # Stage 1: Frozen Retinexformer
+  stage1:
+    epochs: 50
+    batch_size: 4  # Reduced from 8 for GPU memory
+    learning_rate: 1.0e-4
+    freeze_retinex: true
+    use_perceptual_loss: false
+  # Stage 2: Joint fine-tuning
+  stage2:
+    epochs: 60
+    batch_size: 4  # Reduced from 8 for GPU memory
+    # Discriminative learning rates
+    retinex_lr: 1.0e-6  # Very low for Retinexformer
+    temporal_lr: 1.0e-4  # Normal for temporal modules
+    freeze_retinex: false
+    use_perceptual_loss: true
+# Loss settings
+loss:
+  reconstruction_weight: 1.0
+  temporal_weight: 0.3
+  perceptual_weight: 0.1
+  reconstruction_type: "l2"  # "l1" or "l2"
+# Optimizer settings
+optimizer:
+  type: "AdamW"
+  weight_decay: 1.0e-4
+  betas: [0.9, 0.999]
+# Scheduler settings
+scheduler:
+  type: "CosineAnnealingLR"
+  T_max: 50  # Matches stage1 epochs or stage2 epochs
+  eta_min: 1.0e-6
+# Logging and checkpoints
+logging:
+  wandb: true  # Use Weights & Biases (fallback to TensorBoard if false)
+  log_interval: 10  # Print every N batches
+  checkpoint_interval: 5  # Save every N epochs
+  save_dir: "checkpoints/lol_v1"
+# Hardware
+hardware:
+  num_workers: 4
+  pin_memory: true
+  mixed_precision: true  # Use automatic mixed precision
+# Pretrained weights
+pretrained:
+  retinexformer: "pretrained_weights/LOL_v1.pth"
+# Evaluation
+evaluation:
+  metrics: ["psnr", "ssim", "temporal_consistency"]
+  save_images: true
+  num_vis_samples: 5

model.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9cf977bd3f2d5fc8d6a10013d2f59db6f3e154cd57efe77a703563424b728bd9
+size 36600526

requirements.txt ADDED Viewed

	@@ -0,0 +1,47 @@

+torch>=2.0.0
+torchvision>=0.15.0
+torchaudio>=2.0.0
+# BasicSR dependencies
+matplotlib>=3.3.0
+scikit-learn>=0.24.0
+scikit-image>=0.18.0
+opencv-python>=4.5.0
+yacs>=0.1.8
+joblib>=1.0.0
+natsort>=7.1.0
+h5py>=3.1.0
+tqdm>=4.60.0
+tensorboard>=2.5.0
+# Transformer & DL utilities
+einops>=0.4.0
+timm>=0.6.0
+addict>=2.4.0
+future>=0.18.0
+lmdb>=1.2.0
+numpy>=1.21.0
+pyyaml>=5.4.0
+requests>=2.25.0
+scipy>=1.7.0
+yapf>=0.31.0
+# Perceptual losses
+lpips>=0.1.4
+# Video processing
+imageio>=2.9.0
+imageio-ffmpeg>=0.4.5
+# Logging and experiment tracking
+wandb>=0.12.0
+tensorboard>=2.11.0
+torchmetrics>=1.0.0
+# Utilities
+tqdm>=4.64.0
+python-dotenv>=0.19.0
+Pillow>=8.0.0
+# HuggingFace Hub (for model upload)
+huggingface_hub>=0.16.0