Improve model card: Add pipeline tag, library name, license, and update paper/project links (#6)

3513c73 verified 7 months ago

5.07 kB

	---
	datasets:
	- vita-video-gen/svi-benchmark
	language:
	- en
	tags:
	- video generation
	pipeline_tag: image-to-video
	library_name: diffusers
	license: mit
	project_page: https://stable-video-infinity.github.io/homepage/
	papers:
	- title: 'Stable Video Infinity: Infinite-Length Video Generation with Error Recycling'
	authors:
	- Wuyang Li
	- Wentao Pan
	- Po-Chien Luan
	- Yang Gao
	- Alexandre Alahi
	url: https://huggingface.co/papers/2510.09212
	conference: arXiv preprint, 2025
	---

	<div align="center">

	<h1>Stable Video Infinity: Infinite-Length Video Generation with Error Recycling<h1>

	<p align="center">
	<a href="https://huggingface.co/papers/2510.09212"> <img src="https://img.shields.io/badge/Paper-HuggingFace-red?logo=huggingface&logoColor=yellow" alt="Paper on Hugging Face"/> </a>
	<a href="https://stable-video-infinity.github.io/homepage/"> <img src="https://img.shields.io/badge/Project-Page-green" alt="Project Page"/> </a>
	<a href="https://github.com/vita-epfl/Stable-Video-Infinity"> <img src="https://img.shields.io/badge/SVI-GitHub-black?logo=github&logoColor=white" alt="SVI on GitHub"/> </a>
	<a href="https://huggingface.co/datasets/vita-video-gen/svi-benchmark"> <img src="https://img.shields.io/badge/SVI_Dataset-Hugging%20Face-orange?logo=huggingface&logoColor=yellow" alt="SVI Dataset"/> </a>
	<a href="https://huggingface.co/vita-video-gen/svi-model"> <img src="https://img.shields.io/badge/SVI_models-Hugging%20Face-FFCC00?logo=huggingface&logoColor=yellow" alt="SVI Models"/> </a> </p> </div>

	## 🎯 About This Repository
	Stable-Video-Infinity(SVI) is able to generate ANY-length videos with high temporal consistency, plausible scene transitions, and controllable streaming storylines in ANY domains.
	This repository contains the model weights of SVI Family.

	## 🌟 Key Highlights
	- OpenSVI: Everything is open-sourced: training & evaluation scripts, datasets, and more.
	- Infinite Length: No inherent limit on video duration; generate arbitrarily long stories (see the 10‑minute “Tom and Jerry” demo).
	- Versatile: Supports diverse in-the-wild generation tasks: multi-scene short films, single‑scene animations, skeleton-/audio-conditioned generation, cartoons, and more.
	- Efficient: Only LoRA adapters are tuned, requiring very little training data: anyone can make their own SVI easily.
	## 📦 Resources
	\| Model \| Task \| Input \| Output \| Hugging Face Link \| Comments \|
	\|-------\|------\|-------\|--------\|-------------------\|------------------\|
	\| ALL \| Infinite possibility \| Image + X \| X video \| [🤗 Folder](https://huggingface.co/vita-video-gen/svi-model/tree/main/version-1.0) \|Family bucket! I want to play with all! \|
	\| SVI-Shot \| Single-scene generation \| Image + Text prompt \| Long video \| [🤗 Model](https://huggingface.co/vita-video-gen/svi-model/resolve/main/version-1.0/svi-shot.safetensors?download=true) \| Generate consistent long video with 1 text prompt. (This will never drift) \|
	\| SVI-Film \| Multi-scene generation \| Image + Text prompt stream \| Film-style video \| [🤗 Model](https://huggingface.co/vita-video-gen/svi-model/resolve/main/version-1.0/svi-film.safetensors?download=true) \| Generate creative long video with 1 text prompt stream (5 second per text). \|
	\| SVI-Film (Transition) \| Multi-scene generation \| Image + Text prompt stream \| Film-style video \| [🤗 Model](https://huggingface.co/vita-video-gen/svi-model/resolve/main/version-1.0/svi-film-transitions.safetensors?download=true) \|Generate creative long video with 1 text prompt stream. (More scene transitions due to the training data) \|
	\| SVI-Tom&Jerry \| Cartoon animation \| Image \| Cartoon video \| [🤗 Model](https://huggingface.co/vita-video-gen/svi-model/resolve/main/version-1.0/svi-tom.safetensors?download=true) \| Generate creative long cartoon videos with 1 text prompt stream (This will never drift in our 20 min test)\|
	\| SVI-Talk \| Talking head \| Image + Audio \| Talking video \| [🤗 Model](https://huggingface.co/vita-video-gen/svi-model/resolve/main/version-1.0/svi-talk.safetensors?download=true) \|Generate long videos with audio-conditioned human speaking \|
	\| SVI-Dance \| Dancing animation \| Image + Skeleton \| Dance video \| [🤗 Model](https://huggingface.co/vita-video-gen/svi-model/resolve/main/version-1.0/svi-dance.safetensors?download=true) \| Generate long videos with skeleton-conditioned human dancing \|

	Note: If you want to play with T2V, you can directly use SVI with an image generated by any T2I model!

	## 📝 Citation
	If you find our work helpful for your research, please consider citing our paper. Thank you so much!

	```bibtex
	@article{li2025stable,
	title={Stable Video Infinity: Infinite-Length Video Generation with Error Recycling},
	author={Wuyang Li and Wentao Pan and Po-Chien Luan and Yang Gao and Alexandre Alahi},
	journal={arXiv preprint arXiv: arXiv:2510.09212},
	year={2025},
	url={https://huggingface.co/papers/2510.09212},
	}
	```