File size: 3,698 Bytes
5e16ba5
 
 
 
a1e489a
b4323b3
5e16ba5
a1e489a
5e16ba5
 
0c80d30
1095a24
 
 
5e16ba5
 
e77fd21
5e16ba5
 
 
722557b
 
 
 
 
 
 
 
 
5e16ba5
 
 
220fc0c
5e16ba5
220fc0c
 
 
 
 
 
 
 
 
5e16ba5
 
 
0c80d30
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
---
language:
- en
- zh
pipeline_tag: text-to-audio
library_name: tencent-song-generation
---

# SongGeneration

<p align="center"><img src="img/logo.jpg" width="40%"></p>
<p align="center">
    <a href="https://levo-demo.github.io/">Demo</a> &nbsp;|&nbsp; <a href="https://arxiv.org/abs/2506.07520">Paper</a>  &nbsp;|&nbsp; <a href="https://github.com/tencent-ailab/songgeneration">Code</a>  &nbsp;|&nbsp; <a href="https://huggingface.co/spaces/tencent/SongGeneration">Space Demo</a>
</p>


This repository is the official weight repository for LeVo: High-Quality Song Generation with Multi-Preference Alignment. In this repository, we provide the SongGeneration model, inference scripts, and the checkpoint that has been trained on the Million Song Dataset.

## Model Versions

| Model                    | Max Length |       Language       | GPU Memory | RTF(H20) | Download Link                                                |
| ------------------------ | :--------: | :------------------: | :--------: | :------: | ------------------------------------------------------------ |
| SongGeneration-base      |   2m30s    |          zh          |  10G/16G   |   0.67   | [Huggingface](https://huggingface.co/tencent/SongGeneration/tree/main/ckpt/songgeneration_base) |
| SongGeneration-base-new  |   2m30s    |        zh, en        |  10G/16G   |   0.67   | [Huggingface](https://huggingface.co/lglg666/SongGeneration-base-new) |
| SongGeneration-base-full |   4m30s    |        zh, en        |  12G/18G   |   0.69   | [Huggingface](https://huggingface.co/lglg666/SongGeneration-base-full) |
| SongGeneration-large     |   4m30s    |        zh, en        |  22G/28G   |   0.82   | [Huggingface](https://huggingface.co/lglg666/SongGeneration-large) |
| SongGeneration-v2-large  |   4m30s    | zh, en, es, ja, etc. |  22G/28G   |   0.82   | [Huggingface](https://huggingface.co/lglg666/SongGeneration-v2-large) |
| SongGeneration-v2-medium |   4m30s    | zh, en, es, ja, etc. |  12G/18G   |   0.69   | Coming soon                                                  |
| SongGeneration-v2-fast   |   4m30s    | zh, en, es, ja, etc. |     -      |    -     | Coming soon                                                  |                                                |

## Overview

๐Ÿš€ We introduce LeVo 2 (SongGeneration 2), an open-source music foundation model designed to shatter the ceiling of open-source AI music by achieving true commercial-grade generation. 

Through a large-scale, rigorous expert evaluation (20 industry professionals, 6 core dimensions, 100 songs per model), LeVo 2 has proven its superiority:

- ๐Ÿ† Commercial-Grade Musicality: Comprehensively outperforms all open-source baselines across Overall Quality, Melody, Arrangement, Sound Quality, and Structure. Its subjective generation quality successfully rivals top-tier closed-source commercial systems (e.g., MiniMax 2.5).
- ๐ŸŽฏ Precise Lyric Accuracy: Achieves an outstanding Phoneme Error Rate (PER) of 8.55%, effectively solving the lyrical hallucination problem. This remarkable accuracy significantly outperforms top commercial models like Suno v5 (12.4%) and Mureka v8 (9.96%).
- ๐ŸŽ›๏ธ Exceptional Controllability: Highly responsive to multi-modal instructions, including text descriptions and audio prompts, allowing for precise control over the generated music.

๐Ÿ“Š *For detailed experimental setups and comprehensive metrics, please refer to the [Evaluation Performance](#Evaluation-Performance) section below or our upcoming technical report.*

<img src="img/output.png" alt="img" style="zoom:100%;" />

## License

The code and weights in this repository is released in the [LICENSE](LICENSE)  file.