-
tokyotech-llm/GPT-OSS-Swallow-20B-RL-v0.1
Text Generation • 21B • Updated • 821 • 20 -
tokyotech-llm/GPT-OSS-Swallow-120B-RL-v0.1
Text Generation • 117B • Updated • 2.53k • 16 -
tokyotech-llm/GPT-OSS-Swallow-20B-SFT-v0.1
Text Generation • 21B • Updated • 97 • 5 -
tokyotech-llm/GPT-OSS-Swallow-120B-SFT-v0.1
Text Generation • 117B • Updated • 79 • 2
AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
Swallow LLM
Research and development of large language models conducted by the members mainly in Okazaki Laboratory and Yokota Laboratory at Institute of Science Tokyo (formerly known as Tokyo Institute of Technology)
- From Okazaki Laboratory, Institute of Science Tokyo, the following members:
- Naoaki Okazaki
- Sakae Mizuki
- Youmi Ma
- Koki Maeda
- Masanari Ohi
- Koshiro Saito
- Tatsuya Ichinose
- Naoya Matsushita
- Sora Miyamoto
- Nguyen Tien Dung
- Yuta Katayama
- Takaya Hiratsuka
- From YOKOTA Laboratory, Institute of Science Tokyo, the following members:
- Rio Yokota
- Kazuki Fujii
- Taishi Nakamura
- Shigeki Ishida
- Masaki Kawamura
- Yukito Tajima
- Daisuke Nohara
- From Artificial Intelligence Research Center, AIST, Japan, the following members:
-
tokyotech-llm/Qwen3-Swallow-8B-RL-v0.2
Text Generation • 8B • Updated • 2.42k • • 10 -
tokyotech-llm/Qwen3-Swallow-30B-A3B-RL-v0.2
Text Generation • 31B • Updated • 210 • 7 -
tokyotech-llm/Qwen3-Swallow-32B-RL-v0.2
Text Generation • 33B • Updated • 3.88k • • 2 -
tokyotech-llm/Qwen3-Swallow-8B-SFT-v0.2
Text Generation • 8B • Updated • 1.69k • • 5
-
tokyotech-llm/GPT-OSS-Swallow-20B-RL-v0.1
Text Generation • 21B • Updated • 821 • 20 -
tokyotech-llm/GPT-OSS-Swallow-120B-RL-v0.1
Text Generation • 117B • Updated • 2.53k • 16 -
tokyotech-llm/GPT-OSS-Swallow-20B-SFT-v0.1
Text Generation • 21B • Updated • 97 • 5 -
tokyotech-llm/GPT-OSS-Swallow-120B-SFT-v0.1
Text Generation • 117B • Updated • 79 • 2
-
tokyotech-llm/Qwen3-Swallow-8B-RL-v0.2
Text Generation • 8B • Updated • 2.42k • • 10 -
tokyotech-llm/Qwen3-Swallow-30B-A3B-RL-v0.2
Text Generation • 31B • Updated • 210 • 7 -
tokyotech-llm/Qwen3-Swallow-32B-RL-v0.2
Text Generation • 33B • Updated • 3.88k • • 2 -
tokyotech-llm/Qwen3-Swallow-8B-SFT-v0.2
Text Generation • 8B • Updated • 1.69k • • 5
models 138
tokyotech-llm/Medical-GPT-OSS-Swallow-120B
Text Generation • 117B • Updated
tokyotech-llm/Medical-Qwen3-Swallow-8B
Text Generation • 8B • Updated
tokyotech-llm/Medical-Qwen3-Swallow-30B-A3B
Text Generation • 31B • Updated
tokyotech-llm/Medical-Qwen3-Swallow-32B
Text Generation • 33B • Updated
tokyotech-llm/GPT-OSS-Swallow-20B-RL-v0.1-MXFP4
Text Generation • 22B • Updated • 179
tokyotech-llm/GPT-OSS-Swallow-120B-RL-v0.1-MXFP4
Text Generation • 120B • Updated • 410 • 1
tokyotech-llm/Qwen3-Swallow-8B-SFT-v0.2
Text Generation • 8B • Updated • 1.69k • • 5
tokyotech-llm/Qwen3-Swallow-32B-CPT-v0.2
Text Generation • 33B • Updated • 165 • 2
tokyotech-llm/Qwen3-Swallow-30B-A3B-CPT-v0.2
Text Generation • 31B • Updated • 157
tokyotech-llm/Qwen3-Swallow-8B-CPT-v0.2
Text Generation • 8B • Updated • 146 • • 1
datasets 19
tokyotech-llm/swallow-math
Viewer • Updated • 4.33M • 965 • 48
tokyotech-llm/swallow-code
Viewer • Updated • 129M • 959 • 66
tokyotech-llm/Swallow-Nemotron-Post-Training-Dataset-v1
Viewer • Updated • 8.84M • 499 • 6
tokyotech-llm/lmsys-chat-1m-synth
Updated • 519 • 21
tokyotech-llm/s1-test-time-scaling-synth-public
Viewer • Updated • 59k • 51
tokyotech-llm/swallow-code-v2
Viewer • Updated • 147M • 47k • 38
tokyotech-llm/swallow-math-v2
Viewer • Updated • 17.4M • 15.1k • 31
tokyotech-llm/swallow_english_mt_bench
Viewer • Updated • 80 • 76
tokyotech-llm/MMLU-ProX-English
Updated • 162
tokyotech-llm/MMLU-Pro-English
Updated • 296