HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-1e-4-gamma-share-experts-2nd-epoch-lr-5e-6 Text Generation • 16B • Updated Sep 19 • 11
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-1e-4-gamma-share-experts-2nd-epoch Text Generation • 16B • Updated Sep 19 • 13 • 1
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-remove-aux-only Text Generation • 126k • Updated Sep 18 • 8
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-1e-3-gamma Text Generation • 126k • Updated Sep 18 • 11
HectorHe/DeepSeek-V2-Lite-aux-free-sft-math7k-1epoch-1e-4-gamma Text Generation • 16B • Updated Sep 17 • 8
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-aux-free-sft-math7k-1epoch-bs4-4.51 Text Generation • 16B • Updated Sep 17 • 17
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-aux-free-sft-math7k-1epoch-bs4 Text Generation • 126k • Updated Sep 17 • 8
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-3-gamma-part2-run1 Text Generation • 14B • Updated Sep 17 • 9
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-3-gamma-1epoch Text Generation • 14B • Updated Sep 17 • 8
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-4-gamma-3epoch Text Generation • 14B • Updated Sep 15 • 6
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-remov-aux-only Text Generation • 14B • Updated Sep 15 • 10 • 1
HectorHe/Deepseek-V2-13B-Math7K-Expert-Enhance-Subset-Expert-MoE-32-experts Text Generation • 16B • Updated Aug 18 • 9 • 1