HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-sft-math7k Text Generation • 16B • Updated Aug 17 • 5 • 2
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-sft-math14k Text Generation • 16B • Updated Aug 17 • 3 • 1
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-sft-nemotron-code Text Generation • 126k • Updated Aug 17 • 7 • 1
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-diff-info-Distill-mixture-new 16B • Updated Jul 23 • 7
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-diff-info-Distill-forward-kl-new 16B • Updated Jul 22 • 8
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-Distill-6-experts-test-may 3B • Updated Jul 14 • 5
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-diff-info-Distill-mixture 16B • Updated Jul 10 • 8
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-diff-info-Distill-forward-kl 16B • Updated Jul 10 • 7
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-diff-info-Distill-token-specific 16B • Updated Jul 10 • 7
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-diff-info-Distill-token-specific-scale 16B • Updated Jul 10 • 7
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-Distill-6-experts-token-specific 3B • Updated Jul 1 • 7
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-Distill-6-experts-token-specific-3-scaled 3B • Updated Jul 1 • 8
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-Math10K-Distill-6-experts-test-token-specific-5-epoch 3B • Updated Jun 23 • 14