lhl616/Qwen3-4B-Base-axon-error-aware-128-8-dense-0.5-0.8-split-new-step2 4B • Updated 17 days ago • 15
lhl616/Qwen3-4B-Base-axon-error-aware-128-8-dense-0.5-0.8-normal-fixed-denominator 4B • Updated 17 days ago • 18
lhl616/Llama-3.2-3B-Instruct-axon-error-aware-128-8-dense-std-relu-0.5-0.8-start 3B • Updated 17 days ago • 16
lhl616/Llama-3.2-3B-Instruct-axon-error-aware-128-8-dense-std-0.5-0.8-start 3B • Updated 17 days ago • 17
lhl616/Llama-3.2-3B-Instruct-axon-error-aware-128-8-dense-nstd-0.5-0.8-start 3B • Updated 17 days ago • 15
lhl616/Llama-3.2-3B-Instruct-axon-error-aware-128-8-dense-nstd-0.5-0.8-new-relu 3B • Updated 17 days ago • 16