YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)
env CUDA_DEVICE_ORDER=PCI_BUS_ID CUDA_VISIBLE_DEVICES=0 python3 main.py \
--model_name facebook/opt-125M \
--device 0 \
--group_size 128 \
--bits 4 \
--seqlen 2048 \
--iters 1000 \
--use_quant_input \
--disable_eval \
--n_blocks 22 \
--sym \
--deployment_device 'gpu' \
--disable_low_gpu_mem_usage \
--output_dir "/monster/data/zx/opt-125M-quant_lm_head_false"

quant model path:

/monster/data/zx/opt-125M-quant_lm_head_false/opt-125m-autoround-w4g128-gpu

Downloads last month
2
Safetensors
Model size
0.2B params
Tensor type
I32
·
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support