--- license: mit base_model: fredzzp/open-dcoder-0.5B tags: - code-generation - diffusion-model - masked-diffusion - code-correction - python datasets: - code language: - code pipeline_tag: text-generation --- # CDLM-0.5B ## Model Description **CDLM-0.5B** is a fine-tuned version of [fredzzp/open-dcoder-0.5B](https://huggingface.co/fredzzp/open-dcoder-0.5B), trained using **error-aware training** with the mixture objective proposed in our paper on Corrective Diffusion Language Models. This model is designed to improve error-aware confidence and targeted refinement capabilities in code generation tasks. ### Key Features - **Base Model**: [fredzzp/open-dcoder-0.5B](https://huggingface.co/fredzzp/open-dcoder-0.5B) (a masked diffusion language model based on Qwen2) - **Training Method**: Error-aware training with mixture objective that explicitly supervises visible incorrect tokens - **Architecture**: Masked Diffusion Language Model (MDLM) - **Parameters**: ~0.5B ## Training Details This model was fine-tuned from `fredzzp/open-dcoder-0.5B` using error-aware training with a mixture objective. For detailed information on the training methodology, please refer to our paper: [Corrective Diffusion Language Models](https://arxiv.org/pdf/2512.15596). ## Usage ### Installation ```bash pip install torch transformers ``` ### Code Generation ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM model_id = "Shuibai12138/CDLM-0.5B" device = "cuda" if torch.cuda.is_available() else "cpu" # Load tokenizer and model tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, trust_remote_code=True ).to(device) # Generate code prompt = "def fibonacci(n):" input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device) # Use diffusion generation outputs = model.diffusion_generate( inputs=input_ids, max_new_tokens=100, steps=16, temperature=0.8 ) prompt_len = input_ids.shape[1] generated_text = tokenizer.decode(outputs.sequences[0][prompt_len:], skip_special_tokens=True) print("Generated Code:") print(generated_text) ``` **Note**: This model uses a custom `diffusion_generate` method, so `trust_remote_code=True` is required when loading the model. ### Iterative Refinement The model supports iterative refinement for code correction. See the [CDLM repository](https://github.com/zhangshuibai/CDLM) for examples of using the model for code correction tasks. ## Citation If you use this model in your research, please cite: ```bibtex @misc{zhang2025correctivediffusionlanguagemodels, title={Corrective Diffusion Language Models}, author={Shuibai Zhang and Fred Zhangzhi Peng and Yiheng Zhang and Jin Pan and Grigorios G. Chrysos}, year={2025}, eprint={2512.15596}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2512.15596}, } ``` ## Related Resources - **Paper**: [Corrective Diffusion Language Models](https://arxiv.org/pdf/2512.15596) - **Code Repository**: [zhangshuibai/CDLM](https://github.com/zhangshuibai/CDLM) - **Collection**: [HuggingFace Collection](https://huggingface.co/collections/Shuibai12138/cdlm) - **Base Model**: [fredzzp/open-dcoder-0.5B](https://huggingface.co/fredzzp/open-dcoder-0.5B) ## License This model is licensed under the MIT License. See the [LICENSE](LICENSE) file for details. ## Contact For questions and issues, please contact: **Shuibai Zhang**