--- base_model: Kwaipilot/KAT-Dev tags: - rust - Hyperswitch - LoRA - CPT - Causal-LM - code-generation - phased-training pipeline_tag: text-generation language: - en datasets: - AdityaNarayan/HS-Repo-Curriculum-Learning --- # KAT-Dev-CPT-LoRA-HS-32K-maxToken-v1 A specialized **LoRA fine-tuned adapter** built on top of **Kwaipilot/KAT-Dev (32B)**, designed for deep understanding of the **Hyperswitch** (Rust) payment orchestration codebase. This model uses a **3-phase curriculum training pipeline**, progressively enhancing the model's grasp of Rust patterns, PR changes, repository structure, and payment processing logic. --- # 🚀 Overview This LoRA adapter was trained with a **phased CPT strategy**: ### **Phase 1 — Foundation** Learns core repository structure, Rust syntax, basic modules, and Hyperswitch architectural patterns. ### **Phase 2 — Evolution** Exposes the model to progressively complex components, multi-file interactions, workflows, and feature evolution. ### **Phase 3 — PR Mastery** Specializes on real PR changes, diffs, refactors, and reasoning across multi-module changes. The final result is a **high-signal Rust-aware, Hyperswitch-specialized LoRA adapter** ideal for: - Code generation - Code explanation - PR reasoning - Diff summarization - Documentation generation - Rust workflow automation --- # 🔧 Training Details ## LoRA Configuration ```yaml r: 128 alpha: 256 dropout: 0.05 target_modules: - q_proj - k_proj - v_proj - o_proj - gate_proj - up_proj - down_proj ``` ### Hyperparameters ```yaml learning_rate: 1e-4 micro_batch_size: 1 gradient_accumulation_steps: 6 sequence_length: 32768 train/val split: 95/5 precision: bf16 ``` ### Hardware ```yaml num_gpus: 8 gpu_name: NVIDIA H200 ``` --- # 📊 Phased Training Metrics ## **Phase 1 — Foundation** **Dataset:** `phase1_foundation.jsonl` **Epochs:** 3 | Metric | Train | Eval | |--------|-------|-------| | Loss | 0.2918 | 0.2434 | | Entropy | 0.2052 | 0.2355 | | Mean Token Accuracy | 0.9505 | 0.9331 | | Perplexity | — | **1.2756** | | Tokens | 8.88M | 8.88M | --- ## **Phase 2 — Evolution** **Dataset:** `phase2_evolution.jsonl` **Epochs:** 2 | Metric | Train | Eval | |--------|-------|-------| | Loss | 0.7255 | 0.7661 | | Entropy | 0.5080 | 0.7210 | | Mean Token Accuracy | 0.8641 | 0.8110 | | Perplexity | — | **2.1514** | | Tokens | 23.48M | 23.48M | --- ## **Phase 3 — PR Mastery** **Dataset:** `phase3_pr_mastery.jsonl` **Epochs:** 2 | Metric | Train | Eval | |--------|-------|-------| | Loss | 0.5378 | 0.5606 | | Entropy | 0.4781 | 0.5254 | | Mean Token Accuracy | 0.8749 | 0.8569 | | Perplexity | — | **1.7516** | | Tokens | 15.45M | 15.45M | --- # 📈 Summary Across All Phases ```yaml total_epochs: 7 total_phases: 3 initial_train_loss: 0.2918 final_train_loss: 0.5378 initial_eval_loss: 0.2434 final_eval_loss: 0.5606 initial_perplexity: 1.2756 final_perplexity: 1.7516 ``` --- # 🙏 Acknowledgments - **Kwaipilot Team** — For the excellent KAT-Dev 32B base model - **Juspay / Hyperswitch** — For the rich open-source Rust codebase - **Hugging Face** — For PEFT, TRL, and Transformers --- # 📚 Citation ```bibtex @misc{katdev-hyperswitch-phasedlora-2025, title={KAT-Dev-CPT-LoRA-HS-32K-maxToken-v1}, author={Aditya Narayan}, year={2025}, publisher={Hugging Face}, url={https://huggingface.co/AdityaNarayan/KAT-Dev-CPT-LoRA-HS-32K-maxToken-v1} } ```