YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

💻 Source Code & Training Details

The complete training pipeline, SFT scripts, and inference code for this model are available on GitHub: 👉 GitHub Repository: SIREN AI

SIREN: Turkish Knowledge-Rich AssistantSIREN (Platinum Final Assistant) is a state-of-the-art Turkish language model developed using a Hybrid Data Strategy. It is designed not just for casual conversation, but for providing deep, encyclopedic knowledge and performing complex instruction-following tasks.Built upon the robust Cosmos Platinum foundation, SIREN bridges the gap between a friendly chatbot and a knowledgeable expert.🧠 Model Architecture & FoundationBase Model: Cosmos Platinum (Custom Turkish Pre-training)Architecture: GPT-Based Decoder-Only TransformerTokenizer: Custom Turkish Unigram Tokenizer (32k Vocab)Training Method: Supervised Fine-Tuning (SFT) + Knowledge InjectionLanguage: Turkish (Native Proficiency)📊 Datasets & Training StrategySIREN was fine-tuned on a carefully curated mix of ~158,000 high-quality samples, combining three distinct data sources to achieve a balanced personality:1. Instruction Following & Chat (33%)Source: TFLai/Turkish-AlpacaVolume: ~52,000 samplesPurpose: To enable natural, human-like interaction, reasoning, and task execution (e.g., writing emails, summarizing text, coding).2. Deep Knowledge Injection (63%)Source: selmanbaysan/turkish_embedding_model_training_data (Filtered)Volume: 100,000+ Selected SamplesPurpose: Unlike standard chatbots that may hallucinate on facts, SIREN was exposed to vast amounts of encyclopedic data (History, Science, Geography) to provide accurate and context-rich answers.3. Identity Alignment (4%)Method: Synthetic Identity Injection (Oversampled)Volume: 6,000+ Weighted SamplesPurpose: The model possesses a strong self-awareness. It recognizes itself as SIREN, understands its purpose, and refuses to hallucinate a different identity.🚀 CapabilitiesFeatureDescription🤖 Identity AwareConsistently identifies itself as SIREN and understands its developer context.📚 Knowledge ExpertProvides detailed, encyclopedia-style explanations for factual queries (e.g., "Who is Genghis Khan?").💬 Fluent TurkishMasters Turkish morphology, idioms, and cultural nuances better than many multilingual models.🛠️ Task OrientedCapable of following complex instructions for creative writing and logical reasoning.💻 Usage (Python)You can use the model directly with the transformers library or via the custom app.py interface provided in the space.Pythonimport torch

Note: This model uses a custom architecture code provided in the repo.

Please refer to the 'app.py' or 'model.py' file for the specific GPT class definition.

Example Prompt Format:

Soru: [Your Question]

Cevap:

⚙️ Technical SpecificationsEpochs: 1 (To prevent overfitting and preserve generalization)Batch Size: 4 (Gradient Accumulation: 16)Optimizer: AdamWLearning Rate: 1e-5Loss Function: Cross Entropy (Final Loss: ~2.95)⚖️ DisclaimerThis model is released for research and development purposes. While it aims for high accuracy, large language models can produce incorrect information. Users should verify critical information.Developed by Deniz Kaya & The SIREN Team

Downloads last month
20
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using denizkaya2022/Siren_Turkce_GPT 1