--- language: he license: apache-2.0 datasets: - custom tags: - text-classification - intent-classification - hebrew - nlp - bert - customer-service widget: - text: "שכחתי את הסיסמה שלי" example_title: "Password Reset" - text: "רוצה לבטל את המנוי" example_title: "Cancel Subscription" - text: "כמה עולה החבילה" example_title: "General Question" - text: "האתר לא עובד" example_title: "Technical Support" --- # Hebrew Intent Classification Model ## Model Description This model is a fine-tuned BERT model for Hebrew intent classification, specifically designed for customer service scenarios. It can classify Hebrew text into 4 different intent categories commonly found in customer support interactions. ## Supported Intent Classes 1. **ביטול מנוי** (Cancel Subscription) - Requests to cancel or terminate services 2. **שאלה כללית** (General Question) - General inquiries about services, pricing, or account management 3. **שכחת סיסמה** (Password Reset) - Issues related to forgotten passwords or login problems 4. **תמיכה טכנית** (Technical Support) - Technical issues, bugs, or system problems ## Usage ```python from transformers import pipeline # Load the model classifier = pipeline("text-classification", model="Huggingm1r@n/hebrew-intent-classifier") # Make predictions result = classifier("שכחתי את הסיסמה שלי") print(result) # [{'label': 'שכחת סיסמה', 'score': 0.95}] # Test other examples examples = [ "רוצה לבטל את המנוי", "כמה עולה החבילה", "האתר לא עובד" ] for text in examples: result = classifier(text) print(f"'{text}' -> {result[0]['label']} ({result[0]['score']:.2%})") ``` ## Direct Usage with Transformers ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch # Load model and tokenizer tokenizer = AutoTokenizer.from_pretrained("Huggingm1r@n/hebrew-intent-classifier") model = AutoModelForSequenceClassification.from_pretrained("Huggingm1r@n/hebrew-intent-classifier") def predict_intent(text): inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True) with torch.no_grad(): outputs = model(**inputs) logits = outputs.logits probabilities = torch.softmax(logits, dim=-1) predicted_id = torch.argmax(logits, dim=-1).item() predicted_label = model.config.id2label[predicted_id] confidence = probabilities[0][predicted_id].item() return predicted_label, confidence # Example intent, confidence = predict_intent("שכחתי את הסיסמה") print(f"Intent: {intent}, Confidence: {confidence:.2%}") ``` ## Training Details - **Base Model**: bert-base-multilingual-cased - **Training Data**: 135 Hebrew customer service examples (augmented from 12 original) - **Data Augmentation**: Manual variations, formal/informal styles, polite forms - **Performance**: >90% accuracy on validation set ## Example Predictions | Hebrew Text | Predicted Intent | English Translation | |------------|------------------|-------------------| | שכחתי את הסיסמה שלי | שכחת סיסמה | I forgot my password | | רוצה לבטל את המנוי | ביטול מנוי | Want to cancel subscription | | כמה עולה החבילה | שאלה כללית | How much does the package cost | | האתר לא עובד | תמיכה טכנית | The website doesn't work | ## Use Cases - **Customer Service Chatbots**: Route Hebrew customer queries automatically - **Support Ticket Classification**: Categorize support requests by intent - **Voice of Customer Analysis**: Analyze Hebrew customer feedback - **Automated Response Systems**: Trigger appropriate responses based on intent ## Limitations - Designed for customer service domain specifically - Limited to 4 predefined intent classes - May not work well with very informal Hebrew or slang - Requires Hebrew text input ## Model Files - Uses `safetensors` format for secure model storage - Compatible with latest Transformers library - Includes comprehensive tokenizer configuration ## Citation ```bibtex @misc{hebrew-intent-classifier-2025, title={Hebrew Intent Classification Model for Customer Service}, author={Huggingm1r@n}, year={2025}, publisher={Hugging Face}, url={https://huggingface.co/Huggingm1r@n/hebrew-intent-classifier} } ``` ## License This model is released under the Apache 2.0 License.