|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- zh |
|
|
- en |
|
|
- fr |
|
|
- de |
|
|
base_model: |
|
|
- Qwen/Qwen2.5-1.5B |
|
|
tags: |
|
|
- electronic-information |
|
|
- education |
|
|
- ascend |
|
|
- mindspore |
|
|
- edge-computing |
|
|
library_name: transformers |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# EE-Model-1.5B (Electronic Information Discipline Model) |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
[English](README_EN.md) | [简体中文](README.md) |
|
|
|
|
|
**The World's First Lightweight Large Language Model for Electronic Information Discipline** |
|
|
|
|
|
*Empowering Higher Education with Ascend Edge-Cloud Collaborative Architecture* |
|
|
|
|
|
</div> |
|
|
|
|
|
## Model Overview |
|
|
|
|
|
**EE-Model-1.5B** is one of the core components of the "Honghu" Electronic Information Professional Engine, a lightweight language model specifically designed for the electronic information discipline. |
|
|
|
|
|
To address the challenges of complex knowledge points, rapid technological iteration, and strong interdisciplinary nature in this field, we have constructed the **EE-Bench** evaluation benchmark containing over 30,000 high-quality data entries. The model has been trained and fine-tuned using the complete **Ascend** technology stack (MindSpore + MindIE + CANN). |
|
|
|
|
|
EE-Model-1.5B is designed for **edge-side deployment** (such as Ascend OrangePi AIPro), featuring extremely fast inference speed and minimal resource consumption. It can work collaboratively with cloud-based large models (EE-Model-72B) to achieve intelligent routing and efficient load balancing. |
|
|
|
|
|
## Key Features |
|
|
|
|
|
- **Domain Expertise**: Fills the gap in discipline-specific large models for electronic information, covering eight core courses including "Signals and Systems", "Communication Principles", and "Digital Signal Processing". |
|
|
- **Edge Optimization**: The 1.5B parameter model has been optimized through knowledge distillation and pruning, suitable for resource-constrained edge devices, supporting fast response for simple tasks and routing decisions. |
|
|
- **Full-Stack Domestic Solution**: From training (Ascend 910B + MindSpore) to inference (MindIE), built entirely on domestic computing infrastructure, ensuring security and controllability. |
|
|
- **Multilingual Support**: Supports Chinese, English, French, and German, targeting global academic exchange scenarios. |
|
|
|
|
|
## Training Data |
|
|
|
|
|
This model is fine-tuned based on **EE-Bench**, the world's first capability evaluation benchmark dataset for electronic information discipline. |
|
|
|
|
|
- **Data Scale**: 30,000+ high-quality professional instruction data entries. |
|
|
- **Coverage**: Over 20 core knowledge point systems in electronic information. |
|
|
- **Question Types**: Including programming, single-choice, multiple-choice, calculation, short-answer, true/false, fill-in-the-blank, proof, and comprehensive - nine major question types. |
|
|
- **Construction Method**: Automated extraction from textbooks/exam papers using Mineru + manual cleaning + expert verification. |
|
|
|
|
|
## Model Training |
|
|
|
|
|
EE-Model-1.5B training relies on the **Ascend 910B** cluster, using the **LLaMA-Factory** unified training framework for fine-tuning. |
|
|
|
|
|
- **Base Model**: Qwen2.5-1.5B (Base) / DeepSeek (Distilled) |
|
|
- **Hardware Environment**: Huawei Ascend 910B (NPU) |
|
|
- **Training Framework**: MindSpore, MindSpeed |
|
|
- **Training Methods**: |
|
|
- SFT (Supervised Fine-Tuning) |
|
|
- DPO (Direct Preference Optimization) |
|
|
- Knowledge Distillation (from EE-Model-72B) |
|
|
|
|
|
## Inference |
|
|
|
|
|
This model is perfectly compatible with the **MindIE** inference acceleration framework and can also be used for general inference through the Hugging Face `transformers` library. |
|
|
|
|
|
### Quick Start (Python) |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
import torch |
|
|
|
|
|
# Load model |
|
|
model_path = "HongHuTeam/EE-Model-1.5B" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_path) |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
model_path, |
|
|
device_map="auto", |
|
|
torch_dtype=torch.float16 |
|
|
) |
|
|
|
|
|
# Electronic information professional question example |
|
|
prompt = "Please explain the physical significance of Maxwell's equations in electromagnetic field theory." |
|
|
messages = [ |
|
|
{"role": "system", "content": "You are a professional teaching assistant for electronic information discipline."}, |
|
|
{"role": "user", "content": prompt} |
|
|
] |
|
|
|
|
|
text = tokenizer.apply_chat_template( |
|
|
messages, |
|
|
tokenize=False, |
|
|
add_generation_prompt=True |
|
|
) |
|
|
|
|
|
model_inputs = tokenizer([text], return_tensors="pt").to(model.device) |
|
|
|
|
|
generated_ids = model.generate( |
|
|
**model_inputs, |
|
|
max_new_tokens=512 |
|
|
) |
|
|
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] |
|
|
print(response) |
|
|
``` |
|
|
|
|
|
### Ascend MindIE Deployment |
|
|
|
|
|
For Ascend hardware (such as OrangePi AIPro), it is recommended to use MindIE for high-performance inference: |
|
|
|
|
|
```bash |
|
|
# MindIE Service startup example |
|
|
# Coming Soon: Detailed MindIE configuration files and startup scripts will be provided in future updates |
|
|
``` |
|
|
|
|
|
## Model Performance |
|
|
|
|
|
EE-Model-1.5B demonstrates excellent performance on **EE-Bench**, especially in handling basic concept Q&A and routing discrimination tasks. |
|
|
|
|
|
| Model | Parameters | EE-Bench Score | |
|
|
| :--- | :---: | :---: | |
|
|
| **EE-Model-72B** | 72B | **94.70%** | |
|
|
| **EE-Model-1.5B** | 1.5B | **68.35%** | |
|
|
| GPT-4o | - | 71.00% | |
|
|
| Qwen2.5-72B-Instruct | 72B | 70.12% | |
|
|
| Qwen2.5-1.5B-Instruct | 1.5B | 45.28% | |
|
|
|
|
|
*Note: EE-Model-1.5B significantly outperforms the base model at the same parameter scale and approaches GPT-4o level. The 1.5B version focuses primarily on lightweight deployment and edge-side fast response.* |
|
|
|
|
|
## Applications |
|
|
|
|
|
Based on the **Edge-Cloud Collaborative Intelligent Routing Architecture**, EE-Model-1.5B primarily undertakes the following responsibilities: |
|
|
|
|
|
1. **L3-Routing Decision**: Accurately determines the text complexity of user queries, deciding whether tasks should be processed locally or uploaded to the cloud. |
|
|
2. **Simple Task Quick Response**: Rapidly handles low-compute tasks such as concept queries and terminology explanations. |
|
|
3. **Local Privacy Protection**: Processes sensitive data at the edge, reducing the need for cloud uploads. |
|
|
|
|
|
## License & Citation |
|
|
|
|
|
This project is licensed under the Apache 2.0 open source license. |
|
|
|
|
|
If you use this model or dataset in your research, please cite as follows: |
|
|
|
|
|
```bibtex |
|
|
@misc{honghu2025eemodel, |
|
|
title={EE-Model: Electronic Information Professional Engine based on Ascend}, |
|
|
author={Honghu Team}, |
|
|
year={2025}, |
|
|
publisher={GitHub/HuggingFace}, |
|
|
howpublished={\url{https://huggingface.co/HongHuTeam/EE-Model-1.5B}} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Disclaimer |
|
|
|
|
|
Although EE-Model has been deeply fine-tuned on electronic information professional data, as a language model, it may still produce hallucinations or incorrect information. For critical decisions involving circuit design, safety specifications, etc., please consult professionals or refer to authoritative textbooks. |
|
|
|
|
|
--- |
|
|
|
|
|
*Last Update: 2025-12-07* |
|
|
|
|
|
*Created by the Honghu Team (鸿斛战队)* |
|
|
|