EE-Model-1.5B / README_EN.md

Upload README_EN.md with huggingface_hub

51c14ce verified 2 months ago

6.88 kB

	---
	license: apache-2.0
	language:
	- zh
	- en
	- fr
	- de
	base_model:
	- Qwen/Qwen2.5-1.5B
	tags:
	- electronic-information
	- education
	- ascend
	- mindspore
	- edge-computing
	library_name: transformers
	pipeline_tag: text-generation
	---

	# EE-Model-1.5B (Electronic Information Discipline Model)

	<div align="center">

	[English](README_EN.md) \| [简体中文](README.md)

	The World's First Lightweight Large Language Model for Electronic Information Discipline

	Empowering Higher Education with Ascend Edge-Cloud Collaborative Architecture

	</div>

	## Model Overview

	EE-Model-1.5B is one of the core components of the "Honghu" Electronic Information Professional Engine, a lightweight language model specifically designed for the electronic information discipline.

	To address the challenges of complex knowledge points, rapid technological iteration, and strong interdisciplinary nature in this field, we have constructed the EE-Bench evaluation benchmark containing over 30,000 high-quality data entries. The model has been trained and fine-tuned using the complete Ascend technology stack (MindSpore + MindIE + CANN).

	EE-Model-1.5B is designed for edge-side deployment (such as Ascend OrangePi AIPro), featuring extremely fast inference speed and minimal resource consumption. It can work collaboratively with cloud-based large models (EE-Model-72B) to achieve intelligent routing and efficient load balancing.

	## Key Features

	- Domain Expertise: Fills the gap in discipline-specific large models for electronic information, covering eight core courses including "Signals and Systems", "Communication Principles", and "Digital Signal Processing".
	- Edge Optimization: The 1.5B parameter model has been optimized through knowledge distillation and pruning, suitable for resource-constrained edge devices, supporting fast response for simple tasks and routing decisions.
	- Full-Stack Domestic Solution: From training (Ascend 910B + MindSpore) to inference (MindIE), built entirely on domestic computing infrastructure, ensuring security and controllability.
	- Multilingual Support: Supports Chinese, English, French, and German, targeting global academic exchange scenarios.

	## Training Data

	This model is fine-tuned based on EE-Bench, the world's first capability evaluation benchmark dataset for electronic information discipline.

	- Data Scale: 30,000+ high-quality professional instruction data entries.
	- Coverage: Over 20 core knowledge point systems in electronic information.
	- Question Types: Including programming, single-choice, multiple-choice, calculation, short-answer, true/false, fill-in-the-blank, proof, and comprehensive - nine major question types.
	- Construction Method: Automated extraction from textbooks/exam papers using Mineru + manual cleaning + expert verification.

	## Model Training

	EE-Model-1.5B training relies on the Ascend 910B cluster, using the LLaMA-Factory unified training framework for fine-tuning.

	- Base Model: Qwen2.5-1.5B (Base) / DeepSeek (Distilled)
	- Hardware Environment: Huawei Ascend 910B (NPU)
	- Training Framework: MindSpore, MindSpeed
	- Training Methods:
	- SFT (Supervised Fine-Tuning)
	- DPO (Direct Preference Optimization)
	- Knowledge Distillation (from EE-Model-72B)

	## Inference

	This model is perfectly compatible with the MindIE inference acceleration framework and can also be used for general inference through the Hugging Face `transformers` library.

	### Quick Start (Python)

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	# Load model
	model_path = "HongHuTeam/EE-Model-1.5B"
	tokenizer = AutoTokenizer.from_pretrained(model_path)
	model = AutoModelForCausalLM.from_pretrained(
	model_path,
	device_map="auto",
	torch_dtype=torch.float16
	)

	# Electronic information professional question example
	prompt = "Please explain the physical significance of Maxwell's equations in electromagnetic field theory."
	messages = [
	{"role": "system", "content": "You are a professional teaching assistant for electronic information discipline."},
	{"role": "user", "content": prompt}
	]

	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)

	model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

	generated_ids = model.generate(
	**model_inputs,
	max_new_tokens=512
	)
	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	print(response)
	```

	### Ascend MindIE Deployment

	For Ascend hardware (such as OrangePi AIPro), it is recommended to use MindIE for high-performance inference:

	```bash
	# MindIE Service startup example
	# Coming Soon: Detailed MindIE configuration files and startup scripts will be provided in future updates
	```

	## Model Performance

	EE-Model-1.5B demonstrates excellent performance on EE-Bench, especially in handling basic concept Q&A and routing discrimination tasks.

	\| Model \| Parameters \| EE-Bench Score \|
	\| :--- \| :---: \| :---: \|
	\| EE-Model-72B \| 72B \| 94.70% \|
	\| EE-Model-1.5B \| 1.5B \| 68.35% \|
	\| GPT-4o \| - \| 71.00% \|
	\| Qwen2.5-72B-Instruct \| 72B \| 70.12% \|
	\| Qwen2.5-1.5B-Instruct \| 1.5B \| 45.28% \|

	Note: EE-Model-1.5B significantly outperforms the base model at the same parameter scale and approaches GPT-4o level. The 1.5B version focuses primarily on lightweight deployment and edge-side fast response.

	## Applications

	Based on the Edge-Cloud Collaborative Intelligent Routing Architecture, EE-Model-1.5B primarily undertakes the following responsibilities:

	1. L3-Routing Decision: Accurately determines the text complexity of user queries, deciding whether tasks should be processed locally or uploaded to the cloud.
	2. Simple Task Quick Response: Rapidly handles low-compute tasks such as concept queries and terminology explanations.
	3. Local Privacy Protection: Processes sensitive data at the edge, reducing the need for cloud uploads.

	## License & Citation

	This project is licensed under the Apache 2.0 open source license.

	If you use this model or dataset in your research, please cite as follows:

	```bibtex
	@misc{honghu2025eemodel,
	title={EE-Model: Electronic Information Professional Engine based on Ascend},
	author={Honghu Team},
	year={2025},
	publisher={GitHub/HuggingFace},
	howpublished={\url{https://huggingface.co/HongHuTeam/EE-Model-1.5B}}
	}
	```

	## Disclaimer

	Although EE-Model has been deeply fine-tuned on electronic information professional data, as a language model, it may still produce hallucinations or incorrect information. For critical decisions involving circuit design, safety specifications, etc., please consult professionals or refer to authoritative textbooks.

	---

	Last Update: 2025-12-07

	Created by the Honghu Team (鸿斛战队)