Instructions to use zpm/Llama-3.1-PersianQA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zpm/Llama-3.1-PersianQA with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("document-question-answering", model="zpm/Llama-3.1-PersianQA")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("zpm/Llama-3.1-PersianQA", dtype="auto") - llama-cpp-python
How to use zpm/Llama-3.1-PersianQA with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="zpm/Llama-3.1-PersianQA", filename="unsloth.F16.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use zpm/Llama-3.1-PersianQA with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf zpm/Llama-3.1-PersianQA:F16 # Run inference directly in the terminal: llama-cli -hf zpm/Llama-3.1-PersianQA:F16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf zpm/Llama-3.1-PersianQA:F16 # Run inference directly in the terminal: llama-cli -hf zpm/Llama-3.1-PersianQA:F16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf zpm/Llama-3.1-PersianQA:F16 # Run inference directly in the terminal: ./llama-cli -hf zpm/Llama-3.1-PersianQA:F16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf zpm/Llama-3.1-PersianQA:F16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf zpm/Llama-3.1-PersianQA:F16
Use Docker
docker model run hf.co/zpm/Llama-3.1-PersianQA:F16
- LM Studio
- Jan
- Ollama
How to use zpm/Llama-3.1-PersianQA with Ollama:
ollama run hf.co/zpm/Llama-3.1-PersianQA:F16
- Unsloth Studio new
How to use zpm/Llama-3.1-PersianQA with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for zpm/Llama-3.1-PersianQA to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for zpm/Llama-3.1-PersianQA to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for zpm/Llama-3.1-PersianQA to start chatting
- Docker Model Runner
How to use zpm/Llama-3.1-PersianQA with Docker Model Runner:
docker model run hf.co/zpm/Llama-3.1-PersianQA:F16
- Lemonade
How to use zpm/Llama-3.1-PersianQA with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull zpm/Llama-3.1-PersianQA:F16
Run and chat with the model
lemonade run user.Llama-3.1-PersianQA-F16
List all available models
lemonade list
output = llm(
"Once upon a time,",
max_tokens=512,
echo=True
)
print(output)language: fa tags: - question-answering - llama3 - Persian - QA license: apache-2.0 model_name: Llama-3.1-PersianQA
Model Card for Llama-3.1-PersianQA
Model Description
The Llama-3.1-PersianQA model is a fine-tuned version of Llama3 for Persian question-answering tasks. This model is designed to provide accurate answers to questions posed in Persian, based on the provided context. It has been trained on a dataset specific to Persian language QA tasks to enhance its performance in understanding and generating responses in Persian.
Intended Use
This model is intended for use in applications requiring Persian language question answering. It can be integrated into chatbots, virtual assistants, and other systems where users interact in Persian and need accurate responses to their questions based on a given context.
Use Cases
- Customer Support: Automate responses to customer queries in Persian.
- Educational Tools: Provide assistance and answers to questions in Persian educational platforms.
- Content Retrieval: Extract relevant information from Persian texts based on user queries.
Training Data
The model was fine-tuned on a Persian question-answering dataset, which includes various domains and topics to ensure generalization across different types of questions. The dataset used for training contains question-context pairs and corresponding answers in Persian.
Model Architecture
- Base Model: Llama3
- Task: Question Answering
- Language: Persian
Performance
The model has been evaluated on a set of Persian QA benchmarks and performs well across various metrics. Performance may vary depending on the specific domain and nature of the questions.
How to Use
You can use the Llama-3.1-PersianQA model with the Hugging Face transformers library. Here is a sample code to get started:
from transformers import pipeline
# Load the model
qa_pipeline = pipeline("question-answering", model="zpm/Llama-3.1-PersianQA")
# Example usage
context = "شرکت فولاد مبارکۀ اصفهان، بزرگترین واحد صنعتی خصوصی در ایران و بزرگترین مجتمع تولید فولاد در خاورمیانه است."
question = "شرکت فولاد مبارکه در کجا واقع شده است؟"
result = qa_pipeline(question=question, context=context)
print(result)
- Downloads last month
- 535
4-bit
16-bit
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="zpm/Llama-3.1-PersianQA", filename="", )