Instructions to use alvarobartt/lince-zero-7b-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use alvarobartt/lince-zero-7b-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="alvarobartt/lince-zero-7b-GGUF", filename="lince-zero-7b-q4_0.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use alvarobartt/lince-zero-7b-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf alvarobartt/lince-zero-7b-GGUF:Q4_0 # Run inference directly in the terminal: llama-cli -hf alvarobartt/lince-zero-7b-GGUF:Q4_0
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf alvarobartt/lince-zero-7b-GGUF:Q4_0 # Run inference directly in the terminal: llama-cli -hf alvarobartt/lince-zero-7b-GGUF:Q4_0
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf alvarobartt/lince-zero-7b-GGUF:Q4_0 # Run inference directly in the terminal: ./llama-cli -hf alvarobartt/lince-zero-7b-GGUF:Q4_0
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf alvarobartt/lince-zero-7b-GGUF:Q4_0 # Run inference directly in the terminal: ./build/bin/llama-cli -hf alvarobartt/lince-zero-7b-GGUF:Q4_0
Use Docker
docker model run hf.co/alvarobartt/lince-zero-7b-GGUF:Q4_0
- LM Studio
- Jan
- vLLM
How to use alvarobartt/lince-zero-7b-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "alvarobartt/lince-zero-7b-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "alvarobartt/lince-zero-7b-GGUF", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/alvarobartt/lince-zero-7b-GGUF:Q4_0
- Ollama
How to use alvarobartt/lince-zero-7b-GGUF with Ollama:
ollama run hf.co/alvarobartt/lince-zero-7b-GGUF:Q4_0
- Unsloth Studio new
How to use alvarobartt/lince-zero-7b-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for alvarobartt/lince-zero-7b-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for alvarobartt/lince-zero-7b-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for alvarobartt/lince-zero-7b-GGUF to start chatting
- Docker Model Runner
How to use alvarobartt/lince-zero-7b-GGUF with Docker Model Runner:
docker model run hf.co/alvarobartt/lince-zero-7b-GGUF:Q4_0
- Lemonade
How to use alvarobartt/lince-zero-7b-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull alvarobartt/lince-zero-7b-GGUF:Q4_0
Run and chat with the model
lemonade run user.lince-zero-7b-GGUF-Q4_0
List all available models
lemonade list
Model Card for LINCE-ZERO-7B-GGUF
LINCE-ZERO is a fine-tuned LLM for instruction following of Falcon 7B. The team/org leading the fine-tune is Clibrain, and the datasets used are both Alpaca and Dolly datasets, both translated into Spanish and augmented to 80k examples (as Clibrain claims in its model card).
This model contains the quantized variants using the GGUF format, introduced by the llama.cpp team.
Some curious may ask, why don't you just use TheBloke/lince-zero-GGUF? Well, you can use those via llama.cpp to run inference over LINCE-ZERO on low resources, but in case you want to use it via LM Studio in MacOS you will encounter some issues, as it may only work with q4_k_s, q4_k_m, q5_k_s, and q5_k_m quantization formats, and those are not included in TheBloke's.
Model Details
Model Description
- Model type: Falcon
- Fine-tuned from model: Falcon 7B
- Created by: TIIUAE
- Fine-tuned by: Clibrain
- Quantized by: alvarobartt
- Language(s) (NLP): Spanish
- License: Apache 2.0 (disclaimer: there may be some licensing mismatch see https://huggingface.co/clibrain/lince-zero/discussions/5)
Model Sources
- Repository: LINCE-ZERO
Model Files
| Name | Quant method | Bits | Size | Max RAM required | Use case |
|---|---|---|---|---|---|
| lince-zero-7b-q4_k_s.gguf | Q4_K_S | 4 | 7.41 GB | 9.91 GB | small, greater quality loss |
| lince-zero-7b-q4_k_m.gguf | Q4_K_M | 4 | 7.87 GB | 10.37 GB | medium, balanced quality - recommended |
| lince-zero-7b-q5_k_s.gguf | Q5_K_S | 5 | 8.97 GB | 11.47 GB | large, low quality loss - recommended |
| lince-zero-7b-q5_k_m.gguf | Q5_K_M | 5 | 9.23 GB | 11.73 GB | large, very low quality loss - recommended |
Note: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
Uses
Direct Use
[More Information Needed]
Downstream Use [optional]
[More Information Needed]
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
Use the code below to get started with the model.
[More Information Needed]
Training Details
All the training details can be found at Falcon 7B - Training Details, and the fine-tuning details at LINCE-ZERO - Training Details.
- Downloads last month
- 56
4-bit
5-bit
Model tree for alvarobartt/lince-zero-7b-GGUF
Base model
clibrain/lince-zero