Instructions to use alvarobartt/lince-zero-7b-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use alvarobartt/lince-zero-7b-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="alvarobartt/lince-zero-7b-GGUF",
	filename="lince-zero-7b-q4_0.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use alvarobartt/lince-zero-7b-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf alvarobartt/lince-zero-7b-GGUF:Q4_0
# Run inference directly in the terminal:
llama-cli -hf alvarobartt/lince-zero-7b-GGUF:Q4_0

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf alvarobartt/lince-zero-7b-GGUF:Q4_0
# Run inference directly in the terminal:
llama-cli -hf alvarobartt/lince-zero-7b-GGUF:Q4_0

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf alvarobartt/lince-zero-7b-GGUF:Q4_0
# Run inference directly in the terminal:
./llama-cli -hf alvarobartt/lince-zero-7b-GGUF:Q4_0

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf alvarobartt/lince-zero-7b-GGUF:Q4_0
# Run inference directly in the terminal:
./build/bin/llama-cli -hf alvarobartt/lince-zero-7b-GGUF:Q4_0

Use Docker

docker model run hf.co/alvarobartt/lince-zero-7b-GGUF:Q4_0

LM Studio
Jan

vLLM

How to use alvarobartt/lince-zero-7b-GGUF with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "alvarobartt/lince-zero-7b-GGUF"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "alvarobartt/lince-zero-7b-GGUF",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/alvarobartt/lince-zero-7b-GGUF:Q4_0

Ollama
How to use alvarobartt/lince-zero-7b-GGUF with Ollama:
```
ollama run hf.co/alvarobartt/lince-zero-7b-GGUF:Q4_0
```

Unsloth Studio new

How to use alvarobartt/lince-zero-7b-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for alvarobartt/lince-zero-7b-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for alvarobartt/lince-zero-7b-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for alvarobartt/lince-zero-7b-GGUF to start chatting

Docker Model Runner
How to use alvarobartt/lince-zero-7b-GGUF with Docker Model Runner:
```
docker model run hf.co/alvarobartt/lince-zero-7b-GGUF:Q4_0
```

Lemonade

How to use alvarobartt/lince-zero-7b-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull alvarobartt/lince-zero-7b-GGUF:Q4_0

Run and chat with the model

lemonade run user.lince-zero-7b-GGUF-Q4_0

List all available models

lemonade list

Model Card for LINCE-ZERO-7B-GGUF

LINCE-ZERO is a fine-tuned LLM for instruction following of Falcon 7B. The team/org leading the fine-tune is Clibrain, and the datasets used are both Alpaca and Dolly datasets, both translated into Spanish and augmented to 80k examples (as Clibrain claims in its model card).

This model contains the quantized variants using the GGUF format, introduced by the llama.cpp team.

Some curious may ask, why don't you just use TheBloke/lince-zero-GGUF? Well, you can use those via llama.cpp to run inference over LINCE-ZERO on low resources, but in case you want to use it via LM Studio in MacOS you will encounter some issues, as it may only work with q4_k_s, q4_k_m, q5_k_s, and q5_k_m quantization formats, and those are not included in TheBloke's.

Model Details

Model Description

Model type: Falcon
Fine-tuned from model: Falcon 7B
Created by: TIIUAE
Fine-tuned by: Clibrain
Quantized by: alvarobartt
Language(s) (NLP): Spanish
License: Apache 2.0 (disclaimer: there may be some licensing mismatch see https://huggingface.co/clibrain/lince-zero/discussions/5)

Model Sources

Repository: LINCE-ZERO

Model Files

Name	Quant method	Bits	Size	Max RAM required	Use case
lince-zero-7b-q4_k_s.gguf	Q4_K_S	4	7.41 GB	9.91 GB	small, greater quality loss
lince-zero-7b-q4_k_m.gguf	Q4_K_M	4	7.87 GB	10.37 GB	medium, balanced quality - recommended
lince-zero-7b-q5_k_s.gguf	Q5_K_S	5	8.97 GB	11.47 GB	large, low quality loss - recommended
lince-zero-7b-q5_k_m.gguf	Q5_K_M	5	9.23 GB	11.73 GB	large, very low quality loss - recommended

Note: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.

Uses

Direct Use

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

All the training details can be found at Falcon 7B - Training Details, and the fine-tuning details at LINCE-ZERO - Training Details.

Downloads last month: 56

GGUF

Model size

7B params

Architecture

falcon

Hardware compatibility

4-bit

5-bit

Model tree for alvarobartt/lince-zero-7b-GGUF

Base model

clibrain/lince-zero

Quantized

(3)

this model