How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Clevyby/DarkSapling-v2.0-7B-Q5_K_S-GGUF-iMatrix",
	filename="DarkSapling-7B-v2.0-Q5_K_S-imat.gguf",
)
output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Note:

This repo hosts only a Q5_K_S iMatrix of Dark Sapling v2.0 7B. GGUF quant is from Lewdiculous/DarkSapling-7B-v2.0-GGUF-IQ-Imatrix. The additional files in this GGUF repo is for personal usage using Text Gen Webui with llamacpp_hf.

Downloads last month
93
GGUF
Model size
7B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

5-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support