Upload folder using huggingface_hub

Browse files

Files changed (4) hide show

.gitattributes +2 -0
Qwen-3-4b-Text_to_SQL-q4_k_s.gguf +3 -0
Qwen-3-4b-Text_to_SQL-q6_k.gguf +3 -0
README.md +7 -142

.gitattributes CHANGED Viewed

@@ -39,3 +39,5 @@ Qwen-3-4b-Text_to_SQL-q3_k_m.gguf filter=lfs diff=lfs merge=lfs -text
 Qwen-3-4b-Text_to_SQL-q4_k_m.gguf filter=lfs diff=lfs merge=lfs -text
 Qwen-3-4b-Text_to_SQL-q5_k_m.gguf filter=lfs diff=lfs merge=lfs -text
 Qwen-3-4b-Text_to_SQL-q8_0.gguf filter=lfs diff=lfs merge=lfs -text

 Qwen-3-4b-Text_to_SQL-q4_k_m.gguf filter=lfs diff=lfs merge=lfs -text
 Qwen-3-4b-Text_to_SQL-q5_k_m.gguf filter=lfs diff=lfs merge=lfs -text
 Qwen-3-4b-Text_to_SQL-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
+Qwen-3-4b-Text_to_SQL-q4_k_s.gguf filter=lfs diff=lfs merge=lfs -text
+Qwen-3-4b-Text_to_SQL-q6_k.gguf filter=lfs diff=lfs merge=lfs -text

Qwen-3-4b-Text_to_SQL-q4_k_s.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8a1ad044b8d87cb9ddd30fb6c336ef599d47b24b7289c363f394d2f68b777f09
+size 2382739104

Qwen-3-4b-Text_to_SQL-q6_k.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d7b057d6a3415d48231a9dd64c3efcf808275818a0b85dd8870636b92a9200ae
+size 3305690784

README.md CHANGED Viewed

@@ -1,146 +1,11 @@
----
-library_name: gguf
-license: apache-2.0
-base_model:
-- Ellbendls/Qwen-3-4b-Text_to_SQL
-- Qwen/Qwen3-4B-Instruct-2507
-tags:
-- gguf
-- llama.cpp
-- qwen
-- text-to-sql
-- sql
-- instruct
-language:
-- eng
-- zho
-- fra
-- spa
-- por
-- deu
-- ita
-- rus
-- jpn
-- kor
-- vie
-- tha
-- ara
-pipeline_tag: text-generation
----
 # Ellbendls/Qwen-3-4b-Text_to_SQL-GGUF
-Quantized GGUF builds of `Ellbendls/Qwen-3-4b-Text_to_SQL` for fast CPU/GPU inference with llama.cpp-compatible runtimes.
-- **Base model**. Fine-tuned from **Qwen/Qwen3-4B-Instruct-2507** for Text-to-SQL.
-- **License**. Apache-2.0 (inherits from base). Keep attribution.
-- **Purpose**. Turn natural language into SQL. When schema is missing, the model can infer a simple schema then produce SQL.
-## Files
-Base and quantized variants:
-- `Qwen-3-4b-Text_to_SQL-F16.gguf`  — reference float16 export
-- `Qwen-3-4b-Text_to_SQL-q2_k.gguf`
-- `Qwen-3-4b-Text_to_SQL-q3_k_m.gguf`
-- `Qwen-3-4b-Text_to_SQL-q4_k_m.gguf`  ← good default
-- `Qwen-3-4b-Text_to_SQL-q5_k_m.gguf`
-- `Qwen-3-4b-Text_to_SQL-q8_0.gguf`    ← near-lossless, larger
-Conversion and quantization done with `llama.cpp`.
-## Recommended pick
-- **Q4_K_M**. Best balance of speed and quality for laptops and small servers.
-- **Q5_K_M**. Higher quality, a bit more RAM/VRAM.
-- **Q8_0**. Highest quality among quants. Use if you have headroom.
-## Approximate memory needs
-These are ballpark for a 4B model. Real usage varies by runtime and context length.
-- Q4_K_M: 3–4 GB RAM/VRAM
-- Q5_K_M: 4–5 GB
-- Q8_0: 6–8 GB
-- F16: 10–12 GB
-## Quick start
-### llama.cpp (CLI)
-CPU only:
-```bash
-./llama-cli -m Qwen-3-4b-Text_to_SQL-q4_k_m.gguf \
-  -p "Generate SQL to get average salary by department in 2024." \
-  -n 256 -t 6
-````
-NVIDIA GPU offload (build with `-DLLAMA_CUBLAS=ON`):
-```bash
-./llama-cli -m Qwen-3-4b-Text_to_SQL-q4_k_m.gguf \
-  -p "Generate SQL to get average salary by department in 2024." \
-  -n 256 -ngl 999 -t 6
-```
-### Python (llama-cpp-python)
-```python
-from llama_cpp import Llama
-llm = Llama(model_path="Qwen-3-4b-Text_to_SQL-q4_k_m.gguf", n_ctx=4096, n_gpu_layers=35)  # set 0 for CPU-only
-prompt = "Generate SQL to list total orders and revenue by month for 2024."
-out = llm(prompt, max_tokens=256, temperature=0.2, top_p=0.9)
-print(out["choices"][0]["text"].strip())
-```
-### LM Studio / Kobold / text-generation-webui
-* Select the `.gguf` file and load.
-* Set temperature 0.1–0.3 for deterministic SQL.
-* Use a system prompt to anchor behavior.
-## Model details
-* **Base**. `Qwen/Qwen3-4B-Instruct-2507` (32k context, multilingual).
-* **Fine-tune**. Trained on `gretelai/synthetic_text_to_sql`.
-* **Task**. NL → SQL. Capable of simple schema inference when needed.
-* **Languages**. Works best in English. Can follow prompts in several languages from the base model.
-## Conversion reproducibility
-Export used:
-```bash
-python convert_hf_to_gguf.py /path/to/hf_model --outtype f16 --outfile Qwen-3-4b-Text_to_SQL-F16.gguf
-```
-Quantization used:
-```bash
-./llama-quantize Qwen-3-4b-Text_to_SQL-F16.gguf Qwen-3-4b-Text_to_SQL-q4_k_m.gguf Q4_K_M
-# likewise for q2_k, q3_k_m, q5_k_m, q8_0
-```
-## Intended use and limits
-* **Use**. Analytics, reporting, dashboards, data exploration, SQL prototyping.
-* **Limits**. No database connectivity. It only generates SQL text. Validate and test queries before use in production. Provide real schema for best accuracy.
-## Attribution
-* Base model: [`Qwen/Qwen3-4B-Instruct-2507`](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507)
-* Fine-tuned model: [`Ellbendls/Qwen-3-4b-Text_to_SQL`](https://huggingface.co/Ellbendls/Qwen-3-4b-Text_to_SQL)
-## License
-Apache-2.0. Include license and NOTICE from upstream when redistributing the weights. Do not imply endorsement from Qwen or original authors.
-## Changelog
-* 2025-09-17. Initial GGUF release. Added q2\_k, q3\_k\_m, q4\_k\_m, q5\_k\_m, q8\_0, and F16.
-```
-::contentReference[oaicite:0]{index=0}
-```

 # Ellbendls/Qwen-3-4b-Text_to_SQL-GGUF
+Derived GGUF exports of `Ellbendls/Qwen-3-4b-Text_to_SQL`
+Files:
+- Base: `Qwen-3-4b-Text_to_SQL-F16.gguf`
+- Quants: Qwen-3-4b-Text_to_SQL-q6_k.gguf
+Converted and quantized with llama.cpp.
+Attribution: base model `Ellbendls/Qwen-3-4b-Text_to_SQL` (see original license).
+Generated: 2025-09-17T07:58:42.430503Z