Ellbendls commited on
Commit
956bd39
·
verified ·
1 Parent(s): bb0e3e2

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -39,3 +39,5 @@ Qwen-3-4b-Text_to_SQL-q3_k_m.gguf filter=lfs diff=lfs merge=lfs -text
39
  Qwen-3-4b-Text_to_SQL-q4_k_m.gguf filter=lfs diff=lfs merge=lfs -text
40
  Qwen-3-4b-Text_to_SQL-q5_k_m.gguf filter=lfs diff=lfs merge=lfs -text
41
  Qwen-3-4b-Text_to_SQL-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
 
 
 
39
  Qwen-3-4b-Text_to_SQL-q4_k_m.gguf filter=lfs diff=lfs merge=lfs -text
40
  Qwen-3-4b-Text_to_SQL-q5_k_m.gguf filter=lfs diff=lfs merge=lfs -text
41
  Qwen-3-4b-Text_to_SQL-q8_0.gguf filter=lfs diff=lfs merge=lfs -text
42
+ Qwen-3-4b-Text_to_SQL-q4_k_s.gguf filter=lfs diff=lfs merge=lfs -text
43
+ Qwen-3-4b-Text_to_SQL-q6_k.gguf filter=lfs diff=lfs merge=lfs -text
Qwen-3-4b-Text_to_SQL-q4_k_s.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8a1ad044b8d87cb9ddd30fb6c336ef599d47b24b7289c363f394d2f68b777f09
3
+ size 2382739104
Qwen-3-4b-Text_to_SQL-q6_k.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d7b057d6a3415d48231a9dd64c3efcf808275818a0b85dd8870636b92a9200ae
3
+ size 3305690784
README.md CHANGED
@@ -1,146 +1,11 @@
1
-
2
- ---
3
- library_name: gguf
4
- license: apache-2.0
5
- base_model:
6
- - Ellbendls/Qwen-3-4b-Text_to_SQL
7
- - Qwen/Qwen3-4B-Instruct-2507
8
- tags:
9
- - gguf
10
- - llama.cpp
11
- - qwen
12
- - text-to-sql
13
- - sql
14
- - instruct
15
- language:
16
- - eng
17
- - zho
18
- - fra
19
- - spa
20
- - por
21
- - deu
22
- - ita
23
- - rus
24
- - jpn
25
- - kor
26
- - vie
27
- - tha
28
- - ara
29
- pipeline_tag: text-generation
30
- ---
31
-
32
  # Ellbendls/Qwen-3-4b-Text_to_SQL-GGUF
33
 
34
- Quantized GGUF builds of `Ellbendls/Qwen-3-4b-Text_to_SQL` for fast CPU/GPU inference with llama.cpp-compatible runtimes.
35
-
36
- - **Base model**. Fine-tuned from **Qwen/Qwen3-4B-Instruct-2507** for Text-to-SQL.
37
- - **License**. Apache-2.0 (inherits from base). Keep attribution.
38
- - **Purpose**. Turn natural language into SQL. When schema is missing, the model can infer a simple schema then produce SQL.
39
-
40
- ## Files
41
-
42
- Base and quantized variants:
43
-
44
- - `Qwen-3-4b-Text_to_SQL-F16.gguf` — reference float16 export
45
- - `Qwen-3-4b-Text_to_SQL-q2_k.gguf`
46
- - `Qwen-3-4b-Text_to_SQL-q3_k_m.gguf`
47
- - `Qwen-3-4b-Text_to_SQL-q4_k_m.gguf` ← good default
48
- - `Qwen-3-4b-Text_to_SQL-q5_k_m.gguf`
49
- - `Qwen-3-4b-Text_to_SQL-q8_0.gguf` ← near-lossless, larger
50
-
51
- Conversion and quantization done with `llama.cpp`.
52
-
53
- ## Recommended pick
54
-
55
- - **Q4_K_M**. Best balance of speed and quality for laptops and small servers.
56
- - **Q5_K_M**. Higher quality, a bit more RAM/VRAM.
57
- - **Q8_0**. Highest quality among quants. Use if you have headroom.
58
-
59
- ## Approximate memory needs
60
-
61
- These are ballpark for a 4B model. Real usage varies by runtime and context length.
62
-
63
- - Q4_K_M: 3–4 GB RAM/VRAM
64
- - Q5_K_M: 4–5 GB
65
- - Q8_0: 6–8 GB
66
- - F16: 10–12 GB
67
-
68
- ## Quick start
69
-
70
- ### llama.cpp (CLI)
71
-
72
- CPU only:
73
- ```bash
74
- ./llama-cli -m Qwen-3-4b-Text_to_SQL-q4_k_m.gguf \
75
- -p "Generate SQL to get average salary by department in 2024." \
76
- -n 256 -t 6
77
- ````
78
-
79
- NVIDIA GPU offload (build with `-DLLAMA_CUBLAS=ON`):
80
-
81
- ```bash
82
- ./llama-cli -m Qwen-3-4b-Text_to_SQL-q4_k_m.gguf \
83
- -p "Generate SQL to get average salary by department in 2024." \
84
- -n 256 -ngl 999 -t 6
85
- ```
86
-
87
- ### Python (llama-cpp-python)
88
-
89
- ```python
90
- from llama_cpp import Llama
91
-
92
- llm = Llama(model_path="Qwen-3-4b-Text_to_SQL-q4_k_m.gguf", n_ctx=4096, n_gpu_layers=35) # set 0 for CPU-only
93
- prompt = "Generate SQL to list total orders and revenue by month for 2024."
94
- out = llm(prompt, max_tokens=256, temperature=0.2, top_p=0.9)
95
- print(out["choices"][0]["text"].strip())
96
- ```
97
-
98
- ### LM Studio / Kobold / text-generation-webui
99
-
100
- * Select the `.gguf` file and load.
101
- * Set temperature 0.1–0.3 for deterministic SQL.
102
- * Use a system prompt to anchor behavior.
103
-
104
- ## Model details
105
-
106
- * **Base**. `Qwen/Qwen3-4B-Instruct-2507` (32k context, multilingual).
107
- * **Fine-tune**. Trained on `gretelai/synthetic_text_to_sql`.
108
- * **Task**. NL → SQL. Capable of simple schema inference when needed.
109
- * **Languages**. Works best in English. Can follow prompts in several languages from the base model.
110
-
111
- ## Conversion reproducibility
112
-
113
- Export used:
114
-
115
- ```bash
116
- python convert_hf_to_gguf.py /path/to/hf_model --outtype f16 --outfile Qwen-3-4b-Text_to_SQL-F16.gguf
117
- ```
118
-
119
- Quantization used:
120
-
121
- ```bash
122
- ./llama-quantize Qwen-3-4b-Text_to_SQL-F16.gguf Qwen-3-4b-Text_to_SQL-q4_k_m.gguf Q4_K_M
123
- # likewise for q2_k, q3_k_m, q5_k_m, q8_0
124
- ```
125
-
126
- ## Intended use and limits
127
-
128
- * **Use**. Analytics, reporting, dashboards, data exploration, SQL prototyping.
129
- * **Limits**. No database connectivity. It only generates SQL text. Validate and test queries before use in production. Provide real schema for best accuracy.
130
-
131
- ## Attribution
132
-
133
- * Base model: [`Qwen/Qwen3-4B-Instruct-2507`](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507)
134
- * Fine-tuned model: [`Ellbendls/Qwen-3-4b-Text_to_SQL`](https://huggingface.co/Ellbendls/Qwen-3-4b-Text_to_SQL)
135
-
136
- ## License
137
-
138
- Apache-2.0. Include license and NOTICE from upstream when redistributing the weights. Do not imply endorsement from Qwen or original authors.
139
-
140
- ## Changelog
141
 
142
- * 2025-09-17. Initial GGUF release. Added q2\_k, q3\_k\_m, q4\_k\_m, q5\_k\_m, q8\_0, and F16.
 
 
143
 
144
- ```
145
- ::contentReference[oaicite:0]{index=0}
146
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Ellbendls/Qwen-3-4b-Text_to_SQL-GGUF
2
 
3
+ Derived GGUF exports of `Ellbendls/Qwen-3-4b-Text_to_SQL`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
+ Files:
6
+ - Base: `Qwen-3-4b-Text_to_SQL-F16.gguf`
7
+ - Quants: Qwen-3-4b-Text_to_SQL-q6_k.gguf
8
 
9
+ Converted and quantized with llama.cpp.
10
+ Attribution: base model `Ellbendls/Qwen-3-4b-Text_to_SQL` (see original license).
11
+ Generated: 2025-09-17T07:58:42.430503Z