File size: 4,133 Bytes
6125ba4
 
 
 
 
 
 
 
 
 
dd7908b
 
6125ba4
 
c09cc2e
 
 
991d243
c09cc2e
 
 
 
dd7908b
fd73212
 
 
 
 
dd7908b
fd73212
991d243
fd73212
dd7908b
fd73212
dd7908b
 
 
 
 
 
 
 
e1f8aa1
 
b62eaac
 
dd7908b
 
 
5112292
 
714ee63
5112292
52e0c53
5112292
fd73212
dd7908b
 
 
 
5112292
 
 
52e0c53
5112292
dd7908b
 
1e755d9
 
5d9ccb3
 
 
 
 
 
 
 
 
 
 
 
b8ee068
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
base_model: openchat/openchat-3.5-0106
inference: false
library_name: transformers
license: apache-2.0
model_creator: OpenChat
model_name: Openchat 3.5 0106
model_type: mistral
pipeline_tag: text-generation
quantized_by: Second State Inc.
tags:
- openchat
---

<!-- header start -->
<!-- 200823 -->
<div style="width: auto; margin-left: auto; margin-right: auto">
<img src="https://github.com/LlamaEdge/LlamaEdge/raw/dev/assets/logo.svg" style="width: 100%; min-width: 400px; display: block; margin: auto;">
</div>
<hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
<!-- header end -->

# OpenChat-3.5-0106-GGUF

## Original Model

[openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106)

## Run with LlamaEdge

- LlamaEdge version: [v0.2.8](https://github.com/LlamaEdge/LlamaEdge/releases/tag/0.2.8) and above

- Prompt template

  - Prompt type: `openchat`
  
  - Prompt string

    ```text
    GPT4 User: {prompt}<|end_of_turn|>GPT4 Assistant:
    ```

  - Reverse prompt: `<|end_of_turn|>`

- Context size: `4096`

- Run as LlamaEdge service

  ```bash
  wasmedge --dir .:. --nn-preload default:GGML:AUTO:openchat-3.5-0106-Q5_K_M.gguf \
    llama-api-server.wasm \
    --model-name openchat \
    --prompt-template openchat \
    --reverse-prompt '<|end_of_turn|>' \
    --ctx-size 4096
  ```

- Run as LlamaEdge command app

  ```bash
  wasmedge --dir .:. --nn-preload default:GGML:AUTO:openchat-3.5-0106-Q5_K_M.gguf \
    llama-chat.wasm \
    --prompt-template openchat \
    --reverse-prompt '<|end_of_turn|>' \
    --ctx-size 4096
  ```

## Quantized GGUF Models

| Name | Quant method | Bits | Size | Use case |
| ---- | ---- | ---- | ---- | ----- |
| [openchat-3.5-0106.Q2_K.gguf](https://huggingface.co/second-state/OpenChat-3.5-0106-GGUF/blob/main/openchat-3.5-0106-Q2_K.gguf)     | Q2_K   | 2 | 3.08 GB| smallest, significant quality loss - not recommended for most purposes |
| [openchat-3.5-0106.Q3_K_L.gguf](https://huggingface.co/second-state/OpenChat-3.5-0106-GGUF/blob/main/openchat-3.5-0106-Q3_K_L.gguf) | Q3_K_L | 3 | 3.82 GB| small, substantial quality loss |
| [openchat-3.5-0106.Q3_K_M.gguf](https://huggingface.co/second-state/OpenChat-3.5-0106-GGUF/blob/main/openchat-3.5-0106-Q3_K_M.gguf) | Q3_K_M | 3 | 3.52 GB| very small, high quality loss |
| [openchat-3.5-0106.Q3_K_S.gguf](https://huggingface.co/second-state/OpenChat-3.5-0106-GGUF/blob/main/openchat-3.5-0106-Q3_K_S.gguf) | Q3_K_S | 3 | 3.16 GB| very small, high quality loss |
| [openchat-3.5-0106.Q4_0.gguf](https://huggingface.co/second-state/OpenChat-3.5-0106-GGUF/blob/main/openchat-3.5-0106-Q4_0.gguf)     | Q4_0   | 4 | 4.11 GB| legacy; small, very high quality loss - prefer using Q3_K_M |
| [openchat-3.5-0106.Q4_K_M.gguf](https://huggingface.co/second-state/OpenChat-3.5-0106-GGUF/blob/main/openchat-3.5-0106-Q4_K_M.gguf) | Q4_K_M | 4 | 4.37 GB| medium, balanced quality - recommended |
| [openchat-3.5-0106.Q4_K_S.gguf](https://huggingface.co/second-state/OpenChat-3.5-0106-GGUF/blob/main/openchat-3.5-0106-Q4_K_S.gguf) | Q4_K_S | 4 | 4.14 GB| small, greater quality loss |
| [openchat-3.5-0106.Q5_0.gguf](https://huggingface.co/second-state/OpenChat-3.5-0106-GGUF/blob/main/openchat-3.5-0106-Q5_0.gguf)     | Q5_0   | 5 | 5.00 GB| legacy; medium, balanced quality - prefer using Q4_K_M |
| [openchat-3.5-0106.Q5_K_M.gguf](https://huggingface.co/second-state/OpenChat-3.5-0106-GGUF/blob/main/openchat-3.5-0106-Q5_K_M.gguf) | Q5_K_M | 5 | 5.13 GB| large, very low quality loss - recommended |
| [openchat-3.5-0106.Q5_K_S.gguf](https://huggingface.co/second-state/OpenChat-3.5-0106-GGUF/blob/main/openchat-3.5-0106-Q5_K_S.gguf) | Q5_K_S | 5 | 5.00 GB| large, low quality loss - recommended |
| [openchat-3.5-0106.Q6_K.gguf](https://huggingface.co/second-state/OpenChat-3.5-0106-GGUF/blob/main/openchat-3.5-0106-Q6_K.gguf)     | Q6_K   | 6 | 5.94 GB| very large, extremely low quality loss |
| [openchat-3.5-0106.Q8_0.gguf](https://huggingface.co/second-state/OpenChat-3.5-0106-GGUF/blob/main/openchat-3.5-0106-Q8_0.gguf)     | Q8_0   | 8 | 7.70 GB| very large, extremely low quality loss - not recommended |