eacortes commited on
Commit
12172b7
·
1 Parent(s): c82c645

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,14 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ model.rknn filter=lfs diff=lfs merge=lfs -text
37
+ model_b1_s1024.rknn filter=lfs diff=lfs merge=lfs -text
38
+ model_b1_s256.rknn filter=lfs diff=lfs merge=lfs -text
39
+ rknn/model_b1_s1024_o1.rknn filter=lfs diff=lfs merge=lfs -text
40
+ rknn/model_b1_s1024_o2.rknn filter=lfs diff=lfs merge=lfs -text
41
+ rknn/model_b1_s1024_o3.rknn filter=lfs diff=lfs merge=lfs -text
42
+ rknn/model_b1_s1024_w8a8.rknn filter=lfs diff=lfs merge=lfs -text
43
+ rknn/model_o1.rknn filter=lfs diff=lfs merge=lfs -text
44
+ rknn/model_o2.rknn filter=lfs diff=lfs merge=lfs -text
45
+ rknn/model_o3.rknn filter=lfs diff=lfs merge=lfs -text
46
+ rknn/model_w8a8.rknn filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,254 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: rk-transformers
3
+ license: apache-2.0
4
+ language:
5
+ - en
6
+ tags:
7
+ - fill-mask
8
+ - masked-lm
9
+ - long-context
10
+ - modernbert
11
+ - rknn
12
+ - rockchip
13
+ - npu
14
+ - rk-transformers
15
+ - rk3588
16
+ pipeline_tag: fill-mask
17
+ inference: false
18
+ datasets:
19
+ - sentence-transformers/natural-questions
20
+ base_model: answerdotai/ModernBERT-base
21
+ model_name: ModernBERT-base
22
+ ---
23
+ # ModernBERT-base (RKNN2)
24
+
25
+ > This is an RKNN-compatible version of the [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) model. It has been optimized for Rockchip NPUs using the [rk-transformers](https://github.com/emapco/rk-transformers) library.
26
+
27
+ <details><summary>Click to see the RKNN model details and usage examples</summary>
28
+
29
+ ## Model Details
30
+
31
+ - **Original Model:** [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base)
32
+ - **Target Platform:** rk3588
33
+ - **rknn-toolkit2 Version:** 2.3.2
34
+ - **rk-transformers Version:** 0.3.1
35
+
36
+ ### Available Model Files
37
+
38
+ | Model File | Optimization Level | Quantization | File Size |
39
+ | :--------- | :----------------- | :----------- | :-------- |
40
+ | [model.rknn](./model.rknn) | 0 | float16 | 316.2 MB |
41
+ | [model_b1_s1024.rknn](./model_b1_s1024.rknn) | 0 | float16 | 370.2 MB |
42
+ | [model_b1_s256.rknn](./model_b1_s256.rknn) | 0 | float16 | 301.4 MB |
43
+ | [rknn/model_b1_s1024_o1.rknn](./rknn/model_b1_s1024_o1.rknn) | 1 | float16 | 370.2 MB |
44
+ | [rknn/model_b1_s1024_o2.rknn](./rknn/model_b1_s1024_o2.rknn) | 2 | float16 | 370.2 MB |
45
+ | [rknn/model_b1_s1024_o3.rknn](./rknn/model_b1_s1024_o3.rknn) | 3 | float16 | 370.2 MB |
46
+ | [rknn/model_b1_s1024_w8a8.rknn](./rknn/model_b1_s1024_w8a8.rknn) | 0 | w8a8 | 193.0 MB |
47
+ | [rknn/model_o1.rknn](./rknn/model_o1.rknn) | 1 | float16 | 316.2 MB |
48
+ | [rknn/model_o2.rknn](./rknn/model_o2.rknn) | 2 | float16 | 316.2 MB |
49
+ | [rknn/model_o3.rknn](./rknn/model_o3.rknn) | 3 | float16 | 316.2 MB |
50
+ | [rknn/model_w8a8.rknn](./rknn/model_w8a8.rknn) | 0 | w8a8 | 164.9 MB |
51
+
52
+ ## Usage
53
+
54
+ ### Installation
55
+
56
+ Install `rk-transformers` with inference dependencies to use this model:
57
+
58
+ ```bash
59
+ pip install rk-transformers[inference]
60
+ ```
61
+
62
+ #### RK-Transformers API
63
+
64
+ ```python
65
+ from rktransformers import RKModelForFeatureExtraction
66
+ from transformers import AutoTokenizer
67
+
68
+ tokenizer = AutoTokenizer.from_pretrained("rk-transformers/ModernBERT-base")
69
+ model = RKModelForFeatureExtraction.from_pretrained(
70
+ "rk-transformers/ModernBERT-base",
71
+ platform="rk3588",
72
+ core_mask="auto",
73
+ )
74
+
75
+ inputs = tokenizer("My name is Philipp and I live in Germany.", return_tensors="np")
76
+ outputs = model(**inputs)
77
+ last_hidden_state = outputs.last_hidden_state
78
+ print(last_hidden_state.shape)
79
+
80
+ # Load specific optimized/quantized model file
81
+ model = RKModelForFeatureExtraction.from_pretrained(
82
+ "rk-transformers/ModernBERT-base",
83
+ platform="rk3588",
84
+ file_name="rknn/model_b1_s1024_w8a8.rknn"
85
+ )
86
+ ```
87
+
88
+ ## Configuration
89
+
90
+ The full configuration for all exported RKNN models is available in the [config.json](./config.json) file.
91
+
92
+ </details>
93
+
94
+ ---
95
+
96
+ # ModernBERT
97
+
98
+ ## Table of Contents
99
+ 1. [Model Summary](#model-summary)
100
+ 2. [Usage](#Usage)
101
+ 3. [Evaluation](#Evaluation)
102
+ 4. [Limitations](#limitations)
103
+ 5. [Training](#training)
104
+ 6. [License](#license)
105
+ 7. [Citation](#citation)
106
+
107
+ ## Model Summary
108
+
109
+ ModernBERT is a modernized bidirectional encoder-only Transformer model (BERT-style) pre-trained on 2 trillion tokens of English and code data with a native context length of up to 8,192 tokens. ModernBERT leverages recent architectural improvements such as:
110
+
111
+ - **Rotary Positional Embeddings (RoPE)** for long-context support.
112
+ - **Local-Global Alternating Attention** for efficiency on long inputs.
113
+ - **Unpadding and Flash Attention** for efficient inference.
114
+
115
+ ModernBERT’s native long context length makes it ideal for tasks that require processing long documents, such as retrieval, classification, and semantic search within large corpora. The model was trained on a large corpus of text and code, making it suitable for a wide range of downstream tasks, including code retrieval and hybrid (text + code) semantic search.
116
+
117
+ It is available in the following sizes:
118
+
119
+ - [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) - 22 layers, 149 million parameters
120
+ - [ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) - 28 layers, 395 million parameters
121
+
122
+ For more information about ModernBERT, we recommend our [release blog post](https://huggingface.co/blog/modernbert) for a high-level overview, and our [arXiv pre-print](https://arxiv.org/abs/2412.13663) for in-depth information.
123
+
124
+ *ModernBERT is a collaboration between [Answer.AI](https://answer.ai), [LightOn](https://lighton.ai), and friends.*
125
+
126
+ ## Usage
127
+
128
+ You can use these models directly with the `transformers` library starting from v4.48.0:
129
+
130
+ ```sh
131
+ pip install -U transformers>=4.48.0
132
+ ```
133
+
134
+ Since ModernBERT is a Masked Language Model (MLM), you can use the `fill-mask` pipeline or load it via `AutoModelForMaskedLM`. To use ModernBERT for downstream tasks like classification, retrieval, or QA, fine-tune it following standard BERT fine-tuning recipes.
135
+
136
+ **⚠️ If your GPU supports it, we recommend using ModernBERT with Flash Attention 2 to reach the highest efficiency. To do so, install Flash Attention as follows, then use the model as normal:**
137
+
138
+ ```bash
139
+ pip install flash-attn
140
+ ```
141
+
142
+ Using `AutoModelForMaskedLM`:
143
+
144
+ ```python
145
+ from transformers import AutoTokenizer, AutoModelForMaskedLM
146
+
147
+ model_id = "answerdotai/ModernBERT-base"
148
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
149
+ model = AutoModelForMaskedLM.from_pretrained(model_id)
150
+
151
+ text = "The capital of France is [MASK]."
152
+ inputs = tokenizer(text, return_tensors="pt")
153
+ outputs = model(**inputs)
154
+
155
+ # To get predictions for the mask:
156
+ masked_index = inputs["input_ids"][0].tolist().index(tokenizer.mask_token_id)
157
+ predicted_token_id = outputs.logits[0, masked_index].argmax(axis=-1)
158
+ predicted_token = tokenizer.decode(predicted_token_id)
159
+ print("Predicted token:", predicted_token)
160
+ # Predicted token: Paris
161
+ ```
162
+
163
+ Using a pipeline:
164
+
165
+ ```python
166
+ import torch
167
+ from transformers import pipeline
168
+ from pprint import pprint
169
+
170
+ pipe = pipeline(
171
+ "fill-mask",
172
+ model="answerdotai/ModernBERT-base",
173
+ torch_dtype=torch.bfloat16,
174
+ )
175
+
176
+ input_text = "He walked to the [MASK]."
177
+ results = pipe(input_text)
178
+ pprint(results)
179
+ ```
180
+
181
+ **Note:** ModernBERT does not use token type IDs, unlike some earlier BERT models. Most downstream usage is identical to standard BERT models on the Hugging Face Hub, except you can omit the `token_type_ids` parameter.
182
+
183
+ ## Evaluation
184
+
185
+ We evaluate ModernBERT across a range of tasks, including natural language understanding (GLUE), general retrieval (BEIR), long-context retrieval (MLDR), and code retrieval (CodeSearchNet and StackQA).
186
+
187
+ **Key highlights:**
188
+ - On GLUE, ModernBERT-base surpasses other similarly-sized encoder models, and ModernBERT-large is second only to Deberta-v3-large.
189
+ - For general retrieval tasks, ModernBERT performs well on BEIR in both single-vector (DPR-style) and multi-vector (ColBERT-style) settings.
190
+ - Thanks to the inclusion of code data in its training mixture, ModernBERT as a backbone also achieves new state-of-the-art code retrieval results on CodeSearchNet and StackQA.
191
+
192
+ ### Base Models
193
+
194
+ | Model | IR (DPR) | IR (DPR) | IR (DPR) | IR (ColBERT) | IR (ColBERT) | NLU | Code | Code |
195
+ |-------------|--------------|--------------|--------------|---------------|---------------|------|------|------|
196
+ | | BEIR | MLDR_OOD | MLDR_ID | BEIR | MLDR_OOD | GLUE | CSN | SQA |
197
+ | BERT | 38.9 | 23.9 | 32.2 | 49.0 | 28.1 | 84.7 | 41.2 | 59.5 |
198
+ | RoBERTa | 37.7 | 22.9 | 32.8 | 48.7 | 28.2 | 86.4 | 44.3 | 59.6 |
199
+ | DeBERTaV3 | 20.2 | 5.4 | 13.4 | 47.1 | 21.9 | 88.1 | 17.5 | 18.6 |
200
+ | NomicBERT | 41.0 | 26.7 | 30.3 | 49.9 | 61.3 | 84.0 | 41.6 | 61.4 |
201
+ | GTE-en-MLM | 41.4 | **34.3** |**44.4** | 48.2 | 69.3 | 85.6 | 44.9 | 71.4 |
202
+ | ModernBERT | **41.6** | 27.4 | 44.0 | **51.3** | **80.2** | **88.4** | **56.4** |**73.6**|
203
+
204
+ ---
205
+
206
+ ### Large Models
207
+
208
+ | Model | IR (DPR) | IR (DPR) | IR (DPR) | IR (ColBERT) | IR (ColBERT) | NLU | Code | Code |
209
+ |-------------|--------------|--------------|--------------|---------------|---------------|------|------|------|
210
+ | | BEIR | MLDR_OOD | MLDR_ID | BEIR | MLDR_OOD | GLUE | CSN | SQA |
211
+ | BERT | 38.9 | 23.3 | 31.7 | 49.5 | 28.5 | 85.2 | 41.6 | 60.8 |
212
+ | RoBERTa | 41.4 | 22.6 | 36.1 | 49.8 | 28.8 | 88.9 | 47.3 | 68.1 |
213
+ | DeBERTaV3 | 25.6 | 7.1 | 19.2 | 46.7 | 23.0 | **91.4**| 21.2 | 19.7 |
214
+ | GTE-en-MLM | 42.5 | **36.4** | **48.9** | 50.7 | 71.3 | 87.6 | 40.5 | 66.9 |
215
+ | ModernBERT | **44.0** | 34.3 | 48.6 | **52.4** | **80.4** | 90.4 |**59.5** |**83.9**|
216
+
217
+ *Table 1: Results for all models across an overview of all tasks. CSN refers to CodeSearchNet and SQA to StackQA. MLDRID refers to in-domain (fine-tuned on the training set) evaluation, and MLDR_OOD to out-of-domain.*
218
+
219
+ ModernBERT’s strong results, coupled with its efficient runtime on long-context inputs, demonstrate that encoder-only models can be significantly improved through modern architectural choices and extensive pretraining on diversified data sources.
220
+
221
+
222
+ ## Limitations
223
+
224
+ ModernBERT’s training data is primarily English and code, so performance may be lower for other languages. While it can handle long sequences efficiently, using the full 8,192 tokens window may be slower than short-context inference. Like any large language model, ModernBERT may produce representations that reflect biases present in its training data. Verify critical or sensitive outputs before relying on them.
225
+
226
+ ## Training
227
+
228
+ - Architecture: Encoder-only, Pre-Norm Transformer with GeGLU activations.
229
+ - Sequence Length: Pre-trained up to 1,024 tokens, then extended to 8,192 tokens.
230
+ - Data: 2 trillion tokens of English text and code.
231
+ - Optimizer: StableAdamW with trapezoidal LR scheduling and 1-sqrt decay.
232
+ - Hardware: Trained on 8x H100 GPUs.
233
+
234
+ See the paper for more details.
235
+
236
+ ## License
237
+
238
+ We release the ModernBERT model architectures, model weights, training codebase under the Apache 2.0 license.
239
+
240
+ ## Citation
241
+
242
+ If you use ModernBERT in your work, please cite:
243
+
244
+ ```
245
+ @misc{modernbert,
246
+ title={Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference},
247
+ author={Benjamin Warner and Antoine Chaffin and Benjamin Clavié and Orion Weller and Oskar Hallström and Said Taghadouini and Alexis Gallagher and Raja Biswas and Faisal Ladhak and Tom Aarsen and Nathan Cooper and Griffin Adams and Jeremy Howard and Iacopo Poli},
248
+ year={2024},
249
+ eprint={2412.13663},
250
+ archivePrefix={arXiv},
251
+ primaryClass={cs.CL},
252
+ url={https://arxiv.org/abs/2412.13663},
253
+ }
254
+ ```
config.json ADDED
@@ -0,0 +1,539 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "ModernBertForMaskedLM"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 50281,
8
+ "classifier_activation": "gelu",
9
+ "classifier_bias": false,
10
+ "classifier_dropout": 0.0,
11
+ "classifier_pooling": "mean",
12
+ "cls_token_id": 50281,
13
+ "decoder_bias": true,
14
+ "deterministic_flash_attn": false,
15
+ "embedding_dropout": 0.0,
16
+ "eos_token_id": 50282,
17
+ "global_attn_every_n_layers": 3,
18
+ "global_rope_theta": 160000.0,
19
+ "gradient_checkpointing": false,
20
+ "hidden_activation": "gelu",
21
+ "hidden_size": 768,
22
+ "initializer_cutoff_factor": 2.0,
23
+ "initializer_range": 0.02,
24
+ "intermediate_size": 1152,
25
+ "layer_norm_eps": 1e-05,
26
+ "local_attention": 128,
27
+ "local_rope_theta": 10000.0,
28
+ "max_position_embeddings": 8192,
29
+ "mlp_bias": false,
30
+ "mlp_dropout": 0.0,
31
+ "model_type": "modernbert",
32
+ "norm_bias": false,
33
+ "norm_eps": 1e-05,
34
+ "num_attention_heads": 12,
35
+ "num_hidden_layers": 22,
36
+ "pad_token_id": 50283,
37
+ "position_embedding_type": "absolute",
38
+ "repad_logits_with_grad": false,
39
+ "rknn": {
40
+ "model.rknn": {
41
+ "batch_size": 1,
42
+ "custom_string": null,
43
+ "dynamic_input": null,
44
+ "float_dtype": "float16",
45
+ "inputs_yuv_fmt": null,
46
+ "max_seq_length": 512,
47
+ "mean_values": null,
48
+ "model_input_names": [
49
+ "input_ids",
50
+ "attention_mask"
51
+ ],
52
+ "opset": 19,
53
+ "optimization": {
54
+ "compress_weight": false,
55
+ "enable_flash_attention": true,
56
+ "model_pruning": false,
57
+ "optimization_level": 0,
58
+ "remove_reshape": false,
59
+ "remove_weight": false,
60
+ "sparse_infer": false
61
+ },
62
+ "quantization": {
63
+ "auto_hybrid_cos_thresh": 0.98,
64
+ "auto_hybrid_euc_thresh": null,
65
+ "dataset_columns": null,
66
+ "dataset_name": null,
67
+ "dataset_size": 128,
68
+ "dataset_split": null,
69
+ "dataset_subset": null,
70
+ "do_quantization": false,
71
+ "quant_img_RGB2BGR": false,
72
+ "quantized_algorithm": "normal",
73
+ "quantized_dtype": "w8a8",
74
+ "quantized_hybrid_level": 0,
75
+ "quantized_method": "channel"
76
+ },
77
+ "rktransformers_version": "0.3.1",
78
+ "single_core_mode": false,
79
+ "std_values": null,
80
+ "target_platform": "rk3588",
81
+ "task": "feature-extraction",
82
+ "task_kwargs": null
83
+ },
84
+ "model_b1_s1024.rknn": {
85
+ "batch_size": 1,
86
+ "custom_string": null,
87
+ "dynamic_input": null,
88
+ "float_dtype": "float16",
89
+ "inputs_yuv_fmt": null,
90
+ "max_seq_length": 1024,
91
+ "mean_values": null,
92
+ "model_input_names": [
93
+ "input_ids",
94
+ "attention_mask"
95
+ ],
96
+ "opset": 19,
97
+ "optimization": {
98
+ "compress_weight": false,
99
+ "enable_flash_attention": true,
100
+ "model_pruning": false,
101
+ "optimization_level": 0,
102
+ "remove_reshape": false,
103
+ "remove_weight": false,
104
+ "sparse_infer": false
105
+ },
106
+ "quantization": {
107
+ "auto_hybrid_cos_thresh": 0.98,
108
+ "auto_hybrid_euc_thresh": null,
109
+ "dataset_columns": null,
110
+ "dataset_name": null,
111
+ "dataset_size": 128,
112
+ "dataset_split": null,
113
+ "dataset_subset": null,
114
+ "do_quantization": false,
115
+ "quant_img_RGB2BGR": false,
116
+ "quantized_algorithm": "normal",
117
+ "quantized_dtype": "w8a8",
118
+ "quantized_hybrid_level": 0,
119
+ "quantized_method": "channel"
120
+ },
121
+ "rktransformers_version": "0.3.1",
122
+ "single_core_mode": false,
123
+ "std_values": null,
124
+ "target_platform": "rk3588",
125
+ "task": "feature-extraction",
126
+ "task_kwargs": null
127
+ },
128
+ "model_b1_s256.rknn": {
129
+ "batch_size": 1,
130
+ "custom_string": null,
131
+ "dynamic_input": null,
132
+ "float_dtype": "float16",
133
+ "inputs_yuv_fmt": null,
134
+ "max_seq_length": 256,
135
+ "mean_values": null,
136
+ "model_input_names": [
137
+ "input_ids",
138
+ "attention_mask"
139
+ ],
140
+ "opset": 19,
141
+ "optimization": {
142
+ "compress_weight": false,
143
+ "enable_flash_attention": true,
144
+ "model_pruning": false,
145
+ "optimization_level": 0,
146
+ "remove_reshape": false,
147
+ "remove_weight": false,
148
+ "sparse_infer": false
149
+ },
150
+ "quantization": {
151
+ "auto_hybrid_cos_thresh": 0.98,
152
+ "auto_hybrid_euc_thresh": null,
153
+ "dataset_columns": null,
154
+ "dataset_name": null,
155
+ "dataset_size": 128,
156
+ "dataset_split": null,
157
+ "dataset_subset": null,
158
+ "do_quantization": false,
159
+ "quant_img_RGB2BGR": false,
160
+ "quantized_algorithm": "normal",
161
+ "quantized_dtype": "w8a8",
162
+ "quantized_hybrid_level": 0,
163
+ "quantized_method": "channel"
164
+ },
165
+ "rktransformers_version": "0.3.1",
166
+ "single_core_mode": false,
167
+ "std_values": null,
168
+ "target_platform": "rk3588",
169
+ "task": "feature-extraction",
170
+ "task_kwargs": null
171
+ },
172
+ "rknn/model_b1_s1024_o1.rknn": {
173
+ "batch_size": 1,
174
+ "custom_string": null,
175
+ "dynamic_input": null,
176
+ "float_dtype": "float16",
177
+ "inputs_yuv_fmt": null,
178
+ "max_seq_length": 1024,
179
+ "mean_values": null,
180
+ "model_input_names": [
181
+ "input_ids",
182
+ "attention_mask"
183
+ ],
184
+ "opset": 19,
185
+ "optimization": {
186
+ "compress_weight": false,
187
+ "enable_flash_attention": true,
188
+ "model_pruning": false,
189
+ "optimization_level": 1,
190
+ "remove_reshape": false,
191
+ "remove_weight": false,
192
+ "sparse_infer": false
193
+ },
194
+ "quantization": {
195
+ "auto_hybrid_cos_thresh": 0.98,
196
+ "auto_hybrid_euc_thresh": null,
197
+ "dataset_columns": null,
198
+ "dataset_name": null,
199
+ "dataset_size": 128,
200
+ "dataset_split": null,
201
+ "dataset_subset": null,
202
+ "do_quantization": false,
203
+ "quant_img_RGB2BGR": false,
204
+ "quantized_algorithm": "normal",
205
+ "quantized_dtype": "w8a8",
206
+ "quantized_hybrid_level": 0,
207
+ "quantized_method": "channel"
208
+ },
209
+ "rktransformers_version": "0.3.1",
210
+ "single_core_mode": false,
211
+ "std_values": null,
212
+ "target_platform": "rk3588",
213
+ "task": "feature-extraction",
214
+ "task_kwargs": null
215
+ },
216
+ "rknn/model_b1_s1024_o2.rknn": {
217
+ "batch_size": 1,
218
+ "custom_string": null,
219
+ "dynamic_input": null,
220
+ "float_dtype": "float16",
221
+ "inputs_yuv_fmt": null,
222
+ "max_seq_length": 1024,
223
+ "mean_values": null,
224
+ "model_input_names": [
225
+ "input_ids",
226
+ "attention_mask"
227
+ ],
228
+ "opset": 19,
229
+ "optimization": {
230
+ "compress_weight": false,
231
+ "enable_flash_attention": true,
232
+ "model_pruning": false,
233
+ "optimization_level": 2,
234
+ "remove_reshape": false,
235
+ "remove_weight": false,
236
+ "sparse_infer": false
237
+ },
238
+ "quantization": {
239
+ "auto_hybrid_cos_thresh": 0.98,
240
+ "auto_hybrid_euc_thresh": null,
241
+ "dataset_columns": null,
242
+ "dataset_name": null,
243
+ "dataset_size": 128,
244
+ "dataset_split": null,
245
+ "dataset_subset": null,
246
+ "do_quantization": false,
247
+ "quant_img_RGB2BGR": false,
248
+ "quantized_algorithm": "normal",
249
+ "quantized_dtype": "w8a8",
250
+ "quantized_hybrid_level": 0,
251
+ "quantized_method": "channel"
252
+ },
253
+ "rktransformers_version": "0.3.1",
254
+ "single_core_mode": false,
255
+ "std_values": null,
256
+ "target_platform": "rk3588",
257
+ "task": "feature-extraction",
258
+ "task_kwargs": null
259
+ },
260
+ "rknn/model_b1_s1024_o3.rknn": {
261
+ "batch_size": 1,
262
+ "custom_string": null,
263
+ "dynamic_input": null,
264
+ "float_dtype": "float16",
265
+ "inputs_yuv_fmt": null,
266
+ "max_seq_length": 1024,
267
+ "mean_values": null,
268
+ "model_input_names": [
269
+ "input_ids",
270
+ "attention_mask"
271
+ ],
272
+ "opset": 19,
273
+ "optimization": {
274
+ "compress_weight": false,
275
+ "enable_flash_attention": true,
276
+ "model_pruning": false,
277
+ "optimization_level": 3,
278
+ "remove_reshape": false,
279
+ "remove_weight": false,
280
+ "sparse_infer": false
281
+ },
282
+ "quantization": {
283
+ "auto_hybrid_cos_thresh": 0.98,
284
+ "auto_hybrid_euc_thresh": null,
285
+ "dataset_columns": null,
286
+ "dataset_name": null,
287
+ "dataset_size": 128,
288
+ "dataset_split": null,
289
+ "dataset_subset": null,
290
+ "do_quantization": false,
291
+ "quant_img_RGB2BGR": false,
292
+ "quantized_algorithm": "normal",
293
+ "quantized_dtype": "w8a8",
294
+ "quantized_hybrid_level": 0,
295
+ "quantized_method": "channel"
296
+ },
297
+ "rktransformers_version": "0.3.1",
298
+ "single_core_mode": false,
299
+ "std_values": null,
300
+ "target_platform": "rk3588",
301
+ "task": "feature-extraction",
302
+ "task_kwargs": null
303
+ },
304
+ "rknn/model_b1_s1024_w8a8.rknn": {
305
+ "batch_size": 1,
306
+ "custom_string": null,
307
+ "dynamic_input": null,
308
+ "float_dtype": "float16",
309
+ "inputs_yuv_fmt": null,
310
+ "max_seq_length": 1024,
311
+ "mean_values": null,
312
+ "model_input_names": [
313
+ "input_ids",
314
+ "attention_mask"
315
+ ],
316
+ "opset": 19,
317
+ "optimization": {
318
+ "compress_weight": false,
319
+ "enable_flash_attention": true,
320
+ "model_pruning": false,
321
+ "optimization_level": 0,
322
+ "remove_reshape": false,
323
+ "remove_weight": false,
324
+ "sparse_infer": false
325
+ },
326
+ "quantization": {
327
+ "auto_hybrid_cos_thresh": 0.98,
328
+ "auto_hybrid_euc_thresh": null,
329
+ "dataset_columns": [
330
+ "answer"
331
+ ],
332
+ "dataset_name": "sentence-transformers/natural-questions",
333
+ "dataset_size": 256,
334
+ "dataset_split": [
335
+ "train"
336
+ ],
337
+ "dataset_subset": null,
338
+ "do_quantization": true,
339
+ "quant_img_RGB2BGR": false,
340
+ "quantized_algorithm": "normal",
341
+ "quantized_dtype": "w8a8",
342
+ "quantized_hybrid_level": 0,
343
+ "quantized_method": "channel"
344
+ },
345
+ "rktransformers_version": "0.3.1",
346
+ "single_core_mode": false,
347
+ "std_values": null,
348
+ "target_platform": "rk3588",
349
+ "task": "feature-extraction",
350
+ "task_kwargs": null
351
+ },
352
+ "rknn/model_o1.rknn": {
353
+ "batch_size": 1,
354
+ "custom_string": null,
355
+ "dynamic_input": null,
356
+ "float_dtype": "float16",
357
+ "inputs_yuv_fmt": null,
358
+ "max_seq_length": 512,
359
+ "mean_values": null,
360
+ "model_input_names": [
361
+ "input_ids",
362
+ "attention_mask"
363
+ ],
364
+ "opset": 19,
365
+ "optimization": {
366
+ "compress_weight": false,
367
+ "enable_flash_attention": true,
368
+ "model_pruning": false,
369
+ "optimization_level": 1,
370
+ "remove_reshape": false,
371
+ "remove_weight": false,
372
+ "sparse_infer": false
373
+ },
374
+ "quantization": {
375
+ "auto_hybrid_cos_thresh": 0.98,
376
+ "auto_hybrid_euc_thresh": null,
377
+ "dataset_columns": null,
378
+ "dataset_name": null,
379
+ "dataset_size": 128,
380
+ "dataset_split": null,
381
+ "dataset_subset": null,
382
+ "do_quantization": false,
383
+ "quant_img_RGB2BGR": false,
384
+ "quantized_algorithm": "normal",
385
+ "quantized_dtype": "w8a8",
386
+ "quantized_hybrid_level": 0,
387
+ "quantized_method": "channel"
388
+ },
389
+ "rktransformers_version": "0.3.1",
390
+ "single_core_mode": false,
391
+ "std_values": null,
392
+ "target_platform": "rk3588",
393
+ "task": "feature-extraction",
394
+ "task_kwargs": null
395
+ },
396
+ "rknn/model_o2.rknn": {
397
+ "batch_size": 1,
398
+ "custom_string": null,
399
+ "dynamic_input": null,
400
+ "float_dtype": "float16",
401
+ "inputs_yuv_fmt": null,
402
+ "max_seq_length": 512,
403
+ "mean_values": null,
404
+ "model_input_names": [
405
+ "input_ids",
406
+ "attention_mask"
407
+ ],
408
+ "opset": 19,
409
+ "optimization": {
410
+ "compress_weight": false,
411
+ "enable_flash_attention": true,
412
+ "model_pruning": false,
413
+ "optimization_level": 2,
414
+ "remove_reshape": false,
415
+ "remove_weight": false,
416
+ "sparse_infer": false
417
+ },
418
+ "quantization": {
419
+ "auto_hybrid_cos_thresh": 0.98,
420
+ "auto_hybrid_euc_thresh": null,
421
+ "dataset_columns": null,
422
+ "dataset_name": null,
423
+ "dataset_size": 128,
424
+ "dataset_split": null,
425
+ "dataset_subset": null,
426
+ "do_quantization": false,
427
+ "quant_img_RGB2BGR": false,
428
+ "quantized_algorithm": "normal",
429
+ "quantized_dtype": "w8a8",
430
+ "quantized_hybrid_level": 0,
431
+ "quantized_method": "channel"
432
+ },
433
+ "rktransformers_version": "0.3.1",
434
+ "single_core_mode": false,
435
+ "std_values": null,
436
+ "target_platform": "rk3588",
437
+ "task": "feature-extraction",
438
+ "task_kwargs": null
439
+ },
440
+ "rknn/model_o3.rknn": {
441
+ "batch_size": 1,
442
+ "custom_string": null,
443
+ "dynamic_input": null,
444
+ "float_dtype": "float16",
445
+ "inputs_yuv_fmt": null,
446
+ "max_seq_length": 512,
447
+ "mean_values": null,
448
+ "model_input_names": [
449
+ "input_ids",
450
+ "attention_mask"
451
+ ],
452
+ "opset": 19,
453
+ "optimization": {
454
+ "compress_weight": false,
455
+ "enable_flash_attention": true,
456
+ "model_pruning": false,
457
+ "optimization_level": 3,
458
+ "remove_reshape": false,
459
+ "remove_weight": false,
460
+ "sparse_infer": false
461
+ },
462
+ "quantization": {
463
+ "auto_hybrid_cos_thresh": 0.98,
464
+ "auto_hybrid_euc_thresh": null,
465
+ "dataset_columns": null,
466
+ "dataset_name": null,
467
+ "dataset_size": 128,
468
+ "dataset_split": null,
469
+ "dataset_subset": null,
470
+ "do_quantization": false,
471
+ "quant_img_RGB2BGR": false,
472
+ "quantized_algorithm": "normal",
473
+ "quantized_dtype": "w8a8",
474
+ "quantized_hybrid_level": 0,
475
+ "quantized_method": "channel"
476
+ },
477
+ "rktransformers_version": "0.3.1",
478
+ "single_core_mode": false,
479
+ "std_values": null,
480
+ "target_platform": "rk3588",
481
+ "task": "feature-extraction",
482
+ "task_kwargs": null
483
+ },
484
+ "rknn/model_w8a8.rknn": {
485
+ "batch_size": 1,
486
+ "custom_string": null,
487
+ "dynamic_input": null,
488
+ "float_dtype": "float16",
489
+ "inputs_yuv_fmt": null,
490
+ "max_seq_length": 512,
491
+ "mean_values": null,
492
+ "model_input_names": [
493
+ "input_ids",
494
+ "attention_mask"
495
+ ],
496
+ "opset": 19,
497
+ "optimization": {
498
+ "compress_weight": false,
499
+ "enable_flash_attention": true,
500
+ "model_pruning": false,
501
+ "optimization_level": 0,
502
+ "remove_reshape": false,
503
+ "remove_weight": false,
504
+ "sparse_infer": false
505
+ },
506
+ "quantization": {
507
+ "auto_hybrid_cos_thresh": 0.98,
508
+ "auto_hybrid_euc_thresh": null,
509
+ "dataset_columns": [
510
+ "answer"
511
+ ],
512
+ "dataset_name": "sentence-transformers/natural-questions",
513
+ "dataset_size": 256,
514
+ "dataset_split": [
515
+ "train"
516
+ ],
517
+ "dataset_subset": null,
518
+ "do_quantization": true,
519
+ "quant_img_RGB2BGR": false,
520
+ "quantized_algorithm": "normal",
521
+ "quantized_dtype": "w8a8",
522
+ "quantized_hybrid_level": 0,
523
+ "quantized_method": "channel"
524
+ },
525
+ "rktransformers_version": "0.3.1",
526
+ "single_core_mode": false,
527
+ "std_values": null,
528
+ "target_platform": "rk3588",
529
+ "task": "feature-extraction",
530
+ "task_kwargs": null
531
+ }
532
+ },
533
+ "sep_token_id": 50282,
534
+ "sparse_pred_ignore_index": -100,
535
+ "sparse_prediction": false,
536
+ "torch_dtype": "float32",
537
+ "transformers_version": "4.55.4",
538
+ "vocab_size": 50368
539
+ }
model.rknn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:59480bb74c8f27fe5f0971a8513e0b9fa211a8fe3f3349fd7f5b22bde5f02115
3
+ size 331520806
model_b1_s1024.rknn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:47734deb70c567d2f7c1535714bac263a5a52fb7b97c77363e01bfbcc598a2f4
3
+ size 388226153
model_b1_s256.rknn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b794689c544ed093e2a13b1ad5bfeb909b677fa4bd18c9f8b26d4da51fed8cc5
3
+ size 316003110
rknn/model_b1_s1024_o1.rknn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f597f7a3a1e24201c410dcd7b5919a0c5a024172564bd271fe892b3808768566
3
+ size 388226153
rknn/model_b1_s1024_o2.rknn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b82757fa7a3c28a117b940a467cf1692ce63bcc56c5858c87cabbfb5eb7df637
3
+ size 388226153
rknn/model_b1_s1024_o3.rknn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cd3a6817885ad3723bcb2b93f26fd191dce3a4bf78355d806c51864fcf4e5863
3
+ size 388226153
rknn/model_o1.rknn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3968978004632e0c32cafc7345fec5b8996826920985a658b70090acba9186cc
3
+ size 331520806
rknn/model_o2.rknn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c1af2713e47768b9ce96c94010da6d057279ebe80b84b6c826722bc6d812982b
3
+ size 331520806
rknn/model_o3.rknn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c4cb89f9b310ee6c3741a43caa45427944501f91fb451813191e542230ee0f1a
3
+ size 331520806
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,945 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "model_input_names": [
937
+ "input_ids",
938
+ "attention_mask"
939
+ ],
940
+ "model_max_length": 8192,
941
+ "pad_token": "[PAD]",
942
+ "sep_token": "[SEP]",
943
+ "tokenizer_class": "PreTrainedTokenizerFast",
944
+ "unk_token": "[UNK]"
945
+ }