ONNX
DaniAffCH commited on
Commit
1ad7c2a
·
1 Parent(s): 2d662f4

[GSoC] Add block quantized models (#270)

Browse files

* Gemm and MatMul block quantization support

* refactoring

* fix indentation

* node name independent

* Block quantization tool:
- constant weight category supported
- add data type saturation
- handled the case in which all the elements within a block are the same

benchmark script modified to support block quantized models

block quantized some models

* add missing block quantized models

* formatting

* add blocked models to eval script. Evaluation yunet

* Add sface and pphumanseg evaluation, block quantization tool fix, handpose blocked model fix, removed blocked CRNN EN,

* changed evaluation metric in block_quantize script and add verbose mode

* Add evaluation for PP-ResNet and Mobilenet

* changed file suffix and update readmes

* renamed int8bq

Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -8,15 +8,18 @@ Note:
8
  - Model files encode MobileFaceNet instances trained on the SFace loss function, see the [SFace paper](https://arxiv.org/abs/2205.12010) for reference.
9
  - ONNX file conversions from [original code base](https://github.com/zhongyy/SFace) thanks to [Chengrui Wang](https://github.com/crywang).
10
  - (As of Sep 2021) Supporting 5-landmark warping for now, see below for details.
 
11
 
12
  Results of accuracy evaluation with [tools/eval](../../tools/eval).
13
 
14
  | Models | Accuracy |
15
  | ----------- | -------- |
16
  | SFace | 0.9940 |
 
17
  | SFace quant | 0.9932 |
18
 
19
  \*: 'quant' stands for 'quantized'.
 
20
 
21
  ## Demo
22
 
 
8
  - Model files encode MobileFaceNet instances trained on the SFace loss function, see the [SFace paper](https://arxiv.org/abs/2205.12010) for reference.
9
  - ONNX file conversions from [original code base](https://github.com/zhongyy/SFace) thanks to [Chengrui Wang](https://github.com/crywang).
10
  - (As of Sep 2021) Supporting 5-landmark warping for now, see below for details.
11
+ - `face_recognition_sface_2021dec_int8bq.onnx` represents the block-quantized version in int8 precision and is generated using [block_quantize.py](../../tools/quantize/block_quantize.py) with `block_size=64`.
12
 
13
  Results of accuracy evaluation with [tools/eval](../../tools/eval).
14
 
15
  | Models | Accuracy |
16
  | ----------- | -------- |
17
  | SFace | 0.9940 |
18
+ | SFace block | 0.9942 |
19
  | SFace quant | 0.9932 |
20
 
21
  \*: 'quant' stands for 'quantized'.
22
+ \*\*: 'block' stands for 'blockwise quantized'.
23
 
24
  ## Demo
25