[GSoC] Add block quantized models (#270)
Browse files* Gemm and MatMul block quantization support
* refactoring
* fix indentation
* node name independent
* Block quantization tool:
- constant weight category supported
- add data type saturation
- handled the case in which all the elements within a block are the same
benchmark script modified to support block quantized models
block quantized some models
* add missing block quantized models
* formatting
* add blocked models to eval script. Evaluation yunet
* Add sface and pphumanseg evaluation, block quantization tool fix, handpose blocked model fix, removed blocked CRNN EN,
* changed evaluation metric in block_quantize script and add verbose mode
* Add evaluation for PP-ResNet and Mobilenet
* changed file suffix and update readmes
* renamed int8bq
README.md
CHANGED
|
@@ -8,15 +8,18 @@ Note:
|
|
| 8 |
- Model files encode MobileFaceNet instances trained on the SFace loss function, see the [SFace paper](https://arxiv.org/abs/2205.12010) for reference.
|
| 9 |
- ONNX file conversions from [original code base](https://github.com/zhongyy/SFace) thanks to [Chengrui Wang](https://github.com/crywang).
|
| 10 |
- (As of Sep 2021) Supporting 5-landmark warping for now, see below for details.
|
|
|
|
| 11 |
|
| 12 |
Results of accuracy evaluation with [tools/eval](../../tools/eval).
|
| 13 |
|
| 14 |
| Models | Accuracy |
|
| 15 |
| ----------- | -------- |
|
| 16 |
| SFace | 0.9940 |
|
|
|
|
| 17 |
| SFace quant | 0.9932 |
|
| 18 |
|
| 19 |
\*: 'quant' stands for 'quantized'.
|
|
|
|
| 20 |
|
| 21 |
## Demo
|
| 22 |
|
|
|
|
| 8 |
- Model files encode MobileFaceNet instances trained on the SFace loss function, see the [SFace paper](https://arxiv.org/abs/2205.12010) for reference.
|
| 9 |
- ONNX file conversions from [original code base](https://github.com/zhongyy/SFace) thanks to [Chengrui Wang](https://github.com/crywang).
|
| 10 |
- (As of Sep 2021) Supporting 5-landmark warping for now, see below for details.
|
| 11 |
+
- `face_recognition_sface_2021dec_int8bq.onnx` represents the block-quantized version in int8 precision and is generated using [block_quantize.py](../../tools/quantize/block_quantize.py) with `block_size=64`.
|
| 12 |
|
| 13 |
Results of accuracy evaluation with [tools/eval](../../tools/eval).
|
| 14 |
|
| 15 |
| Models | Accuracy |
|
| 16 |
| ----------- | -------- |
|
| 17 |
| SFace | 0.9940 |
|
| 18 |
+
| SFace block | 0.9942 |
|
| 19 |
| SFace quant | 0.9932 |
|
| 20 |
|
| 21 |
\*: 'quant' stands for 'quantized'.
|
| 22 |
+
\*\*: 'block' stands for 'blockwise quantized'.
|
| 23 |
|
| 24 |
## Demo
|
| 25 |
|