hadim commited on
Commit
5dbdc6d
·
verified ·
1 Parent(s): b1ca31f

Release 0.2.0

Browse files
README.md CHANGED
@@ -28,10 +28,11 @@ These models are converted from the official KataGo PyTorch checkpoints to ONNX
28
  | `kata1-b28c512nbt-adam-s11165M-d5387M` | 28 blocks, 512 channels |
29
  | `kata1-b28c512nbt-s12043015936-d5616446734` | 28 blocks, 512 channels |
30
 
31
- Each model is available in two versions:
32
 
33
- - **`.onnx`** - Full precision (FP32)
34
- - **`.quant.onnx`** - Quantized (INT8) - ~4x smaller, suitable for web/edge deployment
 
35
 
36
  ## Usage
37
 
@@ -41,8 +42,8 @@ Each model is available in two versions:
41
  import onnxruntime as ort
42
  import numpy as np
43
 
44
- # Load the model
45
- session = ort.InferenceSession("kata1-b28c512nbt-adam-s11165M-d5387M.quant.onnx")
46
 
47
  # Prepare inputs (batch_size, channels, height, width)
48
  bin_input = np.random.randn(1, 22, 19, 19).astype(np.float32)
@@ -62,8 +63,9 @@ policy, value, miscvalue, moremiscvalue, ownership, scoring, futurepos, seki, sc
62
  ```javascript
63
  import * as ort from "onnxruntime-web";
64
 
 
65
  const session = await ort.InferenceSession.create(
66
- "kata1-b28c512nbt-adam-s11165M-d5387M.quant.onnx"
67
  );
68
 
69
  const binInput = new ort.Tensor(
@@ -135,7 +137,8 @@ If you use these models, please cite the original KataGo paper:
135
 
136
  - **Conversion Tool**: [katago-onnx](https://github.com/kaya-go/katago-onnx)
137
  - **ONNX Opset**: 17
138
- - **Quantization**: Dynamic quantization (QUInt8 weights)
 
139
  - **Dynamic Axes**: Batch size, board height/width are dynamic
140
 
141
  ## Acknowledgments
 
28
  | `kata1-b28c512nbt-adam-s11165M-d5387M` | 28 blocks, 512 channels |
29
  | `kata1-b28c512nbt-s12043015936-d5616446734` | 28 blocks, 512 channels |
30
 
31
+ Each model is available in three versions:
32
 
33
+ - **`.fp32.onnx`** - Full precision (FP32) - Recommended for browser/WASM
34
+ - **`.fp16.onnx`** - Half precision (FP16) - For native apps (CoreML, CUDA, WebGPU)
35
+ - **`.uint8.onnx`** - Quantized (UINT8) - ~4x smaller, for memory-constrained devices
36
 
37
  ## Usage
38
 
 
42
  import onnxruntime as ort
43
  import numpy as np
44
 
45
+ # Load the model (use .fp32.onnx for browser/WASM, .fp16.onnx for native apps)
46
+ session = ort.InferenceSession("kata1-b28c512nbt-adam-s11165M-d5387M.fp32.onnx")
47
 
48
  # Prepare inputs (batch_size, channels, height, width)
49
  bin_input = np.random.randn(1, 22, 19, 19).astype(np.float32)
 
63
  ```javascript
64
  import * as ort from "onnxruntime-web";
65
 
66
+ // Use .fp32.onnx for WASM backend, or .uint8.onnx for smaller download size
67
  const session = await ort.InferenceSession.create(
68
+ "kata1-b28c512nbt-adam-s11165M-d5387M.fp32.onnx"
69
  );
70
 
71
  const binInput = new ort.Tensor(
 
137
 
138
  - **Conversion Tool**: [katago-onnx](https://github.com/kaya-go/katago-onnx)
139
  - **ONNX Opset**: 17
140
+ - **FP16 Conversion**: Internal computations in FP16, I/O remains FP32 for compatibility
141
+ - **UINT8 Quantization**: Dynamic quantization with QUInt8 weights
142
  - **Dynamic Axes**: Batch size, board height/width are dynamic
143
 
144
  ## Acknowledgments
kata1-b28c512nbt-adam-s11165M-d5387M/kata1-b28c512nbt-adam-s11165M-d5387M.fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:190537d99c0df828be79e0c429f19e57fd91a7cbf51c3be1c7586fd58a93db6f
3
+ size 146968796
kata1-b28c512nbt-adam-s11165M-d5387M/{kata1-b28c512nbt-adam-s11165M-d5387M.onnx → kata1-b28c512nbt-adam-s11165M-d5387M.fp32.onnx} RENAMED
File without changes
kata1-b28c512nbt-adam-s11165M-d5387M/{kata1-b28c512nbt-adam-s11165M-d5387M.quant.onnx → kata1-b28c512nbt-adam-s11165M-d5387M.uint8.onnx} RENAMED
File without changes
kata1-b28c512nbt-s12043015936-d5616446734/kata1-b28c512nbt-s12043015936-d5616446734.fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8340f1b31fb33c00bd368da4801711c1053a0c414b677797efe0bef68ca8dfbc
3
+ size 146968796
kata1-b28c512nbt-s12043015936-d5616446734/{kata1-b28c512nbt-s12043015936-d5616446734.onnx → kata1-b28c512nbt-s12043015936-d5616446734.fp32.onnx} RENAMED
File without changes
kata1-b28c512nbt-s12043015936-d5616446734/{kata1-b28c512nbt-s12043015936-d5616446734.quant.onnx → kata1-b28c512nbt-s12043015936-d5616446734.uint8.onnx} RENAMED
File without changes