dleemiller commited on
Commit
0c0575b
·
verified ·
1 Parent(s): 1329b4b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -11
README.md CHANGED
@@ -42,17 +42,15 @@ remaining layers with a new classification head.
42
  # NLI Evaluation Results
43
 
44
  F1-Micro scores (equivalent to accuracy) for each dataset.
45
-
46
- | Model | finecat | mnli | mnli_mismatched | snli | anli_r1 | anli_r2 | anli_r3 | wanli | lingnli |
47
- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
48
- | `dleemiller/finecat-nli-l` | **0.8152** | **0.9088** | <u>0.9217</u> | <u>0.9259</u> | **0.7400** | **0.5230** | **0.5150** | **0.7424** | **0.8689** |
49
- | `tasksource/ModernBERT-large-nli` | 0.7959 | 0.8983 | **0.9229** | 0.9188 | <u>0.7260</u> | <u>0.5110</u> | </u>0.4925</u> | <u>0.6978</u> | 0.8504 |
50
- | `dleemiller/ModernCE-large-nli` | 0.7811 | **0.9088** | 0.9205 | **0.9273** | 0.6630 | 0.4860 | 0.4408 | 0.6576 | <u>0.8566</u> |
51
- | `tasksource/ModernBERT-base-nli` | 0.7595 | 0.8685 | 0.8979 | 0.8915 | 0.6300 | 0.4820 | 0.4192 | 0.6632 | 0.8118 |
52
- | `dleemiller/ModernCE-base-nli` | 0.7533 | 0.8923 | 0.9035 | 0.9187 | 0.5240 | 0.3950 | 0.3333 | 0.6464 | 0.8282 |
53
- | `dleemiller/EttinX-nli-s` | 0.7251 | 0.8765 | 0.8798 | 0.9128 | 0.3360 | 0.2790 | 0.3083 | 0.6234 | 0.8012 |
54
- | `dleemiller/EttinX-nli-xs` | 0.7013 | 0.8376 | 0.8380 | 0.8979 | 0.2780 | 0.2840 | 0.2800 | 0.5838 | 0.7521 |
55
- | `dleemiller/EttinX-nli-xxs` | 0.6842 | 0.7988 | 0.8047 | 0.8851 | 0.2590 | 0.3060 | 0.2992 | 0.5426 | 0.7018 |
56
 
57
 
58
  ---
 
42
  # NLI Evaluation Results
43
 
44
  F1-Micro scores (equivalent to accuracy) for each dataset.
45
+ Performance was measured at bs=32 using a Nvidia Blackwell PRO 6000 Max-Q.
46
+
47
+ | Model | finecat | mnli | mnli_mismatched | snli | anli_r1 | anli_r2 | anli_r3 | wanli | lingnli | Throughput (samples/s) | Peak GPU Mem (MB) |
48
+ | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
49
+ | `MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli` | **0.8233** | <u>0.9121</u> | 0.9079 | 0.8898 | **0.7960** | **0.6830** | **0.6400** | <u>0.7700</u> | **0.8821** | 454.96 | 3250.44 |
50
+ | `dleemiller/finecat-nli-l` | <u>0.8227</u> | **0.9152** | **0.9265** | 0.9162 | <u>0.7480</u> | <u>0.5700</u> | <u>0.5433</u> | **0.7706** | <u>0.8742</u> | 539.04 | 1838.06 |
51
+ | `tasksource/ModernBERT-large-nli` | 0.7959 | 0.8983 | <u>0.9229</u> | 0.9188 | 0.7260 | 0.5110 | 0.4925 | 0.6978 | 0.8504 | 543.44 | 1838.06 |
52
+ | `dleemiller/ModernCE-large-nli` | 0.7811 | 0.9088 | 0.9205 | **0.9273** | 0.6630 | 0.4860 | 0.4408 | 0.6576 | 0.8566 | 540.74 | 1838.06 |
53
+ | `cross-encoder/nli-deberta-v3-large` | 0.7618 | 0.9019 | 0.9049 | <u>0.9220</u> | 0.5300 | 0.4170 | 0.3758 | 0.6548 | 0.8466 | 448.35 | 3250.44 |
 
 
54
 
55
 
56
  ---