AI & ML interests

Make all hub models available for conversion to ONNX format.

Recent Activity

MaziyarPanahiย 
posted an update 3 days ago
view post
Post
304
Training mRNA Language Models Across 25 Species for $165

We built an end-to-end protein AI pipeline covering structure prediction, sequence design, and codon optimization. After comparing multiple transformer architectures for codon-level language modeling, CodonRoBERTa-large-v2 emerged as the clear winner with a perplexity of 4.10 and a Spearman CAI correlation of 0.40, significantly outperforming ModernBERT. We then scaled to 25 species, trained 4 production models in 55 GPU-hours, and built a species-conditioned system that no other open-source project offers. Complete results, architectural decisions, and runnable code below.

https://huggingface.co/blog/OpenMed/training-mrna-models-25-species
MaziyarPanahiย 
posted an update 10 days ago
view post
Post
2126
We annotated 119K medical images with two frontier VLMs (Qwen 3.5, Kimi K2.5), cross-validated at 93% agreement, and produced 110K training records, all for under $500. Fine-tuning 3 small models (2-3B params) improved all benchmarks: best model reaches +15.0% average exact match.

Everything is open-sourced: datasets, adapters, and code.

https://huggingface.co/blog/OpenMed/synthvision
  • 2 replies
ยท
MaziyarPanahiย 
posted an update about 1 month ago
view post
Post
4760
DNA, mRNA, proteins, AI. I spent the last year going deep into computational biology as an ML engineer. This is Part I of what I found. ๐Ÿงฌ

In 2024, AlphaFold won the Nobel Prize in Chemistry.

By 2026, the open-source community had built alternatives that outperform it.

That's the story I find most interesting about protein AI right now. Not just the science (which is incredible), but the speed at which open-source caught up. Multiple teams, independently, reproduced and then exceeded AlphaFold 3's accuracy with permissive licenses. The field went from prediction to generation: we're not just modeling known proteins anymore, we're designing new ones.

I spent months mapping this landscape for ML engineers. What the architectures actually are (spoiler: transformers and diffusion models), which tools to use for what, and which ones you can actually ship commercially.

New post on the Hugging Face blog: https://huggingface.co/blog/MaziyarPanahi/protein-ai-landscape

Hope you all enjoy! ๐Ÿค—
  • 2 replies
ยท
MaziyarPanahiย 
posted an update about 2 months ago
view post
Post
2331
Announcing: OpenMed Multilingual PII Detection Models

Today I am releasing 105 open-source models for Personally Identifiable Information (PII) detection in French, German, and Italian.

All Apache 2.0 licensed. Free for commercial use. No restrictions.

Performance:

- French: 97.97% F1 (top model)
- German: 97.61% F1 (top model)
- Italian: 97.28% F1 (top model)

All top-10 models per language exceed 96% F1

Coverage:

55+ PII entity types per language
Native ID formats: NSS (French), Sozialversicherungsnummer (German), Codice Fiscale (Italian)
Language-specific address, phone, and name patterns

Training Data:

French: 49,580 samples
German: 42,250 samples
Italian: 40,944 samples

Why Multilingual?

European healthcare operates in European languages. Clinical notes, patient records, and medical documents are generated in French, German, Italian, and other languages.

Effective de-identification requires:

- Native language understanding โ€” not translation
- Local ID format recognition โ€” each country has unique patterns
- Cultural context awareness โ€” names, addresses, and formats vary
- These models deliver production-ready accuracy without requiring data to leave your infrastructure or language.

HIPAA & GDPR Compliance
Built for US and European privacy regulations:

- On-premise deployment: Process data locally with zero external dependencies
- Data sovereignty: No API calls, no cloud services, no cross-border transfers
- Air-gapped capable: Deploy in fully isolated environments if required
- Regulatory-grade accuracy: Supporting Expert Determination standards
- HIPAA and GDPR compliance across languages, without compliance gaps.

Use Cases
- Hospital EHR systems: Automated patient record de-identification
- Clinical research: Multilingual dataset preparation for studies
- Insurance companies: Claims processing across

https://huggingface.co/collections/OpenMed/multilingual-pii-and-de-identification
  • 1 reply
ยท
MaziyarPanahiย 
posted an update about 2 months ago
view post
Post
1286
From Golden Gate Bridge to Broken JSON: Why Anthropic's SAE Steering Fails for Structured Output

I ran 6 experiments trying to use Anthropic's SAE steering for JSON generation.

- Base model: 86.8% valid JSON
- Steering only: 24.4%
- Fine-tuned: 96.6%
- FSM constrained: 100%

Steering is for semantics, not syntax.

https://huggingface.co/blog/MaziyarPanahi/sae-steering-json
MaziyarPanahiย 
posted an update about 2 months ago
view post
Post
4033
๐Ÿšจ Day 8/8: OpenMed Medical Reasoning Dataset Release - THE GRAND FINALE

Today I complete my 8-day release series with Medical-Reasoning-SFT-Mega.
The largest open medical reasoning dataset, combining 7 state-of-the-art AI models with fair distribution deduplication.

THE 7 SOURCE MODELS (Original Sample Counts):

1. Trinity-Mini: 810,284 samples
2. Qwen3-Next-80B: 604,249 samples
3. GPT-OSS-120B: 506,150 samples
4. Nemotron-Nano-30B: 444,544 samples
5. GLM-4.5-Air: 225,179 samples
6. MiniMax-M2.1: 204,773 samples
7. Baichuan-M3-235B: 124,520 samples

TOTAL BEFORE DEDUPLICATION: 2,919,699 samples

TOKEN COUNTS:
- Content tokens: 2.22 Billion
- Reasoning tokens: 1.56 Billion
- Total tokens: 3.78 Billion
- Samples with chain-of-thought: 100%

Quick Start:
from datasets import load_dataset
ds = load_dataset("OpenMed/Medical-Reasoning-SFT-Mega")


All datasets Apache 2.0 licensed. Free for research and commercial use.

Thank you for following OpenMed's release series. I can't wait to see what you build. ๐Ÿ”ฅ

OpenMed/Medical-Reasoning-SFT-Mega
OpenMed/Medical-Reasoning-SFT-GPT-OSS-120B-V2
OpenMed/Medical-Reasoning-SFT-Trinity-Mini
OpenMed/Medical-Reasoning-SFT-GLM_4.5_Air
OpenMed/Medical-Reasoning-SFT-MiniMax-M2.1
OpenMed/Medical-Reasoning-SFT-Qwen3-Next-80B
OpenMed/Medical-Reasoning-SFT-Nemotron-Nano-30B
OpenMed/Medical-Reasoning-SFT-Baichuan-M3-235B



https://huggingface.co/collections/OpenMed/medical-datasets
  • 6 replies
ยท
IlyasMoutawwakilย 
posted an update 2 months ago
view post
Post
3154
Transformers v5 just landed! ๐Ÿš€
It significantly unifies and reduces modeling code across architectures, while opening the door to a whole new class of performance optimizations.

My favorite new feature? ๐Ÿค”
The new dynamic weight loader + converter. Hereโ€™s why ๐Ÿ‘‡

Over the last few months, the core Transformers maintainers built an incredibly fast weight loader, capable of converting tensors on the fly while loading them in parallel threads. This means weโ€™re no longer constrained by how parameters are laid out inside the safetensors weight files.

In practice, this unlocks two big things:
- Much more modular modeling code. You can now clearly see how architectures build on top of each other (DeepSeek v2 โ†’ v3, Qwen v2 โ†’ v3 โ†’ MoE, etc.). This makes shared bottlenecks obvious and lets us optimize the right building blocks once, for all model families.
- Performance optimizations beyond what torch.compile can do alone. torch.compile operates on the computation graph, but it canโ€™t change parameter layouts. With the new loader, we can restructure weights at load time: fusing MoE expert projections, merging attention QKV projections, and enabling more compute-dense kernels that simply werenโ€™t possible before.

Personally, I'm honored to have contributed in this direction, including the work on optimizing MoE implementations and making modeling code more torch-exportable, so these optimizations can be ported cleanly across runtimes.

Overall, Transformers v5 is a strong signal of where the community and industry are converging: Modularity and Performance, without sacrificing Flexibility.

Transformers v5 makes its signature from_pretrained an entrypoint where you can mix and match:
- Parallelism
- Quantization
- Custom kernels
- Flash/Paged attention
- Continuous batching
- ...

Kudos to everyone involved! I highly recommend the:
Release notes: https://github.com/huggingface/transformers/releases/tag/v5.0.0
Blog post: https://huggingface.co/blog/transformers-v5
  • 3 replies
ยท
IlyasMoutawwakilย 
posted an update 2 months ago
view post
Post
2433
After 2 months of refinement, I'm happy to announce that a lot of Transformers' modeling code is now significantly more torch-compile & export-friendly ๐Ÿ”ฅ

Why it had to be done ๐Ÿ‘‡
PyTorch's Dynamo compiler is increasingly becoming the default interoperability layer for ML systems. Anything that relies on torch.export or torch.compile, from model optimization to cross-framework integrations, benefits directly when models can be captured as a single dynamo-traced graph !

Transformers models are now easier to:
โš™๏ธ Compile end-to-end with torch.compile backends
๐Ÿ“ฆ Export reliably via torch.export and torch.onnx.export
๐Ÿš€ Deploy to ONNX / ONNX Runtime, Intel Corporation's OpenVINO, NVIDIA AutoDeploy (TRT-LLM), AMD's Quark, Meta's Executorch and more hardware-specific runtimes.

This work aims at unblocking entire TorchDynamo-based toolchains that rely on exporting Transformers across runtimes and accelerators.

We are doubling down on Transformers commitment to be a first-class citizen of the PyTorch ecosystem, more exportable, more optimizable, and easier to deploy everywhere.

There are definitely some edge-cases that we still haven't addressed so don't hesitate to try compiling / exporting your favorite transformers and to open issues / PRs.

PR in the comments ! More updates coming coming soon !
  • 1 reply
ยท
MaziyarPanahiย 
posted an update 3 months ago
view post
Post
3739
๐ŸŽ‰ OpenMed 2025 Year in Review: 6 Months of Open Medical AI

I'm thrilled to share what the OpenMed community has accomplished since our July 2025 launch!

๐Ÿ“Š The Numbers

29,700,000 downloads Thank you! ๐Ÿ™

- 481 total models (475 medical NER models + 6 fine-tuned LLMs)
- 475 medical NER models in [OpenMed](
OpenMed
) organization
- 6 fine-tuned LLMs in [openmed-community](
openmed-community
)
- 551,800 PyPI downloads of the [openmed package](https://pypi.org/project/openmed/)
- 707 followers on HuggingFace (you!)
- 97 GitHub stars on the [toolkit repo](https://github.com/maziyarpanahi/openmed)

๐Ÿ† Top Models by Downloads

1. [OpenMed-NER-PharmaDetect-SuperClinical-434M]( OpenMed/OpenMed-NER-PharmaDetect-SuperClinical-434M) โ€” 147,305 downloads
2. [OpenMed-NER-ChemicalDetect-ElectraMed-33M]( OpenMed/OpenMed-NER-ChemicalDetect-ElectraMed-33M) โ€” 126,785 downloads
3. [OpenMed-NER-BloodCancerDetect-TinyMed-65M]( OpenMed/OpenMed-NER-BloodCancerDetect-TinyMed-65M) โ€” 126,465 downloads

๐Ÿ”ฌ Model Categories

Our 481 models cover comprehensive medical domains:

- Disease Detection (~50 variants)
- Pharmaceutical Detection (~50 variants)
- Oncology Detection (~50 variants)
- Genomics/DNA Detection (~80 variants)
- Chemical Detection (~50 variants)
- Species/Organism Detection (~60 variants)
- Protein Detection (~50 variants)
- Pathology Detection (~50 variants)
- Blood Cancer Detection (~30 variants)
- Anatomy Detection (~40 variants)
- Zero-Shot NER (GLiNER-based)


OpenMed

OpenMed NER: Open-Source, Domain-Adapted State-of-the-Art Transformers for Biomedical NER Across 12 Public Datasets (2508.01630)
https://huggingface.co/collections/OpenMed/medical-and-clinical-ner
https://huggingface.co/collections/OpenMed/zeroshot-medical-and-clinical-ner
OpenMed/Medical-Reasoning-SFT-GPT-OSS-120B
  • 1 reply
ยท