Thank you for those codes
@sergiopaniego
! small nitpick is that we did release the BF16, not need for unsloth here :)
https://huggingface.co/mistralai/Ministral-3-3B-Instruct-2512-BF16
Joffrey THOMAS PRO
Jofthomas
AI & ML interests
None yet
Recent Activity
updated
a dataset
about 5 hours ago
agents-course/unit4-students-scores
replied to
sergiopaniego's
post
3 days ago
NEW: @mistralai released a fantastic family of multimodal models, Ministral 3.
You can fine-tune them for free on Colab using TRL ⚡️, supporting both SFT and GRPO
Link to the notebooks:
- SFT: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_ministral3_vl.ipynb
- GRPO: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_ministral3_vl.ipynb
- TRL and more examples: https://huggingface.co/docs/trl/index
liked
a model
3 days ago
mistralai/Ministral-3-3B-Instruct-2512-BF16
Organizations
replied to
sergiopaniego's
post
3 days ago
reacted to
sergiopaniego's
post with 🔥
3 days ago
Post
2241
NEW:
@mistralai
released a fantastic family of multimodal models, Ministral 3.
You can fine-tune them for free on Colab using TRL ⚡️, supporting both SFT and GRPO
Link to the notebooks:
- SFT: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_ministral3_vl.ipynb
- GRPO: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_ministral3_vl.ipynb
- TRL and more examples: https://huggingface.co/docs/trl/index
You can fine-tune them for free on Colab using TRL ⚡️, supporting both SFT and GRPO
Link to the notebooks:
- SFT: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_ministral3_vl.ipynb
- GRPO: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_ministral3_vl.ipynb
- TRL and more examples: https://huggingface.co/docs/trl/index
posted
an
update
5 days ago
Post
3289
The new Mistral 3 models are here !
Today, we announce Mistral 3, the next generation of Mistral models. Mistral 3 includes three state-of-the-art small, dense models (14B, 8B, and 3B) and Mistral Large 3 – our most capable model to date – a sparse mixture-of-experts trained with 41B active and 675B total parameters.
All models are released under the Apache 2.0 license.
Ministrals :
https://huggingface.co/collections/mistralai/ministral-3
Mistral Large 3:
https://huggingface.co/collections/mistralai/mistral-large-3
Today, we announce Mistral 3, the next generation of Mistral models. Mistral 3 includes three state-of-the-art small, dense models (14B, 8B, and 3B) and Mistral Large 3 – our most capable model to date – a sparse mixture-of-experts trained with 41B active and 675B total parameters.
All models are released under the Apache 2.0 license.
Ministrals :
https://huggingface.co/collections/mistralai/ministral-3
Mistral Large 3:
https://huggingface.co/collections/mistralai/mistral-large-3
posted
an
update
7 months ago
Post
4654
Meet our new agentic model : 𝗗𝗲𝘃𝘀𝘁𝗿𝗮𝗹
Devstral is an open-source LLM built software engineering tasks built under a collaboration between Mistral AI and All Hands AI 🙌.
𝗞𝗲𝘆 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀 :
• 🤖 𝗔𝗴𝗲𝗻𝘁𝘀 : perfect for Agentic coding
• 🍃 𝗹𝗶𝗴𝗵𝘁𝘄𝗲𝗶𝗴𝗵𝘁: Devstral is a 𝟮𝟰𝗕 parameter based on Mistral small.
• ©️ 𝗔𝗽𝗮𝗰𝗵𝗲 𝟮.𝟬, meaning fully open-source !
• 📄 A 𝟭𝟮𝟴𝗸 context window.
📚Blog : https://mistral.ai/news/devstral
⚡API : The model is also available on our API under the name 𝗱𝗲𝘃𝘀𝘁𝗿𝗮𝗹-𝘀𝗺𝗮𝗹𝗹-𝟮𝟱𝟬𝟱
🤗 repo : mistralai/Devstral-Small-2505
Can't wait to see what you will build with it !
Devstral is an open-source LLM built software engineering tasks built under a collaboration between Mistral AI and All Hands AI 🙌.
𝗞𝗲𝘆 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀 :
• 🤖 𝗔𝗴𝗲𝗻𝘁𝘀 : perfect for Agentic coding
• 🍃 𝗹𝗶𝗴𝗵𝘁𝘄𝗲𝗶𝗴𝗵𝘁: Devstral is a 𝟮𝟰𝗕 parameter based on Mistral small.
• ©️ 𝗔𝗽𝗮𝗰𝗵𝗲 𝟮.𝟬, meaning fully open-source !
• 📄 A 𝟭𝟮𝟴𝗸 context window.
📚Blog : https://mistral.ai/news/devstral
⚡API : The model is also available on our API under the name 𝗱𝗲𝘃𝘀𝘁𝗿𝗮𝗹-𝘀𝗺𝗮𝗹𝗹-𝟮𝟱𝟬𝟱
🤗 repo : mistralai/Devstral-Small-2505
Can't wait to see what you will build with it !
reacted to
burtenshaw's
post with 🤗🚀🔥
11 months ago
Post
51453
We’re launching a FREE and CERTIFIED course on Agents!
We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents.
Here's what you'll learn:
- Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions.
- Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors.
- Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents.
- Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents.
Audience
This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents.
Enroll today and start building the next generation of AI agent applications!
https://bit.ly/hf-learn-agents
We're thrilled to announce the launch of the Hugging Face Agents course on Learn! This interactive, certified course will guide you through building and deploying your own AI agents.
Here's what you'll learn:
- Understanding Agents: We'll break down the fundamentals of AI agents, showing you how they use LLMs to perceive their environment (observations), reason about it (thoughts), and take actions. Think of a smart assistant that can book appointments, answer emails, or even write code based on your instructions.
- Building with Frameworks: You'll dive into popular agent frameworks like LangChain, LlamaIndex and smolagents. These tools provide the building blocks for creating complex agent behaviors.
- Real-World Applications: See how agents are used in practice, from automating SQL queries to generating code and summarizing complex documents.
- Certification: Earn a certification by completing the course modules, implementing a use case, and passing a benchmark assessment. This proves your skills in building and deploying AI agents.
Audience
This course is designed for anyone interested in the future of AI. Whether you're a developer, data scientist, or simply curious about AI, this course will equip you with the knowledge and skills to build your own intelligent agents.
Enroll today and start building the next generation of AI agent applications!
https://bit.ly/hf-learn-agents
reacted to
m-ric's
post with 🔥
about 1 year ago
Post
1559
Transformers v4.45.0 released: includes a lightning-fast method to build tools! ⚡️
During user research with colleagues @MoritzLaurer and @Jofthomas , we discovered that the class definition currently in used to define a Tool in
➡️ So I’ve made an easier way to build tools: just make a function with type hints + a docstring, and add a @tool decorator in front.
✅ Voilà, you’re good to go!
Read all about it in the new doc here: https://huggingface.co/docs/transformers/main/en/agents#create-a-new-tool
And don’t hesitate to give feedback, I’m all ears! 🤗
During user research with colleagues @MoritzLaurer and @Jofthomas , we discovered that the class definition currently in used to define a Tool in
transformers.agents is a bit tedious to use, because it goes in great detail.➡️ So I’ve made an easier way to build tools: just make a function with type hints + a docstring, and add a @tool decorator in front.
✅ Voilà, you’re good to go!
Read all about it in the new doc here: https://huggingface.co/docs/transformers/main/en/agents#create-a-new-tool
And don’t hesitate to give feedback, I’m all ears! 🤗
replied to
their
post
over 1 year ago
if you liked this space, you can vote for this project on the gemini api contest now : https://ai.google.dev/competition/projects/everchanging-quest
reacted to
m-ric's
post with 🔥
over 1 year ago
Post
2201
🎮 𝗔 𝗻𝗲𝘂𝗿𝗮𝗹 𝗻𝗲𝘁𝘄𝗼𝗿𝗸 𝘀𝗶𝗺𝘂𝗹𝗮𝘁𝗲𝘀 𝗗𝗢𝗢𝗠: 𝗚𝗼𝗼𝗴𝗹𝗲 𝗿𝗲𝘀𝗲𝗮𝗿𝗰𝗵𝗲𝗿𝘀 𝗼𝗽𝗲𝗻 𝘁𝗵𝗲 𝘄𝗮𝘆 𝗳𝗼𝗿 𝗰𝗼𝗺𝗽𝗹𝗲𝘁𝗲𝗹𝘆-𝗔𝗜-𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗲𝗱 𝗴𝗮𝗺𝗲𝘀!
Imagine if games were completely live-generated by an AI model : the NPCs and their dialogues, the storyline, and even the game environment. The player’s in-game actions would have a real, lasting impact on the game story.
In a very exciting paper, Google researchers just gave us the first credible glimpse of this future.
➡️ They created GameNGen, the first neural model that can simulate a complex 3D game in real-time. They use it to simulate the classic game DOOM running at over 20 frames per second on a single TPU, with image quality comparable to lossy JPEG compression. And it feels just like the true game!
Here's how they did it:
1. They trained an RL agent to play DOOM and recorded its gameplay sessions.
2. They then used these recordings to train a diffusion model to predict the next frame, based on past frames and player actions.
3. During inference, they use only 4 denoising steps (instead of the usual dozens) to generate each frame quickly.
𝗞𝗲𝘆 𝗶𝗻𝘀𝗶𝗴𝗵𝘁𝘀:
🎮🤔 Human players can barely tell the difference between short clips (3 seconds) of the real game or the simulation
🧠 The model maintains game state (health, ammo, etc.) over long periods despite having only 3 seconds of effective context length
🔄 They use "noise augmentation" during training to prevent quality degradation in long play sessions
🚀 The game runs on one TPU at 20 FPS with 4 denoising steps, or 50 FPS with model distillation (with some quality loss)
The researchers did not open source the code, but I feel like we’ve just seen a part of the future being written!
Their paper (exploding the upvote counter) 👉 Diffusion Models Are Real-Time Game Engines (2408.14837)
In a similar vein, play @Jofthomas 's 'Everchanging Quest' 🎮 Jofthomas/Everchanging-Quest
Imagine if games were completely live-generated by an AI model : the NPCs and their dialogues, the storyline, and even the game environment. The player’s in-game actions would have a real, lasting impact on the game story.
In a very exciting paper, Google researchers just gave us the first credible glimpse of this future.
➡️ They created GameNGen, the first neural model that can simulate a complex 3D game in real-time. They use it to simulate the classic game DOOM running at over 20 frames per second on a single TPU, with image quality comparable to lossy JPEG compression. And it feels just like the true game!
Here's how they did it:
1. They trained an RL agent to play DOOM and recorded its gameplay sessions.
2. They then used these recordings to train a diffusion model to predict the next frame, based on past frames and player actions.
3. During inference, they use only 4 denoising steps (instead of the usual dozens) to generate each frame quickly.
𝗞𝗲𝘆 𝗶𝗻𝘀𝗶𝗴𝗵𝘁𝘀:
🎮🤔 Human players can barely tell the difference between short clips (3 seconds) of the real game or the simulation
🧠 The model maintains game state (health, ammo, etc.) over long periods despite having only 3 seconds of effective context length
🔄 They use "noise augmentation" during training to prevent quality degradation in long play sessions
🚀 The game runs on one TPU at 20 FPS with 4 denoising steps, or 50 FPS with model distillation (with some quality loss)
The researchers did not open source the code, but I feel like we’ve just seen a part of the future being written!
Their paper (exploding the upvote counter) 👉 Diffusion Models Are Real-Time Game Engines (2408.14837)
In a similar vein, play @Jofthomas 's 'Everchanging Quest' 🎮 Jofthomas/Everchanging-Quest
posted
an
update
over 1 year ago
Post
7727
Everchanging Quest is out !
It is an LLM controlled Rogue-Like in which the LLM gets a markdown representation of the map, and should generate a JSON with the objective to fulfill on the map as well as the necessary objects and their placements.
Come test it on the space :
Jofthomas/Everchanging-Quest
It is an LLM controlled Rogue-Like in which the LLM gets a markdown representation of the map, and should generate a JSON with the objective to fulfill on the map as well as the necessary objects and their placements.
Come test it on the space :
Jofthomas/Everchanging-Quest
thanks @anakin87 , Awesome Notebook and just what I needed !
reacted to
anakin87's
post with 🔥👀👍
over 1 year ago
Post
2352
⚙️ Prompt Optimization with Haystack and DSPy
Experimental notebook: 🧪📓 https://github.com/deepset-ai/haystack-cookbook/blob/main/notebooks/prompt_optimization_with_dspy.ipynb
When building applications with LLMs, writing effective prompts is a long process of trial and error. 🔄
Often, if you switch models, you also have to change the prompt. 😩
What if you could automate this process?
💡 That's where DSPy comes in - a framework designed to algorithmically optimize prompts for Language Models.
By applying classical machine learning concepts (training and evaluation data, metrics, optimization), DSPy generates better prompts for a given model and task.
Recently, I explored combining DSPy with the robustness of Haystack Pipelines.
Here's how it works:
▶️ Start from a Haystack RAG pipeline with a basic prompt
🎯 Define a goal (in this case, get correct and concise answers)
📊 Create a DSPy program, define data and metrics
✨ Optimize and evaluate -> improved prompt
🚀 Build a refined Haystack RAG pipeline using the optimized prompt
Experimental notebook: 🧪📓 https://github.com/deepset-ai/haystack-cookbook/blob/main/notebooks/prompt_optimization_with_dspy.ipynb
When building applications with LLMs, writing effective prompts is a long process of trial and error. 🔄
Often, if you switch models, you also have to change the prompt. 😩
What if you could automate this process?
💡 That's where DSPy comes in - a framework designed to algorithmically optimize prompts for Language Models.
By applying classical machine learning concepts (training and evaluation data, metrics, optimization), DSPy generates better prompts for a given model and task.
Recently, I explored combining DSPy with the robustness of Haystack Pipelines.
Here's how it works:
▶️ Start from a Haystack RAG pipeline with a basic prompt
🎯 Define a goal (in this case, get correct and concise answers)
📊 Create a DSPy program, define data and metrics
✨ Optimize and evaluate -> improved prompt
🚀 Build a refined Haystack RAG pipeline using the optimized prompt
reacted to
bwang0911's
post with ❤️🚀
over 1 year ago
Post
3247
we are very proud to introduce
jinaai/jina-clip-v1, aka "jina-embeddings-multimodal".
The OpenAI CLIP openai/clip-vit-base-patch32 have nice performance to align text and image modality, that user can perform cross-modal text image retrieval or image classification on top of it. However, due to the training data and recipe, it can not:
1. model longer sequence of text inputs (77 token constraint).
2. align text representations (CLIP Text Tower is weak for text search).
In our latest publication, Jina CLIP: Your CLIP Model Is Also Your Text Retriever (2405.20204) , we proposed a multi-task, multi-objective learning scheme. The produced CLIP model shows:
1. Stronger cross-modal performance against OpenAI sets, 2% and 6% improvement on cross-modal retrieval recall@5.
2. Text tower of the JinaCLIP is a strong text encoder, reach the same performance as jinaai/jina-embeddings-v2-base-en, 165% improvement on MTEB[BEIR] recall@5.
3. Image tower of the JinaCLIP also shows strong performance in image-image search (CBIR), 12% recall improvement on Cifar100 test set.
If you are working on MuRAG (multimodal-retrieval argumented generation), try it out!
The OpenAI CLIP openai/clip-vit-base-patch32 have nice performance to align text and image modality, that user can perform cross-modal text image retrieval or image classification on top of it. However, due to the training data and recipe, it can not:
1. model longer sequence of text inputs (77 token constraint).
2. align text representations (CLIP Text Tower is weak for text search).
In our latest publication, Jina CLIP: Your CLIP Model Is Also Your Text Retriever (2405.20204) , we proposed a multi-task, multi-objective learning scheme. The produced CLIP model shows:
1. Stronger cross-modal performance against OpenAI sets, 2% and 6% improvement on cross-modal retrieval recall@5.
2. Text tower of the JinaCLIP is a strong text encoder, reach the same performance as jinaai/jina-embeddings-v2-base-en, 165% improvement on MTEB[BEIR] recall@5.
3. Image tower of the JinaCLIP also shows strong performance in image-image search (CBIR), 12% recall improvement on Cifar100 test set.
If you are working on MuRAG (multimodal-retrieval argumented generation), try it out!
reacted to
radames's
post with 🔥
over 1 year ago
Post
2165
AI-town now runs on Hugging Face Spaces with our API for LLMs and embeddings, including the open-source Convex backend, all in one container. Easy to duplicate and config on your own
Demo: radames/ai-town
Instructions: https://github.com/radames/ai-town-huggingface
Demo: radames/ai-town
Instructions: https://github.com/radames/ai-town-huggingface