Qwen3-VL-4B-Thinking

Run Qwen3-VL-4B-Thinking optimized for Apple Silicon on MLX with NexaSDK.

Quickstart

  1. Install NexaSDK

  2. Run the model locally with one line of code:

    nexa infer NexaAI/qwen3vl-4B-Thinking-fp16-mlx
    

Model Description

Qwen3-VL-4B-Thinking is a 4-billion-parameter multimodal large language model from the Qwen team at Alibaba Cloud.
Part of the Qwen3-VL (Vision-Language) family, it is designed for advanced visual reasoning and chain-of-thought generation across image, text, and video inputs.

Compared to the Instruct variant, the Thinking model emphasizes deeper multi-step reasoning, analysis, and planning. It produces detailed, structured outputs that reflect intermediate reasoning steps, making it well-suited for research, multimodal understanding, and agentic workflows.

Features

  • Vision-Language Understanding: Processes images, text, and videos for joint reasoning tasks.
  • Structured Thinking Mode: Generates intermediate reasoning traces for better transparency and interpretability.
  • High Accuracy on Visual QA: Performs strongly on visual question answering, chart reasoning, and document analysis benchmarks.
  • Multilingual Support: Understands and responds in multiple languages.
  • Optimized for Efficiency: Delivers strong performance at 4B scale for on-device or edge deployment.

Use Cases

  • Multimodal reasoning and visual question answering
  • Scientific and analytical reasoning tasks involving charts, tables, and documents
  • Step-by-step visual explanation or tutoring
  • Research on interpretability and chain-of-thought modeling
  • Integration into agent systems that require structured reasoning

Inputs and Outputs

Input:

  • Text, images, or combined multimodal prompts (e.g., image + question)

Output:

  • Generated text, reasoning traces, or structured responses
  • May include explicit thought steps or structured JSON reasoning sequences

License

Check the official Qwen license for terms of use and redistribution.

Downloads last month
7
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NexaAI/qwen3vl-4B-Thinking-fp16-mlx

Finetuned
(13)
this model

Collection including NexaAI/qwen3vl-4B-Thinking-fp16-mlx