Qwen3.5

Name: Qwen3.5 local performance reports
Creator: llamaperf
License: https://creativecommons.org/licenses/by/4.0/

Alibaba · 1 report

Thin page (1 of 3 reports needed for indexing). Add yours.

Qwen3.5 4B NuExtract3

8× H100 80GB

visionsummarization

Model based on Qwen3.5-4B. Trained on 8xH100 for 3 days. Supports Safetensors, GGUF, MLX weights. Requires as little as 4GB VRAM. Multiple quantizations available (GPTQ, W8A8, FP8, Q4, Q6). Tested with vLLM, SGLang, llama.cpp.