AMD MI50 32GB

AMD · 32GB · 2 reports

See what fits on this GPU →

This page is thin (2 of 3 reports needed for indexing). Help fill it in.

Latest Most reported Fastest t/s

Qwen3.6 27B

2× AMD MI50 32GB · llama.cpp

quant:: Q8_0 (gguf)
kv:: Q8
mtp (multi-token prediction):: on

Benchmark comparing MTP KV cache quantization (Q8_0) vs no quantization on Qwen3.6-27B-Q8_0 with llama.cpp. Results show negligible wall time difference (~0.14s) with quantized draft KV cache. Also tested with tensor parallelism on 2xMI50 32GB. Aggregate accept rate ~0.735-0.741.

Kimi K2.6

32× AMD MI50 32GB

throughput:: 9.7 t/s gen · 264.0 t/s pp

32x AMD MI50 32GB GPUs. Prompt processing 264 t/s, generation 9.7 t/s. Model: Kimi K2.6.