llamaperf

RX 9070

AMD · 16GB · 1 report

See what fits on this GPU →
This page is thin (1 of 3 reports needed for indexing). Help fill it in.

Qwen3.6 27B

RX 9070 · llama.cpp · 131,072 ctx

Tone: positive
throughput:
46.9 t/s gen · 398.4 t/s pp
quant:
UD-Q5_K_XL (gguf)
flash attention:
on
mtp (multi-token prediction):
on
codingagentic

User runs two RX 9070 XTs with ROCm, uses MTP (spec-type = draft-mtp, spec-draft-n-max = 2). Prompt t/s varies; generation t/s around 45-52. Draft acceptance rate ~0.8-0.99. User praises speed, smarts, steerability for agentic coding tasks. Quant is UD-Q5_K_XL (unsloth GGUF).