llamaperf

RX 7900 XTX

AMD · 24GB · 3 reports

See what fits on this GPU →
Tone: mixed
throughput:
32.0 t/s pp
quant:
Q6 (gguf)
coding

User is considering adding a second 7900 XTX for 48GB VRAM to run larger models. Currently running Qwen 27B Q6 dense with 32K context at 32 t/s prompt processing. Main use case is coding via opencode.

Tone: positive
throughput:
40.0 t/s gen
quant:
Q4_K_M (gguf)
text-generation

~35-45 tok/s on RX 7900 XTX. Gemma 4 E4B Q4_K_M via Ollama. Best consumer AMD option. Source: gemma4-ai.com AMD GPU guide

throughput:
58.0 t/s gen · 83.0 t/s pp
quant:
FP16 (safetensors)
text-generation

Gemma 4 E4B on RX 7900 XTX via vLLM + ROCm. Default path: 57.96 gen tok/s, 82.96 prompt tok/s. Source: flexinfer.ai