RTX 4070 Ti Super

Name: RTX 4070 Ti Super local LLM performance reports
Creator: llamaperf
License: https://creativecommons.org/licenses/by/4.0/

NVIDIA · 16GB · 1 report

See what fits on this GPU →

This page is thin (1 of 3 reports needed for indexing). Help fill it in.

Latest Most reported Fastest t/s

Qwen3.6 35B (3B active)

RTX 4070 Ti Super · ik_llama.cpp · 131,072 ctx

throughput:: 110.2 t/s gen
quant:: IQ4_XS-4.19bpw (gguf)
kv:: Q8
mtp (multi-token prediction):: on

codingsummarizationmath

Benchmark comparing llama.cpp (89.76 t/s) vs ik_llama.cpp (110.24 t/s) with MTP on Qwen3.6-35B-A3B IQ4_XS quant. 23% speed increase. CPU: Ryzen 7 9700X, OS: CachyOS. GPU used as secondary with iGPU for display.