VRAM Calculator

Pick your GPU — see which open-weight models fit in VRAM, at which quantization, and roughly how fast they run. Tokens/sec comes from community-reported benchmarks when we have them.

GPU

System RAM

Context length

Quality

3 fits fully·1 with offload·0 won't run

All 4 reports for this GPU →

Fits

Gemma 3

27B params·Google·FP16

61.0 GB

—

Fits

Qwen3.5

27B params·Alibaba·FP16

60.5 GB

—

Fits

Qwen3.6

3B / 35B·Alibaba·FP16

11.9 GB

—

Offload

Gemma 4

30.7B params·Google DeepMind·FP16

68.7 GB

—