llamaperf

M5 Max 64GB

APPLE · 64GB unified memory · 3 reports

See what fits on this GPU →
Tone: positive
throughput:
63.0 t/s gen
quant:
4-bit (mlx)
codingcreative-writing

MTPLX engine achieves 63 tok/s on Qwen3.6-27B 4-bit MLX on M5 Max 64GB, up from 28 tok/s baseline. Uses native MTP heads with temperature 0.6, top_p 0.95, top_k 20. Optimal depth D3. Custom patched MLX fork with Metal kernels.

Tone: mixed
throughput:
32.0 t/s gen
coding

Qwen 3.6 27B on MacBook Pro M5 Max 64GB: 32 tokens/sec, 18m04s, 33946 tokens. Compared to Gemma 4 31B (27 t/s, 3m51s, 6209 tokens). Qwen showed more creativity but Gemma won for game logic.