M5 Max 64GB

APPLE · 64GB unified memory · 3 reports

See what fits on this GPU →

Latest Most reported Fastest t/s

Gemma 4 26B assistant

M5 Max 64GB · llama.cpp

throughput:: 97.0 t/s gen

coding

Multi-Token Prediction (MTP) implementation yields 40% speedup (138 t/s with MTP).

Qwen3.6 27B

M5 Max 64GB · MLX

throughput:: 63.0 t/s gen
quant:: 4-bit (mlx)

codingcreative-writing

MTPLX engine achieves 63 tok/s on Qwen3.6-27B 4-bit MLX on M5 Max 64GB, up from 28 tok/s baseline. Uses native MTP heads with temperature 0.6, top_p 0.95, top_k 20. Optimal depth D3. Custom patched MLX fork with Metal kernels.

Qwen3.6 27B

M5 Max 64GB

throughput:: 32.0 t/s gen

coding

Qwen 3.6 27B on MacBook Pro M5 Max 64GB: 32 tokens/sec, 18m04s, 33946 tokens. Compared to Gemma 4 31B (27 t/s, 3m51s, 6209 tokens). Qwen showed more creativity but Gemma won for game logic.