ยท 3 days agoยท Dev.to
AMD GPU outperforms Mac on Llama 8B, falters with Phi-3.
I wrote a post yesterday about why GPUs barely help small text embeddings at batch=1. Different workload, same machines. This time I ran a local LLM inference benchmark across the same three boxes. The result complicated my hardware mental model in a way I think is worth sharing. Three machines. A M
#amd-gpu#llm-inference#cloud-computing#machine-learning