DiffusionGemma 26B Unleashes Power on M2 Max: MLX Performance Put to the Test

為了找到一些在地端也能讓 Agent 有無限 token 自由的毒駕的方法，原本用手邊的M4 24GB Mac 上嘗試執行 DiffusionGemma 26B，卻悲慘的連 1,000 tokens 的 Context 都撐不住，直接迎來 OOM（記憶體不足）的悲劇。換到 M2 Max 96GB 後，終於可以展現出它應有的實力？我改用MLX（mlx-vlm 0.6.3），過程中雖然踩了 MXFP4 的量化 Bug 並手動處理了 Patch，但最後成功在4-bit 格式下跑完整套 Benchmark。本文記錄這幾天 DiffusionGemma 26B 在 Apple Silicon 上的

⚡

Key Insights

10 editorial insights.

AiFeed24 Team·⏱ 1 min read·News

✈️ Telegram 𝕏 Tweet WhatsApp

Deep Analysis

Multi-Source Intelligence

Tags:#cloud

Found this useful? Share it!

✈️ Telegram 𝕏 Tweet WhatsApp

DiffusionGemma 26B Unleashes Power on M2 Max: MLX Performance Put to the Test

Deep Analysis

Multi-Source Intelligence

Related Stories

Flutter Developers Gain Cross-Platform Flexibility with Serverless Widgets

Transforming Chaotic HTML Tables into Dependable Data Solutions

YouTube's AI Video Summaries Transform Opportunities for Businesses

GitLab 19.0 Embeds Agentic AI in Secrets, Merge Requests, and Supply Chain Security

DiffusionGemma 26B Unleashes Power on M2 Max: MLX Performance Put to the Test

Deep Analysis

Multi-Source Intelligence

Related Stories

Flutter Developers Gain Cross-Platform Flexibility with Serverless Widgets

Transforming Chaotic HTML Tables into Dependable Data Solutions

YouTube's AI Video Summaries Transform Opportunities for Businesses

GitLab 19.0 Embeds Agentic AI in Secrets, Merge Requests, and Supply Chain Security