Best eGPU Setup for MacBook Local LLM Inference (2026)
Hardware & PerformanceIntermediate
Key Takeaways
- βApple Silicon MacBooks cannot use eGPUs for compute β Apple removed the feature in macOS Ventura
- βIntel MacBooks (2018β2020) still support eGPU via Thunderbolt 3, but the Mac line-up is discontinued
- βFor macOS: buy Mac Mini M4 Pro (48 GB) instead of trying to extend a MacBook with eGPU
- βFor portable + GPU: AMD mini PC (UM890 Pro) + RTX 3090 via OCuLink β runs Ollama at 60β80 tok/s
- βThunderbolt 4 eGPUs on x86 laptops (Windows/Linux) do work β 35β45% bandwidth penalty vs native PCIe
Why eGPU Doesn't Work for MacBooks in 2026
Apple removed Thunderbolt eGPU support in macOS Ventura (released October 2022). All Apple Silicon MacBooks (M1, M2, M3, M4, M5) run on this or later macOS versions. Even if you physically connect an eGPU enclosure, macOS will not use the external GPU for GPU compute tasks β only the internal GPU is active. External display output via eGPU still works, but LLM inference does not use it.
- βΈ**macOS 13 Ventura (2022)**: eGPU support dropped. All Apple Silicon Macs affected.
- βΈ**macOS 14 Sonoma, 15 Sequoia**: Still no eGPU compute support.
- βΈ**Intel MacBooks (2018β2020)**: eGPU worked via Thunderbolt 3 on older macOS. These Macs are discontinued and will not receive macOS updates past macOS Tahoe.
- βΈ**External display via eGPU**: Still works on older Macs as an output-only device.
What to Do Instead: Real Alternatives
Quick Answers
Is there any way to make an eGPU work with an M4 MacBook Pro for AI?βΎ
Not for GPU compute. Apple's macOS does not expose an API for external GPUs to run Metal compute tasks on M-series hardware. The only path is to connect the MacBook to an Ollama server running on a separate machine (a mini PC or desktop with a dedicated GPU) over the local network. Set OLLAMA_HOST=0.0.0.0 on the server and point your MacBook's apps to that IP address.
Will Apple bring back eGPU support for Apple Silicon?βΎ
Unlikely. Apple's M-series architecture integrates the GPU, CPU, and memory on a single chip β the design philosophy is unified memory, not expandability. Apple has not indicated any plans to restore eGPU compute support. The Mac Pro (2023) with expansion slots is the only Apple product that supports GPU expansion.
Can I use an NVIDIA GPU for inference and pipe the output to my MacBook?βΎ
Yes β this is the recommended approach. Run Ollama on a Windows or Linux machine with an NVIDIA GPU, expose it on your LAN (OLLAMA_HOST=0.0.0.0), and connect from your MacBook via Open WebUI, Cursor, Continue, or any OpenAI-compatible client. The MacBook handles the UI; the NVIDIA machine handles the computation.
Want the full breakdown?
Read the complete guide β