Key Takeaways
- Cloud GPU costs $0.35–2.50/hr depending on GPU tier and provider
- Local RTX 4090 workstation totals ~$3,200 upfront (GPU + system)
- Break-even: 1,800 hours cumulative use at $0.50 avg cloud rate = local wins
- Mac Mini M4 Pro 48GB: $2,000 upfront, breaks even at ~1,200 cloud hours
- Electricity adds $0.03–0.08/hr to local operating costs
- Cloud wins for spiky, occasional, or experimental workloads
- Local wins for sustained daily inference, privacy-sensitive use, or fine-tuning
| GPU | VRAM | Provider | Spot $/hr | On-Demand $/hr |
|---|---|---|---|---|
| RTX 4090 | 24 GB | RunPod | $0.28–0.44 | $0.74 |
| RTX 4090 | 24 GB | Vast.ai | $0.32–0.48 | $0.89 |
| A40 | 48 GB | RunPod | $0.44–0.64 | $1.14 |
| A100 80GB | 80 GB | Lambda Labs | $1.29 | $2.49 |
| H100 SXM | 80 GB | RunPod | $2.39 | $3.49 |
| Build | GPU | VRAM | Total Cost | Models Supported |
|---|---|---|---|---|
| Budget | RTX 3090 (used) | 24 GB | ~$1,200 | Up to 30B Q4 |
| Recommended | RTX 4090 | 24 GB | ~$3,200 | Up to 34B Q4, 7B full |
| Power | RTX 4090 + 3090 | 48 GB | ~$5,000 | Up to 70B Q4 |
| Mac Mini M4 Pro | M4 Pro (unified) | 48 GB | ~$2,000 | Up to 70B Q4 via MLX |
| Monthly Hours | Cloud Cost/mo (RTX 4090 @ $0.50/hr) | Time to Recover $3,200 RTX 4090 Build |
|---|---|---|
| 10 hr/mo | $5/mo | Never (53 years) |
| 30 hr/mo | $15/mo | 18 years |
| 50 hr/mo | $25/mo | 10.7 years |
| 120 hr/mo (4hr/day) | $60/mo | 4.4 years |
| 240 hr/mo (8hr/day) | $120/mo | 2.2 years |
| 480 hr/mo (16hr/day) | $240/mo | 13 months |
| 720 hr/mo (24hr/day) | $360/mo | 9 months |
What is the break-even point for a local LLM workstation vs cloud GPU?
An RTX 4090 workstation ($3,200 total) breaks even against $0.50/hr cloud GPU at approximately 6,400 cumulative hours. At 8 hours/day, that is 2.2 years. At 16 hours/day (shared team server), it is 13 months.
Does electricity cost significantly affect the comparison?
In the US (12¢/kWh), electricity adds ~$0.05/hr to local costs — minor. In Germany (38¢/kWh), it adds ~$0.16/hr, which meaningfully narrows the local advantage. The Mac Mini M4 Pro's 45W draw keeps electricity costs low even in high-rate countries.
Is RunPod or Vast.ai cheaper for occasional fine-tuning?
Vast.ai is typically 10–20% cheaper than RunPod at spot pricing, but RunPod has better uptime and a managed pods feature. For occasional use (< 20 hours/month), Vast.ai spot pricing is the lowest-cost option. For reliability-sensitive workloads, RunPod Community Cloud is the better choice.
What about depreciation on local hardware?
GPU hardware depreciates 20–40% over 3 years. An RTX 4090 bought at $1,700 may resell for $900–1,200 in 2028. Factoring this in, the true cost of local hardware after 3 years is (purchase price − resale value + electricity). For the RTX 4090 workstation: ($3,200 − $1,200 + $180 electricity at 8hr/day US) = ~$2,180 over 3 years vs. cloud at $0.50/hr × 8hr/day × 365 × 3 = $4,380.
How much does it cost to run a 70B model locally?
A 70B Q4_K_M model requires 48GB VRAM/unified memory. Hardware options: dual RTX 3090 ($2,000), Mac Mini M4 Pro 48GB ($2,000), or Mac Studio M4 Max 128GB ($3,000). Electricity at 8hr/day US rate adds $45–90/year. Running the same model on RunPod A40 spot at 8hr/day costs ~$1,300/year.