ๅ ณ้ฎ่ฆ็น
- CPU: AMD Ryzen 7 5700X ($250 used) โ 8 cores, good for LLM hosting + OS tasks
- GPU: RTX 4070 ($350 used) โ 12GB VRAM, runs 70B quantized (Q4) at 120 tok/sec
- RAM: 32GB DDR4 ($80โ120 used) โ Fast GPU inference + OS multitasking
- SSD: 1TB NVMe ($40 new) โ Boot drive; models stored on separate 2TB external SSD
- PSU: 750W Gold ($70 new) โ Enough for RTX 4070 + CPU + storage
- Cooling: Stock Ryzen cooler ($0, included); optional case fans ($30) for 24/7 use
- Total: ~$1,000 for new parts, $700โ800 used market
- Inference speed: 120โ150 tokens/second on Llama 3.1 70B (Q4 quantization)
What Are the Specs for a $1,000 Build?
The $1,000 build targets Llama 3.1 70B inference at consumer-grade speed. RTX 4070 has 12GB VRAM, enough for 70B quantized models. Ryzen 7 5700X is fast enough for OS overhead (typically 5โ10% CPU during inference). 32GB RAM prevents swapping to disk.
| Component | Budget Pick | Better Alternative |
|---|---|---|
| โ | โ | โ |
| โ | โ | โ |
| โ | โ | โ |
| โ | โ | โ |
| โ | โ | โ |
| โ | โ | โ |
| โ | โ | โ |
Which GPU Should You Choose?
RTX 4070 (recommended): 12GB VRAM, $350โ450 used. Runs Llama 3.1 70B (Q4) at 120 tok/sec. Best value for $1K budget.
RTX 4070 Super: 12GB VRAM, $450โ550 used. 10โ15% faster than 4070. Worth it if you find one at $450.
RTX 3090 (alternative): 24GB VRAM, $400โ600 used. Slightly slower than 4070 per token, but larger context window support.
Avoid: RTX 3080 10GB (too small VRAM), RTX 4060 12GB (too slow), anything <12GB for 70B models.
How Do You Assemble and Set Up the PC?
Assembly is straightforward for standard ATX motherboards. Install CPU in socket, RAM in slots, SSD in M.2 slot, then GPU in top PCIe slot. Install Windows/Linux, NVIDIA drivers, then Ollama or vLLM.
- 1Install CPU into socket (align triangle markers).
- 2Install RAM: Push down clips on both ends, insert RAM until it clicks.
- 3Install NVMe SSD into M.2 slot (typically below GPU slot).
- 4Install GPU: Remove slot covers, push PCIe retention clip, insert GPU, secure with bracket screw.
- 5Connect 24-pin motherboard power + 8-pin CPU power from PSU.
- 6Connect SATA data cable to SSD (if not NVMe).
- 7Install Windows 11 or Ubuntu Linux from USB.
- 8Download NVIDIA drivers (nvidia.com), install, reboot.
- 9Install Ollama (ollama.ai) or vLLM (`pip install vllm`), pull model, test.
What Performance Should You Expect?
On RTX 4070: Llama 3.1 70B (Q4) runs at 120โ150 tokens/second. This is sufficient for interactive chatbots, batch processing, and coding assistance. Full responses (300 tokens) complete in 2โ3 seconds.
Single-user latency: ~5ms per token (model) + 50ms overhead (VRAM transfer) = 55ms/token wall-clock time (~18 tok/sec user-perceived).
For 24/7 use: Power draw is ~140W (Ryzen) + 250W (GPU) = 390W average, or ~$3.50/week in US electricity.
How Much Does Each Component Cost?
As of April 2026, used prices are 30โ40% below new. Building used saves $300โ400.
| Component | New Price | Used Price (eBay) |
|---|---|---|
| โ | โ | โ |
| โ | โ | โ |
| โ | โ | โ |
| โ | โ | โ |
| โ | โ | โ |
| โ | โ | โ |
| โ | โ | โ |
| โ | โ | โ |
| โ | โ | โ |
FAQ
Can I use this build for gaming and AI?
Yes. Gaming uses GPU, AI uses GPU. Run one at a time to avoid bottlenecks. Ryzen 7 5700X is a solid gaming CPU (~120 fps 1440p in modern titles).
Is 32GB RAM enough?
For inference: Yes, sufficient. For training/fine-tuning: 64GB is better (allows larger batch sizes). 32GB OK for LoRA fine-tuning.
Can I upgrade the GPU later?
Yes. RTX 4070 is current-gen (2023). RTX 5070 will launch mid-2025. Motherboard will support future GPUs via PCIe 4.0.
How long until the PC becomes obsolete?
RTX 4070 will remain relevant 5+ years (similar to GTX 1080 today). Ryzen 5700X is solid for 7+ years.
Should I buy new or used parts?
Used GPU (eBay): Save 30โ40%, 6-month warranty typical. Motherboard/CPU: Buy new for reliability ($250โ300 extra total).
What Are Common Mistakes When Building?
- Buying a small PSU (500W). RTX 4070 needs minimum 650W; use 750W for headroom.
- Skimping on RAM. 16GB causes disk swap during large model loading; use 32GB minimum.
- Choosing DDR5 RAM. DDR4 is cheaper. Ryzen 5000 supports DDR4 fine.
- Not using PCIe 4.0. Older SSDs (SATA) bottleneck model loading. Get NVMe PCIe 4.0+.
Sources
- eBay GPU pricing (RTX 4070): market data April 2026
- AMD Ryzen 7 5700X specifications: AMD official specs
- NVIDIA RTX 4070 TDP and power specifications (250W)