Wichtigste Erkenntnisse
- Open-source (Llama, Mistral, Gemma): run locally, full control, lower per-token cost, but require infrastructure & management
- Proprietary (GPT-4o, Claude, Gemini): superior capability, fastest inference, pay-per-token, but data sent to provider servers
- Cost trade-off: open-source saves on tokens but costs on GPU/servers. Proprietary saves on ops but costs per-query.
- PromptQuorum supports both: dispatch the same prompt to open-source (local Ollama) and proprietary (OpenAI, Anthropic, Google) in parallel
- Privacy: open-source self-hosted = maximum privacy (GDPR, HIPAA compliant). Proprietary = trust provider security & data policies.
- Customization: open-source can be fine-tuned on proprietary data. Proprietary APIs only allow prompting (no model modification).
- Real-world performance: top open-source (Llama 3.2 70B) competes with GPT-4o on benchmarks; smaller models lag significantly.
- Timeline: open-source catches up ~6–12 months behind proprietary. New capability usually appears in OpenAI/Anthropic first, then open-source.