重要なポイント
- Store prompts in Git; version control is foundational—spreadsheets break at scale
- Add YAML metadata: owner, created date, last reviewed, test results, performance baseline
- Implement search: Either YAML frontmatter + regex grep, or API index (Braintrust, LangSmith)
- Enforce naming: e.g., `prompts/{domain}/{use-case}/v{N}.yaml` — predictable, auditable
- Require testing before merge: No prompt enters main without passing the test suite
Why Teams Need a Prompt Library
Without a library, teams duplicate prompts, lose institutional knowledge, and re-solve problems.
- Duplication: Three teams each write customer-support prompt; no one knows the others did
- Knowledge loss: Person who wrote best prompt leaves; prompt is in Slack, not discoverable
- Rework: Each team re-optimizes the same prompt independently instead of improving shared version
- Risk: Production uses unvetted prompt because no one reviewed it before it went live
Prompt Library Structure
Organize by domain, use case, and version; store metadata separately from the prompt itself.
- Directory structure: `prompts/{domain}/{use-case}/v{N}.yaml` (e.g., `prompts/support/customer-escalation/v2.yaml`)
- One YAML per prompt: prompt content + metadata (owner, created, reviewed, tests, performance)
- Naming rules: domain (support, sales, coding, research), use-case (kebab-case), version semver
- Alternative: Database-backed (Braintrust, LangSmith) for search + API access without Git friction
Essential Metadata
Every prompt needs owner, creation date, review status, and performance baseline.
- owner: Email or @slack handle of maintainer
- created: ISO date (2026-04-05)
- lastReviewed: Audit trail for compliance (required by SOC 2)
- tags: Array of keywords (e.g., customer-service, escalation, email)
- testCases: Integer count; "5 tests" says it's been validated
- performanceBaseline: { model: "GPT-4o", accuracy: 0.92, latency: "0.8s" }
- deprecated: Boolean; if true, link to replacement version
Versioning and Updates
Use semantic versioning; major changes require re-testing and review.
- v1.0: Initial validated prompt
- v1.1: Bug fixes or clarifications, same test suite passes
- v2.0: Significant rewrite, new test suite, may change output format
- Update workflow: Branch → edit prompt → run tests → PR review → merge → tag release
Search and Discovery
Prompts only matter if the team can find and use them.
- Git-based: `grep -r "tag: customer-service" prompts/` — simple, works offline
- Metadata index: Braintrust Dashboard or LangSmith Registry — rich search UI, cost ~$500/month
- API endpoint: Custom endpoint returns YAML metadata + link to Git; teams query via SDK
- Recommendation: Start with Git grep; upgrade to API as library grows >50 prompts
Review and Approval Workflow
No prompt merges to main without review; enforce at Git level via branch protection.
- Branch protection rule: Require 1–2 approvals before merging prompt changes
- Reviewer checklist: (1) Tests pass, (2) Tests are adequate (not just 1 example), (3) Metadata complete, (4) Naming follows convention
- Comment template: "Approved for {domain}; estimated impact: {accuracy change}; monitor {metric}"
- Quarterly audit: Review all prompts marked "lastReviewed" >90 days ago; update or deprecate
Common Mistakes
- Spreadsheet as "library"—no version history; changes overwrite; no audit trail
- No metadata—don't know who maintains prompt, when it was written, or if it's tested
- Naming chaos—`ChatGPT_v2_FINAL_UPDATED.txt`; no one knows which is current
- No discovery mechanism—prompts buried in Git; team doesn't know they exist
- No deprecation—old prompts never removed; library grows with zombie prompts
Sources
- Braintrust prompt registry documentation
- LangSmith docs: Prompt management
- GitHub branch protection rules guide