ãªãè€æ°ã®ã¢ãã«ã§ããã³ããããã¹ããã¹ããªã®ãïŒ
è€æ°ã®ã¢ãã«ã§ããã³ããããã¹ãããããšãå¿ èŠãªã®ã¯ãåã¢ãã«ãç°ãªãåŠç¿ããŒã¿ååžãæã¡ãåé·æ§ããã©ãŒããããæç€ºã®éµå®ã«ç°ãªãããã©ã«ãå€ãçæããããã§ãã æ¬çªãããã€åã«ãã«ãã¢ãã«ãã¹ãã宿œãã3ã€ã®çç±ïŒ
- ç°ãªãåŠç¿ããŒã¿ååžïŒ GPT-4oãClaude 4.6 SonnetãGemini 2.5 Flashã¯ããããç°ãªãããŒã¿ã§åŠç¿ãããç°ãªãRLHFèšå®ã§èª¿æŽãããŠããŸããåãæç€ºãç°ãªãããã©ã«ãåºåãçæããŸãã
- æ¬çªç°å¢ã§ã®å埩åïŒ ã¢ãã«APIã¯é害ãã¬ãŒãå¶éãåããŸããããã¯ã¢ããã¢ãã«ã¯åãããã³ããã§ãã¹ããããåãåè³ªåºæºã«å¯ŸããŠã¹ã³ã¢ãªã³ã°ãããå Žåã«ã®ã¿ç¢ºå®ã«æ©èœããŸãã
- ã³ã¹ãæé©åïŒ 30%å®ãã¢ãã«ãç¹å®ã®ã¿ã¹ã¯ã§95%ã®å質ãéæã§ããå ŽåããããŸãããã¹ããªãã«ã¯ããããŸããã
åãããã³ããã§ã¢ãã«éã§äœãç°ãªãã®ãïŒ
5ã€ã®åºå次å ãåãããã³ããã§ã¢ãã«éã§äžè²«ããŠç°ãªããŸãïŒãã©ãŒãããæºæ ãåé·æ§ãäºå®ã®æ£ç¢ºæ§ãæç€ºã®éµå®ãããŒã³ã
- ãã©ãŒãããæºæ ïŒ åºåã¯æå®ããããã©ãŒãããïŒJSONãMarkdownããŒãã«ãçªå·ä»ããªã¹ãïŒã«åŸã£ãŠããŸããïŒGPT-4oã¯ãã©ãŒããããæç€ºçãªå Žåã峿 Œãªæºæ ã«åããåŸåããããŸãã
- åé·æ§ïŒ åèªæ°ãšè©³çްã¬ãã«ã¯ã¢ãã«éã§å€§ããç°ãªããŸããClaude 4.6 Sonnetã¯éåžžãã詳现ã§ãã
- äºå®ã®æ£ç¢ºæ§ïŒ ãã«ã·ããŒã·ã§ã³ã®å²åã¯ãã¡ã€ã³ãšã¢ãã«ã«ãã£ãŠç°ãªããŸããåè£ã¢ãã«ãåãäºå®ããã³ããã§ãã¹ãããŠãã ããã
- æç€ºã®éµå®ïŒ ãã¹ããããæç€ºãšåŠå®çå¶çŽã¯ã¢ãã«ããšã«ç°ãªã£ãŠè§£éãããŸããClaudeã¯åŠå®çå¶çŽã峿 Œã«éµå®ããŸãã
- ããŒã³ïŒ ã¢ãã«ã¯ããã©ã«ãã§ç°ãªããã©ãŒãã«/ã€ã³ãã©ãŒãã«ã¬ãžã¹ã¿ãŒãæã£ãŠããŸãã
ãã«ãã¢ãã«ãã¹ããããªã¯ã¹ã®æ§ç¯æ¹æ³
ãã«ãã¢ãã«ãã¹ããããªã¯ã¹ã¯æ§é åãããã°ãªããã§ãïŒè¡ã¯ãã¹ãã±ãŒã¹ïŒ10ã20ïŒãåã¯ã¢ãã«ãåã»ã«ã«ã¯1ã2ã3ã®ã¹ã³ã¢ãå ¥ããŸãã
- 1æåŸ ãããå ¥åç¯å²ãã«ããŒãã10ã20ã®ãã¹ãã±ãŒã¹ãäœæããŸãïŒ60%å žåçãªå ¥åã20%ãšããžã±ãŒã¹ã20%æµå¯Ÿçå ¥åã
- 2ã¹ã³ã¢ãªã³ã°ã«ãŒããªãã¯ãéžæïŒ1 = 倱æã2 = éšåçã3 = åæ Œãå šã¢ãã«ãšå šãã¹ãã±ãŒã¹ã«åãã«ãŒããªãã¯ãé©çšããŸãã
- 3åãã¹ãã±ãŒã¹ãåã¢ãã«ã§ç¬ç«ããŠå®è¡ããŸããã¢ãã«åºæã®èª¿æŽãªãã«åäžã®ããã³ããã䜿çšããŸãã
- 4åã»ã«ãã¹ã³ã¢ãªã³ã°ããã¢ãã«å¥ããã³ãã¹ãã±ãŒã¹ã¿ã€ãå¥ã®éèšã¹ã³ã¢ãèšç®ããŸãã
- 5決å®éŸå€ïŒæå€§ã¹ã³ã¢ã®80%ãäžåãã¢ãã«ã¯ãããã³ãããä¿®æ£ãããŸã§æ¬çªç°å¢ã§ã¯éžæããªãã§ãã ããã
ãã«ãã¢ãã«ããã³ãããã¹ãã®ããŒã«
2ã€ã®ããŒã«ãã»ãšãã©ã®ã¯ãŒã¯ãããŒãã«ããŒããŸãïŒPromptQuorumïŒåæéä¿¡ãšäžŠã¹ãŠæ¯èŒïŒãšPromptfooïŒèšå®ãã¡ã€ã«ããŒã¹ã®ãã¹ãèªååïŒã
- PromptQuorumïŒ 1ã€ã®ããã³ãããå ¥åãããã¹ãããã¢ãã«ãéžæãã1ã€ã®ãã¥ãŒã§äžŠã¹ãåºåãåãåããŸããç¡æãGPT-4oãClaude 4.6 SonnetãGemini 2.5 FlashããµããŒãã
- PromptfooïŒ ãªãŒãã³ãœãŒã¹ã®YAMLããŒã¹ããŒã«ãYAMLãã¡ã€ã«ã§ããã³ããããã¹ãã±ãŒã¹ãã¹ã³ã¢ãªã³ã°åºæºãå®çŸ©ããCLIã³ãã³ã1ã€ã§å®å šãªãããªã¯ã¹ãå®è¡ããŸãã
- 10å以å ã®ã»ããã¢ããïŒ npm install -g promptfooãæ¬¡ã«promptfooconfig.yamlããããã€ããŒïŒopenai:gpt-4oãanthropic:claude-sonnet-4-6ãgoogle:gemini-2.5-flashïŒã§äœæããpromptfoo evalãå®è¡ããŸãã
ãã«ãã¢ãã«ãã¹ãçµæã®èªã¿æ¹
ãã«ãã¢ãã«ãã¹ãçµæã¯3ã€ã®æ±ºå®çµæã®ããããããããããŸãïŒ1ã€ã®ã¢ãã«ãéžã¶ãã¿ã¹ã¯ã¿ã€ãå¥ã«åå²ããããŸãã¯ã³ã³ã»ã³ãµã¹ã¢ãããŒãã䜿çšããã
- 1ã€ã®ã¢ãã«ãéžã¶ïŒ 1ã€ã®ã¢ãã«ããã¹ããããªã¯ã¹å šäœã§æç¢ºã«é«ãã¹ã³ã¢ãç²åŸããŸããå šæ¬çªãã©ãã£ãã¯ã«äœ¿çšãã2çªç®ã®ã¢ãã«ããã©ãŒã«ããã¯ãšããŠèšå®ããŸãã
- ã¿ã¹ã¯ã¿ã€ãå¥ã«åå²ïŒ ã©ã®ã¢ãã«ãå šãã¹ãã«ããŽãªã§åãŠãŸãããåã¿ã¹ã¯ã¿ã€ãããã®ã«ããŽãªã§æãé«ãã¹ã³ã¢ã®ã¢ãã«ã«ã«ãŒãã£ã³ã°ããŸãã
- ã³ã³ã»ã³ãµã¹ã¢ãããŒãïŒ PromptQuorumã®ã³ã³ã»ã³ãµã¹ã¹ã³ã¢ãªã³ã°ã¯ã¢ãã«åºåãå¹³ååãŸãã¯æç¥šã¡ã«ããºã ã䜿çšããŸã â åäžã¢ãã«ãååã«ä¿¡é Œã§ããªãå Žåã«æå¹ã
ãããã質å
ãã«ãã¢ãã«ããã³ãããã¹ããšã¯ïŒ
ãã«ãã¢ãã«ããã³ãããã¹ãã¯ãåãããã³ãããGPT-4oãClaude 4.6 SonnetãGemini 2.5 Flashãªã©2ã€ä»¥äžã®AIã¢ãã«ã§å®è¡ãããã©ãŒãããæºæ ãåé·æ§ãæ£ç¢ºæ§ãæç€ºã®éµå®ãªã©ã®åè³ªåºæºã§åºåãæ¯èŒããææ³ã§ãã
ãªãåãããã³ãããã¢ãã«ã«ãã£ãŠç°ãªãåºåãçæããã®ã§ããïŒ
åã¢ãã«ã¯ç°ãªãããŒã¿ååžãšç°ãªãRLHFèšå®ã§åŠç¿ãããŠãããåé·æ§ãããŒã³ããã©ãŒãããæºæ ãæç€ºã®éµå®ã«ç°ãªãããã©ã«ãããããŸããGPT-4oã§ã¯ãªãŒã³ãªJSONãªããžã§ã¯ããçæããããã³ããããClaudeã§ã¯Markdownã®èª¬æãçæããå¯èœæ§ããããŸãã
ãã«ãã¢ãã«ãã¹ããããªã¯ã¹ã«å¿ èŠãªãã¹ãã±ãŒã¹æ°ã¯ïŒ
ä¿¡é Œã§ããã·ã°ãã«ã«ã¯æäœ10ã®ãã¹ãã±ãŒã¹ãå¿ èŠã§ããå žåçãªå ¥åããšããžã±ãŒã¹ãæµå¯Ÿçå ¥åãã«ããŒãã15ã20ã®ãã¹ãã±ãŒã¹ãç®æããŠãã ããã10æªæºã®ãã¹ãã±ãŒã¹ã¯ãã€ãºãå€ãããŸãã
ãã«ãã¢ãã«ããã³ãããã¹ãããµããŒãããããŒã«ã¯ïŒ
PromptQuorumã¯1ã€ã®ããã³ãããå šã¢ãã«ã«åæéä¿¡ããç¡æã§äžŠã¹ãŠæ¯èŒã衚瀺ããŸããPromptfooã¯GPT-4oãClaudeãGeminiãLlama 3.2ãªã©ã®ããŒã«ã«ã¢ãã«ããµããŒããããªãŒãã³ãœãŒã¹ããŒã«ã§ããBraintrustã¯dataseté§åã®è©äŸ¡ãæäŸããŸãã
METI AI Governance 2024ã¬ã€ãã©ã€ã³ã«å¯Ÿå¿ãããã«ãã¢ãã«ãã¹ãã®å®æœæ¹æ³ã¯ïŒ
METI AI Governance 2024ã§ã¯ããšã³ã¿ãŒãã©ã€ãºãããã€ã¡ã³ãåãã«ããã³ãããã¹ãã®éææ§ãšèª¬æå¯èœæ§ãèŠæ±ããŠããŸãããã¹ããããªã¯ã¹ã«ç£æ»ãã°ã説æå¯èœæ§ãã§ãã¯ãã¢ãã«åºæã®åºåäžç¢ºå®æ§ãå«ããŠãã ãããéèæ©é¢ãå»çæ©é¢ã¯ããã¹ãçµæãã³ã³ãã©ã€ã¢ã³ã¹ææžãšããŠä¿åããå¿ èŠããããŸãã
ã¢ãžã¢å€ªå¹³æŽå°åã®ãã«ãã¢ãã«ãããã€ã¡ã³ãã§ã®ãã¹ããã©ã¯ãã£ã¹ã¯äœã§ããïŒ
ã¢ãžã¢å€ªå¹³æŽå°åïŒASEANãæ¥æ¬ãéåœãã€ã³ãïŒã§ã¯ããŒã¿äž»æš©èŠä»¶ã峿 Œã§ãããã«ãã¢ãã«ãã¹ãã§ãåã¢ãã«ãããŒã«ã«ããŒã¿ä¿è·æ³ïŒMETIãPDPAïŒã«æºæ ããŠããããšã確èªããŠãã ãããã¢ãã«ããšã«ããŒã¿ãã±ãŒã·ã§ã³èšå®ãšã¬ã€ãã³ã·ãŒèŠä»¶ããã¹ãããŸããã¯ãã¹ããŒããŒåºåã¯èŠå¶å¯Ÿè±¡ãšãªãå¯èœæ§ããããããããŒã«ã«æšè«ãªãã·ã§ã³ãå«ããŠãã ããã
é¢é£è³æ
- LLMã¢ãã«ã®è©äŸ¡ãšæ¯èŒæ¹æ³
- ããã³ãããã¹ããšæ€èšŒïŒLLMåããã¹ãã¹ã€ãŒãèªåå
- Promptfooå ¥éïŒããŒã«ã«ãšCI/CDã§ã®Promptãã¹ã
- è€æ°ã¢ãã«éã§ã³ã³ã»ã³ãµã¹ã¹ã³ã¢ãªã³ã°ã䜿çšããææ
- ã³ã¹ãæé©åïŒGPT-4oãClaudeãGeminiéžæã¬ã€ã