éèŠãªãã€ã³ã
- æéã®æé ïŒOllamaãã€ã³ã¹ããŒã« â `ollama run llama3.2`ãå®è¡ â ã¿ãŒããã«ã§ãã£ãããé«éæ¥ç¶ãªãåèš5å以å ã
- 8 GBã®RAMãã·ã³ïŒ`llama3.2:3b`ïŒ2 GBã®ããŠã³ããŒãïŒãŸãã¯`phi4-mini`ïŒ2.3 GBïŒããå§ããŠãã ãããã©ã¡ããææ°ã®ããŒãããœã³ã³ã§åäœããŸãã
- CPUã§15-40ããŒã¯ã³/ç§ããããã¬ã³ãžGPUãŸãã¯Apple Siliconã§60-120ããŒã¯ã³/ç§ãæåŸ ã§ããŸãã
- æåã®ã¬ã¹ãã³ã¹ã¯ã¯ã©ãŠãAPIããé ãæããå ŽåããããŸããããŒã«ã«ã¢ãã«ã¯é床ããã©ã€ãã·ãŒãšãŒãã³ã¹ããšäº€æããŸãã
- åæã¢ãã«ã®ããŠã³ããŒãåŸã¯ããã¹ãŠãªãã©ã€ã³ã§åäœããŸããæ¬¡å以éã®ã»ãã·ã§ã³ã«ã€ã³ã¿ãŒãããæ¥ç¶ã¯äžèŠã§ãã
ã¹ããã1ïŒOllamaãã€ã³ã¹ããŒã«
Ollamaã¯ããŒã«ã«ã§åäœããLLMãå®è¡ããæéã®æ¹æ³ã§ãã1ã€ã®ã³ãã³ããŸãã¯2åã®ããŠã³ããŒãã§ã€ã³ã¹ããŒã«ã§ããŸãïŒ
# macOS (Homebrew)
brew install ollama
# Linux
curl -fsSL https://ollama.com/install.sh | sh
# Windows: download installer from ollama.com/downloadOllamaãåäœããŠããããšã確èª
ã€ã³ã¹ããŒã«åŸãOllamaãã¢ã¯ãã£ãã§ããããšã確èªããŸãïŒ
curl http://localhost:11434
# Expected output: Ollama is runningã¹ããã2ïŒæåã®ã¢ãã«ãéžæ
å©çšå¯èœãªRAMã«åºã¥ããŠã¢ãã«ãéžæããŸããè¿·ã£ãå Žåã¯`llama3.2:3b`ããå§ããŠãã ããã4 GBã®RAMãæèŒãããããããã·ã³ã§åäœããæçšãªåºåãçæããŸãïŒ
| ã䜿ãã®RAM | æšå¥šã¢ãã« | ããŠã³ããŒããµã€ãº | çç± |
|---|---|---|---|
| 4 GB | llama3.2:1b | çŽ1.3 GB | æå°éã®äœ¿ããLlamaã¢ãã« |
| 8 GB | Llama 3.2 3B | çŽ2 GB | åå¿è åãã®æé«ã®å質/ãµã€ãºæ¯ |
| 8-16 GB | Llama 3.1 8B | çŽ4.7 GB | 匷åãªæ±çšã¢ãã« |
| 16 GBä»¥äž | mistral:7b ãŸã㯠qwen2.5:7b | çŽ4-5 GB | ç«¶äºåã®ããå質ãé«éãªæšè« |
ã¹ããã3ïŒã¢ãã«ãããŠã³ããŒã
`ollama pull`ã§ã¢ãã«ãããŠã³ããŒãããŸããã¢ãã«ã¯`~/.ollama/models`ã«ä¿åãããäžåºŠã ãããŠã³ããŒãããã°äœ¿ããŸãïŒ
ollama pull llama3.2
# Or pull a specific size variant
ollama pull llama3.2:3b
ollama pull llama3.1:8bããŠã³ããŒãã®æ§å
Ollamaã¯ã¿ãŒããã«ã«ããŠã³ããŒãã®é²è¡ç¶æ³ã衚瀺ããŸãã`llama3.2:3b`ã¢ãã«ã¯äžè¬çãªãããŒããã³ãæ¥ç¶ã§2-5åããããŸããã¢ãã«ã¯å§çž®ç¶æ ã§ä¿åãããŸãã2 GBã®ããŠã³ããŒãããã£ã¹ã¯äžã§çŽ2.3 GBã«å±éãããŸãã
pulling manifest
pulling 966de95ca8dc... 100% ââââââââââââââââââ 1.9 GB
pulling 9f436a92eb8b... 100% ââââââââââââââââââ 42 B
verifying sha256 digest
writing manifest
successã¹ããã4ïŒã¢ãã«ãèµ·åããŠæåã®ããã³ãããéä¿¡
ã€ã³ã¿ã©ã¯ãã£ããªãã£ããã»ãã·ã§ã³ãéå§ããŸãïŒ
ollama run llama3.2
# Ollama loads the model and shows a prompt:
>>> Send a message (/? for help)æåã®äŒè©±
ã¡ãã»ãŒãžãå ¥åããŠEnterããŒãæŒããŸããã¢ãã«ã¯ããŒã¯ã³ããšã«ã¬ã¹ãã³ã¹ãã¹ããªãŒãã³ã°ããŸãïŒ
>>> What are local LLMs?
Local LLMs (large language models) are AI models that run entirely
on your own hardware -- your laptop, desktop, or server. Unlike cloud
services such as ChatGPT or Claude, local LLMs process everything
locally with no data sent to external servers...æåŸ ã§ããããšïŒé床ãå質ãå¶é
é床ã¯ããŒããŠã§ã¢ã«ãã£ãŠç°ãªããŸãã2023幎ã®ã©ãããããïŒGPUãªãïŒïŒ3Bã¢ãã«ã§15-25ããŒã¯ã³/ç§ã8Bã¢ãã«ã§8-15ããŒã¯ã³/ç§ãæåŸ ã§ããŸããApple M3 ProïŒ8Bã§50-80ããŒã¯ã³/ç§ãNVIDIA RTX 4070 TiïŒ8Bã§90-130ããŒã¯ã³/ç§ã
å質ã¯`llama3.2:3b`ã§ã¯è€éãªã¿ã¹ã¯ã«ãããŠGPT-4oãClaude Opus 4.7ãããæããã«äœããªããŸããèŠçŽãã·ã³ãã«ãªè³ªçå¿çãã³ãŒãã®èª¬æã«ã¯æçšã§ãã倿®µéã®æšè«ãé·æå·çã«ã¯8BãŸãã¯13Bã¢ãã«ãžã®ã¢ããã°ã¬ãŒããæ€èšããŠãã ããã
ã³ã³ããã¹ããŠã£ã³ããŠïŒ`llama3.2:3b`ã¯Ollamaã§ããã©ã«ãã§128KããŒã¯ã³ããµããŒãããŸããå®éã«ã¯ã1åã®äŒè©±ã§çŽ16KããŒã¯ã³åŸã«å質ãäœäžããŸãã
æåã®ã¬ã¹ãã³ã¹é å»¶ïŒ`ollama run`åŸã®æåã®ã¬ã¹ãã³ã¹ã«ã¯ã¢ãã«ã®ããŒãæéïŒ5-30ç§ïŒãå«ãŸããŸããåãã»ãã·ã§ã³å ã®åŸç¶ã®ã¬ã¹ãã³ã¹ã¯é«éã«ãªããŸãã
ã¿ãŒããã«ä»¥å€ã§ããŒã«ã«LLMãäœ¿ãæ¹æ³
Ollamaã®ã¿ãŒããã«ãã£ããã¯ãã¹ãç®çã«æçšã§ãããå®éã®ãŠãŒã¹ã±ãŒã¹ã§ã¯ããè¯ãã€ã³ã¿ãŒãã§ãŒã¹ãå¿ èŠã§ãïŒ
- Open WebUIïŒOllamaçšã®é«æ©èœWebã€ã³ã¿ãŒãã§ãŒã¹ãDockerã§èµ·åïŒ`docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway ghcr.io/open-webui/open-webui:main`ãhttp://localhost:3000ã§ã¢ã¯ã»ã¹ã
- LM StudioïŒãã¹ã¯ãããGUIãå¥œãæ¹ã¯ãLM Studioã®ã€ã³ã¹ããŒã«æ¹æ³ã§å®å šãªã»ããã¢ããã確èªã§ããŸãã
- API飿ºïŒ`localhost:11434`ã®Ollama APIã¯OpenAI SDKãšäºææ§ããããŸããOpenAIã®ããŒã¹URLãåãå ¥ããã¢ããªã±ãŒã·ã§ã³ã¯ãã¹ãŠããŒã«ã«ã¢ãã«ã«æ¥ç¶ã§ããŸãã
- VS Code / CursorïŒContinue.devãªã©ã®æ¡åŒµæ©èœãOllamaã«æ¥ç¶ãããšãã£ã¿å ã§ããŒã«ã«AIã³ãŒãã£ã³ã°æ¯æŽãæäŸããŸãã
åããŠã®ããŒã«ã«LLMïŒå°åå¥ã³ã³ããã¹ã
EU / GDPRïŒOllamaã§ããŒã«ã«LLMãå®è¡ãããšãããã³ããããŒã¿ãã³ã³ããã¹ããåºåã®ãããããã·ã³ããå€ã«åºãŸãããå人ããŒã¿ãæ±ãEUã®å°éå®¶ã«ãšã£ãŠãã¯ã©ãŠãAI APIã®ãã©ã€ãã·ãŒä¿è·ä»£æ¿ææ®µãšãªããŸãã
æ¥æ¬ïŒMETIïŒïŒMETI AIã¬ããã³ã¹ã¬ã€ãã©ã€ã³ã§ã¯ãAIæšè«ãè¡ãããå Žæãææžåããããšãæ±ããããŠããŸããæåã®Ollamaã»ããã¢ããã¯å®å šã§ç£æ»å¯èœãªããŒã«ã«ç°å¢ãäœæããŸããã¢ãã«ãã¡ã€ã«ã¯`~/.ollama/models`ã«ããŒãžã§ã³åºæã®ãã¡ã€ã«åã§ä¿åããã`ollama ps`ã§æšè«ãæ€èšŒã§ããŸããMETIã³ã³ãã©ã€ã¢ã³ã¹ç®çã§æ£ç¢ºãªã¢ãã«ããŒãžã§ã³ãšããŒããŠã§ã¢ãææžåã§ããŸãã
äžåœïŒäžåœèªã®ã¯ãŒã¯ãããŒã«ã¯ãæåã®ã¢ãã«ãšããŠqwen2.5:3bãllama3.2:3bã®ä»£ããã«äœ¿çšããŠãã ããïŒ`ollama pull qwen2.5:3b`ãQwen2.5ã¯LlamaãšåãããŒããŠã§ã¢éå±€ã§ããè¯ãçµæãçæããŸãã
ããŒã«ã«LLMååèµ·åæã®ãããã質å
ã¢ãã«ã®ã¬ã¹ãã³ã¹ãéåžžã«é ã -- ããã¯æ£åžžã§ããïŒ
CPUå°çšããŒããŠã§ã¢ã§ã¯ã7Bã¢ãã«ã®å Žå8-20ããŒã¯ã³/ç§ã¯æ£åžžã§ããåããŒã¯ã³ã¯çŽ0.75åèªã«çžåœããŸãã10ããŒã¯ã³/ç§ã§100åèªã®ã¬ã¹ãã³ã¹ã«ã¯çŽ13ç§ããããŸããæšè«ãé«éåããã«ã¯ãããå°ããã¢ãã«ïŒ8Bã®ä»£ããã«3BïŒã䜿çšãããã察å¿GPUãããå Žåã¯GPUãªãããŒããæå¹ã«ããããæãäžè¬çã§æéã®èšå®ã§ããQ4_K_Méååã¬ãã«ã䜿çšããŠãã ããã
2ã€ã®ã¢ãã«ãåæã«å®è¡ã§ããŸããïŒ
ååãªRAMãããã°ãOllamaã¯è€æ°ã®ã¢ãã«ãåæã«ããŒãããç¶æ ã«ä¿ãŠãŸããããã©ã«ãã§ã¯ãOllamaã¯5åéã®é掻æ§åŸã«ã¢ãã«ãã¢ã³ããŒãããŸããOLLAMA_KEEP_ALIVEç°å¢å€æ°ã§ããã倿Žã§ããŸãã2ã€ã®7Bã¢ãã«ãåæã«å®è¡ããã«ã¯çŽ16 GBã®RAMãå¿ èŠã§ãã
Ollamaãããã¯ã°ã©ãŠã³ãã§åäœããªãããã«ããã«ã¯ïŒ
macOSïŒã¡ãã¥ãŒããŒã®llamaã¢ã€ã³ã³ãã¯ãªãã¯ããŠãçµäºããéžæãLinuxïŒ`systemctl stop ollama`ãå®è¡ãWindowsïŒã·ã¹ãã ãã¬ã€ã®ã¢ã€ã³ã³ãå³ã¯ãªãã¯ããŠãçµäºããéžæã
åããŠããŒã«ã«LLMãå®è¡ããæãç°¡åãªæ¹æ³ã¯ïŒ
Ollamaãã€ã³ã¹ããŒã«ãïŒollama.comïŒã`ollama pull llama3.2:3b`ãå®è¡ããæ¬¡ã«`ollama run llama3.2:3b`ãå®è¡ããŸããããã ãã§ãã3ã€ã®ã³ãã³ãã2-5åã§ãã€ã³ã¿ãŒãããäžèŠã®AIã¢ãã«ããã·ã³äžã§åäœããŸãã
ããŒã«ã«LLMãæ£åžžã«åäœããŠããã確èªããã«ã¯ïŒ
ã¿ãŒããã«ã§`ollama ps`ãå®è¡ããŸããã¢ãã«ãå®è¡äžã§ããã°ãååããµã€ãºãã¡ã¢ãªäœ¿çšéãšãšãã«ãªã¹ãã«è¡šç€ºãããŸããã2+2ã¯ïŒãã®ãããªç°¡åãªããã³ãããéã£ãŠãã4ããšè¿ã£ãŠããã°æ£åžžã«åäœããŠããŸãã
ããŒã«ã«LLMãå®è¡ããã®ã«GPUã¯å¿ èŠã§ããïŒ
ããããããŒã«ã«LLMã¯CPUã§åäœããŸããGPUã¯æšè«ã5-10åéãããŸãããåŠç¿ãå€ãã®å®éã®ãŠãŒã¹ã±ãŒã¹ã«ã¯CPUå°çšã§ãåé¡ãããŸãããApple M1/M2ãAMD RyzenããŸãã¯Intel第12äžä»£CPUãæèŒããææ°ã®ããŒãããœã³ã³ã¯3B-7Bã¢ãã«ãåççãªé床ïŒ10-30ããŒã¯ã³/ç§ïŒã§å®è¡ã§ããŸãã
ããŒã«ã«LLMã¯ã©ããããã®ãã£ã¹ã¯ã¹ããŒã¹ã䜿ããŸããïŒ
`llama3.2:1b`ã¯1.3 GBã`llama3.2:3b`ã¯2 GBã`llama3.1:8b`ã¯4.7 GBã§ãããããã¯Ollamaãä¿åããå§çž®ãµã€ãºã§ãã
ã€ã³ã¿ãŒãããæ¥ç¶ãªãã§ããŒã«ã«LLMã䜿çšã§ããŸããïŒ
ã¯ããå®å šã«ãOllamaã§ã¢ãã«ãäžåºŠããŠã³ããŒããïŒã€ã³ã¿ãŒãããå¿ èŠïŒããã®åŸã¯æ°žé ã«ã€ã³ã¿ãŒããããªãã§ããŒã«ã«ã§å®è¡ã§ããŸãããã©ã€ããŒããããã¯ãŒã¯ãé£è¡æ©å ããŸãã¯å®å šãªãªãã©ã€ã³ç°å¢ã«æé©ã§ãã
ããŒã«ã«LLMãšChatGPTã®éãã¯ïŒ
ChatGPTã¯Anthropicã®ãµãŒããŒã§åäœããŸããããŒã«ã«LLMã¯ããªãã®ãã·ã³ã§åäœããŸããããŒã«ã« = ããã€ã¹ããããŒã¿ãäžååºãªããå®å šãªãã©ã€ãã·ãŒãAPIã³ã¹ããªããChatGPT = è€éãªã¿ã¹ã¯ã§ããé«ãå質ãã€ã³ã¿ãŒããããšææãµãã¹ã¯ãªãã·ã§ã³ãå¿ èŠã
Ollamaã§è©Šãã¹ãæåã®ã¢ãã«ã¯ïŒ
`ollama pull llama3.2:3b` -- 2 GBã§ãããããææ°ããŒãããœã³ã³ã§åäœããæèœãªåçãçæããOllamaãæšå¥šããã¹ã¿ãŒãå°ç¹ã§ãã
æåã®å®è¡åŸã®æ¬¡ã®ã¹ããã
åäœããããŒã«ã«LLMãã§ããã®ã§ãäœãã§ãããæ¢çŽ¢ããŸããããããŒããŠã§ã¢ã«æé©ãªã¢ãã«ãçè§£ããã«ã¯ãåå¿è åããã¹ãããŒã«ã«LLMã¢ãã«ãã芧ãã ãããããŒãããœã³ã³åºæã®ããã©ãŒãã³ã¹ã«ã€ããŠã¯ããŒãããœã³ã³ã§ããŒã«ã«LLMãå®è¡ããæ¹æ³ãã芧ãã ããã
åèè³æ
- **Ollamaã¢ãã«ã©ã€ãã©ãª** -- ããŠã³ããŒãå¯èœãªã¢ãã«ãšãã®ä»æ§ã®å ¬åŒãªã¹ã
- **Ollama GitHubãªããžããª** -- ãªãŒãã³ãœãŒã¹ã³ãŒããããã¥ã¡ã³ããã€ã·ã¥ãŒãã©ããã³ã°
- **Meta Llama 3.2 ã¢ãã«ã«ãŒã** -- å ¬åŒä»æ§ããã¬ãŒãã³ã°ããŒã¿ãããã©ãŒãã³ã¹ãã³ãããŒã¯
æåã®å®è¡åŸã®ããããééã
- ããŒã¯ã³æ°ãšéåºŠãæ··åãã -- 20ããŒã¯ã³/ç§ã§100ããŒã¯ã³ãçæãã7Bã¢ãã«ã¯5ç§ããããŸããç¬æã§ã¯ãããŸããã
- ä»ã®ã¿ã¹ã¯ã§ã·ã¹ãã ãå¿ããäžã§æšè«ãå®è¡ãããšãå®å¹çãªããŒã¯ã³/ç§ãå€§å¹ ã«äœäžããŸãã
- ã³ã³ããã¹ããŠã£ã³ããŠã®å¶éã確èªããªã -- ã»ãšãã©ã®åå¿è åãã¢ãã«ã¯2K-8KããŒã¯ã³ããµããŒãããŠãããããã³ãã£ã¢ã¢ãã«ã®100K+ã§ã¯ãããŸããã
- æåã®å®è¡ã§å³æã¬ã¹ãã³ã¹ãæåŸ ãã -- æåã®ã¬ã¹ãã³ã¹ã«ã¯ã¢ãã«ã®ããŒãæéïŒ5-30ç§ïŒãå«ãŸããŸããåãã»ãã·ã§ã³å ã®åŸç¶ã®ã¬ã¹ãã³ã¹ã¯2-5åéããªããŸãã
- ééã£ãã¢ãã«ã¿ã°ã䜿çšãã -- `llama3.1:8b-text`ã¯ããŒã¹ããã¹ãè£å®ã¢ãŒãã§ãç¡éã«ãŒã/ç¹°ãè¿ããçºçããŸãããã£ããã«ã¯`llama3.1:8b-instruct`ã®ãããª`-instruct`ã¿ã°ã䜿çšããŠãã ããã