Skip to main content
PromptQuorumPromptQuorum
Home/Local LLMs/๋กœ์ปฌ LLM ํ”„๋กฌํ”„ํŠธ ์—”์ง€๋‹ˆ์–ด๋ง 2026: CoT ๋ฐ Few-Shot
Advanced Techniques

๋กœ์ปฌ LLM ํ”„๋กฌํ”„ํŠธ ์—”์ง€๋‹ˆ์–ด๋ง 2026: CoT ๋ฐ Few-Shot

ยท11๋ถ„ ์ฝ๊ธฐยทBy Hans Kuepper ยท Founder of PromptQuorum, multi-model AI dispatch tool ยท PromptQuorum

๋กœ์ปฌ LLM(7Bโ€“13B ๋ชจ๋ธ)์€ ํด๋ผ์šฐ๋“œ API์™€ ๋‹ค๋ฅด๊ฒŒ ํ”„๋กฌํ”„ํŠธ์— ๋ฐ˜์‘ํ•ฉ๋‹ˆ๋‹ค. ๋ช…์‹œ์ ์ธ ๊ตฌ์กฐ, ๋” ๋ช…ํ™•ํ•œ ์ง€์นจ, ๊ทธ๋ฆฌ๊ณ  in-context ํ•™์Šต์— ๋Œ€ํ•œ ์˜์กด๋„๋ฅผ ์ค„์—ฌ์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๋กœ์ปฌ 7Bโ€“13B ๋ชจ๋ธ์€ GPT-5.2 ๋˜๋Š” Claude์™€ ๋‹ค๋ฅด๊ฒŒ ํ”„๋กฌํ”„ํŠธ์— ๋ฐ˜์‘ํ•ฉ๋‹ˆ๋‹ค. ๋ช…์‹œ์ ์ธ ๊ตฌ์กฐ, ๋” ๋ช…ํ™•ํ•œ ์ง€์นจ, ๊ทธ๋ฆฌ๊ณ  ํด๋ผ์šฐ๋“œ ๋ชจ๋ธ์ด 1โ€“2๊ฐœ๋งŒ ํ•„์š”ํ•œ ๊ณณ์—์„œ 3โ€“5๊ฐœ์˜ few-shot ์˜ˆ์‹œ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. 2026๋…„ 4์›” ๊ธฐ์ค€์œผ๋กœ, ๊ฒ€์ฆ๋œ ๊ธฐ๋ฒ•์—๋Š” chain-of-thought ํ”„๋กฌํ”„ํŒ…(์ •ํ™•๋„ 10โ€“20% ํ–ฅ์ƒ), ์—ญํ•  ์ •์˜, ๊ตฌ์กฐํ™”๋œ ์ถœ๋ ฅ ํ˜•์‹ ์ง€์ •(JSON), Ollama ๋ฐ LM Studio์˜ ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ ๊ตฌ์„ฑ์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค.

Key Takeaways

  • ๋กœ์ปฌ 7B ๋ชจ๋ธ์€ GPT-5.5๋ณด๋‹ค ๋” ๋ช…์‹œ์ ์ธ ์•ˆ๋‚ด๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๋” ๊ธด ํ”„๋กฌํ”„ํŠธ์™€ ๋” ๋ช…ํ™•ํ•œ ์ง€์นจ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
  • Chain-of-thought("๋‹จ๊ณ„๋ณ„๋กœ ์ƒ๊ฐํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค")์€ ์ถ”๋ก  ์ •ํ™•๋„๋ฅผ 10โ€“20% ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.
  • ํ•ญ์ƒ ์ถœ๋ ฅ ํ˜•์‹(JSON, Markdown, ์ผ๋ฐ˜ ํ…์ŠคํŠธ)์„ ์ง€์ •ํ•˜์‹ญ์‹œ์˜ค. ๋น„๊ตฌ์กฐํ™”๋œ ์ถœ๋ ฅ์€ ์˜ˆ์ธก ๋ถˆ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
  • Few-shot ์˜ˆ์‹œ(1โ€“3๊ฐœ)๋Š” ๋กœ์ปฌ ๋ชจ๋ธ์—์„œ zero-shot๋ณด๋‹ค ๋” ํšจ๊ณผ์ ์ž…๋‹ˆ๋‹ค. ์˜ˆ์‹œ๊ฐ€ ๋งŽ์„์ˆ˜๋ก ์ผ๊ด€์„ฑ์ด ๋†’์•„์ง‘๋‹ˆ๋‹ค.
  • ์—ญํ•  ์ •์˜("๋‹น์‹ ์€ Python ์ „๋ฌธ๊ฐ€์ž…๋‹ˆ๋‹ค")๋Š” ๋„๋ฉ”์ธ๋ณ„ ์‘๋‹ต์„ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.

๋น ๋ฅธ ์‚ฌ์‹ค

  • CoT๋ฅผ ํ†ตํ•œ ์ •ํ™•๋„ ํ–ฅ์ƒ: ์ถ”๋ก  ์ž‘์—…์—์„œ 10โ€“20% ํ–ฅ์ƒ
  • Few-shot ์š”๊ตฌ์‚ฌํ•ญ: ๋กœ์ปฌ 7B๋Š” ์˜ˆ์‹œ 3โ€“5๊ฐœ ํ•„์š” vs ํด๋ผ์šฐ๋“œ API๋Š” 1โ€“2๊ฐœ ํ•„์š”
  • ์ปจํ…์ŠคํŠธ ์†Œ๋น„: ๊ฐ ์˜ˆ์‹œ๋‹น 50โ€“200 ํ† ํฐ ์‚ฌ์šฉ
  • Temperature ์˜ํ–ฅ: 0.8์—์„œ 0.3์œผ๋กœ ๋‚ฎ์ถ”๋ฉด ์‚ฌ์‹ค ์ •ํ™•๋„ 15โ€“25% ํ–ฅ์ƒ
  • ๋ชจ๋ธ ํฌ๊ธฐ ์ฐจ์ด: 7B ๋ชจ๋ธ์€ 70B ๋ชจ๋ธ๋ณด๋‹ค ๋” ๋ช…์‹œ์ ์ธ ์•ˆ๋‚ด๊ฐ€ ํ•„์š”
  • ์ถœ๋ ฅ ํ˜•์‹ ์ผ๊ด€์„ฑ: JSON ๋ช…์„ธ๋Š” ์‹ ๋ขฐ์„ฑ์„ 30โ€“40% ํ–ฅ์ƒ

๋กœ์ปฌ ๋ชจ๋ธ์€ ์–ด๋–ป๊ฒŒ ๋‹ค๋ฆ…๋‹ˆ๊นŒ?

AspectGPT-5.2 (ChatGPT Plus)Local 7B (Llama 3.3 8B)Local 70B (Llama 3.3)
์ปจํ…์ŠคํŠธ ์ฐฝ128K ํ† ํฐ4Kโ€“128K ํ† ํฐ128K ํ† ํฐ
์ง€์นจ ๋”ฐ๋ฅด๊ธฐ๋งค์šฐ ์šฐ์ˆ˜๋ช…์‹œ์  ํ”„๋กฌํ”„ํŠธ๋กœ ์–‘ํ˜ธ๋งค์šฐ ์ข‹์Œ
Few-shot ํ•™์Šต์˜ˆ์‹œ 1โ€“2๊ฐœ์˜ˆ์‹œ 3โ€“5๊ฐœ ํ•„์š”์˜ˆ์‹œ 2โ€“3๊ฐœ
์ถ”๋ก ๋‹ค๋‹จ๊ณ„ ์•”๋ฌต์ ๋‹จ๊ณ„๋ณ„ ๋ช…์‹œ์  ํ•„์š”์ค‘๊ฐ„ ์ˆ˜์ค€ ์•”๋ฌต์ 
์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธAPI๊ฐ€ ์ฒ˜๋ฆฌ๋„๊ตฌ๋ณ„ ์„ค์ • ํ•„์š”๋„๊ตฌ๋ณ„ ์„ค์ • ํ•„์š”
Temperature ๊ธฐ๋ณธ๊ฐ’1.0 (API)0.8 (Ollama ๊ธฐ๋ณธ๊ฐ’)0.8 (Ollama ๊ธฐ๋ณธ๊ฐ’)

Chain-of-Thought ํ”„๋กฌํ”„ํŒ…์€ ์–ด๋–ป๊ฒŒ ์ •ํ™•๋„๋ฅผ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๊นŒ?

Chain-of-thought(CoT) ํ”„๋กฌํ”„ํŒ…์€ LLM์—๊ฒŒ ๋‹ต๋ณ€ํ•˜๊ธฐ ์ „์— ์ถ”๋ก  ๊ณผ์ •์„ ๋‹จ๊ณ„๋ณ„๋กœ ๋ณด์—ฌ๋‹ฌ๋ผ๊ณ  ์š”์ฒญํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ธฐ๋ฒ•์€ ๋กœ์ปฌ 7Bโ€“13B ๋ชจ๋ธ์— ํŠนํžˆ ํšจ๊ณผ์ ์ž…๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ชจ๋ธ๋“ค์€ ๋” ํฐ ํด๋ผ์šฐ๋“œ ๋ชจ๋ธ์˜ ์•”๋ฌต์  ์ถ”๋ก  ๋Šฅ๋ ฅ์ด ๋ถ€์กฑํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. "17 ร— 24"์™€ ๊ฐ™์€ ์ˆ˜ํ•™ ๋ฌธ์ œ์—์„œ, CoT ์—†์ด๋Š” ๋กœ์ปฌ ๋ชจ๋ธ์ด ์ž์ฃผ ์ž˜๋ชป ์ถ”์ธกํ•ฉ๋‹ˆ๋‹ค. ๋ช…์‹œ์ ์ธ ๋‹จ๊ณ„๋ณ„ ์ถ”๋ก ์„ ์‚ฌ์šฉํ•˜๋ฉด ๋ฌธ์ œ๋ฅผ ๋ถ€๋ถ„์œผ๋กœ ๋ถ„ํ•ดํ•˜์—ฌ 10โ€“20% ๋” ๋†’์€ ์ •ํ™•๋„๋ฅผ ๋‹ฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.

CoT ์—†์ด: "17 ร— 24๋Š” ์–ผ๋งˆ์ž…๋‹ˆ๊นŒ?" โ†’ ๋ชจ๋ธ์ด ์ง์ ‘ ๋‹ต๋ณ€ํ•˜๋ฉฐ ์ž์ฃผ ํ‹€๋ฆผ.

CoT ์‚ฌ์šฉ: "๋‹จ๊ณ„๋ณ„๋กœ ํ’€์–ด๋ณด์‹ญ์‹œ์˜ค: 17 ร— 24" โ†’ ๋ชจ๋ธ์ด ํ‘œ์‹œ: 17 ร— 20 = 340, 17 ร— 4 = 68, ํ•ฉ๊ณ„ = 408. ๋” ์ •ํ™•ํ•จ.

์ด ๊ธฐ๋ฒ•์ด ๋„๊ตฌ ์„ ํƒ์„ ์œ„ํ•ด ๋‚ด๋ถ€์ ์œผ๋กœ ์ถ”๋ก ์„ ์‚ฌ์šฉํ•˜๋Š” ๋กœ์ปฌ AI ์—์ด์ „ํŠธ๊นŒ์ง€ ์–ด๋–ป๊ฒŒ ํ™•์žฅ๋˜๋Š”์ง€ ์•Œ์•„๋ณด์‹ญ์‹œ์˜ค.

Chain-of-thought ํ”„๋กฌํ”„ํŒ…์€ ๋ชจ๋ธ์—๊ฒŒ ๋‹ต๋ณ€ ์ „์— ์ถ”๋ก ์„ ๋ช…์‹œ์  ๋‹จ๊ณ„๋กœ ๋ถ„ํ•ดํ•˜๋„๋ก ์ง€์‹œํ•˜๋ฉฐ, ๋ณต์žกํ•œ ์ž‘์—…์—์„œ ์ •ํ™•๋„๋ฅผ 10โ€“20% ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.

python
# CoT๋ฅผ ์‚ฌ์šฉํ•œ ํ”„๋กฌํ”„ํŠธ
prompt = """
You will answer a question by thinking step-by-step.
Let me think about this:

Question: Why do local LLMs require more explicit prompting than cloud APIs?

Thinking:
1. First, consider the differences in model size...
2. Then, think about training data and fine-tuning...
3. Finally, consider the architecture and inference optimization...

Answer:
"""

# ์ด ์ฝ”๋“œ๋Š” ๋ชจ๋ธ์ด ๋ฌธ์ œ๋ฅผ ์ถ”๋ก ํ•˜๋„๋ก ์•ˆ๋‚ดํ•ฉ๋‹ˆ๋‹ค

โ€ข๐Ÿ’ก: ํ”„๋กœ ํŒ: CoT๋Š” ๋ถ€๋ถ„์ ์ธ ์ถ”๋ก ์œผ๋กœ ์ถœ๋ ฅ์„ ์ค€๋น„ํ•  ๋•Œ ๊ฐ€์žฅ ํšจ๊ณผ์ ์ž…๋‹ˆ๋‹ค. ์˜ˆ์‹œ: "๋‹จ๊ณ„๋ณ„๋กœ ๋ถ„์„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค: ๋จผ์ €, ๋‹ค์Œ์„ ์•Œ์•„์ฐจ๋ฆฝ๋‹ˆ๋‹ค..."

์ถœ๋ ฅ ํ˜•์‹ ์ง€์ •์ด ๋กœ์ปฌ ๋ชจ๋ธ์— ์™œ ์ค‘์š”ํ•ฉ๋‹ˆ๊นŒ?

์ •ํ™•ํ•œ ์ถœ๋ ฅ ํ˜•์‹(JSON, Markdown, ์ผ๋ฐ˜ ํ…์ŠคํŠธ)์„ ์ง€์ •ํ•˜๋Š” ๊ฒƒ์€ ๋กœ์ปฌ ๋ชจ๋ธ์— ๋งค์šฐ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ๋ช…์‹œ์ ์ธ ์ง€์นจ ์—†์ด๋Š” ์˜ˆ์ธก ๋ถˆ๊ฐ€๋Šฅํ•œ ์ถœ๋ ฅ์„ ์ƒ์„ฑํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. GPT-5.5์™€ ๊ฐ™์€ ํด๋ผ์šฐ๋“œ ๋ชจ๋ธ์€ ๋ชจํ˜ธํ•œ ์š”์ฒญ์—์„œ๋„ ์˜๋„๋ฅผ ์ถ”๋ก ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ๋กœ์ปฌ 7Bโ€“13B ๋ชจ๋ธ์€ ๊ทธ๋ ‡์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ตฌ์กฐํ™”๋œ ๋ฌธ์„œ ์ถ”์ถœ์ด ํ•„์š”ํ•œ ๋กœ์ปฌ RAG ์‹œ์Šคํ…œ์˜ ๊ฒฝ์šฐ, JSON ํ˜•์‹ ๋ช…์„ธ๋Š” ํŒŒ์‹ฑ ์˜ค๋ฅ˜๋ฅผ ๋ฐฉ์ง€ํ•˜๊ณ  ์ถ”์ถœ ์ •ํ™•๋„๋ฅผ 30โ€“40% ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.

์˜ˆ์‹œ: "ํ…์ŠคํŠธ์—์„œ ์—”ํ‹ฐํ‹ฐ๋ฅผ ์ถ”์ถœํ•˜์‹ญ์‹œ์˜ค"๋Š” ๋ชฉ๋ก ๋Œ€์‹  ์„œ์ˆ ํ˜• ํ…์ŠคํŠธ๋ฅผ ๋ฐ˜ํ™˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ฐœ์„ : "JSON ํ˜•์‹์œผ๋กœ ์—”ํ‹ฐํ‹ฐ๋ฅผ ์ถ”์ถœํ•˜์‹ญ์‹œ์˜ค: person, location, organization ํ‚ค ํฌํ•จ".

python
# ๋‚˜์œ ์˜ˆ: ๋ชจํ˜ธํ•œ ์ถœ๋ ฅ
prompt = "Summarize this text"

# ์ข‹์€ ์˜ˆ: ๋ช…์‹œ์  ํ˜•์‹
prompt = """
Summarize the text in EXACTLY 3 bullet points.
Format as a JSON list:
{
  "summary": [
    "- Point 1",
    "- Point 2",
    "- Point 3"
  ]
}
"""

โ€ขโš ๏ธ: ์ผ๋ฐ˜์ ์ธ ๋ฌธ์ œ: ๋กœ์ปฌ ๋ชจ๋ธ์ด ์›์‹œ JSON ์ถœ๋ ฅ์„ ๊ฑฐ๋ถ€ํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ํ”„๋กฌํ”„ํŠธ์— "Output ONLY JSON, no markdown fence"๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ์ด๋ฅผ ์šฐํšŒํ•˜์‹ญ์‹œ์˜ค.

์—ญํ•  ํ• ๋‹น์ด ๋กœ์ปฌ ๋ชจ๋ธ ์‘๋‹ต์„ ์–ด๋–ป๊ฒŒ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๊นŒ?

ํŠน์ • ์—ญํ• ์„ ํ• ๋‹นํ•˜๋Š” ๊ฒƒ("๋‹น์‹ ์€ 10๋…„ ๊ฒฝ๋ ฅ์˜ Python ์ „๋ฌธ๊ฐ€์ž…๋‹ˆ๋‹ค")์€ ์ผ๋ฐ˜์ ์ธ ํ”„๋กฌํ”„ํŠธ์— ๋น„ํ•ด ๋„๋ฉ”์ธ๋ณ„ ์‘๋‹ต์„ ๋Œ€ํญ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค. ํŽ˜๋ฅด์†Œ๋‚˜ ํ”„๋กฌํ”„ํŒ…์ด๋ผ๊ณ  ๋ถˆ๋ฆฌ๋Š” ์ด ๊ธฐ๋ฒ•์€ ๋ชจ๋ธ์˜ ์‘๋‹ต ์ƒ์„ฑ์„ ํŠน์ • ์ „๋ฌธ ๋„๋ฉ”์ธ์— ๊ณ ์ •์‹œํ‚ด์œผ๋กœ์จ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ๋กœ์ปฌ ๋ชจ๋ธ์€ ํด๋ผ์šฐ๋“œ ๋ชจ๋ธ๋ณด๋‹ค ์—ญํ•  ์ •์˜์— 15โ€“25% ๋” ์ž˜ ๋ฐ˜์‘ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ์ผ๋ฐ˜์ ์ธ ํ”„๋กฌํ”„ํŠธ๊ฐ€ ์ž‘๋™ํ•˜๋„๋ก ํ•˜๋Š” ๊ฐ•๋ ฅํ•œ RLHF ์ •๋ ฌ์ด ๋ถ€์กฑํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ์˜ˆ์‹œ:

  • "๋‹น์‹ ์€ Python ์ „๋ฌธ๊ฐ€์ž…๋‹ˆ๋‹ค" โ†’ ๋” ๋‚˜์€ ์ฝ”๋“œ ์„ค๋ช…
  • "๋‹น์‹ ์€ ์˜ํ•™ ์—ฐ๊ตฌ์›์ž…๋‹ˆ๋‹ค" โ†’ ๋” ์ƒ์„ธํ•œ ์ƒ์˜ํ•™ ์‘๋‹ต
  • "๋‹น์‹ ์€ ํšŒ์˜์ ์ธ ๋ถ„์„๊ฐ€์ž…๋‹ˆ๋‹ค" โ†’ ๋” ๋น„ํŒ์ ์ธ ์‚ฌ๊ณ 

์—ฌ๋Ÿฌ ์‚ฌ์šฉ ์‚ฌ๋ก€์— ๊ฑธ์ณ ๋ฐฐํฌํ•˜๋Š” ๊ฒฝ์šฐ ๋” ๊ฐ•๋ ฅํ•œ ๋„๋ฉ”์ธ ์ •๋ ฌ์„ ์œ„ํ•ด ์—ญํ•  ์ •์˜์™€ ํŒŒ์ธํŠœ๋‹์„ ๊ฒฐํ•ฉํ•˜์‹ญ์‹œ์˜ค.

์ผ์ƒ์ ์ธ ํ‘œํ˜„์œผ๋กœ, ํŽ˜๋ฅด์†Œ๋‚˜ ํ”„๋กฌํ”„ํŒ…์€ ๋ชจ๋ธ์—๊ฒŒ ๋‹ต๋ณ€ํ•  ๋•Œ ์–ด๋–ค "๋ชจ์ž"๋ฅผ ์“ธ์ง€ ์•Œ๋ ค์ค๋‹ˆ๋‹ค. Python ์ „๋ฌธ๊ฐ€ ๋ชจ์ž๋Š” ์ผ๋ฐ˜ ๋ณด์กฐ์ž ๋ชจ์ž๋ณด๋‹ค ๋‹ค๋ฅธ(๊ทธ๋ฆฌ๊ณ  ๋” ๋‚˜์€) ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

โ€ข๐ŸŽฏ: ๋ชจ๋ฒ” ์‚ฌ๋ก€: ๊ตฌ์ฒด์„ฑ์ด ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. "๋‹น์‹ ์€ ์ „๋ฌธ๊ฐ€์ž…๋‹ˆ๋‹ค"๋Š” ์•ฝํ•ฉ๋‹ˆ๋‹ค. "๋‹น์‹ ์€ async/await ํŒจํ„ด์— ์ง‘์ค‘ํ•˜๋Š” 10๋…„ ๊ฒฝ๋ ฅ์˜ Python ๋ฐฑ์—”๋“œ ์ „๋ฌธ๊ฐ€์ž…๋‹ˆ๋‹ค"๊ฐ€ ๊ฐ•ํ•ฉ๋‹ˆ๋‹ค.

Ollama, LM Studio, llama.cpp์—์„œ ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์–ด๋–ป๊ฒŒ ์„ค์ •ํ•ฉ๋‹ˆ๊นŒ?

์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ๋Š” ์‚ฌ์šฉ์ž ๋ฉ”์‹œ์ง€ ์ด์ „์— ๋ชจ๋ธ์˜ ์—ญํ• ๊ณผ ์ œ์•ฝ์„ ์ •์˜ํ•˜๋ฉฐ, ๊ฐ ๋„๊ตฌ(Ollama, LM Studio, llama.cpp)๋งˆ๋‹ค ์„ค์ • ๋ฐฉ๋ฒ•์ด ๋‹ค๋ฆ…๋‹ˆ๋‹ค.

bash
# Ollama (Modelfile)
FROM llama3.1:8b
SYSTEM """You are a Python expert with 10 years experience. Answer only Python questions. Provide code examples. Use type hints."""
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER repeat_penalty 1.1

# Ollama (API / OpenAI SDK)
response = client.chat.completions.create(
  model="llama3.1:8b",
  messages=[
    {"role": "system", "content": "You are a Python expert..."},
    {"role": "user", "content": "Write a FastAPI endpoint"}
  ],
  temperature=0.7
)

# LM Studio (GUI)
# Settings -> System Prompt ํ•„๋“œ(ํ”„๋กฌํ”„ํŠธ ๋ถ™์—ฌ๋„ฃ๊ธฐ)
# ๋˜๋Š” localhost:1234์˜ API -- Ollama์™€ ๋™์ผํ•œ ํ˜•์‹

# llama.cpp (CLI)
./main -m llama-3.1-8b.gguf \
  --system-prompt "You are a Python expert..." \
  --temp 0.7 --top-p 0.9 --repeat-penalty 1.1 \
  -p "Write a FastAPI endpoint"

Temperature ๋ฐ ์ƒ˜ํ”Œ๋ง ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ์ถœ๋ ฅ ํ’ˆ์งˆ์— ์–ด๋–ค ์˜ํ–ฅ์„ ๋ฏธ์นฉ๋‹ˆ๊นŒ?

temperature, top_p, repeat_penalty ์กฐ์ •์€ ํ”„๋กฌํ”„ํŠธ ๋ฌธ๊ตฌ๋งŒํผ์ด๋‚˜ ๋กœ์ปฌ 7B ์ถœ๋ ฅ ํ’ˆ์งˆ์— ์˜ํ–ฅ์„ ๋ฏธ์น˜๋ฉฐ, ๋กœ์ปฌ ๋ชจ๋ธ์€ ํด๋ผ์šฐ๋“œ API์™€ ๋‹ค๋ฅธ ๊ธฐ๋ณธ๊ฐ’์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

๋กœ์ปฌ ๋ชจ๋ธ์„ ์œ„ํ•œ ํ•ต์‹ฌ ์ธ์‚ฌ์ดํŠธ: Ollama์˜ ๊ธฐ๋ณธ temperature(0.8)๋Š” OpenAI API์˜ ๊ธฐ๋ณธ๊ฐ’(nucleus ์ƒ˜ํ”Œ๋ง์œผ๋กœ 1.0)๋ณด๋‹ค ๋†’์Šต๋‹ˆ๋‹ค. Temperature๋ฅผ 0.3โ€“0.5๋กœ ๋‚ฎ์ถ”๋ฉด ๋กœ์ปฌ 7B ๋ชจ๋ธ์˜ ์‚ฌ์‹ค ์ •ํ™•๋„๊ฐ€ ํฌ๊ฒŒ ํ–ฅ์ƒ๋ฉ๋‹ˆ๋‹ค. ์ฝ”๋”ฉ ์ž‘์—…์˜ ๊ฒฝ์šฐ temperature๋ฅผ 0.1โ€“0.2๋กœ ์„ค์ •ํ•˜๊ณ  repeat_penalty๋ฅผ 1.0์œผ๋กœ ์„ค์ •ํ•˜์‹ญ์‹œ์˜ค(์ฝ”๋“œ๋Š” import๋‚˜ ํ•จ์ˆ˜ ํ˜ธ์ถœ์ฒ˜๋Ÿผ ๋ฐ˜๋ณต์ ์ธ ํŒจํ„ด์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค).

ParameterWhat it controlsDefault (Ollama)Recommended
temperature๋ฌด์ž‘์œ„์„ฑ0.8์‚ฌ์‹ค์ : 0.3โ€“0.5, ์ฐฝ์˜์ : 0.7โ€“0.9
top_p์–ดํœ˜ ๋‹ค์–‘์„ฑ0.9์ผ๊ด€์„ฑ: 0.8, ๋‹ค์–‘์„ฑ: 0.95
repeat_penalty๋ฐ˜๋ณต ๋ฐฉ์ง€1.1๋Œ€ํ™”: 1.1โ€“1.2, ์ฝ”๋“œ: 1.0

โ€ข๐Ÿ“Œ: ํ•ต์‹ฌ ์‚ฌํ•ญ: Temperature๋Š” logit์— ๋Œ€ํ•œ ์Šน์ˆ˜์ž…๋‹ˆ๋‹ค. 0.0์—์„œ๋Š” ํ•ญ์ƒ ๊ฐ€์žฅ ๋†’์€ ํ™•๋ฅ  ํ† ํฐ์„ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค. 1.0 ์ด์ƒ์—์„œ๋Š” ๋ฌด์ž‘์œ„์„ฑ์ด ์ฆ๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ๋กœ์ปฌ ๋ชจ๋ธ์€ temperature 1.5 ์ด์ƒ์—์„œ ํฌํ™”๋ฉ๋‹ˆ๋‹ค.

๋กœ์ปฌ ๋ชจ๋ธ์€ ์™œ ํด๋ผ์šฐ๋“œ API๋ณด๋‹ค ๋” ๋งŽ์€ Few-Shot ์˜ˆ์‹œ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๊นŒ?

๋กœ์ปฌ ๋ชจ๋ธ์— 3โ€“5๊ฐœ์˜ ์˜ˆ์‹œ(few-shot ํ•™์Šต)๋ฅผ ์ œ๊ณตํ•˜๋ฉด zero-shot ๋Œ€๋น„ ์ถœ๋ ฅ ์ผ๊ด€์„ฑ์ด 15โ€“25% ํ–ฅ์ƒ๋ฉ๋‹ˆ๋‹ค. ๋ฐ˜๋ฉด ํด๋ผ์šฐ๋“œ ๋ชจ๋ธ์€ 1โ€“2๊ฐœ์˜ ์˜ˆ์‹œ๋งŒ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

๋กœ์ปฌ ๋ชจ๋ธ์€ ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๊ฐ€ ์ ๊ณ  ํ•™์Šต ๋ฐ์ดํ„ฐ๊ฐ€ ๋œ ๋‹ค์–‘ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋” ๋งŽ์€ ์˜ˆ์‹œ๊ฐ€ ๋„์›€์ด ๋ฉ๋‹ˆ๋‹ค. Few-shot ํ•™์Šต์€ ๋ชจ๋ธ์—๊ฒŒ ์‹ค์ œ ์ž‘์—…์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์ „์— ์˜ˆ์ƒ๋˜๋Š” ์ž…๋ ฅ/์ถœ๋ ฅ ํŒจํ„ด์„ ๋ณด์—ฌ์ฃผ๋Š” in-context ํ•™์Šต ๊ธฐ๋ฒ•์ž…๋‹ˆ๋‹ค.

python
# Few-shot ํ”„๋กฌํ”„ํŠธ
prompt = """
Classify sentiment. Examples:

"I love this product!" -> positive
"Worst experience ever" -> negative
"It's okay, nothing special" -> neutral

Now classify: "This is amazing!"
Answer: """

# ๋ชจ๋ธ์ด ์˜ˆ์‹œ์—์„œ ํ˜•์‹๊ณผ ์Šคํƒ€์ผ์„ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค

โ€ข๐Ÿ› ๏ธ: ๊ตฌํ˜„ ํŒ: ์˜ˆ์‹œ๋ฅผ ๋‹ค์–‘ํ•˜๊ฒŒ ์ œ๊ณตํ•˜์‹ญ์‹œ์˜ค(์‰ฌ์šด ๊ฒƒ 1๊ฐœ, ์ค‘๊ฐ„ ๊ฒƒ 1๊ฐœ, ์–ด๋ ค์šด ๊ฒƒ 1๊ฐœ). 3๊ฐœ์˜ ์œ ์‚ฌํ•œ ์˜ˆ์‹œ๋ณด๋‹ค ๋‹ค์–‘์„ฑ์ด ์ผ๋ฐ˜ํ™”๋ฅผ ํ–ฅ์ƒ์‹œํ‚ค๊ณ  ํŠน์ • ํŒจํ„ด์— ๋Œ€ํ•œ ๊ณผ์ ํ•ฉ์„ ๋ฐฉ์ง€ํ•ฉ๋‹ˆ๋‹ค.

์ผ๋ฐ˜์ ์ธ ํ”„๋กฌํ”„ํŠธ ์—”์ง€๋‹ˆ์–ด๋ง ์‹ค์ˆ˜

  • ๊ตฌ์กฐ ์—†๋Š” ์žฅํ™ฉํ•œ ํ”„๋กฌํ”„ํŠธ. ํšก์„ค์ˆ˜์„คํ•˜๋Š” ์ง€์นจ์€ ๋กœ์ปฌ ๋ชจ๋ธ์„ ํ˜ผ๋ž€์Šค๋Ÿฝ๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค. ๊ฐ„๊ฒฐํ•˜๊ณ  ๋ช…์‹œ์ ์œผ๋กœ ์ž‘์„ฑํ•˜์‹ญ์‹œ์˜ค.
  • Chain-of-thought ๋ฏธ์‚ฌ์šฉ. CoT๋Š” ์ •ํ™•๋„๋ฅผ 10โ€“20% ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค. ์ถ”๋ก  ์ž‘์—…์—๋Š” ํ•ญ์ƒ ํฌํ•จํ•˜์‹ญ์‹œ์˜ค.
  • ํ•˜๋‚˜์˜ ํ”„๋กฌํ”„ํŠธ๊ฐ€ ๋ชจ๋“  ๊ฒฝ์šฐ์— ์ ์šฉ๋œ๋‹ค๋Š” ๊ฐ€์ •. ๋ฐ˜๋ณตํ•˜๊ณ  ํ…Œ์ŠคํŠธํ•˜์‹ญ์‹œ์˜ค. ์ž‘์€ ๋ฌธ๊ตฌ ๋ณ€๊ฒฝ์ด ํฐ ์ถœ๋ ฅ ๋ณ€ํ™”๋ฅผ ์•ผ๊ธฐํ•ฉ๋‹ˆ๋‹ค.
  • ์ถœ๋ ฅ ํ˜•์‹ ๋ฌด์‹œ. ๋ช…์‹œ์ ์ธ ํ˜•์‹ ์ง€์ • ์—†์ด๋Š” ์ถœ๋ ฅ์ด ์˜ˆ์ธก ๋ถˆ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
  • ๋ชจํ˜ธํ•œ ์—ญํ•  ์ •์˜ ์‚ฌ์šฉ. "๋‹น์‹ ์€ ์ „๋ฌธ๊ฐ€์ž…๋‹ˆ๋‹ค"๋Š” ๋ชจํ˜ธํ•ฉ๋‹ˆ๋‹ค. "๋‹น์‹ ์€ 10๋…„ ๊ฒฝ๋ ฅ์˜ Python ์ „๋ฌธ๊ฐ€์ž…๋‹ˆ๋‹ค"๊ฐ€ ๋” ๋‚ซ์Šต๋‹ˆ๋‹ค.

โ€ข๐Ÿ“: ์•Œ๊ณ  ๊ณ„์…จ์Šต๋‹ˆ๊นŒ? ๊ฐ€์žฅ ํšจ๊ณผ์ ์ธ ํ”„๋กฌํ”„ํŠธ๋Š” 3โ€“5๋ฒˆ ๋ฐ˜๋ณตํ•ฉ๋‹ˆ๋‹ค. ๋กœ์ปฌ ๋ชจ๋ธ ํ”„๋กฌํ”„ํŒ…์€ "ํ•œ ๋ฒˆ ์„ค์ •ํ•˜๊ณ  ์žŠ์–ด๋ฒ„๋ฆฌ๊ธฐ"๊ฐ€ ์•„๋‹™๋‹ˆ๋‹ค. ์ž‘์€ ๊ฐœ์„ ์ด ๋ˆ„์ ๋˜์–ด ์ƒ๋‹นํ•œ ์ •ํ™•๋„ ํ–ฅ์ƒ์œผ๋กœ ์ด์–ด์ง‘๋‹ˆ๋‹ค.

ํ”„๋กฌํ”„ํŠธ ์—”์ง€๋‹ˆ์–ด๋ง์˜ ์ง€์—ญ๋ณ„ ๊ณ ๋ ค์‚ฌํ•ญ

EU(GDPR): EU ์ธํ”„๋ผ์—์„œ ๋กœ์ปฌ ๋ชจ๋ธ์„ ์œ„ํ•œ ํ”„๋กฌํ”„ํŠธ ์—”์ง€๋‹ˆ์–ด๋ง์„ ๋ฐฐํฌํ•  ๋•Œ, ํ”„๋กฌํ”„ํŠธ ๋ฐ˜๋ณต์— ์‚ฌ์šฉ๋˜๋Š” ๋ชจ๋“  ํ•™์Šต ๋ฐ์ดํ„ฐ๊ฐ€ GDPR ๋ฐ์ดํ„ฐ ์ตœ์†Œํ™” ์›์น™์„ ์ค€์ˆ˜ํ•˜๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค. ํ…Œ์ŠคํŠธ๋ฅผ ์œ„ํ•ด ์‚ฌ์šฉ์ž ์ฟผ๋ฆฌ๋ฅผ ์™ธ๋ถ€ API๋กœ ๋‚ด๋ณด๋‚ด์ง€ ๋งˆ์‹ญ์‹œ์˜ค. ๋กœ์ปฌ์—์„œ ๋ฐ˜๋ณตํ•˜์‹ญ์‹œ์˜ค.

์ผ๋ณธ(APPI): ๊ณ ๊ฐ ๋ฐ์ดํ„ฐ์— ๋กœ์ปฌ LLM์„ ์‚ฌ์šฉํ•˜๋Š” ์ผ๋ณธ ๊ธฐ์—…์€ ๋ชจ๋“  ํ”„๋กฌํ”„ํŠธ์™€ ์‘๋‹ต์— ๋Œ€ํ•œ ๋ช…์‹œ์ ์ธ ๊ฐ์‚ฌ ๋กœ๊น…์„ ๊ตฌํ˜„ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ํ”„๋กฌํ”„ํŠธ ํ’ˆ์งˆ์€ ๋ฐ์ดํ„ฐ ๋ณด์•ˆ์— ์ง์ ‘ ์˜ํ–ฅ์„ ๋ฏธ์นฉ๋‹ˆ๋‹ค. ์ž˜๋ชป ์„ค๊ณ„๋œ ํ”„๋กฌํ”„ํŠธ๋Š” ์ถœ๋ ฅ์—์„œ ๋ฏผ๊ฐํ•œ ์ •๋ณด๋ฅผ ๋…ธ์ถœํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ค‘๊ตญ(๋ฐ์ดํ„ฐ ๋ณด์•ˆ๋ฒ• 2021): ์ค‘๊ตญ ๋ณธํ† ์˜ ๋กœ์ปฌ LLM ๋ฐฐํฌ๋Š” ๋ชจ๋“  ์ถ”๋ก , ํ”„๋กฌํ”„ํŒ…, ๋ชจ๋ธ ํŠœ๋‹์„ ์˜จํ”„๋ ˆ๋ฏธ์Šค์—์„œ ์œ ์ง€ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ ๊ฑฐ์ฃผ ์ค€์ˆ˜๋ฅผ ์œ„ํ•ด Qwen ๋ฐ ๊ธฐํƒ€ ๊ตญ๋‚ด ๋ชจ๋ธ์ด ์„ ํ˜ธ๋ฉ๋‹ˆ๋‹ค.

๋กœ์ปฌ LLM ํ”„๋กฌํ”„ํŒ…์— ๊ด€ํ•œ ์ž์ฃผ ๋ฌป๋Š” ์งˆ๋ฌธ

๋กœ์ปฌ LLM์€ ์™œ GPT-5.5๋ณด๋‹ค ๋” ๋ช…์‹œ์ ์ธ ํ”„๋กฌํ”„ํŠธ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๊นŒ?

๋กœ์ปฌ 7Bโ€“13B ๋ชจ๋ธ์€ GPT-5.5(์ถ”์ • 1.8T ํŒŒ๋ผ๋ฏธํ„ฐ)๋ณด๋‹ค ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๊ฐ€ ์ ๊ณ  ํ•™์Šต ๋ฐ์ดํ„ฐ๊ฐ€ ๋œ ๋‹ค์–‘ํ•ฉ๋‹ˆ๋‹ค. ๋ชจํ˜ธํ•œ ์˜๋„๋ฅผ ์ž˜ ์ถ”๋ก ํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค. ๋ช…์‹œ์ ์ธ ์ง€์นจ(ํ˜•์‹, ์—ญํ• , ๋‹จ๊ณ„๋ณ„ ์ถ”๋ก )์ด ์ด ๊ฒฉ์ฐจ๋ฅผ ๋ณด์™„ํ•ฉ๋‹ˆ๋‹ค. Chain-of-thought ํ”„๋กฌํ”„ํŒ…์€ ์ถ”๋ก  ์ž‘์—…์—์„œ ๋กœ์ปฌ ๋ชจ๋ธ ์ •ํ™•๋„๋ฅผ 10โ€“20% ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.

๋กœ์ปฌ LLM ํ”„๋กฌํ”„ํŠธ์— few-shot ์˜ˆ์‹œ๋ฅผ ๋ช‡ ๊ฐœ ํฌํ•จํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ?

๋กœ์ปฌ 7B ๋ชจ๋ธ์—๋Š” ์˜ˆ์‹œ 3โ€“5๊ฐœ๊ฐ€ ์ตœ์ ์ž…๋‹ˆ๋‹ค. GPT-5.5๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ 1โ€“2๊ฐœ๋งŒ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ์‹œ๊ฐ€ ๋งŽ์„์ˆ˜๋ก ์ผ๊ด€์„ฑ์ด ํ–ฅ์ƒ๋˜์ง€๋งŒ ์ปจํ…์ŠคํŠธ ์ฐฝ ํ† ํฐ์„ ์†Œ๋น„ํ•ฉ๋‹ˆ๋‹ค(๋ชจ๋ธ์— ๋”ฐ๋ผ 4Kโ€“32K ํ† ํฐ). 4K ์ปจํ…์ŠคํŠธ ์ฐฝ์„ ๊ฐ€์ง„ Llama 3.2 8B์˜ ๊ฒฝ์šฐ ์˜ˆ์‹œ 3๊ฐœ์™€ ์ž‘์—…์œผ๋กœ ์ œํ•œํ•˜์‹ญ์‹œ์˜ค. 32K ์ด์ƒ์˜ ์ปจํ…์ŠคํŠธ๋ฅผ ๊ฐ€์ง„ ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ 5๊ฐœ๊ฐ€ ์•ˆ์ „ํ•ฉ๋‹ˆ๋‹ค.

Chain-of-thought ํ”„๋กฌํ”„ํŒ…์ด ๋ชจ๋“  ๋กœ์ปฌ ๋ชจ๋ธ์—์„œ ์ž‘๋™ํ•ฉ๋‹ˆ๊นŒ?

Chain-of-thought๋Š” ๋ชจ๋“  instruction-tuned ๋ชจ๋ธ(Llama 3.x, Qwen 3, Mistral Small)์—์„œ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ ๋ชจ๋ธ(non-instruction-tuned)์€ "๋‹จ๊ณ„๋ณ„๋กœ ์ƒ๊ฐํ•˜์‹ญ์‹œ์˜ค" ์ง€์นจ์„ ์•ˆ์ •์ ์œผ๋กœ ๋”ฐ๋ฅด์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋กœ์ปฌ ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ "๋‹จ๊ณ„๋ณ„๋กœ ํ’€์–ด๋ณด์‹ญ์‹œ์˜ค:" ๋˜๋Š” ์˜ˆ์ƒ ์ถœ๋ ฅ ์‹œ์ž‘ ๋ถ€๋ถ„์˜ "Reasoning:"๊ณผ ๊ฐ™์€ CoT ๊ตฌ๋ฌธ์ด ๊ฐ€์žฅ ํšจ๊ณผ์ ์ž…๋‹ˆ๋‹ค.

๋กœ์ปฌ LLM์—์„œ ๊ฐ€์žฅ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ์ถœ๋ ฅ ํ˜•์‹์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

JSON์€ ๋กœ์ปฌ LLM์—์„œ ๊ฐ€์žฅ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ๊ตฌ์กฐํ™”๋œ ์ถœ๋ ฅ ํ˜•์‹์ž…๋‹ˆ๋‹ค. ํ”„๋กฌํ”„ํŠธ์— ์ •ํ™•ํ•œ JSON ์Šคํ‚ค๋งˆ๋ฅผ ์ง€์ •ํ•˜์‹ญ์‹œ์˜ค: "name, score, reasoning ํ‚ค๋ฅผ ๊ฐ€์ง„ JSON ๊ฐ์ฒด๋กœ๋งŒ ์‘๋‹ตํ•˜์‹ญ์‹œ์˜ค." Markdown ํ—ค๋”(##)๋Š” ์„น์…˜์— ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. XML์ด๋‚˜ ์‚ฌ์šฉ์ž ์ •์˜ ํ˜•์‹ ์š”์ฒญ์€ ํ”ผํ•˜์‹ญ์‹œ์˜ค. ๋กœ์ปฌ ๋ชจ๋ธ์€ ์ด๋ฅผ ์ผ๊ด€์„ฑ ์—†์ด ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค.

๋กœ์ปฌ LLM์ด ์ฃผ์ œ๋ฅผ ๋ฒ—์–ด๋‚˜์ง€ ์•Š๋„๋ก ์–ด๋–ป๊ฒŒ ๋ฐฉ์ง€ํ•ฉ๋‹ˆ๊นŒ?

์‹œ์Šคํ…œ ๋˜๋Š” ์ง€์นจ ํ”„๋กฌํ”„ํŠธ์— ๋ช…์‹œ์ ์ธ ์ œ์•ฝ์„ ์ถ”๊ฐ€ํ•˜์‹ญ์‹œ์˜ค: "[์ฃผ์ œ]์— ๋Œ€ํ•ด์„œ๋งŒ ๋‹ต๋ณ€ํ•˜์‹ญ์‹œ์˜ค. ๋‹ค๋ฅธ ๊ฒƒ์— ๋Œ€ํ•ด ๋ฌป๋Š” ๊ฒฝ์šฐ ๋‹ค์Œ์„ ๋งํ•˜์‹ญ์‹œ์˜ค: [์ฃผ์ œ]์— ๋Œ€ํ•ด์„œ๋งŒ ๋„์›€๋“œ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค." Ollama์˜ ๊ฒฝ์šฐ ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ ํ•„๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์‹ญ์‹œ์˜ค. llama.cpp์˜ ๊ฒฝ์šฐ ์‹œ์Šคํ…œ ๋ฉ”์‹œ์ง€๋กœ ์•ž์— ์ถ”๊ฐ€ํ•˜์‹ญ์‹œ์˜ค. ์ด ๊ฒฝ๊ณ„ ์„ค์ •์€ ๊ฐ•๋ ฅํ•œ RLHF ์ •๋ ฌ์„ ๊ฐ€์ง„ ํด๋ผ์šฐ๋“œ ๋ชจ๋ธ๋ณด๋‹ค ๋กœ์ปฌ 7B ๋ชจ๋ธ์—์„œ ํ›จ์”ฌ ๋” ํšจ๊ณผ์ ์œผ๋กœ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

๋กœ์ปฌ ๋ชจ๋ธ์˜ zero-shot๊ณผ few-shot ํ”„๋กฌํ”„ํŒ…์˜ ์ฐจ์ด์ ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

Zero-shot์€ ์˜ˆ์‹œ ์—†์ด ์ œ๊ณต๋ฉ๋‹ˆ๋‹ค: "์ด ์ด๋ฉ”์ผ์„ ์ŠคํŒธ์ธ์ง€ ์•„๋‹Œ์ง€ ๋ถ„๋ฅ˜ํ•˜์‹ญ์‹œ์˜ค." Few-shot์€ ์ž‘์—… ์ „์— 2โ€“5๊ฐœ์˜ ๋ ˆ์ด๋ธ”๋œ ์˜ˆ์‹œ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋กœ์ปฌ 7B ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ, few-shot์€ ๋ถ„๋ฅ˜ ๋ฐ ์ถ”์ถœ ์ž‘์—…์—์„œ zero-shot ๋Œ€๋น„ 15โ€“25% ์ •ํ™•๋„๋กœ ์ผ๊ด€๋˜๊ฒŒ ์šฐ์ˆ˜ํ•ฉ๋‹ˆ๋‹ค. Zero-shot์€ ํ˜•์‹์ด ๋œ ์ค‘์š”ํ•œ ์ƒ์„ฑ ์ž‘์—…(์š”์•ฝ, ๋ฒˆ์—ญ)์— ํšจ๊ณผ์ ์ž…๋‹ˆ๋‹ค.

๋กœ์ปฌ ๋ชจ๋ธ์— ๋Œ€ํ•œ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์–ด๋–ป๊ฒŒ ํ…Œ์ŠคํŠธํ•˜๊ณ  ๋ฐ˜๋ณตํ•ฉ๋‹ˆ๊นŒ?

5โ€“10๊ฐœ์˜ ๋‹ค์–‘ํ•œ ์˜ˆ์‹œ์—์„œ ํ…Œ์ŠคํŠธํ•˜์‹ญ์‹œ์˜ค. ํ•œ ๋ฒˆ์— ํ•˜๋‚˜์˜ ๋ณ€์ˆ˜(์—ญํ• , ํ˜•์‹, ๋˜๋Š” CoT ์ง€์นจ)๋ฅผ ๋ณ€๊ฒฝํ•˜์‹ญ์‹œ์˜ค. ๋ณ€๊ฒฝ ์ „ํ›„์˜ ์ •ํ™•๋„๋‚˜ ์ผ๊ด€์„ฑ์„ ์ธก์ •ํ•˜์‹ญ์‹œ์˜ค. ๊ฐ„๋‹จํ•œ ํ…Œ์ŠคํŠธ ์„ธํŠธ ์‚ฌ์šฉ: ์‰ฌ์šด ์˜ˆ์‹œ 2โ€“3๊ฐœ, ์–ด๋ ค์šด ์˜ˆ์‹œ 2โ€“3๊ฐœ. ๊ฐ€์žฅ ์ž˜ ์ž‘๋™ํ•˜๋Š” ํ”„๋กฌํ”„ํŠธ ๋ฒ„์ „์„ ์ถ”์ ํ•˜์‹ญ์‹œ์˜ค. 3โ€“5๊ฐœ์˜ ํ”„๋กฌํ”„ํŠธ ๋ณ€ํ˜• ์‚ฌ์ดํด๋กœ ๋ฐ˜๋ณตํ•˜์‹ญ์‹œ์˜ค. ์žฌ์‚ฌ์šฉ์„ ์œ„ํ•ด ํ”„๋กฌํ”„ํŠธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์— ์ž‘๋™ํ•˜๋Š” ํ”„๋กฌํ”„ํŠธ๋ฅผ ๋ฌธ์„œํ™”ํ•˜์‹ญ์‹œ์˜ค.

ํŠน์ • ์ž‘์—…์— ํ”„๋กฌํ”„ํŠธ ์—”์ง€๋‹ˆ์–ด๋ง์„ ํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ, ์•„๋‹ˆ๋ฉด ํŒŒ์ธํŠœ๋‹์„ ํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ?

๋จผ์ € ํ”„๋กฌํ”„ํŠธ ์—”์ง€๋‹ˆ์–ด๋ง์„ ์‹œ๋„ํ•˜์‹ญ์‹œ์˜ค(๋น ๋ฅด๊ณ , ๋ฌด๋ฃŒ์ด๋ฉฐ, ๋ฐ˜๋ณต ๊ฐ€๋Šฅ). 20๊ฐœ ์ด์ƒ์˜ ํ”„๋กฌํ”„ํŠธ ๋ณ€ํ˜• ํ›„์—๋„ ์ •ํ™•๋„๊ฐ€ ์ •์ฒด๋˜๋ฉด ํŒŒ์ธํŠœ๋‹์„ ๊ณ ๋ คํ•˜์‹ญ์‹œ์˜ค. ํŒŒ์ธํŠœ๋‹์€ 500๊ฐœ ์ด์ƒ์˜ ์ž‘์—…๋ณ„ ์˜ˆ์‹œ์™€ 1โ€“4์‹œ๊ฐ„์˜ ํ•™์Šต ์‹œ๊ฐ„์ด ํ•„์š”ํ•˜์ง€๋งŒ 10โ€“20%์˜ ์ •ํ™•๋„ ํ–ฅ์ƒ์„ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค. ๋ฒ”์šฉ ์ž‘์—…์˜ ๊ฒฝ์šฐ ํ”„๋กฌํ”„ํŠธ ์—”์ง€๋‹ˆ์–ด๋ง์œผ๋กœ ์ถฉ๋ถ„ํ•ฉ๋‹ˆ๋‹ค. ๋„๋ฉ”์ธ๋ณ„ ์ž‘์—…(์˜๋ฃŒ, ๋ฒ•๋ฅ , ์ฝ”๋”ฉ)์˜ ๊ฒฝ์šฐ ํŒŒ์ธํŠœ๋‹์ด ์ง€์†์ ์ธ ๊ฐœ์„ ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

๋กœ์ปฌ LLM์—์„œ ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ์™€ ์‚ฌ์šฉ์ž ์ง€์นจ์€ ์–ด๋–ป๊ฒŒ ๋‹ค๋ฆ…๋‹ˆ๊นŒ?

์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ๋Š” ์‚ฌ์šฉ์ž ๋ฉ”์‹œ์ง€ ์ด์ „์— ๋ชจ๋ธ์˜ ์—ญํ• ๊ณผ ์ œ์•ฝ์„ ์ •์˜ํ•˜๋ฉฐ ์š”์ฒญ ๊ตฌ์กฐ์˜ ์ผ๋ถ€์ž…๋‹ˆ๋‹ค(Ollama, LM Studio ๋˜๋Š” API๋ฅผ ํ†ตํ•ด). ์‚ฌ์šฉ์ž ์ง€์นจ์€ ๋Œ€ํ™”์˜ ์ผ๋ถ€์ž…๋‹ˆ๋‹ค. ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ๋Š” ๊ธฐ์ค€ ๋™์ž‘์„ ์„ค์ •ํ•˜๋ฉฐ ์‚ฌ์šฉ์ž ๋ฉ”์‹œ์ง€์— ์ง€์นจ์„ ํฌํ•จํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค ๋” ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋กœ์ปฌ ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ, ์ž˜ ์ž‘์„ฑ๋œ ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ๋Š” ์ผ๊ด€์„ฑ์„ 15โ€“25% ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค. ๋ชจ๋ธ์ด ์‚ฌ์šฉ์ž ์ˆ˜์ค€ ํ…์ŠคํŠธ๋ณด๋‹ค ์‹œ์Šคํ…œ ์ˆ˜์ค€ ์ œ์•ฝ์„ ์šฐ์„ ์‹œํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

๋‹ค์–‘ํ•œ ๋กœ์ปฌ ๋ชจ๋ธ์—์„œ ๋™์ผํ•œ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

๋ถ€๋ถ„์ ์œผ๋กœ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ CoT ๊ตฌ์กฐ์™€ ์—ญํ•  ์ •์˜๋Š” ๋ชจ๋ธ ๊ฐ„์— ์ด์ „๋ฉ๋‹ˆ๋‹ค(Llama, Qwen, Mistral). ๊ทธ๋Ÿฌ๋‚˜ ์ตœ์ ์˜ ๊ฒฐ๊ณผ๋ฅผ ์œ„ํ•ด์„œ๋Š” ๊ฐ ๋ชจ๋ธ์— ๋งž๊ฒŒ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์กฐ์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. Llama ๋ชจ๋ธ์€ "๋‹จ๊ณ„๋ณ„๋กœ ์ƒ๊ฐํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค"์— ๋ฐ˜์‘ํ•˜๋Š” ๋ฐ˜๋ฉด, Qwen ๋ชจ๋ธ์€ "๋จผ์ €, ..."๋ฅผ ์„ ํ˜ธํ•ฉ๋‹ˆ๋‹ค. ์‹ค์ œ๋กœ ๋ฐฐํฌํ•˜๋Š” ๋ชจ๋ธ์—์„œ ํ”„๋กฌํ”„ํŠธ๋ฅผ ํ…Œ์ŠคํŠธํ•˜์‹ญ์‹œ์˜ค. ๋” ํฐ ๋ชจ๋ธ(70B)์€ ๋” ์ž‘์€ ๋ชจ๋ธ(7B)๋ณด๋‹ค ํ”„๋กฌํ”„ํŠธ ๋ณ€ํ˜•์— ๋” ๊ด€๋Œ€ํ•ฉ๋‹ˆ๋‹ค.

์ถœ์ฒ˜

A Note on Third-Party Facts

This article references third-party AI models, benchmarks, prices, and licenses. The AI landscape changes rapidly. Benchmark scores, license terms, model names, and API prices can shift between the time of writing and the time you read this. Before making deployment or compliance decisions based on this article, verify current figures on each providerโ€™s official source: Hugging Face model cards for licenses and benchmarks, provider websites for API pricing, and EUR-Lex for current GDPR and EU AI Act text. This article reflects publicly available information as of May 2026.

Run PromptQuorum with a local LLM, your own API keys, or both โ€” you pick the backend.

Join the PromptQuorum Waitlist โ†’

โ† Back to Local LLMs

๋กœ์ปฌ LLM ํ”„๋กฌํ”„ํŠธ ์—”์ง€๋‹ˆ์–ด๋ง 2026: CoT ๋ฐ Few-Shot | PromptQuorum