关键要点
- Local 7B models need more explicit guidance than GPT-4o. Longer prompts, clearer instructions.
- Chain-of-thought ("Let me think step by step") improves reasoning accuracy by 10–20%.
- Always specify output format (JSON, Markdown, plain text). Unstructured outputs are unpredictable.
- Few-shot examples (1–3) work better than zero-shot for local models. More examples = better consistency.
- Role definition ("You are a Python expert") improves domain-specific responses.
How Are Local Models Different?
| Aspect | GPT-4o | Local 7B Model |
|---|---|---|
| Context window | 128K tokens | — |
| Instruction following | Excellent | — |
| Few-shot learning | 1–2 examples sufficient | — |
| Reasoning complexity | Multi-step, implicit | — |
| Personality consistency | Highly consistent | — |
Chain-of-Thought: Make Models Reason
Chain-of-thought (CoT) prompting asks the LLM to show its reasoning step-by-step before answering.
Without CoT: "What is 17 × 24?" → Model answers directly, often wrong.
With CoT: "Solve this step-by-step: 17 × 24" → Model shows: 17 × 20 = 340, 17 × 4 = 68, total = 408. More accurate.
# Prompt with CoT
prompt = """
You will answer a question by thinking step-by-step.
Let me think about this:
Question: Why do local LLMs require more explicit prompting than cloud APIs?
Thinking:
1. First, consider the differences in model size...
2. Then, think about training data and fine-tuning...
3. Finally, consider the architecture and inference optimization...
Answer:
"""
# This guides the model to reason through the problemSpecifying Structured Output Formats
Local models produce unpredictable outputs unless you specify format explicitly.
Example: "Extract entities from the text" might return narrative text instead of a list.
Better: "Extract entities as JSON with keys: person, location, organization".
# Bad: ambiguous output
prompt = "Summarize this text"
# Good: explicit format
prompt = """
Summarize the text in EXACTLY 3 bullet points.
Format as a JSON list:
{
"summary": [
"- Point 1",
"- Point 2",
"- Point 3"
]
}
"""Role Definition and Persona Prompting
Telling the model to adopt a role improves domain-specific responses.
Examples:
- "You are a Python expert" → better code explanations
- "You are a medical researcher" → more detailed biomedical responses
- "You are a skeptical analyst" → more critical thinking
Few-Shot Learning for Consistency
Provide examples (few-shot) to guide the model's output style and format.
Local models benefit from 3–5 examples. Cloud models work with 1–2.
# Few-shot prompt
prompt = """
Classify sentiment. Examples:
"I love this product!" → positive
"Worst experience ever" → negative
"It's okay, nothing special" → neutral
Now classify: "This is amazing!"
Answer: """
# Model learns format and style from examplesCommon Prompt Engineering Mistakes
- Verbose prompts without structure. Rambling instructions confuse local models. Be concise and explicit.
- Not using chain-of-thought. CoT improves accuracy 10–20%. Always include for reasoning tasks.
- Assuming one prompt works for all. Iterate and test. Small wording changes cause large output changes.
- Ignoring output format. Without explicit format specification, outputs are unpredictable.
- Using vague role definitions. "You are an expert" is vague. "You are a Python expert with 10 years experience" is better.
Sources
- Chain-of-Thought Paper (Wei et al.) — arxiv.org/abs/2201.11903
- Prompt Engineering Guide — github.com/dair-ai/Prompt-Engineering-Guide