重要なポイント
- LLMs cannot access the internet, perform real-time reasoning, do complex math reliably, or make true independent decisions
- Hallucination: LLMs confidently invent facts, especially on niche or recent topics. Combine with RAG or fact-checking for high-stakes use.
- LLMs struggle with handwriting, abstract images, small text, and visual reasoning (except GPT-4o and Claude 4.6 Sonnet vision)
- Math weakness: LLMs tokenize numbers as text. They can follow step-by-step instructions but fail on novel calculations. Use code execution instead.
- No novel reasoning: if a task requires reasoning never seen in training data, LLMs often fail or produce incoherent output
- No true judgment: LLMs generate plausible responses based on patterns, not reasoning or responsibility. Require human review for high-stakes decisions.
- Cost/latency at scale: running LLMs on large datasets or in real-time systems becomes expensive and slow
- Workarounds: combine LLMs with APIs (real-time data), code execution (math), RAG (knowledge grounding), human review (decisions), and vision models (images)