Key Takeaways
- Best overall: Mistral Small 3.1 24B (most concise, tone-appropriate). Best multilingual: Qwen2.5 7B (French/German/Spanish/Japanese). Best for tone adaptation: Llama 3.1 8B.
- 70B models are too verbose for short-form writing. For long documents over 2 pages, Llama 3.3 70B with 128K context handles multi-section proposals reliably.
- Mistral Small 3.1 and Llama 3.1 8B are ideal for email, proposals, and memos.
- Email drafting: Mistral Small 3.1. Business proposal: Llama 3.1 8B with tone examples.
- Brand voice transfer: Provide 2-3 example emails; model learns tone and word choice.
- Edit mode > generation: Use model to refine existing draft (better control than full generation).
- Speed: Mistral Small 3.1 generates 200-word email in 8-15 sec. Llama 3.1 8B in 5-10 sec.
- Cost: Zero (open source) vs. $30/mo (ChatGPT Plus) or $200/mo (enterprise).
Which Models Excel at Business Tone?
Business writing rewards clarity and concision. Smaller models are better.
- Mistral Small 3.1 24B: Most concise output. Produces clean, professional short-form content (emails, Slack messages, executive memos). Best for tone control.
- Llama 3.1 8B: Balanced. Good for medium-length content (proposals, memos). Adapts well to brand voice examples.
- Qwen2.5 7B: Excellent for non-English business writing. Native tokenization for French, German, Spanish, Japanese, and Chinese. Best multilingual choice.
- For short-form writing (emails, memos), 7B-24B models produce cleaner output than 70B. For long-form content (proposals, reports over 2 pages), Llama 3.3 70B with 128K context handles multi-section documents reliably.
Writing Tasks & Model Recommendations
| Task | Best Model | Prompt Strategy | Output Quality |
|---|---|---|---|
| Task | Best Model | Prompt Strategy | Output Quality |
| Email drafting | Mistral Small 3.1 24B | "Write in active voice, max 150 words, no jargon" | Excellent -- concise, professional |
| Business proposal (1-3 pages) | Llama 3.1 8B | Provide 2-3 proposal examples as style reference | Good -- adapts well to tone examples |
| Executive memo | Mistral Small 3.1 24B | "Format: Problem / Recommendation / Next Steps" | Excellent -- structured output |
| Slack/internal message | Qwen2.5 7B | "Casual but professional, 2-3 sentences max" | Good -- low latency for real-time |
| Non-English business email | Qwen2.5 7B | "[Language] business email, formal register" | Excellent -- native tokenization |
| Contract summary | Llama 3.3 70B | "Summarize key obligations and risk points" | Best -- long context handles full docs |
| Refining existing draft | Any 7B model | "Edit for clarity, remove buzzwords, active voice" | Excellent -- edit mode best use case |
Prompt Engineering for Brand Voice
Business writing requires consistency. Teach the model your voice.
- 1Gather examples: 3-5 emails or memos in your brand voice. The more specific, the better -- use real sent emails, not idealized ones.
- 2Create a prompt template: "You write like this: [EXAMPLES]. Now draft [TASK] in this voice."
- 3Specify constraints: "Keep to 150 words.", "Use active voice.", "No jargon or buzzwords."
- 4Iterate on outputs: If the first draft is too formal, refine: "Use simpler language, remove buzzwords, write like you are texting a colleague."
- 5Store templates: Save prompts per writing type (sales, support, internal). Reuse for consistency.
Common Business Writing Mistakes
- Using 70B models for short-form writing. They produce verbose, over-explained output. For emails and memos, Mistral Small 3.1 24B or Llama 3.1 8B is faster and more concise.
- No examples provided. Model guesses your voice. Always give 2-3 real sent emails or memos in your brand voice.
- Trusting first draft. Business writing requires 1-2 edit cycles. Use edit prompts, not generation-only workflows.
- Not setting context length for long documents: Ollama defaults to 2048 tokens. A 2-page business proposal is approximately 1,500-2,000 words -- near or over this limit. Set `PARAMETER num_ctx 8192` minimum in your Modelfile for business writing tasks. For contract review or multi-page reports, use 32K context.
- Using the same model for writing and editing: The best workflow is two-stage: generate a rough draft with any 7B model (fast), then use Mistral Small 3.1 24B in edit mode to refine tone, remove jargon, and tighten structure. Using a 70B model for both tasks is slower and produces less concise output than this two-model approach.
Setup: Local Writing Assistant
- 1Start Ollama with Mistral Small 3.1: `ollama run mistral-small3.1`.
- 2Install VS Code extension "Continue" or the browser extension for web apps.
- 3Create a custom system prompt with your brand voice examples.
- 4Assign a hotkey (e.g., Ctrl+K) to trigger completion.
- 5Draft email → highlight → Ctrl+K → "Refine this email for [tone]" → copy result.
Local LLMs for Business Writing: Regional Context
EU / GDPR
For EU business professionals drafting emails or documents that reference clients, employees, or business partners, running a local writing assistant means no personal data -- names, contact details, deal terms -- is transmitted to cloud AI services. GDPR Article 6 requires a lawful basis for processing personal data; using a cloud AI API for business correspondence that includes client names and company details creates a data processing relationship requiring a DPA under Article 28.
Local inference eliminates this entirely. Mistral Small 3.1 24B (Mistral AI, France, Apache 2.0) is the recommended EU model -- EU origin, clean licence, and strong instruction-following for formal German, French, and English business writing. For German Geschäftskorrespondenz specifically, Mistral Small 3.1 produces output that follows DIN 5008 formatting conventions better than US-trained models at equivalent size.
Japan (METI)
Japanese business writing has strict formality registers (keigo levels: teineigo, sonkeigo, kenjōgo). Standard LLMs default to teineigo (polite) but cannot reliably produce sonkeigo (respectful) or kenjōgo (humble) without explicit prompt instructions. For Japanese business correspondence: use Qwen2.5 7B with explicit keigo instructions: "メールは丁寧な敬語(尊敬語と謙譲語)を使用してください". Qwen2.5's Japanese tokenizer handles kanji/kana business vocabulary noticeably better than Llama at the same size tier.
Germany (specific)
German business writing follows formal conventions: Sie-form (formal you), full company names, structured paragraph format. Add these to your system prompt: "Schreiben Sie auf Deutsch mit Sie-Anrede, sachlichem Ton, ohne Anglizismen." Mistral Small 3.1 produces the strongest formal German business output of any locally-runnable model as of April 2026 -- its training data includes significant German business corpus content from its EU-based development.
FAQ
Why is Mistral Small 3.1 better than Llama 3.1 for email?
Mistral Small 3.1 is more concise. Llama 3.1 is more adaptable. For pure speed/brevity: Mistral Small 3.1. For tone matching: Llama 3.1.
Can I use a 13B model for business writing?
Yes, but unnecessary. 7B is faster and equally good. 13B is slightly better at long proposals (>2 pages).
Should I use generation or editing mode?
Editing mode (refine existing draft) is safer. Generation mode is faster but requires more prompting.
How do I avoid sounding like ChatGPT?
Use small models (7B-24B), provide brand examples, request active voice + short sentences, no filler words.
Can I use local LLMs for confidential emails?
Yes. 100% private. No data leaves your machine. This is the primary advantage over cloud APIs.
What if the output is too formal?
Refine prompt: "Remove buzzwords. Use everyday language. Write like you're texting a colleague."
What is the best local LLM for non-English business writing?
Qwen2.5 7B supports 29 languages natively including French, German, Spanish, Japanese, Chinese, Korean, and Arabic. It processes non-English business text more token-efficiently than Llama or Mistral. For formal European business languages (German, French, Spanish), Mistral Small 3.1 24B is competitive due to its EU training data. Run: `ollama run qwen2.5:7b` for Asian and Middle Eastern languages; `ollama run mistral-small3.1` for European formal writing.
How do I use a local LLM to match my company's brand voice?
Provide 3-5 examples of existing company communications in your system prompt: "You write in this style: [paste examples]. Maintain this tone and vocabulary in all responses." The model learns vocabulary patterns, sentence length preferences, and formality level from examples. Update the examples every 6 months as your brand voice evolves.
Can local LLMs write in German Sie-form?
Yes with explicit instruction. Add to your system prompt: "Schreiben Sie auf Deutsch. Verwenden Sie ausschließlich die Sie-Form. Sachlicher, professioneller Ton ohne Anglizismen." Mistral Small 3.1 and Qwen2.5 7B both follow this instruction reliably. Without the explicit Sie-form instruction, models default to informal du-form in German.
Which local model is best for editing existing text vs generating from scratch?
For editing: any 7B model (Qwen2.5 7B, Llama 3.1 8B) works well -- editing is less demanding than generation. For generation from scratch on complex documents (proposals, reports): Mistral Small 3.1 24B produces more structured output. The two-stage approach works best: generate rough draft with a 7B model (fast), refine with Mistral Small 3.1 in edit mode.
Sources
- Mistral AI. (2024). "Mistral Small 3.1 Release." https://mistral.ai/news/mistral-small-3-1/ -- Model specifications and instruction-following benchmarks for Mistral Small 3.1 24B.
- Alibaba Qwen Team. (2025). "Qwen2.5 Technical Report." https://arxiv.org/abs/2412.15115 -- Multilingual capability data including Japanese, German, French, and Chinese business writing support.
- Meta AI. (2024). "Llama 3.1 Model Card." https://llama.meta.com/ -- Tone adaptation and instruction-following evaluation for Llama 3.1 8B.