Upgrade to Kimi Membership to unlock faster models, higher concurrency, and more powerful research preview capabilities.
* Estimated by assuming full monthly quota on a single feature, based on typical token usage. For reference only.
| Feature |
Adagio
$0 / mo
|
Moderato
$19 / mo
|
Allegretto
$39 / mo
|
Allegro
$99 / mo
|
Vivace
$199 / mo
|
|---|---|---|---|---|---|
| Agent | |||||
| Agent Usage | 6 | 60 | 150 | 360 | 720 |
| Concurrent Tasks | 1 | 2 tasks | 2 tasks | 4 tasks | 4 tasks |
| Priority Queue | — | 4× speed | 4× speed | 4× speed | 4× speed |
| Agent Swarm | |||||
| Agent Swarm Access | — | — | 50 uses / mo | 120 uses / mo | 240 uses / mo |
| Concurrent Subagents | — | — | 4 subagents | 4 subagents | 8 subagents |
| Kimi Code | |||||
| Kimi Code Credits | — | 1× credits | 5× credits | 15× credits | 30× credits |
| Kimi Claw | |||||
| Kimi Claw Cloud | — | — | ✓ | ✓ | ✓ |
| Kimi Claw Android | — | — | ✓ | ✓ | ✓ |
| Kimi Claw PC (Mac ARM) | — | — | ✓ | ✓ | ✓ |
| Group Chat with Claw | — | — | 10 groups | 10 groups | 10 groups |
| Professional Data | |||||
| Pro Data Requests | 200 | 2,000 | 5,000 | 12,000 | 24,000 |
| Tools | |||||
| Word / Excel / Slides | ✓ | ✓ | ✓ | ✓ | ✓ |
| Deep Research | — | ✓ | ✓ | ✓ | ✓ |
| Websites Deploy | — | ✓ | ✓ | ✓ | ✓ |
| Website with Database | — | ✓ | ✓ | ✓ | ✓ |
| Slides Visual Mode | — | ✓ | ✓ | ✓ | ✓ |
| Research Preview | — | ✓ | ✓ | ✓ | ✓ |
| Tool | Typical Paid Plan | Best For | What Stands Out |
|---|---|---|---|
| Kimi AI (Moderato) | $19/mo | Design→code + agentic tasks | Monthly quotas for Deep Research + OK Computer + Kimi Code. API fees not included. |
| ChatGPT Plus | $20/mo | All-rounder (writing, coding, images, tools) | Strong general assistant + broad feature set in one place |
| Claude Pro | $20/mo or $17 annual | Writing + coding + long context work | Great for documents, structured writing, and project-style workflows |
| Google Gemini AI Plus | $7.99/mo | Cheaper upgrade in Google ecosystem | Often bundled with storage + Gemini features in Google apps |
| Google Gemini AI Pro | $19.99/mo | Higher limits + creator tools | More access to advanced Gemini + credits/tools depending on region |
| Perplexity Pro | $20/mo or $200/yr | Research with citations / browsing | Best "answer + sources" experience for web research |
| Microsoft 365 + Copilot | $19.99/mo | Word/Excel/PowerPoint productivity | Copilot inside Microsoft apps + Office suite bundle |
| Poe (multi-model) | from $4.99/mo | Trying many models cheaply | One subscription to access multiple model providers via points |
Kimi's Moderato tier is priced near common premium plans, but it's purpose-built around work quotas — Deep Research, OK Computer, Kimi Code. API fees are separate. If your workflow is design→code and agent tasks, Kimi feels more specialized than general chat plans.
ChatGPT Plus ($20/mo) is typically the easiest "one subscription that does a bit of everything" choice. If you do mixed tasks — writing, coding, images, file work — it's usually the most balanced.
Claude Pro ($20/mo) is often chosen when your workflow is heavy on documents, writing quality, and structured outputs.
Gemini/Google AI plans are great value if you already use Google storage and apps. There's a cheaper AI Plus and a higher AI Pro tier depending on your region.
Perplexity Pro ($20/mo) is the best deal when you care about citations, browsing, and fast research summaries.
Poe is the budget option if your goal is to try lots of models without paying each company $20/month separately.
Kimi K2.6 is Moonshot AI's most capable open-source model, built for long-horizon coding, frontend design generation, 300-agent swarms, and native multimodal workflows. Unlike a standalone product, K2.6 is accessed through your existing Kimi membership or directly via token-based API billing — making it available at every price point from free to enterprise.
Starts at $19/month (Moderato) and gives you K2.6 inside the Kimi chat interface with agent credits, Deep Research, Kimi Code access, and Slides and Websites tools included. Higher tiers — Allegretto ($39), Allegro ($99), and Vivace ($199) — unlock Agent Swarm with up to 300 parallel subagents, more Kimi Code credits, Kimi Claw cloud deployment, and significantly larger Professional Data quotas.
Token-based and billed separately from membership. Reference pricing sits around $0.55 per million input tokens and $2.65 per million output tokens, making K2.6 one of the most cost-competitive frontier models for developers building at scale. The API is fully OpenAI-compatible — swap in model: "kimi-k2.6" and you're running the latest model in any existing workflow.
Available on HuggingFace under a modified MIT license, free to download and self-host with frameworks like vLLM, SGLang, or KTransformers — ideal for privacy-focused teams and AI researchers who need full infrastructure control.
Whether you're a daily user who wants smarter agent workflows, a developer building a product on top of K2.6's coding and design capabilities, or an enterprise team looking to self-host a trillion-parameter model, there's a pricing path designed for your workload.
Kimi K2.6 is the most capable model in the Kimi lineup — built for long-horizon coding, coding-driven frontend design, 300-agent swarms, and native multimodal workflows. Access it through any Kimi membership plan above Adagio, or via the API at token-based rates.
model: "kimi-k2.6" in your API calls.| Access Method | Price | What's Included | Best For |
|---|---|---|---|
| App — Moderato | $19/mo | K2.6 in chat + agent tasks, 60 agent credits, Deep Research, Kimi Code 1× credits, Slides, Websites Deploy | Daily users, creators, researchers |
| App — Allegretto | $39/mo | All Moderato + Agent Swarm (50 uses, 4 subagents), Kimi Code 5× credits, Kimi Claw, 5,000 Pro Data req | Pro users, teams, agentic workflows |
| App — Allegro | $99/mo | All Allegretto + Agent Swarm (120 uses, 4 subagents), Kimi Code 15× credits, 12,000 Pro Data req | Power users, heavy coders, automation |
| App — Vivace | $199/mo | All Allegro + Agent Swarm (240 uses, 8 subagents), Kimi Code 30× credits, 24,000 Pro Data req | Agencies, enterprises, bulk automation |
| API — Input Tokens | ~$0.55/1M | Prompt, system instructions, conversation history, retrieved docs. Cheaper — more cacheable. | Developers, builders, automation pipelines |
| API — Output Tokens | ~$2.65/1M | Generated responses. Higher-priced — compute-intensive, harder to cache. Long outputs drive spend. | Developers, builders, automation pipelines |
| Open Weights | Free (MIT) | Download weights from HuggingFace. Self-host with vLLM / SGLang / KTransformers. Hardware costs apply. | AI researchers, privacy-first enterprises |
* API token prices are market reference snapshots (OpenRouter / ArtificialAnalysis). Always verify on your actual billing page. Prices change frequently.
"Kimi pricing" can mean different things depending on how you use it: the consumer app (monthly membership), developer API access (token-based billing), or open-weight model usage via third-party providers. Membership and API costs are never bundled — they're always separate.
Rough token rules of thumb: 1,000 tokens ≈ 700–800 English words. A normal chat answer is 200–800 output tokens. A long structured response can be 1,500–3,000+. Long conversation history makes input tokens dominate.
* Uses midpoint pricing ($0.55 input / $2.65 output) as a starting estimate. Replace with your actual provider's rates. API costs can spike with long outputs — output tokens typically drive the bill.
Input: 40M tokens ≈ $22. Output: 12.5M tokens ≈ $33. Total: ~$55/mo at reference rates.
Input: 5M tokens ≈ $2.75. Output: 4M tokens ≈ $10.60. Total: ~$14/mo at reference rates.
Input: 40M tokens ≈ $22. Output: 12M tokens ≈ $31.80. Total: ~$54/mo at reference rates.
Many power users do both: membership for personal productivity and deep research inside the Kimi app, and API for their product, internal tools, or high-volume automation. Kimi's own rules page explicitly states API usage fees are not included with membership — reinforcing this as two separate billing tracks.
Keeping a huge conversation history attached to every request causes input tokens to explode. Each message re-sends the entire prior conversation.
Fix: summarize older context externally, retrieve only what you need per requestLong outputs are expensive because output tokens are priced higher and can't be cached. A 3,000-word answer can cost 4-6× more than a 500-word response.
Fix: generate in parts, request structured outlines first, expand only sections you'll publishDeep research tasks can call tools repeatedly — each iteration adds tokens and potentially additional tool-use charges depending on the platform.
Fix: use membership's capped deep research quota for this use case; avoid unbounded agent loops in APIIf your prompt is unclear, you'll regenerate multiple times — every retry is billed. Common with one-liner prompts without format or length constraints.
Fix: tighter prompt templates with clear tone, length, sections, and examples of "good" outputStart free, upgrade when you need more power. No commitment required.