Item: Kimi K2.7-Code
Author: Lars Iverson

Moonshot AI pushed Kimi K2.7-Code to Hugging Face on June 12, 2026, a coding-focused post-train on the K2.6 mixture-of-experts base that ships under a Modified MIT license and reports a 21.8-point gain on Moonshot's own Kimi Code Bench v2. It's the fifth major Kimi checkpoint in under a year, and the first where the company appears to be conceding the frontier comparison and competing on a narrower axis: cost, openness, and tool use.

The architecture is unchanged from K2.6: a 1-trillion-parameter MoE with 32B active, 384 experts routing 8 per token, and a 256K context window. Full-precision weights run roughly 600GB on disk per Codersera's spec sheet, with community quantizations of the K2.6-class checkpoint landing near 240GB. K2.7-Code ships with native INT4 quantization and serving recipes for vLLM, SGLang, and KTransformers. There's no official GGUF or llama.cpp build at launch.

The pricing is the most legible signal of intent. Hosted API is $0.95 per million cache-miss input tokens and $4.00 per million output. The model card also reports approximately 30% lower thinking-token usage versus K2.6, with preserve_thinking hard-wired to True. There's no instant mode. Every call reasons, and Moonshot has decided that's not a choice the user gets to make.

On Moonshot's first-party comparison table covering six benchmarks, K2.7-Code beats GPT-5.5 on zero of them and beats Claude Opus 4.8 on exactly one: MCPMark Verified, the Model Context Protocol tool-invocation benchmark, where it posts 81.1% against Opus's 76.4%. The remaining headline numbers, +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, +31.5% on MLS Bench Lite, are all gains over K2.6 on Moonshot's own evaluations. Independent third-party numbers on SWE-Bench Verified, SWE-Bench Pro, Terminal-Bench, LiveCodeBench, GPQA Diamond, and MMLU-Pro don't exist on release day, per Codersera. K2.6 held open-source SOTA at 58.6 on SWE-Bench Pro in April; whether K2.7-Code extends that's, for now, a claim awaiting a benchmark.

The release cadence tells its own story. K2 base in July 2025, K2 Thinking in November, K2.5 in January 2026, K2.6 in April, K2.7-Code in June. Five checkpoints, eleven months, a single architectural family iterated in public. Moonshot isn't trying to win the headline benchmark against GPT-5.5. It's trying to be the open-weight default for teams writing code against the MCP tool layer, at a fraction of the inference bill, and shipping fast enough that the closed labs can't price-anchor against a stationary target.

Kimi K2.7-Code ships open weights, cuts thinking tokens 30%, and edges Opus 4.8 on MCPMark

Sources