<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Coding-Agent on Svtter's Blog</title><link>https://svtter.cn/en/tags/coding-agent/</link><description>Recent content in Coding-Agent on Svtter's Blog</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Wed, 27 May 2026 10:30:00 +0800</lastBuildDate><atom:link href="https://svtter.cn/en/tags/coding-agent/index.xml" rel="self" type="application/rss+xml"/><item><title>How kimi-code Handles kimi-k2.6: A Comparison with OpenCode</title><link>https://svtter.cn/en/p/how-kimi-code-handles-kimi-k2.6-a-comparison-with-opencode/</link><pubDate>Wed, 27 May 2026 10:30:00 +0800</pubDate><guid>https://svtter.cn/en/p/how-kimi-code-handles-kimi-k2.6-a-comparison-with-opencode/</guid><description>&lt;img src="https://svtter.cn/p/kimi-code-%E5%AF%B9-kimi-k2.6-%E7%9A%84%E4%B8%93%E7%94%A8%E5%A4%84%E7%90%86%E4%B8%8E-opencode-%E7%9A%84%E5%AF%B9%E6%AF%94/featured-image.png" alt="Featured image of post How kimi-code Handles kimi-k2.6: A Comparison with OpenCode" /&gt;&lt;p&gt;Recently, kimi-code migrated from Python to TypeScript. Here&amp;rsquo;s a quick analysis.&lt;/p&gt;
&lt;p&gt;Based on my review of the &lt;strong&gt;kimi-code&lt;/strong&gt; source code (particularly &lt;code&gt;packages/kosong/src/providers/kimi.ts&lt;/code&gt;, &lt;code&gt;kimi-schema.ts&lt;/code&gt;, &lt;code&gt;kimi-files.ts&lt;/code&gt;, etc.) and relevant OpenCode compatibility issues, here are the kimi-k2.6-specific optimizations in kimi-code and how they differ from OpenCode.&lt;/p&gt;
&lt;h2 id="1-native-kimi-provider-not-a-generic-openai-compatible-layer"&gt;1. Native Kimi Provider (Not a Generic OpenAI-compatible Layer)
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;kimi-code&lt;/strong&gt; does not treat Kimi as &amp;ldquo;just another OpenAI-compatible endpoint.&amp;rdquo; Instead, it implements a dedicated &lt;code&gt;kimi&lt;/code&gt; provider type:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;kimi-code&lt;/th&gt;
&lt;th&gt;OpenCode&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Provider Type&lt;/td&gt;
&lt;td&gt;Dedicated &lt;code&gt;'kimi'&lt;/code&gt; type with independent adapter&lt;/td&gt;
&lt;td&gt;Accessed via generic OpenAI/Anthropic bridge&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Proprietary Fields&lt;/td&gt;
&lt;td&gt;Native handling of &lt;code&gt;reasoning_content&lt;/code&gt;, &lt;code&gt;thinking&lt;/code&gt;, &lt;code&gt;generationKwargs&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;reasoning_content&lt;/code&gt; often lost in the bridge layer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auth Headers&lt;/td&gt;
&lt;td&gt;Supports &lt;code&gt;kimiRequestHeaders&lt;/code&gt;, &lt;code&gt;X-Msh-Tool-Call-Id&lt;/code&gt;, and other Moonshot-specific headers&lt;/td&gt;
&lt;td&gt;Generic header forwarding&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="2-full-lifecycle-handling-of-reasoning_content"&gt;2. Full Lifecycle Handling of &lt;code&gt;reasoning_content&lt;/code&gt;
&lt;/h2&gt;&lt;p&gt;kimi-k2.6 has thinking enabled by default and &lt;strong&gt;requires &lt;code&gt;reasoning_content&lt;/code&gt; to be preserved across multi-turn conversation history&lt;/strong&gt;. Otherwise, tool calls will result in a 400 error.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;How kimi-code handles it:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;convertMessage&lt;/code&gt;&lt;/strong&gt;: Extracts internal &lt;code&gt;think&lt;/code&gt; content parts and serializes them into the &lt;code&gt;reasoning_content&lt;/code&gt; field, ensuring thinking content is never lost in message history&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Streaming Parser&lt;/strong&gt;: Explicitly extracts &lt;code&gt;delta.reasoning_content&lt;/code&gt; / &lt;code&gt;message.reasoning_content&lt;/code&gt; in both &lt;code&gt;_convertStreamResponse&lt;/code&gt; and &lt;code&gt;_convertNonStreamResponse&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;TUI Rendering&lt;/strong&gt;: A dedicated &lt;code&gt;ThinkingComponent&lt;/code&gt; renders thinking content in real time, with expand/collapse support and a spinner animation&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;OpenCode&amp;rsquo;s Problem:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The OpenCode Go bridge &lt;strong&gt;drops &lt;code&gt;reasoning_content&lt;/code&gt;&lt;/strong&gt; on the second turn, causing the Moonshot API to return:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-gdscript3" data-lang="gdscript3"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;thinking&lt;/span&gt; &lt;span class="n"&gt;is&lt;/span&gt; &lt;span class="n"&gt;enabled&lt;/span&gt; &lt;span class="n"&gt;but&lt;/span&gt; &lt;span class="n"&gt;reasoning_content&lt;/span&gt; &lt;span class="n"&gt;is&lt;/span&gt; &lt;span class="n"&gt;missing&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;assistant&lt;/span&gt; &lt;span class="k"&gt;tool&lt;/span&gt; &lt;span class="n"&gt;call&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;h2 id="3-json-schema-normalization-kimi-schemats"&gt;3. JSON Schema Normalization (&lt;code&gt;kimi-schema.ts&lt;/code&gt;)
&lt;/h2&gt;&lt;p&gt;Moonshot&amp;rsquo;s tool parameter validator has strict and unique requirements for JSON Schema. This is one of the primary sources of incompatibility between OpenCode and kimi-k2.6.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What kimi-code&amp;rsquo;s &lt;code&gt;normalizeKimiToolSchema&lt;/code&gt; does:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Dereferences &lt;code&gt;$ref&lt;/code&gt;&lt;/strong&gt;: Inlines definitions from &lt;code&gt;$defs&lt;/code&gt; / &lt;code&gt;definitions&lt;/code&gt;, eliminating external references&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Fills in missing &lt;code&gt;type&lt;/code&gt;&lt;/strong&gt;: The Kimi validator rejects nested property schemas that omit &lt;code&gt;type&lt;/code&gt; (e.g., MCP-generated enum-only schemas). kimi-code infers and backfills &lt;code&gt;type: string/object/array&lt;/code&gt;, etc.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Circular reference detection&lt;/strong&gt;: Preserves the original &lt;code&gt;$ref&lt;/code&gt; when a circular reference is detected, avoiding infinite recursion&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;OpenCode&amp;rsquo;s Problem:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Generated schemas use &lt;code&gt;#/definitions/&lt;/code&gt; instead of the &lt;code&gt;#/$defs/&lt;/code&gt; format required by Moonshot, and lack schema type inference and backfilling for Kimi, causing complex tool calls to fail with 400.&lt;/p&gt;
&lt;h2 id="4-native-thinking-mode-configuration-system"&gt;4. Native Thinking Mode Configuration System
&lt;/h2&gt;&lt;p&gt;kimi-code has built-in support for Kimi&amp;rsquo;s thinking mode from the configuration layer all the way to the UI:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Config Parsing&lt;/strong&gt;: &lt;code&gt;ThinkingConfigSchema&lt;/code&gt; supports &lt;code&gt;mode: auto/on/off&lt;/code&gt; and &lt;code&gt;effort: low/medium/high/xhigh/max&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Model Capability Tags&lt;/strong&gt;: &lt;code&gt;ModelAlias&lt;/code&gt; supports &lt;code&gt;capabilities: ['thinking', 'always_thinking']&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Model Selector UI&lt;/strong&gt;: Press &lt;code&gt;←→&lt;/code&gt; to toggle thinking on/off; &lt;code&gt;always-on&lt;/code&gt; models cannot be turned off&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Provider Method&lt;/strong&gt;: &lt;code&gt;withThinking(effort)&lt;/code&gt; correctly generates:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-json" data-lang="json"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;reasoning_effort&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;high&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;extra_body&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nt"&gt;&amp;#34;thinking&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nt"&gt;&amp;#34;type&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;enabled&amp;#34;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Token Budget&lt;/strong&gt;: Automatically normalizes legacy &lt;code&gt;max_tokens&lt;/code&gt; to Kimi&amp;rsquo;s preferred &lt;code&gt;max_completion_tokens&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;OpenCode&amp;rsquo;s Problem:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;When using the Anthropic bridge, it hardcodes &lt;code&gt;thinking&lt;/code&gt; content blocks, but the Kimi API only supports &lt;code&gt;text/image_url/video_url/video&lt;/code&gt;, resulting in:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Invalid value: thinking. Supported values are: &amp;#39;text&amp;#39;,&amp;#39;image_url&amp;#39;,&amp;#39;video_url&amp;#39; and &amp;#39;video&amp;#39;.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;h2 id="5-native-moonshot-service-integration"&gt;5. Native Moonshot Service Integration
&lt;/h2&gt;&lt;p&gt;kimi-code includes Moonshot-exclusive services instead of relying on generic local implementations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;MoonshotFetchURLProvider&lt;/code&gt;&lt;/strong&gt;: Prioritizes Moonshot&amp;rsquo;s &lt;code&gt;coding-fetch&lt;/code&gt; service (with built-in page text extraction), falling back to local fetch only on failure&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;MoonshotWebSearchProvider&lt;/code&gt;&lt;/strong&gt;: Calls the Moonshot search API directly, supporting &lt;code&gt;enable_page_crawling&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;KimiFiles&lt;/code&gt;&lt;/strong&gt;: Uploads videos to the Moonshot file service, returning &lt;code&gt;video_url&lt;/code&gt; in the &lt;code&gt;ms://&amp;lt;file-id&amp;gt;&lt;/code&gt; format&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="6-tool-call-layer-details"&gt;6. Tool Call Layer Details
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Built-in Functions&lt;/strong&gt;: Tool names starting with &lt;code&gt;$&lt;/code&gt; are recognized as Kimi builtin functions and serialized as &lt;code&gt;type: 'builtin_function'&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Usage Extraction&lt;/strong&gt;: Supports Moonshot&amp;rsquo;s proprietary &lt;code&gt;choices[0].usage&lt;/code&gt; placement, as well as &lt;code&gt;cached_tokens&lt;/code&gt; and other fields&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Finish Reason Mapping&lt;/strong&gt;: Maps OpenAI-style &lt;code&gt;stop&lt;/code&gt;/&lt;code&gt;tool_calls&lt;/code&gt;/&lt;code&gt;length&lt;/code&gt; values to an internal unified enum&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="7-cli-core-and-llm-sdk-architectural-isolation"&gt;7. CLI Core and LLM SDK Architectural Isolation
&lt;/h2&gt;&lt;p&gt;This is an easily overlooked but important architectural difference.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The core CLI of kimi-code (&lt;code&gt;apps/kimi-code&lt;/code&gt;) does not directly depend on any OpenAI or Anthropic TypeScript SDK.&lt;/strong&gt; Looking at its &lt;code&gt;package.json&lt;/code&gt;, the core dependencies are only generic libraries like TUI rendering (&lt;code&gt;pi-tui&lt;/code&gt;), CLI parsing (&lt;code&gt;commander&lt;/code&gt;), and syntax highlighting (&lt;code&gt;cli-highlight&lt;/code&gt;). All LLM provider interactions are isolated within the self-developed &lt;code&gt;kosong&lt;/code&gt; package.&lt;/p&gt;
&lt;p&gt;While &lt;code&gt;packages/kosong&lt;/code&gt; internally uses &lt;code&gt;openai&lt;/code&gt; and &lt;code&gt;@anthropic-ai/sdk&lt;/code&gt; as implementation details (since the Kimi API is OpenAI-compatible), it exposes a unified LLM abstraction interface to the outside. The CLI core only depends on &lt;code&gt;kosong&lt;/code&gt; and has no awareness of underlying vendor SDKs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OpenCode is different.&lt;/strong&gt; Its &lt;code&gt;packages/opencode&lt;/code&gt; core package directly depends on a large number of vendor SDKs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;@ai-sdk/openai&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;@ai-sdk/anthropic&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;@ai-sdk/google&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;@ai-sdk/azure&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;@openrouter/ai-sdk-provider&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&amp;hellip; (more than a dozen provider-specific packages in total)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This means OpenCode&amp;rsquo;s core code is deeply coupled with each vendor&amp;rsquo;s SDK, while kimi-code&amp;rsquo;s core CLI stays clean, with all model interactions fully isolated through a self-developed abstraction layer.&lt;/p&gt;
&lt;h2 id="8-what-commit-history-reveals-about-evolution-paths"&gt;8. What Commit History Reveals About Evolution Paths
&lt;/h2&gt;&lt;p&gt;The structural code differences above are just a static snapshot. What&amp;rsquo;s more interesting is comparing the commit histories of the two projects—their &lt;strong&gt;dynamic evolution directions are completely different&lt;/strong&gt;.&lt;/p&gt;
&lt;h3 id="kimi-code-native-design-continuously-reducing-configuration-burden"&gt;kimi-code: Native Design, Continuously Reducing Configuration Burden
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;842e699&lt;/code&gt; — &amp;ldquo;Kimi For Coding&amp;rdquo; (Initial Commit)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This was the starting point of the entire project. The initial code already included:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;packages/kosong/src/providers/kimi.ts&lt;/code&gt;: Dedicated Kimi provider&lt;/li&gt;
&lt;li&gt;&lt;code&gt;packages/kosong/src/providers/kimi-schema.ts&lt;/code&gt;: Dedicated JSON Schema normalizer&lt;/li&gt;
&lt;li&gt;&lt;code&gt;packages/kosong/src/providers/kimi-files.ts&lt;/code&gt;: Dedicated file upload service&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Conclusion: kimi-code treated the Kimi API as a first-class citizen from day one, not as a later patch.&lt;/strong&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;d95b013&lt;/code&gt; fix(catalog): preserve reasoning fields in custom model (&lt;a class="link" href="https://github.com/MoonshotAI/kimi-code/pull/70" target="_blank" rel="noopener"
&gt;#70&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This commit fixed a very subtle issue. models.dev uses the &lt;code&gt;interleaved&lt;/code&gt; field to mark reasoning support, but early code treated &lt;code&gt;interleaved=true&lt;/code&gt; as undefined, causing models selected via &lt;code&gt;/connect&lt;/code&gt; to silently lose their reasoning capability.&lt;/p&gt;
&lt;p&gt;Fixes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;interleaved=true&lt;/code&gt; is mapped to the default &lt;code&gt;reasoning_content&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;interleaved&lt;/code&gt; is added to the &lt;code&gt;update-catalog.mjs&lt;/code&gt; allowlist; otherwise the offline catalog in release builds would silently drop the field again&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;61f7d0e&lt;/code&gt; fix(kosong): make openai-compatible thinking work without reasoning_key (&lt;a class="link" href="https://github.com/MoonshotAI/kimi-code/pull/78" target="_blank" rel="noopener"
&gt;#78&lt;/a&gt;)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;This is the core commit for reasoning handling&lt;/strong&gt;, showcasing kimi-code&amp;rsquo;s deep thinking on compatibility. The diff reveals a three-layer design:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Inbound Auto-Scan&lt;/strong&gt; (response parsing)&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-ts" data-lang="ts"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kr"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;KNOWN_REASONING_KEYS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;reasoning_content&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;reasoning_details&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;reasoning&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="kr"&gt;as&lt;/span&gt; &lt;span class="kr"&gt;const&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// Auto-scan three fields; first string value wins
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Outbound Default Write-Back&lt;/strong&gt; (request serialization)&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-ts" data-lang="ts"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kr"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;DEFAULT_OUTBOUND_REASONING_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;KNOWN_REASONING_KEYS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt; &lt;span class="c1"&gt;// &amp;#39;reasoning_content&amp;#39;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// Defaults to writing back as reasoning_content, no user config needed
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Auto-Inject &lt;code&gt;reasoning_effort&lt;/code&gt;&lt;/strong&gt; (historical continuity)&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-ts" data-lang="ts"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// When history contains ThinkPart but caller hasn&amp;#39;t explicitly set reasoning_effort,
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// auto-inject &amp;#39;medium&amp;#39; to prevent strict gateways like One API / DeepSeek from returning 400
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Edge cases are handled meticulously: blank &lt;code&gt;reasoning_key&lt;/code&gt; (&lt;code&gt;&amp;quot;&amp;quot;&lt;/code&gt;) is normalized to &lt;code&gt;undefined&lt;/code&gt;; values explicitly set by the caller via &lt;code&gt;withGenerationKwargs&lt;/code&gt; &lt;strong&gt;are not silently overwritten by auto-injection&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The verification goal explicitly states:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Manually verified end-to-end against the real DeepSeek API with a hand-written config.toml that does not set reasoning_key: thinking content renders, no 400, multi-turn conversations work.&lt;/p&gt;&lt;/blockquote&gt;
&lt;hr&gt;
&lt;h3 id="opencode-generic-layer-design-openai-centric"&gt;OpenCode: Generic Layer Design, OpenAI-centric
&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;eb84f46&lt;/code&gt; fix(llm): split OpenAI reasoning summary blocks (#29000)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This commit demonstrates OpenCode&amp;rsquo;s completely different approach to reasoning—designed around the &lt;strong&gt;OpenAI Responses API&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Maintains a state machine for &lt;code&gt;encrypted_content&lt;/code&gt; and &lt;code&gt;item_reference&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Folds multiple summary parts by &lt;code&gt;item_id&lt;/code&gt; + &lt;code&gt;summary_index&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;When &lt;code&gt;store:false&lt;/code&gt;, filters out reasoning items lacking &lt;code&gt;encrypted_content&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;This is completely different from Kimi&amp;rsquo;s &lt;code&gt;reasoning_content&lt;/code&gt; mechanism.&lt;/strong&gt; Kimi does not need &lt;code&gt;encrypted_content&lt;/code&gt; or &lt;code&gt;item_reference&lt;/code&gt;; it simply attaches a &lt;code&gt;reasoning_content&lt;/code&gt; field to the message.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 id="a-hard-fact"&gt;A Hard Fact
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a class="link" href="https://github.com/anomalyco/opencode/issues/26331" target="_blank" rel="noopener"
&gt;OpenCode Issue #26331&lt;/a&gt;&lt;/strong&gt; &amp;ldquo;Bug: OpenCode Go bridge layer incompatible with kimi-k2.6 tool calls&amp;rdquo; — &lt;strong&gt;Status: still open&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a class="link" href="https://github.com/anomalyco/opencode/issues/27054" target="_blank" rel="noopener"
&gt;OpenCode Issue #27054&lt;/a&gt;&lt;/strong&gt; &amp;ldquo;KIMI K2.6 showing error in Opencode GO&amp;rdquo; — &lt;strong&gt;Status: closed, but the resolution was to disable MCP (a workaround)&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The last comment on #27054:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The workaround is to disable your MCP and then initiate the session&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;That&amp;rsquo;s not a fix. That&amp;rsquo;s avoiding the problem.&lt;/p&gt;
&lt;hr&gt;
&lt;h3 id="commit-history-comparison-summary"&gt;Commit History Comparison Summary
&lt;/h3&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;kimi-code&lt;/th&gt;
&lt;th&gt;OpenCode&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Initial Design&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Initial commit includes full Kimi provider + schema normalizer + file service&lt;/td&gt;
&lt;td&gt;Generic multi-model architecture, adapted later via bridge&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reasoning Mechanism&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Designed around &lt;code&gt;reasoning_content&lt;/code&gt; field, with auto-scan / write-back / effort injection&lt;/td&gt;
&lt;td&gt;Designed around OpenAI Responses&amp;rsquo; &lt;code&gt;encrypted_content&lt;/code&gt; + &lt;code&gt;item_reference&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Schema Handling&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Dedicated &lt;code&gt;normalizeKimiToolSchema&lt;/code&gt;, dereferences &lt;code&gt;$ref&lt;/code&gt; + backfills &lt;code&gt;type&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Generic schema validation, focused on friendly error messages&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Config Philosophy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Makes OpenAI-compatible gateways &amp;ldquo;zero-config&amp;rdquo; by auto-inferring all fields&lt;/td&gt;
&lt;td&gt;Relies on users manually adapting via bridge/config&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Issue Status&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Continuously shipping reasoning-related patches (&lt;a class="link" href="https://github.com/MoonshotAI/kimi-code/pull/70" target="_blank" rel="noopener"
&gt;#70&lt;/a&gt;, &lt;a class="link" href="https://github.com/MoonshotAI/kimi-code/pull/78" target="_blank" rel="noopener"
&gt;#78&lt;/a&gt;)&lt;/td&gt;
&lt;td&gt;kimi-k2.6 compatibility issue &lt;a class="link" href="https://github.com/anomalyco/opencode/issues/26331" target="_blank" rel="noopener"
&gt;#26331&lt;/a&gt; still open&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="summary-core-differences"&gt;Summary: Core Differences
&lt;/h2&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;kimi-code&lt;/th&gt;
&lt;th&gt;OpenCode&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Architecture Positioning&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Native design for Kimi/Moonshot, dedicated provider&lt;/td&gt;
&lt;td&gt;Generic multi-model agent, adapted via bridge&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Thinking/Reasoning&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Native support, full lifecycle preservation of &lt;code&gt;reasoning_content&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Easily lost in bridge layer, causing 400 errors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;JSON Schema&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Dedicated &lt;code&gt;normalizeKimiToolSchema&lt;/code&gt; for dereferencing and type backfilling&lt;/td&gt;
&lt;td&gt;Generic schema generation, does not meet Kimi validator requirements&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;API Format&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Directly generates Moonshot-native format (including &lt;code&gt;thinking&lt;/code&gt; config, &lt;code&gt;$defs&lt;/code&gt; normalization, etc.)&lt;/td&gt;
&lt;td&gt;Transformed through OpenAI/Anthropic protocol conversion, causing format mismatches&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Service Integration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Built-in Moonshot fetch/search/file services&lt;/td&gt;
&lt;td&gt;Uses generic local tools&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Core Dependencies&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;CLI core does not directly depend on vendor SDKs; isolated via self-developed &lt;code&gt;kosong&lt;/code&gt; package&lt;/td&gt;
&lt;td&gt;Core package directly coupled with &lt;code&gt;@ai-sdk/openai&lt;/code&gt; and more than a dozen other vendor SDKs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Looking at commit history, kimi-code&amp;rsquo;s evolution is directed at &lt;strong&gt;continuously eliminating user configuration burden&lt;/strong&gt; (&lt;code&gt;reasoning_key&lt;/code&gt; went from required → optional override → auto-inferred; &lt;code&gt;interleaved&lt;/code&gt; went from filtered → correctly mapped), while OpenCode&amp;rsquo;s evolution is directed at &lt;strong&gt;deepening OpenAI ecosystem integration&lt;/strong&gt; (Responses API, encrypted reasoning, item reference), leaving Kimi adaptation stuck at the generic bridge layer.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;That&amp;rsquo;s the truth at the commit level: one is native evolution, the other is a bridge gap.&lt;/strong&gt;&lt;/p&gt;</description></item><item><title>Recently Discovered Useful MCP Tools</title><link>https://svtter.cn/en/p/recently-discovered-useful-mcp-tools/</link><pubDate>Sat, 17 Jan 2026 22:18:33 +0800</pubDate><guid>https://svtter.cn/en/p/recently-discovered-useful-mcp-tools/</guid><description>&lt;img src="https://svtter.cn/p/%E6%9C%80%E8%BF%91%E5%8F%91%E7%8E%B0%E5%A5%BD%E7%94%A8%E7%9A%84-mcp-%E5%B7%A5%E5%85%B7/bg.png" alt="Featured image of post Recently Discovered Useful MCP Tools" /&gt;&lt;p&gt;Recently, I&amp;rsquo;ve been extensively using the opencode/claude code combination for development and have explored three particularly useful tools.&lt;/p&gt;
&lt;p&gt;They address several issues:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Parallel development on a single server; controlling tmux: &lt;code&gt;tmux&lt;/code&gt; and &lt;code&gt;tmux-mcp&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Preventing claude code from stopping at meaningless points: &lt;code&gt;ralph-loop&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;End-to-end automated testing: &lt;code&gt;playwright mcp&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="tool-list"&gt;Tool List
&lt;/h2&gt;&lt;h3 id="tmux-mcp"&gt;tmux mcp
&lt;/h3&gt;&lt;p&gt;First, configure tmux in the Linux environment with opencode, then have opencode install &lt;a class="link" href="https://github.com/rinadelph/tmux-mcp.git" target="_blank" rel="noopener"
&gt;https://github.com/rinadelph/tmux-mcp.git&lt;/a&gt;. Once installed, you can use oc to control tmux content.&lt;/p&gt;
&lt;p&gt;This method can be used to reactivate stopped opencode sessions. For example, you can open multiple tmux sessions and have one opencode monitor, start, and stop tasks through the tmux tool.&lt;/p&gt;
&lt;h3 id="ralph-loop"&gt;ralph-loop
&lt;/h3&gt;&lt;p&gt;Ralph is an autonomous AI agent loop that repeatedly runs Amp until all PRD items are completed. Each iteration creates a brand new Amp instance with a clean context.&lt;/p&gt;
&lt;p&gt;Ralph likely originated from here: &lt;a class="link" href="https://github.com/snarktank/ralph" target="_blank" rel="noopener"
&gt;https://github.com/snarktank/ralph&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Due to its effectiveness (which actually occurred after further improvements in model performance), it was also introduced to claude code.&lt;/p&gt;
&lt;script src="https://svtter.cn/js/repo-card.js"&gt;&lt;/script&gt;
&lt;div class="repo-card" data-repo="frankbria/ralph-claude-code"&gt;&lt;/div&gt;
&lt;p&gt;Ralph-loop is a Claude Code plugin that allows Claude Code to automatically restart when tasks are completed, forming a loop execution mechanism. This is particularly useful for tasks that require continuous improvement or iteration.&lt;/p&gt;
&lt;h4 id="installation-method"&gt;Installation Method
&lt;/h4&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Install through Claude Code official plugin market&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;/plugin install ralph-wiggum@claude-plugins-official&lt;/code&gt; or &lt;code&gt;cc '/plugin install ralph-wiggum@claude-plugins-official'&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Configuration and Usage&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;After installation, you can start it in Claude Code via the &lt;code&gt;/ralph-loop&lt;/code&gt; command&lt;/li&gt;
&lt;li&gt;Set tasks and termination conditions, Claude Code will automatically restart each time it stops&lt;/li&gt;
&lt;li&gt;This is particularly useful for scenarios requiring multiple iterations of code improvement, debugging, or testing&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h4 id="use-cases"&gt;Use Cases
&lt;/h4&gt;&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Code Refactoring&lt;/strong&gt;: Have Claude Code continuously improve code quality&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Test-Driven Development&lt;/strong&gt;: Write tests, then have Claude Code continuously improve implementations&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Debugging Loops&lt;/strong&gt;: Automatically restart debugging sessions&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Continuous Integration&lt;/strong&gt;: Simulate CI/CD processes locally&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The drawback of this plugin is that it consumes a lot of tokens; without a max20 subscription, it&amp;rsquo;s better not to use it. However, for tasks requiring high-quality output, this tool can significantly improve work efficiency.&lt;/p&gt;
&lt;h3 id="playwright-mcp"&gt;playwright mcp
&lt;/h3&gt;&lt;p&gt;This plugin can launch browsers to complete end-to-end testing or write end-to-end test code. It can better form loops to have cc or oc improve code.&lt;/p&gt;
&lt;p&gt;Installation method: &lt;code&gt;claude 'help me install playwright mcp'&lt;/code&gt;&lt;/p&gt;
&lt;h2 id="rewriting-as-agents"&gt;Rewriting as Agents
&lt;/h2&gt;&lt;p&gt;I recommend directly rewriting these tools and MCPs as agents through opencode or claude code.&lt;/p&gt;
&lt;p&gt;Compared to skills commands, these tools are more suitable for invocation through agents. &lt;a class="link" href="https://svtter.cn/p/why-agent/#llm-%E7%9A%84%E9%97%AE%E9%A2%98" target="_blank" rel="noopener"
&gt;Agent context is very clean&lt;/a&gt;, making tool invocation almost inevitable.&lt;/p&gt;
&lt;h2 id="summary"&gt;Summary
&lt;/h2&gt;&lt;p&gt;As LLMs become increasingly powerful, numerous MCPs that rely on LLM capabilities naturally gain improvements. Tools that weren&amp;rsquo;t very useful before become more effective. This aligns with the saying: &amp;ldquo;Don&amp;rsquo;t build things that become meaningless after large model capability enhancements.&amp;rdquo;
Large model capabilities continue to improve, and prices keep decreasing.&lt;/p&gt;
&lt;p&gt;I believe the next step is to bridge interactions between different modalities and tools, as well as endowing tools with large model capabilities, which is one of the inevitable development directions for agent engineers.&lt;/p&gt;</description></item><item><title>Efficient and Cost-Effective: My AI Agent Workflow Choice</title><link>https://svtter.cn/en/p/efficient-and-cost-effective-my-ai-agent-workflow-choice/</link><pubDate>Mon, 05 Jan 2026 16:00:00 +0800</pubDate><guid>https://svtter.cn/en/p/efficient-and-cost-effective-my-ai-agent-workflow-choice/</guid><description>&lt;img src="https://svtter.cn/p/%E9%AB%98%E6%95%88%E7%9C%81%E9%92%B1%E6%88%91%E7%9A%84-ai-agent-%E5%B7%A5%E4%BD%9C%E6%B5%81%E9%80%89%E6%8B%A9/featured-image.jpg" alt="Featured image of post Efficient and Cost-Effective: My AI Agent Workflow Choice" /&gt;&lt;p&gt;Claude Code&amp;rsquo;s $100/month price tag is a bit steep for many. To address this, I&amp;rsquo;ve been experimenting with a more practical and affordable workflow.&lt;/p&gt;
&lt;p&gt;In terms of models, my recommendation is to use &lt;strong&gt;Gemini 3 Flash&lt;/strong&gt; on an as-needed (pay-as-you-go) basis as a replacement.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why?&lt;/strong&gt; Gemini 3 Flash offers incredible value. It&amp;rsquo;s fast, efficient, and costs a fraction of what you&amp;rsquo;d pay for Opus or Sonnet. For the vast majority of tasks, Flash is more than enough.&lt;/p&gt;
&lt;h2 id="the-cost-saving-workflow"&gt;The Cost-Saving Workflow
&lt;/h2&gt;&lt;p&gt;Here is my current &amp;ldquo;budget&amp;rdquo; workflow:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Planning &amp;amp; Proposals&lt;/strong&gt;: Use Gemini 3 Flash.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Execution &amp;amp; Building&lt;/strong&gt;: Use the free &lt;strong&gt;GLM 4.7&lt;/strong&gt; (or MiniMax M2.1) via OpenCode. If you have a &lt;a class="link" href="https://svtter.cn/p/2025-10-09-%e6%88%91%e7%8e%b0%e5%9c%a8%e6%9b%b4%e5%a4%9a%e7%9a%84%e4%bd%bf%e7%94%a8-GLM-4.6-%e4%ba%86/" &gt;Zhipu Coding Plan&lt;/a&gt;, that works perfectly too.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Speaking of Gemini 3, we have to talk about &lt;strong&gt;GPT-5.2&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Many engineers still rely on ChatGPT.com directly instead of using a proper coding agent. Regardless of the efficiency debate, the reliability is concerning. From my experience, GPT-5.2&amp;rsquo;s default tone has been tuned to be overly &amp;ldquo;people-pleasing,&amp;rdquo; which might not be ideal for professional developers seeking direct technical feedback.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://svtter.cn/p/%E9%AB%98%E6%95%88%E7%9C%81%E9%92%B1%E6%88%91%E7%9A%84-ai-agent-%E5%B7%A5%E4%BD%9C%E6%B5%81%E9%80%89%E6%8B%A9/pics/image_1767597061665_0.png"
width="1023"
height="930"
srcset="https://svtter.cn/p/%E9%AB%98%E6%95%88%E7%9C%81%E9%92%B1%E6%88%91%E7%9A%84-ai-agent-%E5%B7%A5%E4%BD%9C%E6%B5%81%E9%80%89%E6%8B%A9/pics/image_1767597061665_0_hu_175ada8cb4120ce2.png 480w, https://svtter.cn/p/%E9%AB%98%E6%95%88%E7%9C%81%E9%92%B1%E6%88%91%E7%9A%84-ai-agent-%E5%B7%A5%E4%BD%9C%E6%B5%81%E9%80%89%E6%8B%A9/pics/image_1767597061665_0_hu_c7107e2757a481d7.png 1024w"
loading="lazy"
alt="GPT-5.2 Response Tone"
class="gallery-image"
data-flex-grow="110"
data-flex-basis="264px"
&gt;&lt;/p&gt;
&lt;p&gt;Furthermore, while GPT-5.2 scored impressively on &lt;strong&gt;SWE-bench Verified&lt;/strong&gt;, my real-world experience has been mixed. It&amp;rsquo;s worth looking at the history of SWE-bench:&lt;/p&gt;
&lt;p&gt;Originally proposed by a team from &lt;strong&gt;Princeton University&lt;/strong&gt; (ICLR 2024), it evaluates a model&amp;rsquo;s ability to solve real GitHub issues. However, in August 2024, OpenAI&amp;rsquo;s Preparedness team collaborated with the original authors to create &lt;strong&gt;SWE-bench Verified&lt;/strong&gt; (a subset of 500 manually verified issues). Since OpenAI was involved in the design of this benchmark, their models&amp;rsquo; performance on it should be taken with a grain of salt. While not necessarily a deliberate manipulation, the risk of inherent bias is significant.&lt;/p&gt;
&lt;p&gt;Ultimately, as I often say, &amp;ldquo;Codex&amp;rdquo; models don&amp;rsquo;t always deliver the most practical results in everyday coding.&lt;/p&gt;
&lt;h2 id="opencode-tips"&gt;OpenCode Tips
&lt;/h2&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Leveraging Agents&lt;/strong&gt;: OpenCode supports launching SubAgents. When debugging complex projects, you can have OpenCode launch agents in different directories to handle front-end and back-end tasks separately, which also helps avoid permission issues.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;OpenSpec: Cross-Agent Collaboration&lt;/strong&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-text" data-lang="text"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;1. OpenCode + Gemini 3 Flash → Generate proposal
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;2. Codex → Code Review
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;3. Claude Code → Secondary Review
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;4. OpenSpec Apply → Final Execution
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;OpenSpec generates reliable specs, but sometimes cheaper models produce lower-quality code. In such cases, you can generate multiple times using the spec and select the best result.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="final-thoughts"&gt;Final Thoughts
&lt;/h2&gt;&lt;p&gt;As AI Agent engineers, we need to adapt to these ongoing trends:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Models are becoming smarter.&lt;/li&gt;
&lt;li&gt;Execution is becoming faster.&lt;/li&gt;
&lt;li&gt;Prices are dropping.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;While these trends are promising, we still need to balance speed, cost, and quality for every task. We might soon see agent systems that automate this balancing act, but for now, it&amp;rsquo;s a crucial part of the engineer&amp;rsquo;s role.&lt;/p&gt;</description></item><item><title>Developing Fried Rice Theme with Claude Code</title><link>https://svtter.cn/en/p/developing-fried-rice-theme-with-claude-code/</link><pubDate>Tue, 23 Dec 2025 15:00:00 +0800</pubDate><guid>https://svtter.cn/en/p/developing-fried-rice-theme-with-claude-code/</guid><description>&lt;img src="https://svtter.cn/p/%E7%94%A8-claude-code-%E5%BC%80%E5%8F%91-fried-rice-%E4%B8%BB%E9%A2%98/pics/bg.png" alt="Featured image of post Developing Fried Rice Theme with Claude Code" /&gt;&lt;p&gt;Recently, I used Claude Code to add some SEO features to my own blog theme &lt;a class="link" href="https://github.com/Svtter/Fried-Rice" target="_blank" rel="noopener"
&gt;Fried Rice&lt;/a&gt;, and the overall experience was quite good.&lt;/p&gt;
&lt;h2 id="background"&gt;Background
&lt;/h2&gt;&lt;p&gt;Fried Rice is a theme forked from &lt;a class="link" href="https://github.com/CaiJimmy/hugo-theme-stack" target="_blank" rel="noopener"
&gt;hugo-theme-stack&lt;/a&gt;. Previously, I had already added some basic JSON-LD structured data, and this time I wanted to continue improving it.&lt;/p&gt;
&lt;h2 id="what-was-done-this-time"&gt;What Was Done This Time
&lt;/h2&gt;&lt;p&gt;Mainly enhancing SEO structured data:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;WebSite schema (supports search action)&lt;/li&gt;
&lt;li&gt;Organization schema (includes founder, contact point, address)&lt;/li&gt;
&lt;li&gt;FAQ schema (supports inline FAQ in articles)&lt;/li&gt;
&lt;li&gt;Enhanced Article/BlogPosting schema (added accessibility metadata)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="https://svtter.cn/p/%E7%94%A8-claude-code-%E5%BC%80%E5%8F%91-fried-rice-%E4%B8%BB%E9%A2%98/pics/screen.png"
width="2153"
height="1761"
srcset="https://svtter.cn/p/%E7%94%A8-claude-code-%E5%BC%80%E5%8F%91-fried-rice-%E4%B8%BB%E9%A2%98/pics/screen_hu_9e208084f399dd3b.png 480w, https://svtter.cn/p/%E7%94%A8-claude-code-%E5%BC%80%E5%8F%91-fried-rice-%E4%B8%BB%E9%A2%98/pics/screen_hu_ddc5a242949e5593.png 1024w"
loading="lazy"
alt="Fried Rice Theme Preview"
class="gallery-image"
data-flex-grow="122"
data-flex-basis="293px"
&gt;&lt;/p&gt;
&lt;h2 id="claude-codes-performance"&gt;Claude Code&amp;rsquo;s Performance
&lt;/h2&gt;&lt;p&gt;The entire development process took about 2 hours. Claude Code helped me:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Write code&lt;/strong&gt; - Hugo template syntax is cumbersome, letting AI write it saves a lot of effort&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Review code&lt;/strong&gt; - After I committed, I asked it to check, and it found several issues:
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;datePublished&lt;/code&gt; was defined 3 times&lt;/li&gt;
&lt;li&gt;&lt;code&gt;founder&lt;/code&gt; object was defined repeatedly&lt;/li&gt;
&lt;li&gt;JSON output had double escaping issues&lt;/li&gt;
&lt;li&gt;Variable scope errors&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Fix issues&lt;/strong&gt; - After finding issues, I asked it to fix them directly, all fixed at once&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Create PR, tag, write CHANGELOG&lt;/strong&gt; - These trivial tasks can also be done&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;A pleasant surprise was that it could find logical issues in the code. For example, Hugo&amp;rsquo;s &lt;code&gt;jsonify&lt;/code&gt; output was HTML-escaped causing JSON format errors, and it found the correct solution (using &lt;code&gt;safeJS&lt;/code&gt;).&lt;/p&gt;
&lt;h2 id="shortcomings"&gt;Shortcomings
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;Sometimes needs multiple reminders to use the correct tools&lt;/li&gt;
&lt;li&gt;Not very familiar with Hugo template syntax in some places, needs several iterations&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="summary"&gt;Summary
&lt;/h2&gt;&lt;p&gt;For this kind of &amp;ldquo;add feature + fix bug&amp;rdquo; task, Claude Code is quite useful. Especially for tedious syntax like Hugo templates, having AI write it is much more efficient.&lt;/p&gt;
&lt;h2 id="related-projects"&gt;Related Projects
&lt;/h2&gt;
&lt;script src="https://svtter.cn/js/repo-card.js"&gt;&lt;/script&gt;&lt;div class="repo-card" data-repo="Svtter/Fried-Rice"&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Based on:&lt;/strong&gt;&lt;/p&gt;
&lt;div class="repo-card" data-repo="CaiJimmy/hugo-theme-stack"&gt;&lt;/div&gt;</description></item><item><title>CS146S is a Good Course</title><link>https://svtter.cn/en/p/cs146s-is-a-good-course/</link><pubDate>Mon, 15 Dec 2025 20:45:35 +0800</pubDate><guid>https://svtter.cn/en/p/cs146s-is-a-good-course/</guid><description>&lt;img src="https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/bg.png" alt="Featured image of post CS146S is a Good Course" /&gt;&lt;p&gt;&lt;a class="link" href="https://themodernsoftware.dev/" target="_blank" rel="noopener"
&gt;CS146S&lt;/a&gt; is a good course, one of the reasons is that it teaches modern software engineers how to better collaborate with AI. Secondly, it basically covers all my modern coding capabilities. &lt;em&gt;(It&amp;rsquo;s a joke!)&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;In the following content, I will embed the slides from the course as hyperlinks in my text. If you&amp;rsquo;re interested, you can click the hyperlinks directly to open the corresponding slides.&lt;/p&gt;
&lt;h2 id="basic-techniques"&gt;Basic Techniques
&lt;/h2&gt;&lt;p&gt;I think everyone, like me, has already mastered the basic capabilities. More clear and explicit prompts let LLMs execute instructions unambiguously. Additionally, there are prompt optimization techniques, and using &lt;a class="link" href="https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/prompt-improver" target="_blank" rel="noopener"
&gt;Claude to optimize prompts&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The course also talked about how to &lt;a class="link" href="https://docs.google.com/presentation/d/11CP26VhsjnZOmi9YFgLlonzdib9BLyAlgc4cEvC5Fps/edit?usp=sharing" target="_blank" rel="noopener"
&gt;build coding agents&lt;/a&gt;, emphasizing that you can use the &lt;a class="link" href="https://platform.claude.com/docs/en/agent-sdk/overview" target="_blank" rel="noopener"
&gt;Claude Code SDK&lt;/a&gt;. It&amp;rsquo;s now called Claude Agent SDK.&lt;/p&gt;
&lt;p&gt;To enhance LLM capabilities, you can also use &lt;a class="link" href="https://docs.google.com/presentation/d/1zSC2ra77XOUrJeyS85houg1DU7z9hq5Y4ebagTch-5o/edit?usp=drive_link" target="_blank" rel="noopener"
&gt;MCP services&lt;/a&gt;. I built &lt;a class="link" href="https://github.com/Svtter/git-mcp" target="_blank" rel="noopener"
&gt;git-mcp&lt;/a&gt;, and there&amp;rsquo;s also an unopen-sourced experimental startup MCP.&lt;/p&gt;
&lt;h3 id="mcp-a-bit-deeper-content-from-the-ppt"&gt;MCP a bit deeper (content from the PPT)
&lt;/h3&gt;&lt;p&gt;&lt;img src="https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/clipboard-1765805054231.png"
width="2442"
height="1268"
srcset="https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/clipboard-1765805054231_hu_5bb5ee36b6e10e41.png 480w, https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/clipboard-1765805054231_hu_70c54c437868eb9.png 1024w"
loading="lazy"
class="gallery-image"
data-flex-grow="192"
data-flex-basis="462px"
&gt;&lt;/p&gt;
&lt;p&gt;With MCP, it&amp;rsquo;s worth noting the Host/Server/Client concept. Many Hosts are not open-source. Deepchat&amp;rsquo;s Host can be referenced.&lt;/p&gt;
&lt;p&gt;Limitations:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-gdscript3" data-lang="gdscript3"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Agents&lt;/span&gt; &lt;span class="n"&gt;don&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;t handle many tools very well today&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;APIs&lt;/span&gt; &lt;span class="n"&gt;eat&lt;/span&gt; &lt;span class="n"&gt;up&lt;/span&gt; &lt;span class="n"&gt;your&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;window&lt;/span&gt; &lt;span class="n"&gt;quickly&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;Design&lt;/span&gt; &lt;span class="n"&gt;APIs&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;be&lt;/span&gt; &lt;span class="n"&gt;AI&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;native&lt;/span&gt; &lt;span class="n"&gt;rather&lt;/span&gt; &lt;span class="n"&gt;that&lt;/span&gt; &lt;span class="n"&gt;rigid&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;h2 id="ide-agent"&gt;IDE Agent
&lt;/h2&gt;&lt;p&gt;From the &lt;a class="link" href="https://docs.google.com/presentation/d/11pQNCde_mmRnImBat0Zymnp8TCS_cT_1up7zbcj6Sjg/edit?usp=sharing" target="_blank" rel="noopener"
&gt;IDE&lt;/a&gt; perspective, I&amp;rsquo;ve switched from frequently using Cursor to using Claude Code + VSCode for programming. I feel Claude Code as a CLI is more powerful. However, I haven&amp;rsquo;t used Cursor for a while, so I don&amp;rsquo;t know if there have been some improvements. &lt;a class="link" href="https://www.trae.ai/solo" target="_blank" rel="noopener"
&gt;Trae&amp;rsquo;s solo mode&lt;/a&gt; is just like that, basically insufficient intelligence is the biggest problem. Trae CN.&lt;/p&gt;
&lt;p&gt;Additionally worth mentioning is that Silas Alberti, Head of Research &lt;a class="link" href="https://cognition.ai/" target="_blank" rel="noopener"
&gt;Cognition&lt;/a&gt;&amp;rsquo;s &lt;a class="link" href="https://docs.google.com/presentation/d/1i0pRttHf72lgz8C-n7DSegcLBgncYZe_ppU7dB9zhUA/edit?usp=sharing" target="_blank" rel="noopener"
&gt;slides&lt;/a&gt; are very powerful.&lt;/p&gt;
&lt;p&gt;This summary diagram is awesome. &lt;code&gt;Is it really free to watch?&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/axis.png"
width="2709"
height="1491"
srcset="https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/axis_hu_4f423a1898a3c4be.png 480w, https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/axis_hu_f0b3717c50b18ca0.png 1024w"
loading="lazy"
alt="local-cloud/sync-async limitations"
class="gallery-image"
data-flex-grow="181"
data-flex-basis="436px"
&gt;&lt;/p&gt;
&lt;p&gt;This article also mentions the concept of parallel agents.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/cloud-sync.png"
width="2730"
height="1428"
srcset="https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/cloud-sync_hu_888ea075db809469.png 480w, https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/cloud-sync_hu_aad727e671ca1c8d.png 1024w"
loading="lazy"
class="gallery-image"
data-flex-grow="191"
data-flex-basis="458px"
&gt;&lt;/p&gt;
&lt;p&gt;So for me, the next direction to improve is cloud + async.&lt;/p&gt;
&lt;p&gt;This is Silas Alberti&amp;rsquo;s advice:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/asyncagent.png"
width="2730"
height="1596"
srcset="https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/asyncagent_hu_8737468dad94135f.png 480w, https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/asyncagent_hu_7975af118f4c1c9b.png 1024w"
loading="lazy"
class="gallery-image"
data-flex-grow="171"
data-flex-basis="410px"
&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class="link" href="https://devin.ai/" target="_blank" rel="noopener"
&gt;devin&lt;/a&gt; and &lt;a class="link" href="https://claude.ai/code" target="_blank" rel="noopener"
&gt;Claude Code Cloud&lt;/a&gt; are exactly the same. Actually, you can completely use Claude Code Cloud version for vibe coding.&lt;/p&gt;
&lt;h2 id="agent-manager"&gt;Agent Manager
&lt;/h2&gt;&lt;p&gt;Engineers need to become agent managers, not just software engineers.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/history.png"
width="2843"
height="1514"
srcset="https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/history_hu_dff75fbff84f0862.png 480w, https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/history_hu_9ea6b729a07f56bc.png 1024w"
loading="lazy"
class="gallery-image"
data-flex-grow="187"
data-flex-basis="450px"
&gt;&lt;/p&gt;
&lt;p&gt;Under the Claude Code designer mindset, the software design process should be:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Provide high level requirements 🟩&lt;/li&gt;
&lt;li&gt;Convert requirements into a design doc 🟩/🟦&lt;/li&gt;
&lt;li&gt;Implement solution from doc 🟦&lt;/li&gt;
&lt;li&gt;Add tests 🟦&lt;/li&gt;
&lt;li&gt;Ensure CI (continuous integration) passes 🟦&lt;/li&gt;
&lt;li&gt;Code review 🟦&lt;/li&gt;
&lt;li&gt;Update docs 🟦&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;My habit is more to write simple requirements, then generate design, then let Claude Code implement the rest itself.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I recently found it&amp;rsquo;s not that capable. I adopted a test-driven development approach to ensure every step is done correctly. Otherwise, CI and Add tests actually have no meaning.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Techniques for directing agents:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Agent behavior files (Claude.md/Cursorrules/agents.md)&lt;/li&gt;
&lt;li&gt;Hooks&lt;/li&gt;
&lt;li&gt;Commands&lt;/li&gt;
&lt;li&gt;Subagents&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I&amp;rsquo;ve already used subagents and commands a lot. But I haven&amp;rsquo;t found a killing scenario for hooks yet.&lt;/p&gt;
&lt;h3 id="best-practice-claude-code"&gt;Best practice Claude Code
&lt;/h3&gt;&lt;p&gt;&lt;img src="https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/best-prac-cc.png"
width="2613"
height="1149"
srcset="https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/best-prac-cc_hu_6f1f8b4de9bcdaad.png 480w, https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/best-prac-cc_hu_bd247f1e483fdf39.png 1024w"
loading="lazy"
class="gallery-image"
data-flex-grow="227"
data-flex-basis="545px"
&gt;&lt;/p&gt;
&lt;p&gt;What I want to say is to use subagents as much as possible to avoid the &amp;ldquo;lost in the middle&amp;rdquo; phenomenon.&lt;/p&gt;
&lt;h3 id="claude-code-cli"&gt;&lt;a class="link" href="https://docs.google.com/presentation/d/1bv7Zozn6z45CAh-IyX99dMPMyXCHC7zj95UfwErBYQ8/edit?slide=id.g35e8acebb8e_0_106#slide=id.g35e8acebb8e_0_106" target="_blank" rel="noopener"
&gt;Claude Code CLI&lt;/a&gt;
&lt;/h3&gt;&lt;blockquote&gt;
&lt;p&gt;Why did I buy Claude Code?&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/clipboard-1765805228031.png"
width="2850"
height="1659"
srcset="https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/clipboard-1765805228031_hu_970bdc1680b37577.png 480w, https://svtter.cn/p/cs146s-%E6%98%AF%E4%B8%80%E9%97%A8%E5%A5%BD%E8%AF%BE%E7%A8%8B/pics/clipboard-1765805228031_hu_e272c05700bf0f37.png 1024w"
loading="lazy"
class="gallery-image"
data-flex-grow="171"
data-flex-basis="412px"
&gt;&lt;/p&gt;
&lt;p&gt;We can do more things through the SDK:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;claude -p &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="se"&gt;&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;what did i do this week?&amp;#34;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="se"&gt;&lt;/span&gt; --allowedTools Bash&lt;span class="o"&gt;(&lt;/span&gt;git log:*&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; --output-format stream-json
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;h2 id="conclusion"&gt;Conclusion
&lt;/h2&gt;&lt;p&gt;This course is free, but the insights inside surpass most paid courses. If you can understand and quickly absorb it, don&amp;rsquo;t be stingy with your time, learn it.&lt;/p&gt;</description></item><item><title>Can GLM 4.6 Be Strengthened Through Spec-Kit</title><link>https://svtter.cn/en/p/can-glm-4.6-be-strengthened-through-spec-kit/</link><pubDate>Fri, 14 Nov 2025 15:41:46 +0800</pubDate><guid>https://svtter.cn/en/p/can-glm-4.6-be-strengthened-through-spec-kit/</guid><description>&lt;img src="https://svtter.cn/p/%E9%80%9A%E8%BF%87-spec-kit-%E5%8A%A0%E5%BC%BA%E5%BC%B1%E6%A8%A1%E5%9E%8B%E7%9A%84%E8%A1%A8%E7%8E%B0/pics/bg.png" alt="Featured image of post Can GLM 4.6 Be Strengthened Through Spec-Kit" /&gt;&lt;blockquote&gt;
&lt;p&gt;Another article on how to mitigate losses with glm4.6. Our old friend glm 4.6. The new friend doubao-seed-code has also arrived.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;&lt;a class="link" href="https://github.com/github/spec-kit" target="_blank" rel="noopener"
&gt;github spec-kit&lt;/a&gt; is a coding agent enhancement tool launched by GitHub, aimed at making engineering more standardized and easier.&lt;/p&gt;
&lt;p&gt;I initially looked down on this, thinking I have the claude code max plan, so why bother using it? Then:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://svtter.cn/p/%E9%80%9A%E8%BF%87-spec-kit-%E5%8A%A0%E5%BC%BA%E5%BC%B1%E6%A8%A1%E5%9E%8B%E7%9A%84%E8%A1%A8%E7%8E%B0/pics/limit.png"
width="2118"
height="126"
srcset="https://svtter.cn/p/%E9%80%9A%E8%BF%87-spec-kit-%E5%8A%A0%E5%BC%BA%E5%BC%B1%E6%A8%A1%E5%9E%8B%E7%9A%84%E8%A1%A8%E7%8E%B0/pics/limit_hu_5db449d2dc02deb2.png 480w, https://svtter.cn/p/%E9%80%9A%E8%BF%87-spec-kit-%E5%8A%A0%E5%BC%BA%E5%BC%B1%E6%A8%A1%E5%9E%8B%E7%9A%84%E8%A1%A8%E7%8E%B0/pics/limit_hu_9964d513d07b6656.png 1024w"
loading="lazy"
class="gallery-image"
data-flex-grow="1680"
data-flex-basis="4034px"
&gt;&lt;/p&gt;
&lt;p&gt;This is actually the result of using spec kit, leading to a huge token consumption. Otherwise, based on my usual usage, it should have been just right.&lt;/p&gt;
&lt;p&gt;This means that cheaper models might be more cost-effective to use. Because they are less capable, constraining their behavior with extensive specs might lead to better performance than before.&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s try out &lt;a class="link" href="https://github.com/github/spec-kit" target="_blank" rel="noopener"
&gt;spec-kit&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="installation"&gt;Installation
&lt;/h2&gt;&lt;p&gt;For installation, it&amp;rsquo;s recommended to take a dual approach.&lt;/p&gt;
&lt;p&gt;One is to use it directly without worrying too much about installation:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;uvx --from git+https://github.com/github/spec-kit.git specify init . --github-token&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$GITHUB_TOKEN&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;Here, &lt;code&gt;GITHUB_TOKEN&lt;/code&gt; refers to the GitHub personal token.&lt;/p&gt;
&lt;p&gt;Another method is to install it first and then use it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;pipx install git+https://github.com/github/spec-kit.git
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;Each has its pros and cons. The former requires no installation but needs to pull from git each time; the latter requires a one-time installation but involves dependency management.&lt;/p&gt;
&lt;h2 id="specification-driven-development"&gt;Specification Driven Development
&lt;/h2&gt;&lt;p&gt;SDD is a newly emerging concept. It uses extensive constraints to enable coding agents to write production-ready code.&lt;/p&gt;
&lt;p&gt;This article explains it well:&lt;/p&gt;
&lt;a href="https://mp.weixin.qq.com/s/zVvkSCFiknLZcolKjYLoIA" target="_blank" rel="noopener" style="text-decoration:none; display:block; max-width:600px; border: 1px solid #e0e0e0; border-radius:8px; overflow:hidden; color:inherit; font-family:-apple-system, BlinkMacSystemFont, sans-serif; margin:1em 0;"&gt;
&lt;div style="position:relative; padding-top:56.25%; background:#f0f0f0;"&gt;
&lt;!-- Cover image example: recommended to replace with actual cover image URL --&gt;
&lt;img src="pics/sdd.jpg" alt="Article cover" style="position:absolute; top:0; left:0; width:100%; height:100%; object-fit:cover;"&gt;
&lt;/div&gt;
&lt;div style="padding:12px;"&gt;
&lt;h3 style="margin:0 0 8px; font-size:18px; line-height:1.2; color:#000"&gt;
Follow-up on Spec-Driven Development Two Months Later: spec-kit and Ecosystem Development Research
&lt;/h3&gt;
&lt;p style="margin:0 0 10px; color:#555; font-size:14px; line-height:1.4;"&gt;
This article follows up on the rapid development of GitHub's spec-kit project two months after its release, including its community growth, feature iterations, and ecosystem status. It also explores the core concepts of Specification Driven Development (SDD), compares main tools, discusses challenges, and outlines industry trends for 2025, providing developers with practical advice and outlook.
&lt;/p&gt;
&lt;div style="font-size:12px; color:#999;"&gt;
WeChat Official Account · Publication Date
&lt;/div&gt;
&lt;/div&gt;
&lt;/a&gt;
&lt;p&gt;Based on the concept of SDD, I decided to use spec-kit to test the performance of weaker models. The test task was: to add a publish button to hugo-admin.&lt;/p&gt;
&lt;p&gt;Unfortunately, the performance is still not quite up to par. What&amp;rsquo;s rather amusing is the content it output in the middle:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;● Bash&lt;span class="o"&gt;(&lt;/span&gt;python -c &lt;span class="s2"&gt;&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; import sys…)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; ⎿ Error: Exit code 1
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; Traceback (most recent call last):
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; File &amp;#34;&lt;/span&gt;&amp;lt;string&amp;gt;&lt;span class="s2"&gt;&amp;#34;, line 4, in &amp;lt;module&amp;gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; File &amp;#34;&lt;/span&gt;/home/svtter/work/blog/hugo-admin/services/post_service.py&lt;span class="s2"&gt;&amp;#34;, line 15, in &amp;lt;module&amp;gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; import frontmatter
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; ModuleNotFoundError: No module named &amp;#39;frontmatter&amp;#39;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt; ● The tests show the functionality is implemented correctly but the frontmatter module isn&amp;#39;t installed in the current environment.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="s2"&gt;That&amp;#39;s fine for our implementation - the module is specified in requirements.txt and will be available when the application runs.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;Then it skipped the tests. I can only say that ChatGLM 4.6 is an AI model that doesn&amp;rsquo;t get bogged down in internal struggles. Here is the &lt;a class="link" href="https://github.com/Svtter/hugo-admin/commit/ad7891e0038f154f37a4b65325746dce1c00d1b4" target="_blank" rel="noopener"
&gt;commit&lt;/a&gt; it submitted.&lt;/p&gt;
&lt;p&gt;Afterwards, I switched to &lt;code&gt;doubao-seed-code&lt;/code&gt; to continue testing other features, but the performance of doubao-seed-code combined with Claude Code wasn&amp;rsquo;t great either. You can check out its &lt;a class="link" href="https://github.com/Svtter/hugo-admin/commit/9acd83940da558c1335f036e2dc475062166869d" target="_blank" rel="noopener"
&gt;commit&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In the end, I completed the entire functionality using Trae (which does not support spec-kit). The corresponding &lt;a class="link" href="https://github.com/Svtter/hugo-admin/commit/53f895017e9d2b94880f5385562993274076d7a7" target="_blank" rel="noopener"
&gt;commit&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="summary"&gt;Summary
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;If you can manually manage the current context and some obvious &amp;ldquo;information the model tends to forget,&amp;rdquo; then you can completely avoid using &lt;a class="link" href="https://github.com/github/spec-kit" target="_blank" rel="noopener"
&gt;spec-kit&lt;/a&gt; when working with Claude Code. This thing is a token hog—it essentially uses a sledgehammer to crack a nut.&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/github/spec-kit" target="_blank" rel="noopener"
&gt;spec-kit&lt;/a&gt; does not support Trae, and Trae doesn&amp;rsquo;t need that support to perform well.&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Claude Code Plugin Usage Experience</title><link>https://svtter.cn/en/p/claude-code-plugin-usage-experience/</link><pubDate>Tue, 14 Oct 2025 10:16:54 +0800</pubDate><guid>https://svtter.cn/en/p/claude-code-plugin-usage-experience/</guid><description>&lt;img src="https://svtter.cn/p/claude-code-plugin-%E4%BD%BF%E7%94%A8%E4%BD%93%E9%AA%8C/pics/bg.svg" alt="Featured image of post Claude Code Plugin Usage Experience" /&gt;&lt;p&gt;Overall, the experience was not good.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s likely because it&amp;rsquo;s newly launched and generally feels immature.&lt;/p&gt;
&lt;p&gt;Typical issues include:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Not using available &lt;a class="link" href="https://docs.claude.com/en/docs/claude-code/sub-agents#example-subagents" target="_blank" rel="noopener"
&gt;agents&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Not using available &lt;a class="link" href="https://docs.claude.com/en/docs/agents-and-tools/mcp-connector#mcp-server-configuration" target="_blank" rel="noopener"
&gt;MCP&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Tool calls are infrequent and require manual prompting. As a user, I generally don&amp;rsquo;t deliberately memorize which agents are available.&lt;/p&gt;
&lt;p&gt;More importantly, it impacts efficiency.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If using DeepSeek V3.2, its relatively short context length (128K) means it doesn&amp;rsquo;t perform well when there are many tools or MCP connections.&lt;/li&gt;
&lt;li&gt;Plugins often don&amp;rsquo;t improve the tool usage experience; they can actually degrade it. This is because MCP tools and plugins increase the input token count, forcing the model to process more context. Since the computational complexity of transformers is O(n²), any increase in length has a significant negative impact.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In summary, it&amp;rsquo;s not recommended for use at this time.&lt;/p&gt;</description></item><item><title>[Expired] I now use GLM 4.6 more often.</title><link>https://svtter.cn/en/p/expired-i-now-use-glm-4.6-more-often./</link><pubDate>Thu, 09 Oct 2025 15:36:00 +0800</pubDate><guid>https://svtter.cn/en/p/expired-i-now-use-glm-4.6-more-often./</guid><description>&lt;img src="https://svtter.cn/p/%E8%BF%87%E6%9C%9F-%E6%88%91%E7%8E%B0%E5%9C%A8%E6%9B%B4%E5%A4%9A%E7%9A%84%E4%BD%BF%E7%94%A8-glm-4.6-%E4%BA%86/glm-vs-deepseek.svg" alt="Featured image of post [Expired] I now use GLM 4.6 more often." /&gt;&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;● Update&lt;span class="o"&gt;(&lt;/span&gt;content/post/2025-10-24-我又买了-kimi-coding-plan/pics/bg.svg&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; ⎿ Error editing file
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; ⎿ Interrupted · What should Claude &lt;span class="k"&gt;do&lt;/span&gt; instead?
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;blockquote&gt;
&lt;p&gt;updated at: 2025-10-27
I only use glm4.6 for very simple tasks. In practical experience, minor issues frequently arise. For example, when using claude code, it is unable to update files. Here are some recent experiences using code agents.&lt;/p&gt;&lt;/blockquote&gt;
&lt;h2 id="model-comparison"&gt;Model Comparison
&lt;/h2&gt;&lt;p&gt;Based on my practical usage, GLM 4.6 is still slightly stronger than DeepSeek v3.2.&lt;/p&gt;
&lt;p&gt;For example, in a Next.js project, I configured &lt;code&gt;nextjs config -&amp;gt; baseUrl 192.168.2.14:8080&lt;/code&gt;. GLM 4.6 was able to recognize this pre-configured setting without explicit context, whereas DeepSeek v3.2 could not.&lt;/p&gt;
&lt;p&gt;However, GLM 4.6 is not superior in all aspects. When dealing with relatively ambiguous problems, DeepSeek v3.2 is more conservative and does not violate the constraints I set before task completion. In contrast, GLM 4.6 tends to ignore my constraints, makes bold modifications, and ends up breaking things.&lt;/p&gt;
&lt;h2 id="tools"&gt;Tools
&lt;/h2&gt;&lt;p&gt;Compared to using GLM 4.6 in Claude Code / Cline, the experience in Kilo Code is the best.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Kilo Code can read files in parallel, while CC can only read them one by one.&lt;/li&gt;
&lt;li&gt;Kilo Code enforces the generation of a plan, imposing more restrictions on the big model compared to CC.&lt;/li&gt;
&lt;li&gt;The visual interface is more user-friendly. I can directly ban Python commands (I need to execute &lt;code&gt;uv run&lt;/code&gt; instead of directly running Python commands).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;However, Kilo Code itself also has issues. It cannot use MCP servers of the &lt;code&gt;input; http&lt;/code&gt; type, which prevents the use of &lt;code&gt;web-search-prime&lt;/code&gt; on Kilo Code.&lt;/p&gt;
&lt;h2 id="related-reading"&gt;Related Reading
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://atbug.com/budget-efficiency-kilo-code-choice/" target="_blank" rel="noopener"
&gt;Limited Budget, Maximized Efficiency: Why Kilo Code Became My Preferred Coding Agent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://kilocode.ai/" target="_blank" rel="noopener"
&gt;Kilo Code&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item></channel></rss>