Opencode on Svtter's Blog

How kimi-code Handles kimi-k2.6: A Comparison with OpenCode

Wed, 27 May 2026 10:30:00 +0800

Recently, kimi-code migrated from Python to TypeScript. Here’s a quick analysis.

Based on my review of the kimi-code source code (particularly packages/kosong/src/providers/kimi.ts, kimi-schema.ts, kimi-files.ts, etc.) and relevant OpenCode compatibility issues, here are the kimi-k2.6-specific optimizations in kimi-code and how they differ from OpenCode.

1. Native Kimi Provider (Not a Generic OpenAI-compatible Layer)

kimi-code does not treat Kimi as “just another OpenAI-compatible endpoint.” Instead, it implements a dedicated kimi provider type:

Feature	kimi-code	OpenCode
Provider Type	Dedicated `'kimi'` type with independent adapter	Accessed via generic OpenAI/Anthropic bridge
Proprietary Fields	Native handling of `reasoning_content`, `thinking`, `generationKwargs`	`reasoning_content` often lost in the bridge layer
Auth Headers	Supports `kimiRequestHeaders`, `X-Msh-Tool-Call-Id`, and other Moonshot-specific headers	Generic header forwarding

2. Full Lifecycle Handling of `reasoning_content`

kimi-k2.6 has thinking enabled by default and requires reasoning_content to be preserved across multi-turn conversation history. Otherwise, tool calls will result in a 400 error.

How kimi-code handles it:

convertMessage: Extracts internal think content parts and serializes them into the reasoning_content field, ensuring thinking content is never lost in message history
Streaming Parser: Explicitly extracts delta.reasoning_content / message.reasoning_content in both _convertStreamResponse and _convertNonStreamResponse
TUI Rendering: A dedicated ThinkingComponent renders thinking content in real time, with expand/collapse support and a spinner animation

OpenCode’s Problem:

The OpenCode Go bridge drops reasoning_content on the second turn, causing the Moonshot API to return:

1

thinking is enabled but reasoning_content is missing in assistant tool call message

3. JSON Schema Normalization (`kimi-schema.ts`)

Moonshot’s tool parameter validator has strict and unique requirements for JSON Schema. This is one of the primary sources of incompatibility between OpenCode and kimi-k2.6.

What kimi-code’s normalizeKimiToolSchema does:

Dereferences $ref: Inlines definitions from $defs / definitions, eliminating external references
Fills in missing type: The Kimi validator rejects nested property schemas that omit type (e.g., MCP-generated enum-only schemas). kimi-code infers and backfills type: string/object/array, etc.
Circular reference detection: Preserves the original $ref when a circular reference is detected, avoiding infinite recursion

OpenCode’s Problem:

Generated schemas use #/definitions/ instead of the #/$defs/ format required by Moonshot, and lack schema type inference and backfilling for Kimi, causing complex tool calls to fail with 400.

4. Native Thinking Mode Configuration System

kimi-code has built-in support for Kimi’s thinking mode from the configuration layer all the way to the UI:

Config Parsing: ThinkingConfigSchema supports mode: auto/on/off and effort: low/medium/high/xhigh/max
Model Capability Tags: ModelAlias supports capabilities: ['thinking', 'always_thinking']
Model Selector UI: Press ←→ to toggle thinking on/off; always-on models cannot be turned off

Provider Method: withThinking(effort) correctly generates:

1
2
3
4


{
 "reasoning_effort": "high",
 "extra_body": { "thinking": { "type": "enabled" } }
}

Token Budget: Automatically normalizes legacy max_tokens to Kimi’s preferred max_completion_tokens

OpenCode’s Problem:

When using the Anthropic bridge, it hardcodes thinking content blocks, but the Kimi API only supports text/image_url/video_url/video, resulting in:

1

Invalid value: thinking. Supported values are: 'text','image_url','video_url' and 'video'.

5. Native Moonshot Service Integration

kimi-code includes Moonshot-exclusive services instead of relying on generic local implementations:

MoonshotFetchURLProvider: Prioritizes Moonshot’s coding-fetch service (with built-in page text extraction), falling back to local fetch only on failure
MoonshotWebSearchProvider: Calls the Moonshot search API directly, supporting enable_page_crawling
KimiFiles: Uploads videos to the Moonshot file service, returning video_url in the ms://<file-id> format

6. Tool Call Layer Details

Built-in Functions: Tool names starting with $ are recognized as Kimi builtin functions and serialized as type: 'builtin_function'
Usage Extraction: Supports Moonshot’s proprietary choices[0].usage placement, as well as cached_tokens and other fields
Finish Reason Mapping: Maps OpenAI-style stop/tool_calls/length values to an internal unified enum

7. CLI Core and LLM SDK Architectural Isolation

This is an easily overlooked but important architectural difference.

The core CLI of kimi-code (apps/kimi-code) does not directly depend on any OpenAI or Anthropic TypeScript SDK. Looking at its package.json, the core dependencies are only generic libraries like TUI rendering (pi-tui), CLI parsing (commander), and syntax highlighting (cli-highlight). All LLM provider interactions are isolated within the self-developed kosong package.

While packages/kosong internally uses openai and @anthropic-ai/sdk as implementation details (since the Kimi API is OpenAI-compatible), it exposes a unified LLM abstraction interface to the outside. The CLI core only depends on kosong and has no awareness of underlying vendor SDKs.

OpenCode is different. Its packages/opencode core package directly depends on a large number of vendor SDKs:

@ai-sdk/openai
@ai-sdk/anthropic
@ai-sdk/google
@ai-sdk/azure
@openrouter/ai-sdk-provider
… (more than a dozen provider-specific packages in total)

This means OpenCode’s core code is deeply coupled with each vendor’s SDK, while kimi-code’s core CLI stays clean, with all model interactions fully isolated through a self-developed abstraction layer.

8. What Commit History Reveals About Evolution Paths

The structural code differences above are just a static snapshot. What’s more interesting is comparing the commit histories of the two projects—their dynamic evolution directions are completely different.

kimi-code: Native Design, Continuously Reducing Configuration Burden

842e699 — “Kimi For Coding” (Initial Commit)

This was the starting point of the entire project. The initial code already included:

packages/kosong/src/providers/kimi.ts: Dedicated Kimi provider
packages/kosong/src/providers/kimi-schema.ts: Dedicated JSON Schema normalizer
packages/kosong/src/providers/kimi-files.ts: Dedicated file upload service

Conclusion: kimi-code treated the Kimi API as a first-class citizen from day one, not as a later patch.

d95b013 fix(catalog): preserve reasoning fields in custom model (#70)

This commit fixed a very subtle issue. models.dev uses the interleaved field to mark reasoning support, but early code treated interleaved=true as undefined, causing models selected via /connect to silently lose their reasoning capability.

Fixes:

interleaved=true is mapped to the default reasoning_content
interleaved is added to the update-catalog.mjs allowlist; otherwise the offline catalog in release builds would silently drop the field again

61f7d0e fix(kosong): make openai-compatible thinking work without reasoning_key (#78)

This is the core commit for reasoning handling, showcasing kimi-code’s deep thinking on compatibility. The diff reveals a three-layer design:

Inbound Auto-Scan (response parsing)

1
2


const KNOWN_REASONING_KEYS = ['reasoning_content', 'reasoning_details', 'reasoning'] as const;
// Auto-scan three fields; first string value wins

Outbound Default Write-Back (request serialization)

1
2


const DEFAULT_OUTBOUND_REASONING_KEY = KNOWN_REASONING_KEYS[0]; // 'reasoning_content'
// Defaults to writing back as reasoning_content, no user config needed

Auto-Inject reasoning_effort (historical continuity)

1
2


// When history contains ThinkPart but caller hasn't explicitly set reasoning_effort,
// auto-inject 'medium' to prevent strict gateways like One API / DeepSeek from returning 400

Edge cases are handled meticulously: blank reasoning_key ("") is normalized to undefined; values explicitly set by the caller via withGenerationKwargs are not silently overwritten by auto-injection.

The verification goal explicitly states:

Manually verified end-to-end against the real DeepSeek API with a hand-written config.toml that does not set reasoning_key: thinking content renders, no 400, multi-turn conversations work.

OpenCode: Generic Layer Design, OpenAI-centric

eb84f46 fix(llm): split OpenAI reasoning summary blocks (#29000)

This commit demonstrates OpenCode’s completely different approach to reasoning—designed around the OpenAI Responses API:

Maintains a state machine for encrypted_content and item_reference
Folds multiple summary parts by item_id + summary_index
When store:false, filters out reasoning items lacking encrypted_content

This is completely different from Kimi’s reasoning_content mechanism. Kimi does not need encrypted_content or item_reference; it simply attaches a reasoning_content field to the message.

A Hard Fact

OpenCode Issue #26331 “Bug: OpenCode Go bridge layer incompatible with kimi-k2.6 tool calls” — Status: still open
OpenCode Issue #27054 “KIMI K2.6 showing error in Opencode GO” — Status: closed, but the resolution was to disable MCP (a workaround)

The last comment on #27054:

The workaround is to disable your MCP and then initiate the session

That’s not a fix. That’s avoiding the problem.

Commit History Comparison Summary

Dimension	kimi-code	OpenCode
Initial Design	Initial commit includes full Kimi provider + schema normalizer + file service	Generic multi-model architecture, adapted later via bridge
Reasoning Mechanism	Designed around `reasoning_content` field, with auto-scan / write-back / effort injection	Designed around OpenAI Responses’ `encrypted_content` + `item_reference`
Schema Handling	Dedicated `normalizeKimiToolSchema`, dereferences `$ref` + backfills `type`	Generic schema validation, focused on friendly error messages
Config Philosophy	Makes OpenAI-compatible gateways “zero-config” by auto-inferring all fields	Relies on users manually adapting via bridge/config
Issue Status	Continuously shipping reasoning-related patches (#70, #78)	kimi-k2.6 compatibility issue #26331 still open

Summary: Core Differences

Dimension	kimi-code	OpenCode
Architecture Positioning	Native design for Kimi/Moonshot, dedicated provider	Generic multi-model agent, adapted via bridge
Thinking/Reasoning	Native support, full lifecycle preservation of `reasoning_content`	Easily lost in bridge layer, causing 400 errors
JSON Schema	Dedicated `normalizeKimiToolSchema` for dereferencing and type backfilling	Generic schema generation, does not meet Kimi validator requirements
API Format	Directly generates Moonshot-native format (including `thinking` config, `$defs` normalization, etc.)	Transformed through OpenAI/Anthropic protocol conversion, causing format mismatches
Service Integration	Built-in Moonshot fetch/search/file services	Uses generic local tools
Core Dependencies	CLI core does not directly depend on vendor SDKs; isolated via self-developed `kosong` package	Core package directly coupled with `@ai-sdk/openai` and more than a dozen other vendor SDKs

Looking at commit history, kimi-code’s evolution is directed at continuously eliminating user configuration burden (reasoning_key went from required → optional override → auto-inferred; interleaved went from filtered → correctly mapped), while OpenCode’s evolution is directed at deepening OpenAI ecosystem integration (Responses API, encrypted reasoning, item reference), leaving Kimi adaptation stuck at the generic bridge layer.

That’s the truth at the commit level: one is native evolution, the other is a bridge gap.

OpenCode's GitHub Actions Automation System: Engineering Practices Behind 27 Workflows

Fri, 22 May 2026 10:00:00 +0800

opencode is a 160k-star AI coding tool with 27 workflow files in its .github/workflows/ directory. This number is not uncommon for open source projects, but what’s truly interesting is not the quantity, but the scope these workflows cover: from conventional CI/CD to AI-driven community governance, they’ve done almost everything GitHub Actions can do.

This article analyzes the design of these workflows by category, discusses the pros and cons of this level of automation, and shares insights for our own projects.

Overview

The 27 workflows can be divided into four categories:

Category	Count	Purpose
CI/Testing	4	typecheck, unit tests, e2e, Nix builds
Release/Delivery	5	CLI release, container builds, VS Code extension, GitHub Action release
Automation/Bot	16	issue governance, PR compliance, AI code review, documentation updates
Docs/Other	2	statistics, Discord notifications

16 automation workflows account for 60% of the total. opencode doesn’t just use Actions to run tests and releases—it also entrusts community governance and code quality review to the automation system.

CI/Testing: Solid but Restrained

Four testing-related workflows:

typecheck.yml — Runs bun typecheck on PR and push to dev. Simple and direct, no unnecessary actions.

test.yml — Cross-platform test matrix (Linux + Windows), runs unit tests and Playwright e2e. Has concurrency control where new commits in the same PR cancel old runs. Test results generate JUnit reports uploaded as artifacts.

nix-eval.yml — Verifies Nix flake builds on four architectures (x86_64-linux, aarch64-linux, x86_64-darwin, aarch64-darwin). Mandatory package failures block the build, optional package failures are just warnings.

storybook.yml — Storybook builds for UI components, only triggered when storybook/ui-related files change. Path triggering avoids unnecessary runs.

Several noteworthy design choices:

concurrency group + cancel-in-progress: Multiple workflows use this pattern so the same PR doesn’t stack multiple runs. For a project receiving lots of community PRs, this saves significant CI resources.
Path triggering: containers.yml only runs when container files change, storybook.yml only runs when UI changes. Not everything runs on all commits.
Mixed Runner Strategy: Most workflows use Blacksmith’s third-party hosted runners (blacksmith-4vcpu-ubuntu-2404, blacksmith-4vcpu-windows-2025). Blacksmith is a GitHub Actions API-compatible accelerated runner service using custom infrastructure, significantly faster than GitHub’s free runners. Only lightweight bot tasks (close-issues, close-prs, compliance-close, pr-standards, deploy) stay on GitHub’s native ubuntu-latest. Compute-intensive compilation, testing, and releases all go through Blacksmith, simple script tasks use GitHub’s native runners, allocating resources by task load.

Release/Delivery: Full Platform Coverage

publish.yml is the most complex workflow, handling the complete release process in a single file:

Version number calculation
CLI build matrix (multi-platform, multi-architecture)
Windows code signing (Azure Signing)
macOS code signing (Apple Developer)
Electron app builds
npm publishing
GitHub Release creation
AUR (Arch Linux) publishing

One workflow covers distribution for CLI, desktop apps, npm packages, and Linux packages. This “release everywhere at once” pattern is user-friendly—regardless of platform, everyone gets the new version on the same day.

Other release workflows are split by artifact type:

publish-github-action.yml — Listens for github-v* tags, publishes GitHub Action to Marketplace
publish-vscode.yml — Listens for vscode-v* tags, publishes to both VS Code Marketplace and Open VSX
containers.yml — Multi-architecture container image builds, pushes to GHCR
release-github-action.yml — Creates pre-releases when github directory changes on dev branch

Tag triggering is a good practice: releases are explicit actions, not triggered by accidental code pushes. publish.yml automatically builds snapshots when pushing to ci/dev/beta/fix branches, but official releases require manual dispatch or tags.

Automation/Bot: AI-Driven Community Governance

This is opencode’s most distinctive feature. Among the 16 automation workflows, multiple directly call upon opencode’s own AI capabilities to handle community affairs.

Issue Management

triage.yml — When a new issue is created, opencode AI automatically triages it, adding labels and categories.

duplicate-issues.yml — When a new issue is created/edited, opencode AI analyzes whether it duplicates existing issues. Also checks whether it follows one of three issue templates and whether it contains AI-generated content. Non-compliant issues get a needs:compliance label.

compliance-close.yml — Every 30 minutes, checks issues/PRs with needs:compliance label and auto-closes if not fixed within 2 hours. Different prompt messages are given for issues vs PRs when closing.

close-issues.yml — Closes stale issues daily at 2 AM UTC.

These four layers form complete issue lifecycle management:

1

New issue → AI triage → duplicate/compliance check → compliance grace period → stale cleanup

PR Management

pr-standards.yml is one of the longest workflows, doing two things:

Title format check: Enforces conventional commits format (feat/fix/refactor/…)
Template compliance check: PR description must include required sections like issue references, change type, verification method

Non-compliant PRs get a needs:compliance label and auto-close after 2 hours. Team members and bots are exempt.

pr-management.yml — Checks for duplicates when PR is created, adds labels for community contributors.

close-prs.yml — Closes PRs older than 1 month with insufficient reactions daily at 10 PM UTC. Default threshold is 2 reactions, configurable.

AI Code Review

review.yml — Input /review in PR comments, opencode AI analyzes code and leaves review comments on specific lines. Only available to repo owner/members.

opencode.yml — Input /oc or /opencode in issue or PR comments to trigger opencode AI for more general interactions.

These two workflows demonstrate the “AI as collaborator” approach: not fully automatic code review, but on-demand triggering with humans making final decisions in the loop.

Documentation & Maintenance

docs-update.yml — Every 12 hours, checks recent commits and uses opencode AI to determine if documentation needs updates.

generate.yml — Runs code generation scripts when pushing to dev, auto-commits changes.

beta.yml — Syncs beta branch hourly.

stats.yml — Updates download statistics to STATS.md daily.

Design Patterns Worth Adopting

1. Layered Governance

opencode doesn’t stuff all automation into one workflow, but splits it by responsibility. An issue goes through four workflows in relay from creation to closure. Each workflow does one thing, combining to form a complete governance chain.

Benefits of this design:

Individual workflows can be modified or disabled independently without affecting other steps
Each workflow’s trigger conditions and permission scope are minimized
Easy to locate which step has problems when they occur

2. Compliance Grace Period

compliance-close.yml doesn’t close immediately upon detecting non-compliance, but gives a 2-hour grace period. This is reasonable for global contributors in different time zones—you might submit an issue while sleeping, and wake up with time to fix it.

3. AI at Decision Points, Not Execution Points

triage, duplicate detection, and code review all have AI make initial assessments, with humans making final decisions. But execution-level tasks like code builds and releases don’t use AI at all. This is a pragmatic division: AI excels at pattern recognition and initial classification, but not precise execution.

4. Explicit vs Automatic Triggers

Releases use tag triggers, maintenance uses schedule triggers, governance uses event triggers. Three trigger types correspond to three different automation trust levels: releases need human confirmation, maintenance can be scheduled automatic, governance needs immediate response.

Risks of Over-Automation

opencode’s automation system is comprehensive, but there are points to watch:

Community barrier: New contributors submitting issues must follow specific templates, PRs must conform to conventional commits, otherwise auto-closed after 2 hours. For a 160k-star project, this strictness is reasonable—it filters out many low-quality contributions. But for small projects, this level of automation would scare away potential contributors.

Maintenance cost: 27 workflows means 27 automation scripts to maintain. opencode has custom runners and dedicated scripts. If a workflow’s logic needs adjustment, maintainers need to switch between GitHub Actions YAML and custom scripts.

AI uncertainty: duplicate-issues and triage use AI for judgment, but AI can misjudge. A reasonable issue marked as duplicate and closed creates a negative experience for contributors. opencode uses grace periods and manual review to mitigate this, but the risk remains.

Insights for Our Projects

Not every project needs 27 workflows. But opencode’s layered governance and “AI at decision points” approach are worth referencing:

Start with issue templates: If the project starts receiving lots of duplicate or low-quality issues, add templates and duplicate checking first, rather than manually handling each one.
Use grace periods for compliance checks: Always give a grace period when auto-closing non-compliant contributions.
Use AI for classification, not execution: Let AI help triage issues and check PR formats, but don’t let AI auto-merge code or publish releases.
Use tag triggers for releases: This is the safest approach. Automatic snapshot releases are acceptable, official versions need human confirmation.
Add on demand: Add automation only when you have pain points. opencode’s 27 workflows weren’t built in a day, but gradually added as community scale grew.

Summary

opencode’s GitHub Actions system demonstrates automation practices for large-scale open source projects: CI/CD covers full platform releases, community governance uses multi-workflow relay processing, AI is applied to decision points like triage and review. The core of this system is not technical complexity, but three principles: “layered, grace periods, explicit triggers”. For our own projects, we don’t need to copy all 27 workflows, but these principles can be directly applied.

OpenCode Optimization Beyond Configuration — Plugin-Based Optimization

Tue, 19 May 2026 10:00:00 +0800

I previously wrote an article OpenCode Configuration Optimization Record, which addressed token consumption and context management issues. However, configuration optimization handles “how the model runs,” while “the quality of code when it’s half-written” is something configuration cannot manage. This article starts from my development process of the opencode-review plugin, discussing how opencode-review helps an agent review and improve its own code within a session, resulting in higher quality code entering the PR.

Problem: Who Guards Code Quality Within a Session?

When using OpenCode to write code, a typical workflow is: the agent completes coding within a session, then I review the diff and create a PR. But I discovered a recurring problem: code written by agents often enters PRs with “first draft” quality issues.

These issues include: missing error handling, security vulnerabilities, poorly performing queries, and missing tests. If the agent could perform a self-review within the session—before the code is committed to the PR—many problems wouldn’t exist at the PR stage.

This is different from code review at the CI stage. I’ve already implemented CI review through opencode-actions (I previously wrote an introductory article)—it happens after PR creation, triggered by GitHub Actions. Later, Cloudflare also shared similar ideas in their engineering blog: using OpenCode to build large-scale AI code review. opencode-review aims to solve an earlier stage: within the session, before the PR, enabling the agent to proactively review and fix issues after writing code. The two complement each other: opencode-review raises the quality baseline of code entering the PR, while opencode-actions serves as the final checkpoint.

Specifically, there are three sub-problems to address:

Incomplete review coverage: Code generated by agents may introduce security vulnerabilities and performance issues, but they won’t proactively check for these
Lack of systematic review framework: Without structured dimensions to evaluate code, it’s easy to focus only on functional correctness while ignoring security and performance
Lack of closed loop between issue discovery and fixes: Even when the agent discovers problems, a mechanism is needed to automatically fix them rather than waiting for someone to point them out

Design of opencode-review

Based on these three problems, I designed opencode-review: a structured code review plugin.

Multi-Dimensional Analysis

The first design decision is why divide into five dimensions rather than a general “good or bad” evaluation.

Code quality is not a single dimension. A piece of code may be functionally correct and performant, but contain SQL injection vulnerabilities; or it may be secure and harmless, but lack test coverage. Evaluating them together inevitably leads to vague results.

Academically, the Modern Code Review (MCR) Survey collected code review research from 2013-2025, proposing a classification system covering multiple task dimensions including defect detection, security review, performance analysis, and maintainability assessment. Ericsson’s research team also verified in Automated Code Review Using Large Language Models at Ericsson that dimension-specific review is more effective in industrial scenarios than general review.

opencode-review’s five dimensions—code-quality, security, performance, testing, documentation—correspond to the core review dimensions identified in these studies. Each dimension can be independently toggled because different projects focus on different priorities: an internal tool may not need documentation review, but a security-sensitive service cannot skip the security dimension.

Severity Grading

The second design decision is why divide into three severity levels (critical / suggestion / highlight).

This comes from lessons learned in the static analysis tool domain. Security tools and linters have long faced a problem: alert fatigue. When all issues are marked as equally important, developers start ignoring them. Veracode’s research points out that the direct consequence of alert fatigue is that truly serious issues get drowned out in noise.

The logic of three levels is:

critical: Must fix (security vulnerabilities, logic errors, resource leaks)
suggestion: Suggested improvements (code readability, performance optimization, better practices)
highlight: Worth noting (style consistency, potential improvement space)

This way developers can prioritize handling critical issues without missing a SQL injection among a bunch of “consider refactoring” suggestions.

Auto-Fix Chain

The third design decision is why critical issues should automatically trigger fixes rather than just being reported.

This is a controversial design. Traditional review tools typically “report but don’t fix,” leaving fixes to developers. But opencode-review’s scenario is different—the code it reviews is itself just written by an AI agent, so having another agent fix it is reasonable.

Academically, this belongs to the Automated Program Repair (APR) domain. A Survey of LLM-based Automated Program Repair (arXiv 2506.23749) reviewed 63 LLM-based APR systems from 2022-2025, divided into four paradigms. Among them, the “analysis-augmented” paradigm—using static analysis to locate problems first, then using LLMs to generate fixes—was proven most effective. opencode-review’s auto-fix chain is essentially this paradigm: reviewer discovers critical issue → locates problem position → spawns fixer sub-agent → generates minimal fix.

An ICSE 2025 paper also points out that a key challenge for LLMs in APR is objective alignment—the goal of fixing is not “generate code that looks reasonable,” but “precisely solve the reported problem.” This is why opencode-review’s fixer is designed as minimal fix—making only the minimal modifications to solve the problem, no rewriting, no refactoring, no “convenient” other changes.

Hidden Benefit of Auto-Review: Continuous Improvement of Code Quality Baseline

The three designs above solve “discovering problems” and “fixing problems.” But auto-review has an easily overlooked benefit: it continuously raises the baseline of code quality inadvertently.

This effect comes from two mechanisms:

First, the shaping of code writers by review feedback. FSE 2022 research found in two years of industrial practice that when developers know their code will be automatically reviewed, they consciously follow standards more during the coding phase—because the cost of being pointed out afterward becomes lower, and the benefit of writing well upfront becomes higher. This is a nudge effect. In the AI agent scenario, this effect is stronger: the agent writes code in a session, gets reviewed and pointed out issues, fixes them, gets reviewed again—this cycle can complete multiple rounds within the same session. Each round of feedback corrects the agent’s output tendency, equivalent to an implicit fine-tuning process.

Second, direct quality accumulation from automatic fixes. Critical issues being automatically fixed means the code quality of each commit is higher than without review. This isn’t a one-time improvement, but continuous. Like lint rules in a codebase—at first they only prohibit obvious errors, but as rules accumulate, the overall style and quality of the codebase is unconsciously raised. The auto-fix chain does something similar: security vulnerabilities are automatically patched, resource leaks are automatically fixed, missing tests are automatically added. Over time, the codebase’s quality baseline naturally becomes higher than without auto-review.

Simply put: review is not the goal, quality improvement is. Auto-review turns “post-hoc inspection” into “in-process improvement.”

Cooldown Mechanism

There’s one more design detail: cooldown_seconds.

auto-review triggers when the session is idle, but idle events can trigger frequently (for example, when the agent is waiting for user confirmation, it also idles). Without cooldown, the same code might be reviewed several times, wasting tokens. The default 120-second cooldown period is an empirical value—enough for one round of modifications to complete, without waiting too long.

opencode-froggy: Another Approach

opencode-froggy (85 Stars, just released 0.12.0 yesterday) provides another approach. It doesn’t do structured multi-dimensional review, but instead provides 6 specialized agents (architect, code-reviewer, code-simplifier, doc-writer, partner, rubber-duck) and a flexible hooks system.

Froggy’s code-reviewer is a general read-only review agent that doesn’t distinguish dimensions or severity. But its hooks system is strong—you can configure session.idle events to automatically run lint, auto-format, or even intercept when writing sensitive files:

1
2
3
4
5
6
7
8


---
hooks:
 - event: session.idle
 conditions: [hasCodeChange, isMainSession]
 actions:
 - bash: "npm run lint --fix"
 - command: simplify-changes
---

This is a “developer orchestrates the workflow” approach, complementing opencode-review’s “out-of-the-box structured review.”

Comparison

	opencode-review	opencode-froggy
Review method	Structured multi-dimensional analysis	General code-reviewer agent
Severity grading	critical / suggestion / highlight	None
Auto-fix	critical issue → fixer sub-agent	code-simplifier, manual trigger
Trigger method	session idle + cooldown	hooks configuration
Custom rules	custom_rules supports project norms	None
Other features	None	6 agents + hooks + gitingest + blockchain

The two don’t conflict and can be installed together. My suggestion is: opencode-review for daily auto-review, froggy’s hooks for workflow orchestration.

Plugin Installation

The two plugins have different installation methods.

opencode-froggy supports direct installation via npm, just add to opencode.json:

1
2
3


{
 "plugin": ["opencode-froggy"]
}

opencode-review currently doesn’t have npm installation available yet, requires cloning and local linking:

1
2
3
4
5
6
7
8
9


# Clone to any location
git clone https://github.com/sun-praise/opencode-review.git /path/to/opencode-review

# Project-level installation (recommended)
mkdir -p .opencode/plugins
ln -s /path/to/opencode-review/src/index.ts .opencode/plugins/opencode-review.ts

# Or global installation
ln -s /path/to/opencode-review/src/index.ts ~/.config/opencode/plugins/opencode-review.ts

opencode-review also needs to create .opencode/review.json to configure review behavior:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


{
 "language": "zh",
 "dimensions": ["code-quality", "security", "performance", "testing", "documentation"],
 "trigger": {
 "auto_on_idle": true,
 "cooldown_seconds": 120
 },
 "custom_rules": [
 "All API endpoints must have error handling",
 "Database queries must use parameterized statements"
 ]
}

Other Notable Plugins

The ecosystem already has over 70 plugins, here are a few more recommendations:

opencode-worktree: Zero-friction git worktree management
opencode-notify: Send system notifications when tasks complete
dynamic-context-pruning: Automatically prune outdated tool outputs, optimizing token usage
envsitter-guard: Prevent agents from reading .env sensitive files

See the complete list at awesome-opencode.

References

Modern Code Review (MCR) Survey — 2013-2025 code review research survey
Automated Code Review Using LLMs at Ericsson — Industrial practice of LLM-assisted code review
A Survey of LLM-based Automated Program Repair — LLM auto-fix survey, covering 63 systems
Aligning the Objective of LLM-Based Program Repair (ICSE 2025) — Objective alignment issues in LLM fixing
Understanding Automated Code Review Process (FSE 2022) — Two years of industrial environment auto-review experience
AI-Assisted Assessment in Modern Code Review (AIware 2024) — Deployment and evaluation of AutoCommenter
Code Review Agent Benchmark (c-CRAB) — AI agent code review benchmark
opencode-actions - a coding review agent — GitHub Action built on OpenCode, code review at CI stage
Cloudflare: Orchestrating AI Code Review at Scale — Cloudflare using OpenCode to build large-scale AI review

How to Fix DeepSeek Model Reasoning Issues in OpenCode

Fri, 24 Apr 2026 12:23:58 +0800

When using deepseek-reasoner, we often encounter this problem:

1

The reasoning_content' in the thinking mode must be passed back to the API.

Update

Both issues have now been officially resolved by opencode. Users only need to install the latest version of opencode and use it through the deepseek provider, without additional configuration.

1
2
3
4
5
6


Issue 1
The reasoning_content' in the thinking mode must be passed back to the API.

Issue 2
Bad Request: {"error":{"message":"The content[].thinking in the thinking mode must be passed back to the
API.","type":"invalid_request_error","param":null,"code":"invalid_request_error"}}

Both issues have been officially resolved. Install version 1.14.29 or above.

The old solution follows:

How to solve it? It’s straightforward.

How to Configure

Add provider information to your configuration:

.config/opencode/opencode.json or .config/opencode/opencode.jsonc

Modify the provider section to:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40


{
 "provider": {
 "deepseek": {
 "npm": "@ai-sdk/anthropic",
 "name": "DeepSeek",
 "options": {
 "baseURL": "https://api.deepseek.com/anthropic",
 "apiKey": "<apikey>"
 },
 "models": {
 "deepseek-v4-pro": {
 "name": "DeepSeek-V4-Pro",
 "limit": {
 "context": 1048576,
 "output": 262144
 },
 "options": {
 "thinking": {
 "type": "enabled",
 "budgetTokens": 8192
 }
 }
 },
 "deepseek-v4-flash": {
 "name": "DeepSeek-V4-Flash",
 "limit": {
 "context": 1048576,
 "output": 262144
 },
 "options": {
 "thinking": {
 "type": "enabled",
 "budgetTokens": 8192
 }
 }
 }
 }
 }
 }
}

How to Use

Select the deepseek model.

The result.

Supplement

This method cannot solve this problem

Bad Request: {"error":{"message":"The content[].thinking in the thinking mode must be passed back to the API.","type":"invalid_request_error","param":null,"code":"invalid_request_error"}}

If you encounter this problem, you need to wait for opencode to fix it.

Related article: DeepSeek + Claude Code: Thinking Block Compatibility Issue Analysis — Analyzes the root cause of 400 errors triggered by multi-turn conversations in extended thinking mode when using DeepSeek with Claude Code, along with community solutions.

opencode-actions - A Code Review Agent

Thu, 23 Apr 2026 11:36:34 +0800

To make it easier to integrate opencode for code review, I built a GitHub Action repository. Working with opencode to implement this was straightforward.

Currently, it provides two main features: one is review, and the other is using the runner to execute opencode (directly running opencode’s prompts on the runner) to handle other functionalities. For example, modifying code, creating new issues, creating PRs based on issues, etc.

How stable is it?

This repository has been validated across multiple projects, and the release version is reliable.
However, note that the main branch version is a rapidly iterating version.

How to integrate?

Add the following to your .github/workflows/opencode-review.yml:

1
2
3
4
5
6
7
8


- name: Run OpenCode review
 uses: sun-praise/opencode-actions/review@v1
 with:
 github-token: ${{ secrets.GITHUB_TOKEN }}

 # only one is enough.
 zhipu-api-key: ${{ secrets.ZHIPU_API_KEY }}
 opencode-go-api-key: ${{ secrets.OPENCODE_GO_API_KEY }}

Currently, this action mainly supports Z.AI, ZHIPU, and OPENCODE GO subscriptions. Therefore, if using ZHIPU, simply add your ZHIPU_API_KEY to the project’s secrets. If using the opencode go subscription, you need to add OPENCODE_GO_API_KEY.

Everything else can use the default configuration. The default model is zhipuai-coding-plan/glm-5-turbo. For more configuration requirements, I recommend checking the original repository’s README.

I previously covered this quick review script in my code review article.

Differences

Actually, opencode has its own actions, so why did I build another one?

The differences from the official version are mainly reflected in several aspects:

Feature	Upstream Status	This Repository
Model default fallback	Only required input	Three-level fallback (input → MODEL_NAME → hardcoded default)
Provider convenience fields	None	`zhipu-api-key`, `opencode-go-api-key`, etc.
Review prompt template	None	Chinese-formatted review (mergeable/conditionally mergeable/not mergeable)
Retry logic	None	attempts / retry-profile / retry-on-regex / retry-delay
Execution timeout	None	timeout-seconds
Version check	None	OPENCODE_MIN_VERSION
Installation retry	None	install-attempts
XDG cache caching	Only caches bin	Caches both bin + XDG cache

Detailed explanations are available in sun-praise/opencode-actions#29.

Review effectiveness

You can see the results from the repo’s own PR at opencode-actions#30

The effect looks like this:

Future

Try integrating gemini cli. Google’s GEM 3.1 PRO model currently offers great value for money, with the highest intelligence per unit.
Integrate MCP plugin functionality. If MCP is available during opencode review, it may bring better review results.
Integration of commercial plugin features

Recently Discovered Useful MCP Tools

Sat, 17 Jan 2026 22:18:33 +0800

Recently, I’ve been extensively using the opencode/claude code combination for development and have explored three particularly useful tools.

They address several issues:

Parallel development on a single server; controlling tmux: tmux and tmux-mcp
Preventing claude code from stopping at meaningless points: ralph-loop
End-to-end automated testing: playwright mcp

Tool List

tmux mcp

First, configure tmux in the Linux environment with opencode, then have opencode install https://github.com/rinadelph/tmux-mcp.git. Once installed, you can use oc to control tmux content.

This method can be used to reactivate stopped opencode sessions. For example, you can open multiple tmux sessions and have one opencode monitor, start, and stop tasks through the tmux tool.

ralph-loop

Ralph is an autonomous AI agent loop that repeatedly runs Amp until all PRD items are completed. Each iteration creates a brand new Amp instance with a clean context.

Ralph likely originated from here: https://github.com/snarktank/ralph

Due to its effectiveness (which actually occurred after further improvements in model performance), it was also introduced to claude code.

Ralph-loop is a Claude Code plugin that allows Claude Code to automatically restart when tasks are completed, forming a loop execution mechanism. This is particularly useful for tasks that require continuous improvement or iteration.

Installation Method

Install through Claude Code official plugin market:
- /plugin install ralph-wiggum@claude-plugins-official or cc '/plugin install ralph-wiggum@claude-plugins-official'
Configuration and Usage:
- After installation, you can start it in Claude Code via the /ralph-loop command
- Set tasks and termination conditions, Claude Code will automatically restart each time it stops
- This is particularly useful for scenarios requiring multiple iterations of code improvement, debugging, or testing

Use Cases

Code Refactoring: Have Claude Code continuously improve code quality
Test-Driven Development: Write tests, then have Claude Code continuously improve implementations
Debugging Loops: Automatically restart debugging sessions
Continuous Integration: Simulate CI/CD processes locally

The drawback of this plugin is that it consumes a lot of tokens; without a max20 subscription, it’s better not to use it. However, for tasks requiring high-quality output, this tool can significantly improve work efficiency.

playwright mcp

This plugin can launch browsers to complete end-to-end testing or write end-to-end test code. It can better form loops to have cc or oc improve code.

Installation method: claude 'help me install playwright mcp'

Rewriting as Agents

I recommend directly rewriting these tools and MCPs as agents through opencode or claude code.

Compared to skills commands, these tools are more suitable for invocation through agents. Agent context is very clean, making tool invocation almost inevitable.

Summary

As LLMs become increasingly powerful, numerous MCPs that rely on LLM capabilities naturally gain improvements. Tools that weren’t very useful before become more effective. This aligns with the saying: “Don’t build things that become meaningless after large model capability enhancements.” Large model capabilities continue to improve, and prices keep decreasing.

I believe the next step is to bridge interactions between different modalities and tools, as well as endowing tools with large model capabilities, which is one of the inevitable development directions for agent engineers.

Efficient and Cost-Effective: My AI Agent Workflow Choice

Mon, 05 Jan 2026 16:00:00 +0800

Claude Code’s $100/month price tag is a bit steep for many. To address this, I’ve been experimenting with a more practical and affordable workflow.

In terms of models, my recommendation is to use Gemini 3 Flash on an as-needed (pay-as-you-go) basis as a replacement.

Why? Gemini 3 Flash offers incredible value. It’s fast, efficient, and costs a fraction of what you’d pay for Opus or Sonnet. For the vast majority of tasks, Flash is more than enough.

The Cost-Saving Workflow

Here is my current “budget” workflow:

Planning & Proposals: Use Gemini 3 Flash.
Execution & Building: Use the free GLM 4.7 (or MiniMax M2.1) via OpenCode. If you have a Zhipu Coding Plan, that works perfectly too.

Speaking of Gemini 3, we have to talk about GPT-5.2.

Many engineers still rely on ChatGPT.com directly instead of using a proper coding agent. Regardless of the efficiency debate, the reliability is concerning. From my experience, GPT-5.2’s default tone has been tuned to be overly “people-pleasing,” which might not be ideal for professional developers seeking direct technical feedback.

Furthermore, while GPT-5.2 scored impressively on SWE-bench Verified, my real-world experience has been mixed. It’s worth looking at the history of SWE-bench:

Originally proposed by a team from Princeton University (ICLR 2024), it evaluates a model’s ability to solve real GitHub issues. However, in August 2024, OpenAI’s Preparedness team collaborated with the original authors to create SWE-bench Verified (a subset of 500 manually verified issues). Since OpenAI was involved in the design of this benchmark, their models’ performance on it should be taken with a grain of salt. While not necessarily a deliberate manipulation, the risk of inherent bias is significant.

Ultimately, as I often say, “Codex” models don’t always deliver the most practical results in everyday coding.

OpenCode Tips

Leveraging Agents: OpenCode supports launching SubAgents. When debugging complex projects, you can have OpenCode launch agents in different directories to handle front-end and back-end tasks separately, which also helps avoid permission issues.
OpenSpec: Cross-Agent Collaboration:
1 2 3 4

1. OpenCode + Gemini 3 Flash → Generate proposal 2. Codex → Code Review 3. Claude Code → Secondary Review 4. OpenSpec Apply → Final Execution
OpenSpec generates reliable specs, but sometimes cheaper models produce lower-quality code. In such cases, you can generate multiple times using the spec and select the best result.

Final Thoughts

As AI Agent engineers, we need to adapt to these ongoing trends:

Models are becoming smarter.
Execution is becoming faster.
Prices are dropping.

While these trends are promising, we still need to balance speed, cost, and quality for every task. We might soon see agent systems that automate this balancing act, but for now, it’s a crucial part of the engineer’s role.

Opencode on Svtter's Blog

How kimi-code Handles kimi-k2.6: A Comparison with OpenCode

1. Native Kimi Provider (Not a Generic OpenAI-compatible Layer)

2. Full Lifecycle Handling of reasoning_content

3. JSON Schema Normalization (kimi-schema.ts)

4. Native Thinking Mode Configuration System

5. Native Moonshot Service Integration

6. Tool Call Layer Details

7. CLI Core and LLM SDK Architectural Isolation

8. What Commit History Reveals About Evolution Paths

kimi-code: Native Design, Continuously Reducing Configuration Burden

OpenCode: Generic Layer Design, OpenAI-centric

A Hard Fact

Commit History Comparison Summary

Summary: Core Differences

OpenCode's GitHub Actions Automation System: Engineering Practices Behind 27 Workflows

Overview

CI/Testing: Solid but Restrained

Release/Delivery: Full Platform Coverage

Automation/Bot: AI-Driven Community Governance

Issue Management

PR Management

AI Code Review

Documentation & Maintenance

Design Patterns Worth Adopting

1. Layered Governance

2. Compliance Grace Period

3. AI at Decision Points, Not Execution Points

4. Explicit vs Automatic Triggers

Risks of Over-Automation

Insights for Our Projects

Summary

OpenCode Optimization Beyond Configuration — Plugin-Based Optimization

Problem: Who Guards Code Quality Within a Session?

Design of opencode-review

Multi-Dimensional Analysis

Severity Grading

Auto-Fix Chain

Hidden Benefit of Auto-Review: Continuous Improvement of Code Quality Baseline

Cooldown Mechanism

opencode-froggy: Another Approach

Comparison

Plugin Installation

Other Notable Plugins

References

How to Fix DeepSeek Model Reasoning Issues in OpenCode

Update

How to Configure

How to Use

Supplement

opencode-actions - A Code Review Agent

How stable is it?

How to integrate?

Differences

Review effectiveness

Future

Recently Discovered Useful MCP Tools

Tool List

tmux mcp

ralph-loop

Installation Method

Use Cases

playwright mcp

Rewriting as Agents

Summary

Efficient and Cost-Effective: My AI Agent Workflow Choice

The Cost-Saving Workflow

OpenCode Tips

Final Thoughts

2. Full Lifecycle Handling of `reasoning_content`

3. JSON Schema Normalization (`kimi-schema.ts`)