<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Agent on Svtter's Blog</title><link>https://svtter.cn/en/tags/agent/</link><description>Recent content in Agent on Svtter's Blog</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Sat, 17 Jan 2026 22:18:33 +0800</lastBuildDate><atom:link href="https://svtter.cn/en/tags/agent/index.xml" rel="self" type="application/rss+xml"/><item><title>Recently Discovered Useful MCP Tools</title><link>https://svtter.cn/en/p/recently-discovered-useful-mcp-tools/</link><pubDate>Sat, 17 Jan 2026 22:18:33 +0800</pubDate><guid>https://svtter.cn/en/p/recently-discovered-useful-mcp-tools/</guid><description>&lt;img src="https://svtter.cn/p/%E6%9C%80%E8%BF%91%E5%8F%91%E7%8E%B0%E5%A5%BD%E7%94%A8%E7%9A%84-mcp-%E5%B7%A5%E5%85%B7/bg.png" alt="Featured image of post Recently Discovered Useful MCP Tools" /&gt;&lt;p&gt;Recently, I&amp;rsquo;ve been extensively using the opencode/claude code combination for development and have explored three particularly useful tools.&lt;/p&gt;
&lt;p&gt;They address several issues:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Parallel development on a single server; controlling tmux: &lt;code&gt;tmux&lt;/code&gt; and &lt;code&gt;tmux-mcp&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Preventing claude code from stopping at meaningless points: &lt;code&gt;ralph-loop&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;End-to-end automated testing: &lt;code&gt;playwright mcp&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="tool-list"&gt;Tool List
&lt;/h2&gt;&lt;h3 id="tmux-mcp"&gt;tmux mcp
&lt;/h3&gt;&lt;p&gt;First, configure tmux in the Linux environment with opencode, then have opencode install &lt;a class="link" href="https://github.com/rinadelph/tmux-mcp.git" target="_blank" rel="noopener"
&gt;https://github.com/rinadelph/tmux-mcp.git&lt;/a&gt;. Once installed, you can use oc to control tmux content.&lt;/p&gt;
&lt;p&gt;This method can be used to reactivate stopped opencode sessions. For example, you can open multiple tmux sessions and have one opencode monitor, start, and stop tasks through the tmux tool.&lt;/p&gt;
&lt;h3 id="ralph-loop"&gt;ralph-loop
&lt;/h3&gt;&lt;p&gt;Ralph is an autonomous AI agent loop that repeatedly runs Amp until all PRD items are completed. Each iteration creates a brand new Amp instance with a clean context.&lt;/p&gt;
&lt;p&gt;Ralph likely originated from here: &lt;a class="link" href="https://github.com/snarktank/ralph" target="_blank" rel="noopener"
&gt;https://github.com/snarktank/ralph&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Due to its effectiveness (which actually occurred after further improvements in model performance), it was also introduced to claude code.&lt;/p&gt;
&lt;script src="https://svtter.cn/js/repo-card.js"&gt;&lt;/script&gt;
&lt;div class="repo-card" data-repo="frankbria/ralph-claude-code"&gt;&lt;/div&gt;
&lt;p&gt;Ralph-loop is a Claude Code plugin that allows Claude Code to automatically restart when tasks are completed, forming a loop execution mechanism. This is particularly useful for tasks that require continuous improvement or iteration.&lt;/p&gt;
&lt;h4 id="installation-method"&gt;Installation Method
&lt;/h4&gt;&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Install through Claude Code official plugin market&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;/plugin install ralph-wiggum@claude-plugins-official&lt;/code&gt; or &lt;code&gt;cc '/plugin install ralph-wiggum@claude-plugins-official'&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Configuration and Usage&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;After installation, you can start it in Claude Code via the &lt;code&gt;/ralph-loop&lt;/code&gt; command&lt;/li&gt;
&lt;li&gt;Set tasks and termination conditions, Claude Code will automatically restart each time it stops&lt;/li&gt;
&lt;li&gt;This is particularly useful for scenarios requiring multiple iterations of code improvement, debugging, or testing&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h4 id="use-cases"&gt;Use Cases
&lt;/h4&gt;&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Code Refactoring&lt;/strong&gt;: Have Claude Code continuously improve code quality&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Test-Driven Development&lt;/strong&gt;: Write tests, then have Claude Code continuously improve implementations&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Debugging Loops&lt;/strong&gt;: Automatically restart debugging sessions&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Continuous Integration&lt;/strong&gt;: Simulate CI/CD processes locally&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The drawback of this plugin is that it consumes a lot of tokens; without a max20 subscription, it&amp;rsquo;s better not to use it. However, for tasks requiring high-quality output, this tool can significantly improve work efficiency.&lt;/p&gt;
&lt;h3 id="playwright-mcp"&gt;playwright mcp
&lt;/h3&gt;&lt;p&gt;This plugin can launch browsers to complete end-to-end testing or write end-to-end test code. It can better form loops to have cc or oc improve code.&lt;/p&gt;
&lt;p&gt;Installation method: &lt;code&gt;claude 'help me install playwright mcp'&lt;/code&gt;&lt;/p&gt;
&lt;h2 id="rewriting-as-agents"&gt;Rewriting as Agents
&lt;/h2&gt;&lt;p&gt;I recommend directly rewriting these tools and MCPs as agents through opencode or claude code.&lt;/p&gt;
&lt;p&gt;Compared to skills commands, these tools are more suitable for invocation through agents. &lt;a class="link" href="https://svtter.cn/p/why-agent/#llm-%E7%9A%84%E9%97%AE%E9%A2%98" target="_blank" rel="noopener"
&gt;Agent context is very clean&lt;/a&gt;, making tool invocation almost inevitable.&lt;/p&gt;
&lt;h2 id="summary"&gt;Summary
&lt;/h2&gt;&lt;p&gt;As LLMs become increasingly powerful, numerous MCPs that rely on LLM capabilities naturally gain improvements. Tools that weren&amp;rsquo;t very useful before become more effective. This aligns with the saying: &amp;ldquo;Don&amp;rsquo;t build things that become meaningless after large model capability enhancements.&amp;rdquo;
Large model capabilities continue to improve, and prices keep decreasing.&lt;/p&gt;
&lt;p&gt;I believe the next step is to bridge interactions between different modalities and tools, as well as endowing tools with large model capabilities, which is one of the inevitable development directions for agent engineers.&lt;/p&gt;</description></item><item><title>Why Agent</title><link>https://svtter.cn/en/p/why-agent/</link><pubDate>Tue, 30 Sep 2025 11:54:06 +0800</pubDate><guid>https://svtter.cn/en/p/why-agent/</guid><description>&lt;img src="https://svtter.cn/p/why-agent/pics/why-agent-background.svg" alt="Featured image of post Why Agent" /&gt;&lt;p&gt;I&amp;rsquo;ve always had a question: Why do we need agent frameworks? Aren&amp;rsquo;t large models enough on their own? This article reflects my current understanding of the subject.&lt;/p&gt;
&lt;p&gt;After using several tools extensively and participating in multiple agent projects recently, I&amp;rsquo;ve reached some conclusions.&lt;/p&gt;
&lt;h2 id="the-limitations-of-llms"&gt;The Limitations of LLMs
&lt;/h2&gt;&lt;p&gt;The primary reason for using agents is the inherent limitations of LLMs.&lt;/p&gt;
&lt;p&gt;First and foremost is the &lt;strong&gt;context window&lt;/strong&gt;, as explicitly mentioned in &lt;a class="link" href="https://docs.langchain.com/oss/python/deepagents/subagents#why-use-subagents%3F" target="_blank" rel="noopener"
&gt;langchain/subagent&lt;/a&gt;. Although many modern models have significantly expanded context windows (GPT-4 Turbo 128K, Claude-3.5 Sonnet 200K, Gemini-1.5 Pro up to 2M), they are still insufficient for truly complex tasks. For example, processing a massive codebase or analyzing hundreds of documents quickly exhausts these limits. Furthermore, processing extremely long contexts is both expensive and slow.&lt;/p&gt;
&lt;p&gt;Beyond context, there are other capability gaps:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Vision Capabilities&lt;/strong&gt;: While modern VLMs (Vision Language Models) are powerful, traditional CV (Computer Vision) models often perform better in specific scenarios. Additionally, some models (like DeepSeek-V3) don&amp;rsquo;t have native vision capabilities.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Resource Access&lt;/strong&gt;: LLMs cannot directly interact with databases, file systems, or network services.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Specialized Tools&lt;/strong&gt;: Tools for code execution, complex mathematics, or data analysis require protocols like MCP to be accessible to an LLM.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="what-agents-can-do"&gt;What Agents Can Do
&lt;/h2&gt;&lt;p&gt;Beyond addressing the limitations above, here are some practical ways agents add value.&lt;/p&gt;
&lt;h3 id="domain-specific-text-processing"&gt;Domain-Specific Text Processing
&lt;/h3&gt;&lt;p&gt;Agents can process different text segments (contexts) independently.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Context Optimization&lt;/strong&gt;: Agents can compress or selectively provide context, effectively extending the usable context window.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Performance Gains&lt;/strong&gt;: An LLM within an agent can focus on a single, specific task, leading to better performance. When given too much text, LLMs often struggle to identify key information; smaller, targeted context makes this much easier.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Specialized Knowledge&lt;/strong&gt;: LLMs are trained on general data. To make an agent a domain expert, we can inject specific knowledge directly into its context.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id="visual-capability-integration"&gt;Visual Capability Integration
&lt;/h3&gt;&lt;p&gt;Through agents, we can integrate traditional vision models to handle tasks that LLMs struggle with. For example, using an MCP (Model Context Protocol) to bridge an agent with vision capabilities.&lt;/p&gt;
&lt;p&gt;A notable example is &lt;a class="link" href="https://docs.bigmodel.cn/cn/coding-plan/mcp/vision-mcp-server" target="_blank" rel="noopener"
&gt;Zhipu&amp;rsquo;s Vision MCP&lt;/a&gt;. Using this MCP in conjunction with an agent significantly enhances visual processing power. This highlights the value of MCP servers that integrate specialized services.&lt;/p&gt;
&lt;h2 id="further-reading"&gt;Further Reading
&lt;/h2&gt;&lt;blockquote class="twitter-tweet"&gt;&lt;p lang="zh" dir="ltr"&gt;大家经常聊的 Agent，很多时候其实只是一个 Workflow。这两个概念混用，会导致产品设计和技术选型上走很多弯路。&lt;br&gt;&lt;br&gt;Anthropic 给了一个很清晰的划分，核心区别在于：&lt;br&gt;系统执行任务时，是由代码预设路径（Code-Driven），还是由LLM自己动态决定下一步（LLM-Driven）。前者是 Workflow，后者才是…&lt;/p&gt;&amp;mdash; 一泽Eze (@eze_is_1) &lt;a href="https://twitter.com/eze_is_1/status/1982740850070425826?ref_src=twsrc%5Etfw"&gt;October 27, 2025&lt;/a&gt;&lt;/blockquote&gt; &lt;script async src="https://platform.twitter.com/widgets.js" charset="utf-8"&gt;&lt;/script&gt;
&lt;p&gt;Agents and workflows allow LLMs to use tools. While the input and output remain text, the nature of what that text represents has changed. The creator of the text is no longer necessarily a human.&lt;/p&gt;
&lt;h2 id="agent-frameworks"&gt;Agent Frameworks
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://ai.pydantic.dev/" target="_blank" rel="noopener"
&gt;Pydantic AI&lt;/a&gt;: I find this particularly useful because it integrates Pydantic models into the agent framework, making it much easier to debug. I&amp;rsquo;ve tested its integration with &lt;a class="link" href="https://ai.pydantic.dev/" target="_blank" rel="noopener"
&gt;Qwen3&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://www.langchain.com/" target="_blank" rel="noopener"
&gt;LangChain&lt;/a&gt;: I haven&amp;rsquo;t used this in production, only for basic debugging. The API changes frequently, which can be challenging. One minor issue is prompt handling; &lt;a class="link" href="https://svtter.cn/p/string-template-in-prompt.md/" &gt;I used Jinja to solve this&lt;/a&gt;. Alternatively, the &amp;ldquo;LangChain way&amp;rdquo; involves using &lt;a class="link" href="https://python.langchain.com/docs/concepts/prompt_templates/#string-prompttemplates" target="_blank" rel="noopener"
&gt;PromptTemplates&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;</description></item></channel></rss>