Recently Discovered Useful MCP Tools

Sat, 17 Jan 2026 22:18:33 +0800

Recently, I’ve been extensively using the opencode/claude code combination for development and have explored three particularly useful tools.

They address several issues:

Parallel development on a single server; controlling tmux: tmux and tmux-mcp
Preventing claude code from stopping at meaningless points: ralph-loop
End-to-end automated testing: playwright mcp

Tool List

tmux mcp

First, configure tmux in the Linux environment with opencode, then have opencode install https://github.com/rinadelph/tmux-mcp.git. Once installed, you can use oc to control tmux content.

This method can be used to reactivate stopped opencode sessions. For example, you can open multiple tmux sessions and have one opencode monitor, start, and stop tasks through the tmux tool.

ralph-loop

Ralph is an autonomous AI agent loop that repeatedly runs Amp until all PRD items are completed. Each iteration creates a brand new Amp instance with a clean context.

Ralph likely originated from here: https://github.com/snarktank/ralph

Due to its effectiveness (which actually occurred after further improvements in model performance), it was also introduced to claude code.

Ralph-loop is a Claude Code plugin that allows Claude Code to automatically restart when tasks are completed, forming a loop execution mechanism. This is particularly useful for tasks that require continuous improvement or iteration.

Installation Method

Install through Claude Code official plugin market:
- /plugin install ralph-wiggum@claude-plugins-official or cc '/plugin install ralph-wiggum@claude-plugins-official'
Configuration and Usage:
- After installation, you can start it in Claude Code via the /ralph-loop command
- Set tasks and termination conditions, Claude Code will automatically restart each time it stops
- This is particularly useful for scenarios requiring multiple iterations of code improvement, debugging, or testing

Use Cases

Code Refactoring: Have Claude Code continuously improve code quality
Test-Driven Development: Write tests, then have Claude Code continuously improve implementations
Debugging Loops: Automatically restart debugging sessions
Continuous Integration: Simulate CI/CD processes locally

The drawback of this plugin is that it consumes a lot of tokens; without a max20 subscription, it’s better not to use it. However, for tasks requiring high-quality output, this tool can significantly improve work efficiency.

playwright mcp

This plugin can launch browsers to complete end-to-end testing or write end-to-end test code. It can better form loops to have cc or oc improve code.

Installation method: claude 'help me install playwright mcp'

Rewriting as Agents

I recommend directly rewriting these tools and MCPs as agents through opencode or claude code.

Compared to skills commands, these tools are more suitable for invocation through agents. Agent context is very clean, making tool invocation almost inevitable.

Summary

As LLMs become increasingly powerful, numerous MCPs that rely on LLM capabilities naturally gain improvements. Tools that weren’t very useful before become more effective. This aligns with the saying: “Don’t build things that become meaningless after large model capability enhancements.” Large model capabilities continue to improve, and prices keep decreasing.

I believe the next step is to bridge interactions between different modalities and tools, as well as endowing tools with large model capabilities, which is one of the inevitable development directions for agent engineers.

Why Agent

Tue, 30 Sep 2025 11:54:06 +0800

I’ve always had a question: Why do we need agent frameworks? Aren’t large models enough on their own? This article reflects my current understanding of the subject.

After using several tools extensively and participating in multiple agent projects recently, I’ve reached some conclusions.

The Limitations of LLMs

The primary reason for using agents is the inherent limitations of LLMs.

First and foremost is the context window, as explicitly mentioned in langchain/subagent. Although many modern models have significantly expanded context windows (GPT-4 Turbo 128K, Claude-3.5 Sonnet 200K, Gemini-1.5 Pro up to 2M), they are still insufficient for truly complex tasks. For example, processing a massive codebase or analyzing hundreds of documents quickly exhausts these limits. Furthermore, processing extremely long contexts is both expensive and slow.

Beyond context, there are other capability gaps:

Vision Capabilities: While modern VLMs (Vision Language Models) are powerful, traditional CV (Computer Vision) models often perform better in specific scenarios. Additionally, some models (like DeepSeek-V3) don’t have native vision capabilities.
Resource Access: LLMs cannot directly interact with databases, file systems, or network services.
Specialized Tools: Tools for code execution, complex mathematics, or data analysis require protocols like MCP to be accessible to an LLM.

What Agents Can Do

Beyond addressing the limitations above, here are some practical ways agents add value.

Domain-Specific Text Processing

Agents can process different text segments (contexts) independently.

Context Optimization: Agents can compress or selectively provide context, effectively extending the usable context window.
Performance Gains: An LLM within an agent can focus on a single, specific task, leading to better performance. When given too much text, LLMs often struggle to identify key information; smaller, targeted context makes this much easier.
Specialized Knowledge: LLMs are trained on general data. To make an agent a domain expert, we can inject specific knowledge directly into its context.

Visual Capability Integration

Through agents, we can integrate traditional vision models to handle tasks that LLMs struggle with. For example, using an MCP (Model Context Protocol) to bridge an agent with vision capabilities.

A notable example is Zhipu’s Vision MCP. Using this MCP in conjunction with an agent significantly enhances visual processing power. This highlights the value of MCP servers that integrate specialized services.

Agent Frameworks

Pydantic AI: I find this particularly useful because it integrates Pydantic models into the agent framework, making it much easier to debug. I’ve tested its integration with Qwen3.
LangChain: I haven’t used this in production, only for basic debugging. The API changes frequently, which can be challenging. One minor issue is prompt handling; I used Jinja to solve this. Alternatively, the “LangChain way” involves using PromptTemplates.

Agent on Svtter's Blog