Featured image of post Efficient and Cost-Effective: My AI Agent Workflow Choice

Efficient and Cost-Effective: My AI Agent Workflow Choice

Sharing my efficient and cost-effective workflow using OpenCode + Gemini 3 Flash

语速

Claude Code’s $100/month price tag is a bit steep for many. To address this, I’ve been experimenting with a more practical and affordable workflow.

In terms of models, my recommendation is to use Gemini 3 Flash on an as-needed (pay-as-you-go) basis as a replacement.

Why? Gemini 3 Flash offers incredible value. It’s fast, efficient, and costs a fraction of what you’d pay for Opus or Sonnet. For the vast majority of tasks, Flash is more than enough.

The Cost-Saving Workflow

Here is my current “budget” workflow:

  • Planning & Proposals: Use Gemini 3 Flash.
  • Execution & Building: Use the free GLM 4.7 (or MiniMax M2.1) via OpenCode. If you have a Zhipu Coding Plan, that works perfectly too.

Speaking of Gemini 3, we have to talk about GPT-5.2.

Many engineers still rely on ChatGPT.com directly instead of using a proper coding agent. Regardless of the efficiency debate, the reliability is concerning. From my experience, GPT-5.2’s default tone has been tuned to be overly “people-pleasing,” which might not be ideal for professional developers seeking direct technical feedback.

GPT-5.2 Response Tone

Furthermore, while GPT-5.2 scored impressively on SWE-bench Verified, my real-world experience has been mixed. It’s worth looking at the history of SWE-bench:

Originally proposed by a team from Princeton University (ICLR 2024), it evaluates a model’s ability to solve real GitHub issues. However, in August 2024, OpenAI’s Preparedness team collaborated with the original authors to create SWE-bench Verified (a subset of 500 manually verified issues). Since OpenAI was involved in the design of this benchmark, their models’ performance on it should be taken with a grain of salt. While not necessarily a deliberate manipulation, the risk of inherent bias is significant.

Ultimately, as I often say, “Codex” models don’t always deliver the most practical results in everyday coding.

OpenCode Tips

  1. Leveraging Agents: OpenCode supports launching SubAgents. When debugging complex projects, you can have OpenCode launch agents in different directories to handle front-end and back-end tasks separately, which also helps avoid permission issues.

  2. OpenSpec: Cross-Agent Collaboration:

    1
    2
    3
    4
    
    1. OpenCode + Gemini 3 Flash → Generate proposal
    2. Codex → Code Review
    3. Claude Code → Secondary Review
    4. OpenSpec Apply → Final Execution
    

    OpenSpec generates reliable specs, but sometimes cheaper models produce lower-quality code. In such cases, you can generate multiple times using the spec and select the best result.

Final Thoughts

As AI Agent engineers, we need to adapt to these ongoing trends:

  1. Models are becoming smarter.
  2. Execution is becoming faster.
  3. Prices are dropping.

While these trends are promising, we still need to balance speed, cost, and quality for every task. We might soon see agent systems that automate this balancing act, but for now, it’s a crucial part of the engineer’s role.