LLMs

My private LLM Leaderboard

语速

2025-12-16

  • Code Agent: Sonnet 4.5
  • Talk: Opus 4.5

2025-10-24

  • Code Agent: Kilo Code - GLM 4.6
  • Code Agent: Claude Code - Sonnet 4.5 or Deepseek Reasoner (v3.2)
  • Complex reasoning: Claude Sonnet 4.5
  • Daily conversation: GPT-5

2025-10-09

Current workflow:

2025-09-22

Current workflow:

2025-08-08

Workflow:

2025-07-27

Aliyun’s official qwen is too expensive, 50 yuan is gone. And based on leaderboard tests, performance is worse than r1-0528.

2025-07-26

Previous

Licensed under CC BY-NC-SA 4.0