Third-party Client Performance

语速

1
2
3
Which is the most expensive model on Silicon Flow?
I mean siliconflow.cn
Help me take a look

Over the past year, I have attempted to use deepchat and large model APIs (such as k2 thinking turbo) to build a relatively private chat tool (or agent assistant) for handling some private data. However, the overall experience has not been great. The large models often provide incorrect answers.

For search capabilities, I used the bocha API, resetting 10 credits to provide search functionality for the large model.

Test Questions

I feel there are still some issues with context handling (within a single chat window). I briefly tested this question: Which is the most expensive model on Silicon Flow?.

The answer is:

Kimi k2 thinking turbo

First, deepchat:

Hmm, incorrect.

Then, kimi official:

Also incorrect.

Trying deepseek

First, let’s try the client.

Incorrect.

Then, deepseek official.

Very close, and the answer seems reasonable. Unfortunately, it’s still incorrect.

If we ask ChatGPT directly

Hiss, a bit off. Let’s try gpt-5.

Prompt:

Inference - Reasons for Poor Performance

Insufficient search capability. The Bocha API is to blame.
Different models may have different optimal hyperparameters for best performance. I called the large model API from Silicon Flow.

Conclusion

For this specific problem, ChatGPT still performs better. Compared to before, the official search + model combination also seems to perform better. Therefore, unless the data is particularly sensitive, it’s better to use the official service.
This article is for reference only, just for fun.