u/Impre-visible

Best LLM for multilingual function calling + strict JSON + low latency?

Hello everyone, I'm currently working on an app and I have an idea for a new feature.

On the home page, there would be an input field where users could enter a request, and once it is submitted, an AI will make one/multiple function call(s) to execute what the user needs within the application. However, if the request isn’t specific enough, the user will be presented with a list of questions (checkboxes, open-ended answers, etc.).

So I’m currently looking for the best model for this. My criteria are as follows:

  • Cost-effectiveness
  • Advanced function calls
  • Multilingual support
  • Low latency (fast TTFT)
  • Strict/structured JSON outputs
  • Large context window
  • Data privacy
  • Stability and high throughput limits

I wanted to know if anyone had the chance to test some models based on some of those feedbacks ?

reddit.com
u/Impre-visible — 3 days ago