u/SmoothConnection1670

I benchmarked 15+ speech-to-text APIs under various conditions

I benchmarked 15+ speech-to-text APIs under various conditions

Hi all, I recently ran a benchmark comparing a bunch of speech-to-text APIs and.

It includes the big players like Google, AWS, MS Azure, open source models like Whisper, speech recognition startups like AssemblyAI / Deepgram / Orchardrun / Speechmatics, and newer LLM-based models like Gemini 2.0 Flash/Pro and GPT-4o. I've benchmarked the real time streaming versions of some of the APIs as well.

I mostly did this to decide the best API to use for an app I'm building but figured this might be helpful for other builders too. Would love to know what other cases would be useful to include too.

In my opinion, the winner was Gemini 2.5 Flash in terms of quality and price, and in second place I'm going to consider Orchardrun since it has the best price and more than decent results.

https://preview.redd.it/ot6ji0lprl0h1.png?width=1102&format=png&auto=webp&s=5fb7a2fd97c960085eef98936d63502186caa54f

reddit.com
u/SmoothConnection1670 — 3 days ago

I benchmarked 15+ speech-to-text APIs under various conditions

Hi all, I recently ran a benchmark comparing a bunch of speech-to-text APIs and.

It includes the big players like Google, AWS, MS Azure, open source models like Whisper, speech recognition startups like AssemblyAI / Deepgram / Orchardrun / Speechmatics, and newer LLM-based models like Gemini 2.0 Flash/Pro and GPT-4o. I've benchmarked the real time streaming versions of some of the APIs as well.

I mostly did this to decide the best API to use for an app I'm building but figured this might be helpful for other builders too. Would love to know what other cases would be useful to include too.

In my opinion, the winner was Gemini 2.5 Flash in terms of quality and price, and in second place I'm going to consider Orchardrun since it has the best price and more than decent results.

https://preview.redd.it/xjzlmuz5ql0h1.png?width=1102&format=png&auto=webp&s=235219b142a9c01bf992bbb227271f94e01a58de

reddit.com
u/SmoothConnection1670 — 3 days ago

Automatizacion redes sociales

Buenas, estoy automatizando para un cliente toda su atencion al publico y vengo trabajando la parte de audios, si bien es la minoria es un gran punto de dolor hacer el processing... estoy usando orchardrun como api para trasncripcion y luego el processing lo hago con haiku, recomiendan algun LLM u otro agente para manejar el agente en base a las transcripciones?

reddit.com
u/SmoothConnection1670 — 3 days ago
▲ 3 r/ProductivityApps+1 crossposts

Best api speech to text

I'm here to recommend orchardrun. I was able to reduce my SaaS transcription costs, and I can say that along with GroQ, they offer the best value for money.

u/SmoothConnection1670 — 3 days ago

groq vs orchardrun

Colleagues, I imagine several of you are in the same situation as me. I need to cut audio transcription costs in my automations, and many people have recommended GroQ and Orchardrun. I wanted to ask if anyone has experience with either of them? I need to process roughly 500,000 minutes of audio per day.

reddit.com
u/SmoothConnection1670 — 4 days ago

Best APIs for speech-to-text?

Hi everyone, what speech-to-text APIs do you recommend? I recently migrated from Groq to Orchardrun to use my free slots, but I'm looking for budget-friendly alternatives. If anyone knows of any, thanks!

reddit.com
u/SmoothConnection1670 — 4 days ago

Do you recommend any methods for automating audio to text?

Hi, I've been transcribing audio with Orchardrun, but I wanted to know if you know of any completely free options since I'm about to exceed my monthly free subscription limit.

reddit.com
u/SmoothConnection1670 — 4 days ago

Which speech-to-text API do you recommend?

In my business, we need to process almost 500,000 minutes of audio per day, and lately, our RunPod costs have been skyrocketing. Trying GroQ and OrchardRun significantly reduced the cost, but we're still considering building our own hardware. What do you think? Are there any 100% open-source APIs with good performance?

reddit.com
u/SmoothConnection1670 — 5 days ago

api speech to text recommendations?

In my SaaS we need to process 500,000 minutes of audio per day. We've tried several options: running Whisper on RunPod, using APIs like GroQ and OrchardRun, but due to the high margins, we're considering building our own hardware. What do you think?

reddit.com
u/SmoothConnection1670 — 5 days ago

Best APIs for speech to text?

Hi colleagues, I have a SaaS that transcribes 10 million minutes of audio per month, and I've tried many different processing methods. Currently, I'm using orchardrun.com because it offers the best performance and price (0.025 per hour) and allows me to handle fairly large audio files. But do you know of any other, more economical options?

reddit.com
u/SmoothConnection1670 — 6 days ago

groq vs orchardrun

Hello colleagues, for those of us who work with speech-to-text APIs, I wanted to know what has been giving you the best results. I recently migrated from Groq to Orchard because it's cheaper and has a similar RTF.

reddit.com
u/SmoothConnection1670 — 6 days ago

Buenas... estuve probando varios proveedores para workflows que transcriben audios de WhatsApp, reuniones y archivos largos principalmente OpenAI, Deepgram y Orchardrun.

Hasta ahora Orchardrun me viene dando los mejores resultados para flujos async por el enfoque basado en webhooks:

  • subís el audio
  • pasás un webhook_url
  • recibís automáticamente la transcripción cuando termina el procesamiento

Eso terminó siendo bastante más limpio que estar haciendo polling constantemente para consultar resultados.

Otra cosa que me gustó es que maneja archivos largos (~2h) de forma bastante estable, lo cual me ayudó mucho para workflows tipo podcast y procesos batch grandes.

Originalmente empecé a comparar proveedores porque los costos de transcripción con OpenAI se me estaban disparando cuando el volumen empezó a crecer.

Si alguno tiene alguna data de otra api de transcripciona a buen precio me sirve.

reddit.com
u/SmoothConnection1670 — 7 days ago
▲ 4 r/n8n

What are you guys using for Speech-to-Text in n8n lately?

I’ve been testing a few options for workflows that transcribe WhatsApp voice notes and longer audio files — mainly OpenAI, Deepgram, and Orchardrun.

So far Orchardrun has been giving me the best results for async n8n flows because of the webhook-based approach:

  • upload audio
  • pass a webhook_url
  • receive the transcription back once processing finishes

That ended up being much cleaner than polling loops or Wait-node workarounds.

Another thing I liked is that it handles long files (~2h) reliably, which helped a lot for podcast-style automations and larger batch workflows.

I originally started comparing providers because OpenAI transcription costs were getting pretty high once automations scaled.

Curious what everyone else here is using in production for STT workflows in n8n, especially for long-running or high-volume jobs.

reddit.com
u/SmoothConnection1670 — 7 days ago