u/the-tf

OpenAI Realtime API - How do I stop my agent from giving fake praise and to follow guidelines strictly?

I’m building a voice-based communication coach that talks to users in real time using the OpenAI Realtime API (POST https://api.openai.com/v1/realtime/sessions). The coach should act like a tough, high‑standards reviewer: very direct, candid, and focused on content quality first.

Even with a strict system prompt, the model keeps giving fake praise and calling vague answers “clear and easy to follow.”

Example (simplified):

  • Coach prompt to user: “Give a 60-second status update to a senior stakeholder. Cover: (1) what was accomplished, (2) the biggest risk ahead, (3) one thing you need from them.”
  • User answer: “We’re just working through the usual items.”
  • Model response: “Your main strength is that your explanation was clear and easy to follow… For delivery improvement, try adding a slight pause… Keep going—you’re doing great!”
  • What I actually want instead: Something like: “This is very vague. You didn’t say what was accomplished, what the biggest risk is, or what you need. This is not strong enough for a senior-level update. Try again, more specific but still high-level.”

My system prompt already includes things like:

  • Be strict and candid; don’t sugarcoat.
  • Only coach delivery when content is clear and specific.
  • Give strong feedback on vague answers like “We’re just working through the usual items.”
  • Don’t use phrases like “Great work”, “Your main strength is…”, “You’re doing great” unless the content is genuinely strong.
  • If the answer is vague or incomplete, give 0% praise and 100% content-focused critique.

But the model still:

  • Invents “strengths” for bad answers.
  • Coaches delivery even when content is weak.
  • Uses praise phrases I tried to ban.

I’m looking for:

  • Concrete prompt patterns that actually reduce this “terminal niceness.”
  • Ways (in a Realtime API / streaming setup) to force a content quality check and branch behavior.
  • Examples of prompts or few-shot examples that produce a blunt, critical coach.
  • Whether I should use a different model, add tool-calling / intermediate scoring, or post-process the streamed output to strip praise / reframe it.

If you’ve built strict/critical review or coaching agents (especially with the Realtime API), how did you stop them from reflexively saying “great job” and get them to honestly call out vague, low-effort answers?

reddit.com
u/the-tf — 1 month ago