u/the-tf

I’m building a voice-based communication coach that talks to users in real time using the OpenAI Realtime API (POST https://api.openai.com/v1/realtime/sessions). The coach should act like a tough, high‑standards reviewer: very direct, candid, and focused on content quality first.

Even with a strict system prompt, the model keeps giving fake praise and calling vague answers “clear and easy to follow.”

Example (simplified):

Coach prompt to user: “Give a 60-second status update to a senior stakeholder. Cover: (1) what was accomplished, (2) the biggest risk ahead, (3) one thing you need from them.”
User answer: “We’re just working through the usual items.”
Model response: “Your main strength is that your explanation was clear and easy to follow… For delivery improvement, try adding a slight pause… Keep going—you’re doing great!”
What I actually want instead: Something like: “This is very vague. You didn’t say what was accomplished, what the biggest risk is, or what you need. This is not strong enough for a senior-level update. Try again, more specific but still high-level.”

My system prompt already includes things like:

Be strict and candid; don’t sugarcoat.
Only coach delivery when content is clear and specific.
Give strong feedback on vague answers like “We’re just working through the usual items.”
Don’t use phrases like “Great work”, “Your main strength is…”, “You’re doing great” unless the content is genuinely strong.
If the answer is vague or incomplete, give 0% praise and 100% content-focused critique.

But the model still:

Invents “strengths” for bad answers.
Coaches delivery even when content is weak.
Uses praise phrases I tried to ban.

I’m looking for:

Concrete prompt patterns that actually reduce this “terminal niceness.”
Ways (in a Realtime API / streaming setup) to force a content quality check and branch behavior.
Examples of prompts or few-shot examples that produce a blunt, critical coach.
Whether I should use a different model, add tool-calling / intermediate scoring, or post-process the streamed output to strip praise / reframe it.

If you’ve built strict/critical review or coaching agents (especially with the Realtime API), how did you stop them from reflexively saying “great job” and get them to honestly call out vague, low-effort answers?

OpenAI Realtime API - How do I stop my agent from giving fake praise and to follow guidelines strictly?