Whenever I use agents like Researcher and they have access to internal sources like confluence and jira, they consistently make mistakes and misrepresent the facts, and in many cases the citations it provides has nothing to do with what it said. For example it gave some implementation description that wasn't even accurate, and the confluence page that it cited was from an employee appreciation day event. This is the case when using GPT, Claude, and Auto mode. Meanwhile using the same models in chatgpt/codex or claude code results in zero hallucinations. What exactly is Microsoft doing to sabotage the abilities of these models and for what reason?
u/Elctsuptb — 6 days ago