What’s the hardest part of AI copilot development: UX, data, or model accuracy?
I’ve been researching AI copilot development lately, and most discussions focus heavily on models and benchmarks — but I’m curious what people think is actually the hardest part in real-world projects.
Is it the model accuracy itself?
Is it getting clean/useful company data?
Or is the real challenge building a UX that people actually trust and want to use daily?
From what I’ve seen, even strong LLMs can feel unreliable if the workflow integration is poor. On the other hand, a great interface can’t really save bad outputs or outdated data. And in enterprise environments, data access/permissions seem to become a massive issue pretty quickly.
I’ve also noticed that many AI copilots demo well initially, but struggle once users expect consistency, context awareness, and fewer hallucinations over time.
For developers or teams working on AI copilots:
- What ended up being the biggest bottleneck?
- What problem took longer than expected?
- Did your priorities change after launch?
Curious to hear real experiences rather than marketing claims.