I built a customer support AI for German complience that auto-serves invoices and deflects 39.5% of queries. here's the architecture
Just got the first week of production data back from a customer support AI system I built for a German compliance company. 39.5% deflection rate across 43 conversations. Want to break down the architecture because support chatbots get a bad reputation and most of it is deserved.
The system handles email and chat conversations. When a customer reaches out the system does three things:
1. Intent classification. Determines what the customer actually wants. The current intent categories are: termination, onboarding, invoice requests, legal advice, general questions, technical issues, integration questions, GDPR questions, and account management. This classification drives what happens next.
2. Outcome routing. Based on the intent and the system's confidence in handling it, the conversation gets routed to one of four outcomes:
- Deflected (39.5%): AI resolves the query completely
- Invoice served (19%): system automatically pulls and delivers the requested invoice
- Ticket created (19%): complex query gets escalated to a human agent
- Collecting info (16%): system is still gathering details before routing
3. Response generation. For deflectable queries the system generates a response grounded in the company's actual documentation and policies. Not generic FAQ answers. Actual answers sourced from their knowledge base.
What makes this work better than most support chatbots:
The intent classification isn't just keyword matching. The system understands that "I want to stop my subscription" and "how do I cancel" and "we're discontinuing the service" are all termination intents even though they share almost no words.
The escalation logic errs on the side of creating tickets. If the system isn't confident it can fully resolve the query it escalates rather than giving a bad answer. This is why the deflection rate is 39.5% and not some inflated 80% number. Every deflected conversation is a genuinely resolved query.
The invoice serving is fully automated. Customer asks for an invoice, system identifies the intent, pulls the relevant invoice, and delivers it. This single feature handles 19% of all conversations without any human involvement.
Average response time is 28 seconds. For comparison the same query handled by a human agent involves reading the email, looking up the customer, finding the relevant information, and composing a response. Even a fast agent takes 5-10 minutes.
The interesting part is that this runs alongside an internal RAG system I built for the same client. Their team has AI handling customer-facing support AND AI handling internal legal research. The humans focus on the work that actually requires human judgment: complex legal analysis, sensitive customer conversations, strategic decisions.
Week one data is a small sample (43 conversations) but the deflection rate and intent distribution give a good baseline for tuning. The main optimization targets are improving deflection on onboarding questions and general queries where the system is currently creating tickets it could probably handle.