AI Development
Build a GDPR-Compliant AI Chatbot: Architecture, Costs & Mistakes to Avoid
I keep hearing the same thing from CTOs: "We'd love to add AI chat but GDPR makes it impossible." They picture consent pop-ups stacked three deep, legal disclaimers longer than the conversation, and a user experience so bad nobody would actually use it.
That's not how it works. The businesses that end up with terrible, compliance-heavy chatbot experiences are the ones that built first and panicked about GDPR second. If you design for compliance from the start, the user never notices it's there.
I've built compliant chatbot systems that process thousands of conversations a day. The compliance layer adds maybe 15-20% to the build cost and zero friction to the experience. Here's the architecture.
Want this built properly from the start? A £500 scoping review covers your architecture, provider setup, DPIA scope, and the full compliance documentation you'll need. One week, written report.
Your server sits in the middle. Always.
Every GDPR-compliant chatbot follows one pattern:
User (browser) → Your Server → LLM API → Your Server → User
Never this:
User (browser) → LLM API directly
This matters more than anything else in this article. When your server sits between the user and the LLM, you control what data gets sent to the model, what gets stored, and what gets deleted. Without that middleware layer, you've handed data control to OpenAI or Anthropic or whoever runs the model. And you're still the data controller in GDPR terms — you're just a data controller who can't actually control anything.
The frontend handles the user-facing stuff: showing the AI disclosure ("You're chatting with an AI assistant"), linking to your privacy notice, collecting consent where needed, and providing the "talk to a human" escape hatch.
The backend — your server — is where compliance actually lives. It receives user messages, strips out personal data the LLM doesn't need, manages conversation history and context windows, enforces retention policies by auto-deleting old conversations, handles data subject access requests, and runs audit logging. This is the compliance engine. Everything else is window dressing.
The LLM layer is just an API call. Your server sends a sanitised prompt, gets text back. The provider should have a signed DPA with you, be configured for zero data retention, and not use your conversations for training.
The integration layer connects to your CRM, ticketing system, knowledge base, whatever. Each connection is a data flow you need to document in your DPIA.
One principle runs through all of it: data flows through your server, not around it. You're the data controller. Act like one.
Choosing Your LLM Provider (The GDPR Way)
Not all LLM providers are equal from a data protection standpoint. Here's how the main options compare in 2026:
OpenAI API (GPT-4o, GPT-4.5)
- DPA: Available, sign it before you write a line of code
- Data retention: Zero-retention available via API settings. Opt in explicitly — it's not the default
- Training: API data not used for training when zero-retention is enabled
- Data residency: EU hosting available through Azure OpenAI Service
- Sub-processors: Published list, regularly updated
- Verdict: Solid choice if you configure it properly. The Azure route gives you EU data residency, which simplifies your DPIA
Anthropic Claude API
- DPA: Available
- Data retention: Doesn't train on API data by default — this is a meaningful distinction
- Training: API conversations excluded from training unless you opt in
- Data residency: US-based processing (EU options expanding)
- Sub-processors: Published list
- Verdict: The default no-training policy is a strong starting point. Good for businesses that want fewer configuration steps
Self-Hosted (Llama 3, Mistral, Qwen)
- DPA: Not applicable — you host it yourself
- Data retention: Entirely under your control
- Training: Your data never leaves your infrastructure
- Data residency: Wherever you host it
- Verdict: Maximum data control. But you're paying £200-£500/month for GPU hosting, and the model quality is a step behind the leaders for complex conversations. Good for regulated industries where data can't leave your environment
What to Check Before You Sign
For any provider, verify these four things:
- DPA available and signed — not "available on request," actually signed
- Data residency options — where does data go during processing?
- Training data policy — are your conversations feeding the next model version?
- Sub-processor list — who else touches your data?
If the provider can't give you clear answers to all four, pick a different provider.
What GDPR actually requires from your chatbot
GDPR is principles-based — there's no checklist you tick and walk away. But there are specific requirements that apply to chatbots, and most of them come down to architectural decisions you make once, not ongoing headaches.
Lawful basis
You need a legal reason to process the data. For customer support chatbots, legitimate interest works — you have a genuine business need to answer queries, and customers reasonably expect it. Document this in a Legitimate Interest Assessment. For sales or marketing chatbots that proactively engage visitors, you'll probably need consent.
Don't default to consent for everything. Consent can be withdrawn mid-conversation, which breaks your chatbot's ability to function. Legitimate interest is more stable when it genuinely applies.
Transparency
Users need to know they're talking to an AI (the EU AI Act makes this a legal requirement from August 2025), what data you're collecting, and what happens to it. One line above the chat input handles all three: "You're chatting with an AI assistant. Conversations deleted after 90 days. [Privacy notice]"
That's it. Not a modal. Not a wall of text. One line.
Data minimisation
This is where your architecture earns its money. When a user gives their order number, look up the order in your system and send the order details to the LLM — not the user's full name, email, and address. The LLM doesn't need identifying information to tell someone their package shipped yesterday.
Limit your context window too. Don't send the entire conversation history with every API call. The last 5-10 messages is usually enough. And if the chatbot handles FAQs, it doesn't need the user's email address. Don't ask for information the query doesn't require.
Purpose limitation
A support conversation is for resolving the support issue. You can't later feed it into a marketing segmentation model or use it to fine-tune a custom model without separate consent. Tag conversations with their purpose in your database. Enforce access controls so marketing can't query support logs.
Retention
Set retention periods and enforce them automatically. Support conversations: 30-90 days. Sales enquiries: until resolved plus 30 days. General FAQ: 7-14 days. Complaints and disputes: 6-12 months.
Build a cron job that purges expired conversations every night. Don't rely on someone remembering to do it manually — they won't.
And configure your LLM provider for zero retention. Your conversations should live in YOUR database with YOUR retention rules, not sitting on OpenAI's servers for 30 days because you forgot to change the default.
Security
Standard web application security applies — TLS 1.2+ everywhere, encryption at rest for conversation logs, access controls, audit logging, API keys in environment variables not hardcoded in source. The chatbot-specific addition is prompt injection protection: making sure users can't trick your chatbot into revealing system prompts, other users' data, or doing things it shouldn't.
Data subject rights
People can request access to their conversations, deletion of their data, a portable export, or object to processing entirely. You need to respond within 30 days under UK GDPR.
Practically: store conversations linked to a user identifier. Build an export function and a deletion function that work on a per-user basis. Actually test both before you go live. If you're using zero-retention with your LLM provider, deletion is simpler — you only need to wipe your own database.
Consent and Transparency Done Right
Here's what good chatbot consent looks like. Not a wall of legal text. Not a dark pattern. Just clear information.
Above the chat input:
"Hi! I'm [Company]'s AI assistant. I can help with orders, returns, and general questions. [Privacy notice] | [Talk to a human]"
Before collecting personal details:
"I'll need your order number to look that up. We'll use it only to find your order and delete this conversation after 90 days."
Cookie/tracking consent: If your chatbot widget sets cookies or uses analytics, handle that through your existing cookie consent mechanism. Don't add a second consent layer.
EU AI Act disclosure: From August 2025, you must tell users they're interacting with an AI system. The line "I'm [Company]'s AI assistant" covers this. Simple.
The pattern: be honest, be brief, be accessible. People don't read long notices. They do read one-line disclosures.
What It Costs: Compliant vs. Non-Compliant
Let's talk money, because that's usually the real question.
Building GDPR-compliant from the start:
| Item | Cost |
|---|---|
| Chatbot development (mid-complexity) | £5,000-£8,000 |
| GDPR compliance layer (DPA, privacy notice, consent, retention, SAR handling) | £1,000-£2,000 |
| DPIA | £1,000-£2,000 |
| Total | £7,000-£12,000 |
Retrofitting compliance after a complaint:
| Item | Cost |
|---|---|
| Emergency GDPR audit | £2,000-£5,000 |
| Re-architecture (adding server middleware, retention, deletion) | £3,000-£8,000 |
| DPIA (rushed) | £1,500-£3,000 |
| Legal advice (because someone complained) | £2,000-£5,000 |
| Total | £8,500-£21,000 |
After an ICO enforcement action:
| Item | Cost |
|---|---|
| Everything above | £8,500-£21,000 |
| ICO fine (for SMEs, typically) | £5,000-£500,000 |
| Reputation damage | Incalculable |
The compliance layer adds roughly 15-20% to the build cost. Retrofitting costs 2-3x more. An enforcement action costs 10-50x more. The maths is straightforward.
Where I see businesses get it wrong
The same mistakes, over and over. All avoidable.
Using consumer ChatGPT for business data. The consumer product (chat.openai.com) and the API are different things with different data handling. Consumer conversations may feed training. The API, configured with zero-retention and a signed DPA, is GDPR-compatible. The consumer chat used to process customer complaints is not. I still see teams pasting customer emails into the free version. It's a data protection incident happening in slow motion.
No DPA with the LLM provider. Signing OpenAI's DPA takes ten minutes. It's in your account settings. Not signing it makes every single conversation a compliance violation. There is genuinely no excuse for this one.
Keeping conversations forever. "We might need them later" is not a retention policy. It's the absence of one. Set a period. Build the cron job. Delete on schedule.
No deletion capability. Someone sends a Subject Access Request or erasure request. You have 30 days to respond under UK GDPR. If your system wasn't built to find and delete a specific user's conversations, that's going to be a very stressful month.
Privacy notice that doesn't mention AI. You added a chatbot. Your privacy notice still says nothing about AI processing, who the sub-processor is, or where the data goes. Most businesses forget this step entirely.
No human escalation. GDPR gives people the right not to be subject to solely automated decision-making. If your chatbot handles complaints, billing disputes, or anything with real consequences — there needs to be a "talk to a human" button. Not buried in a menu. Visible.
Sending raw user data to the LLM. A customer mentions their health condition while asking about a refund. Your server sends the entire message, health details included, to the model. The LLM doesn't need that. Strip what's unnecessary before the API call. It's better for compliance and it actually produces better responses — less noise, more signal.
Where to start
Sign the DPA with your LLM provider before you write a single line of code. If you're considering OpenAI, we wrote a detailed guide to making ChatGPT API GDPR compliant.
Then design the architecture — server in the middle, always. Build the chatbot functionality. Add the compliance layer: consent, transparency, retention, deletion, SAR handling. Complete the DPIA. Update your privacy notice. And test the data subject rights flows end to end. Can you actually find, export, and delete a specific user's data? Don't assume it works. Prove it.
If you want this done right from the start rather than paying twice to fix it later — that's the service we offer. One team, one engagement, working chatbot plus all the compliance documentation.
Related Reading
- How Much Does an AI Chatbot Cost? Real Pricing Breakdown for 2026 — detailed cost guide covering every chatbot type
- Do I Need a DPIA for My AI System? — step-by-step DPIA guidance for AI projects
- How to Automate Customer Support With AI — the full guide to AI-powered customer service
Need a GDPR-compliant chatbot built properly? Start with a £500 scoping review. We assess the architecture, provider setup, DPIA scope, and documentation requirements first. If the use case is a fit, our AI Chatbot + Compliance Package starts at £3,500.
Frequently Asked Questions
Does my AI chatbot need to be GDPR compliant?
If your chatbot processes personal data of people in the UK or EU — and it almost certainly does (names, email addresses, conversation content, IP addresses, device data) — then yes. GDPR applies regardless of where your business is based. The ICO has specifically identified AI chatbots as an area of enforcement focus.
Can I use ChatGPT or Claude API for a GDPR-compliant chatbot?
Yes, but with safeguards. Both OpenAI and Anthropic offer Data Processing Agreements, EU data residency options, and zero-data-retention API configurations. You need to: sign their DPA, enable zero-retention mode so conversations aren't used for training, configure EU data residency where available, and document the data flows in your DPIA. Using the API with proper configuration is GDPR-compatible. Using the consumer chat interface for business purposes is not.
What personal data does a chatbot collect?
More than you think. Direct data: anything the user types (names, email addresses, order numbers, complaints, health information if they mention it). Indirect data: IP addresses, device information, session IDs, timestamps, conversation history, browser fingerprint. Derived data: sentiment analysis, topic classification, user preferences inferred from conversations. All of this is personal data under GDPR.
Do I need a DPIA for my AI chatbot?
Almost certainly yes. The ICO says DPIAs are required for processing using new technologies (AI qualifies) and systematic monitoring of individuals (chatbot conversations qualify). A DPIA documents what data you process, why, the risks to individuals, and what safeguards you have in place. It's both a legal requirement and a genuinely useful exercise that forces you to think about data protection before you launch.
How do I handle conversation data retention?
Keep conversations only as long as needed. For customer support chatbots, 30-90 days covers most follow-up needs. For sales chatbots, retain until the enquiry is resolved plus a reasonable period. Never retain indefinitely. Configure your LLM provider for zero-retention (conversations not stored or used for training). Store conversation logs in your own database with automatic deletion after your retention period. Tell users in your privacy notice how long you keep their conversations.
Start with a £500 scoping review
If you need GDPR documentation, AI Act work, or a compliant AI build, the first step is a written scoping review. You get a real report, not a generic discovery call.
Related Articles
AI Development
Outsource AI Development UK: Why Compliance Must Be Part of the Build
Hiring an AI developer who doesn't handle GDPR and AI Act compliance means paying twice. Why the build and the documentation should come from the same team.
AI Development
AI Credit Scoring in Nigeria: How to Build It and Keep the Regulators Happy
How to build an AI credit scoring system for the Nigerian market. What it costs, what data works, and how to satisfy both the NDPA and CBN requirements.
AI Development
AI Vendor Due Diligence: What to Check Before Hiring an AI Development Partner
Before you sign with an AI agency, ask these questions. Technical capability, data handling, compliance knowledge, and pricing transparency — a practical checklist for businesses hiring AI developers.