AI Development
AI Credit Scoring in Nigeria: How to Build It and Keep the Regulators Happy
Nigeria has over 100 million adults. Most of them have never had a credit score.
The credit bureaus — CRC, FirstCentral, CreditRegistry — cover a fraction of the population. If you don't have a bank account with a Tier 1 bank and a history of formal lending, you're invisible to traditional scoring. That's the majority of the country.
This is why 37.5% of Nigerian fintechs already use some form of AI for credit decisions. The data exists — it's just not in the places traditional models look. Mobile money transactions, airtime purchases, utility payments, device usage patterns. All of it tells a story about financial behaviour. AI reads that story.
If you're building a fintech in Nigeria, you probably already know this. The question isn't whether to use AI for credit scoring. It's how to build it properly — technically sound, commercially viable, and compliant enough that when the NDPC comes knocking, you've got answers.
The Alternative Data That Actually Works
Not all alternative data is created equal. Some sources are highly predictive. Others are noise dressed up as signal. Here's what's actually working in the Nigerian market right now.
High-Value Data Sources
Mobile money transaction history — This is the gold standard for alternative credit data in Nigeria. Transaction patterns from MTN MoMo, OPay, and PalmPay reveal income regularity, spending discipline, and financial relationships. You can access this through direct partnerships or via aggregators. Consistency matters more than volume — someone who receives and sends ₦30,000 reliably every week is often a better risk than someone with irregular ₦500,000 deposits.
Bank statement analysis — APIs from Mono and Okra let you pull categorised bank statement data with customer consent. You get income patterns, recurring payments, gambling activity, loan repayments to other lenders, and balance trends. This is the most information-dense single source. Cost: ₦50-150 per API call depending on your volume agreement.
BVN verification data — The Bank Verification Number gives you identity confirmation and demographic data. It's a baseline, not a predictor. But it's required for KYC and links your customer to their other financial records.
Airtime and data purchase patterns — Surprisingly predictive. Regular top-up patterns correlate with income stability. Sudden drops in airtime spending can signal financial distress before it shows up in repayment behaviour. Telco data partnerships are harder to negotiate directly, but aggregators like CredoLab and Branch International have shown this data improves model accuracy by 15-20%.
Useful but Secondary
Utility payment records — Electricity (prepaid meter), water, and waste management payments. Good supplementary signal, but coverage is patchy. EKEDC, IKEDC, and other DisCos have digital records, but accessing them at scale requires individual partnerships.
Device and behavioural data — What phone the customer uses, how they interact with your app, what time they apply. This data is accessible (it's on their device) but comes with serious privacy implications. The NDPA requires explicit consent for this type of collection, and you need to justify its necessity.
Social commerce activity — Instagram and WhatsApp Business transaction records can show business revenue for self-employed borrowers. But scraping social media data without explicit consent is an NDPA violation. If the customer provides it voluntarily as part of their application, it's fair game.
What to Avoid
Don't build models on social media connections, contact lists, or SMS content. Some early-generation lending apps did this. Several got banned. The NDPC and CBN both consider this disproportionate data collection. It's not worth the regulatory risk, and the predictive value is marginal anyway.
How to Build It: Architecture That Works
Here's what a production-grade AI credit scoring system looks like. This isn't a research project — it's the architecture that actually ships.
1. Data Ingestion Layer
Your system needs to pull from multiple providers in real time. Build an abstraction layer that normalises data from different sources into a common format.
- Mono/Okra connectors for bank statements (REST APIs, webhook-based)
- Telco data pipeline via aggregator partnership
- BVN verification via NIBSS or licensed providers
- Internal data — your own customers' repayment history is your most valuable asset after the first 6 months
Design for failure. API calls to third-party providers will timeout, return partial data, or fail entirely. Your system needs graceful degradation — score with whatever data is available, flag low-confidence scores for manual review.
2. Feature Engineering Pipeline
Raw data isn't useful to an ML model. You need to transform it into features — calculated variables that capture financial behaviour.
Good features for Nigerian credit scoring:
- Income regularity score — variance in monthly inflows over 3-6 months
- Expense-to-income ratio — monthly outflows divided by inflows
- Savings behaviour — minimum balance trends over time
- Loan stacking indicator — concurrent active loans with other lenders
- Transaction velocity — frequency and consistency of financial activity
- Airtime stability index — consistency of phone credit purchases
Aim for 50-100 engineered features from your raw data. More than that and you're likely overfitting. Fewer and you're leaving signal on the table.
3. The ML Model
For credit scoring, gradient boosting models (XGBoost, LightGBM) consistently outperform other approaches. They handle tabular data well, train fast, and produce interpretable results.
Neural networks can add marginal accuracy but make explainability harder — and the NDPA's Section 37 requires you to explain decisions. A model you can't explain is a model you can't deploy compliantly.
Recommended approach:
- Primary model: LightGBM or XGBoost for credit decisioning
- Calibration layer: Platt scaling to convert model outputs to actual probability of default
- Segmented models: Different models for salaried vs. self-employed vs. gig workers perform better than one model for everyone
Train on your own portfolio data once you have 6+ months of repayment outcomes. Before that, use industry benchmarks and conservative thresholds, then retrain as your data grows.
4. Scoring API
Your credit scoring model needs to serve real-time decisions. Build a REST API that takes a customer identifier, fetches their data, runs the model, and returns a score — all within 5-10 seconds.
The API response should include:
- Credit score (normalised 0-1000 or similar)
- Risk band (approved, manual review, declined)
- Recommended loan terms (amount, rate, tenor)
- Top factors (the 3-5 features most responsible for this score)
- Confidence level (how much data was available to make this decision)
That "top factors" field isn't optional. It's your Section 37 explainability requirement built directly into the system output.
5. Explainability Layer
This is where most fintechs cut corners and pay for it later.
Use SHAP (SHapley Additive exPlanations) values to decompose every individual score into feature contributions. For each customer, you can show: "Your score was most influenced by: (1) consistent income deposits over 4 months (+180 points), (2) low expense-to-income ratio (+120 points), (3) active loan with another lender (-90 points)."
This isn't just for the regulator. It's good UX. Customers who understand why they were declined are more likely to improve their profile and come back.
Store SHAP values for every score. When the NDPC asks how a decision was made, you pull them up. When a customer contests a decision, your support team can explain it in plain language.
6. Human Review Workflow
Section 37 of the NDPA requires the right to human intervention for automated decisions. You need a practical process for this.
Build a review queue for:
- Scores that fall in the borderline range (e.g., the middle 15% of your risk distribution)
- Any customer who requests human review of their decision
- Scores with low confidence (limited data available)
- Flagged anomalies (data inconsistencies, potential fraud indicators)
The human reviewer should see the full score breakdown, the raw data, and the SHAP explanation. Give them the power to override the model in either direction. Track override rates — if humans are overriding the model more than 10-15% of the time, your model needs retraining.
7. Monitoring Dashboard
Models degrade. Data sources change. Customer populations shift. You need to watch for:
- Model drift — is prediction accuracy declining over time?
- Data quality — are API providers returning consistent, complete data?
- Bias drift — are approval rates diverging between demographic groups?
- Default rate tracking — are actual defaults matching predicted defaults?
- Feature importance shifts — are the model's decision factors changing?
Check weekly. Act monthly. Retrain quarterly at minimum.
What It Costs
Build Costs
| Component | Basic Model | Production System |
|---|---|---|
| Data pipeline (2-3 sources) | ₦1-2M (£1,000-£2,000) | ₦2-4M (£2,000-£4,000) |
| Feature engineering + ML model | ₦1-2M (£1,000-£2,000) | ₦2-3M (£2,000-£3,000) |
| Scoring API | ₦500K-1M (£500-£1,000) | ₦1-2M (£1,000-£2,000) |
| Explainability (SHAP) | — | ₦1-2M (£1,000-£2,000) |
| Human review workflow | — | ₦500K-1M (£500-£1,000) |
| Monitoring dashboard | — | ₦1-2M (£1,000-£2,000) |
| Bias testing + compliance docs | — | ₦1-2M (£1,000-£2,000) |
| Total | ₦3-6M (£3,000-£6,000) | ₦8-15M (£8,000-£15,000) |
The basic model gets you a working credit scorer with limited data sources and no compliance layer. It's fine for internal testing but not for production lending at scale.
The production system is what you ship to real customers. It includes everything the NDPC and CBN expect to see.
Running Costs (Monthly)
| Item | Cost |
|---|---|
| Cloud hosting (AWS/GCP Nigeria region) | ₦50-150K (£50-£150) |
| Data provider API calls (per 1,000 scores) | ₦50-100K (£50-£100) |
| Model monitoring tools | ₦20-50K (£20-£50) |
| Quarterly model retraining | ₦30-50K (£30-£50) amortised |
| Total monthly | ₦150-400K (£150-£400) |
Build vs. Buy
Licensing a third-party scoring engine — Creditinfo, LenddoEFL, or similar — costs $2,000-5,000/month. That's ₦3-7.5 million per year just in licensing fees.
A custom build at ₦8-15 million total pays for itself within 12-18 months compared to licensing. Plus you own the model, your data stays on your infrastructure, and you're not dependent on a vendor who might change their pricing or exit the Nigerian market.
The ROI
Manual underwriting costs ₦2,000-5,000 per application (staff time, verification calls, document review). AI scoring costs ₦50-150 per application at scale.
If you process 5,000 loan applications per month:
- Manual: ₦10-25 million/month in underwriting costs
- AI-assisted: ₦250-750K/month in scoring costs + ₦2-5 million for the 15-20% that go to human review
- Monthly saving: ₦5-20 million
The system pays for itself in the first month or two. After that, it's pure margin improvement.
Bias Testing: The Risk You Can't Ignore
AI credit scoring in Nigeria carries real bias risks. The data itself reflects existing inequalities.
Geographic bias — Urban customers have more digital financial data than rural ones. If your model penalises thin files, you're systematically disadvantaging rural borrowers. Test approval rates by state and urban/rural classification.
Gender bias — Women in Nigeria have lower rates of formal bank account ownership. If bank statement data is a primary feature, the model may score women lower not because they're worse risks but because they have less formal financial history. Test approval rates and default rates by gender separately.
Age bias — Younger applicants have shorter financial histories. Your model might conflate "short history" with "high risk" when it actually just means "young." Test across age bands.
Device bias — Using device type (iPhone vs. budget Android) as a feature is effectively using wealth as a proxy. It's circular — you're scoring people partly on whether they already have money.
How to Test
Run your model on historical data and measure:
- Demographic parity — Are approval rates roughly proportional across groups?
- Equalised odds — Among people who actually repay, are approval rates equal across groups? Among people who default, are rejection rates equal?
- Calibration — When the model says "70% chance of repayment," is the actual repayment rate 70% for all demographic groups?
Perfect fairness across all metrics simultaneously is mathematically impossible — they trade off against each other. But you should be able to show you tested, identified disparities, and took reasonable steps to reduce them.
Document every test. Re-run quarterly as your model retrains on new data. The NDPC expects to see this documentation.
The Compliance Layer
NDPA Requirements
Data Protection Impact Assessment — Required before you launch. AI credit scoring is textbook high-risk processing under the NDPA. Your DPIA documents what data you collect, why, what risks exist, and what safeguards you've built. This is the first document an NDPC investigator asks for.
Section 37 compliance — You need to:
- Inform customers that automated decision-making is being used
- Explain what factors influence the scoring (your SHAP explainability layer handles this)
- Offer the right to human review (your review workflow handles this)
- Allow customers to contest decisions
- Document the logic of your model (not the source code — the logic)
Lawful basis — Contractual necessity is your strongest ground. The customer applied for a loan; you need to assess their creditworthiness to deliver the service. Consent works too but is harder to maintain — customers can withdraw consent at any time.
Data minimisation — Don't collect everything just because you can. If your model works well with bank statements, mobile money, and BVN data, don't also scrape device contacts and SMS logs "just in case." Collect what you need and justify each data source.
Cross-border transfers — If you're training your model on AWS or GCP infrastructure outside Nigeria, customer data is crossing borders. Document where it goes and ensure adequate protection. Both AWS and GCP now have Africa regions (Cape Town), which helps, but you still need to verify your specific setup.
CBN Requirements
If you're in the CBN's regulatory sandbox or operating under a fintech licence, additional rules apply:
- Consumer protection guidelines require fair treatment in automated lending
- Risk management framework expects you to document how your AI models are governed
- Reporting requirements may include model performance metrics in your returns
- Capital adequacy calculations should account for model risk
The CBN is increasingly sophisticated about AI in financial services. They're not going to ask you to stop using it — they're going to ask you to prove you're using it responsibly.
DPCO Engagement
If your fintech processes data above the NDPC's prescribed threshold (and if you're running a lending operation, you almost certainly do), you need to engage a Data Protection Compliance Organisation. There are about 146 licensed DPCOs in Nigeria. Almost none of them have experience auditing AI credit scoring systems.
This is a gap we fill. We build the system and provide the compliance documentation. When your DPCO comes to audit, everything is already documented.
Next Steps
Building an AI credit scoring system is a genuine competitive advantage in Nigerian fintech — but only if the system works technically and holds up to regulatory scrutiny. The fintechs that get this right will dominate lending in the next 3-5 years. The ones that cut corners on compliance will learn the hard way that the NDPC is serious about enforcement.
If you're already running a credit scoring model without the compliance layer, that's a fixable problem. If you're building from scratch, it's cheaper to do it right the first time than to bolt on compliance after an enforcement notice.
Ready to build? Talk to us. We build AI credit scoring systems with NDPA compliance baked in from day one.
Related reading:
- NDPA Compliance for Nigerian Fintechs — the full NDPA breakdown for fintech AI systems
- Do I Need a DPIA for My AI System? — step-by-step guide to the assessment process
- Our services and pricing — what we build, what it costs
Frequently Asked Questions
How does AI credit scoring work in Nigeria?
AI credit scoring uses machine learning to assess creditworthiness using alternative data — mobile money transactions, airtime purchase patterns, utility payments, social commerce activity, and device data — alongside traditional data like bank statements and BVN records. The AI identifies patterns that predict repayment behaviour more accurately than traditional scoring models, which is especially valuable in Nigeria where most adults lack formal credit histories.
How much does it cost to build an AI credit scoring system?
A basic AI credit scoring model costs ₦3-6 million (£3,000-£6,000) to build. A production-ready system with multiple data source integrations, real-time scoring API, bias testing, and compliance documentation costs ₦8-15 million (£8,000-£15,000). Running costs are ₦150-400K/month (£150-400) for hosting, API calls, and model monitoring. Compare that to licensing a third-party scoring engine at $2,000-5,000/month.
What data can I use for AI credit scoring in Nigeria?
Common alternative data sources include: mobile money transaction history (via partnerships with MTN MoMo, OPay, PalmPay), airtime and data purchase patterns, utility payment records, bank statement analysis (via Mono, Okra), BVN verification data, device and behavioural data. Each data source requires a lawful basis under the NDPA — typically consent or contractual necessity. You cannot use data collected for one purpose (e.g., social media) for credit scoring without explicit consent.
Is AI credit scoring legal under the NDPA?
Yes, but with requirements. Section 37 of the NDPA gives individuals the right not to be subject to decisions based solely on automated processing that significantly affect them — and credit decisions definitely qualify. You must provide meaningful information about how the scoring works, offer the right to human review, allow customers to contest decisions, and document your model's logic. A DPIA is required because this is high-risk automated processing of personal data.
How do I test for bias in my credit scoring AI?
Test across protected characteristics: gender, ethnicity, religion, state of origin, age. Run the model on historical data and check whether approval rates, default predictions, and interest rate assignments differ significantly between groups. Use statistical fairness metrics like demographic parity, equalised odds, and calibration. Document everything — the NDPC and CBN both expect you to demonstrate that your model doesn't discriminate. Re-test quarterly as the model learns from new data.
Need help with this?
We build compliant AI systems and handle the documentation. Tell us what you need.
Get in TouchRelated Articles
AI Development
WhatsApp AI Chatbot for Nigerian Businesses: How to Build One That Actually Works
95% of Nigeria's digital population uses WhatsApp. Here's how to build an AI-powered WhatsApp chatbot for your business — costs, platforms, integrations, and the NDPA compliance most developers skip.
AI Development
AI Vendor Due Diligence: What to Check Before Hiring an AI Development Partner
Before you sign with an AI agency, ask these questions. Technical capability, data handling, compliance knowledge, and pricing transparency — a practical checklist for businesses hiring AI developers.
AI Development
How to Build an AI FAQ Chatbot for Your Business (That Actually Answers Questions)
Stop paying people to answer the same 20 questions every day. An AI FAQ chatbot costs £2,000-£5,000 to build and handles 60-80% of customer queries instantly. Here's how to build one properly.