DATALENT is building the world's largest private codebase training dataset — 50,000+ repositories and growing. If you own production code, we want to partner with you. Your codebase becomes part of something that genuinely moves AI forward.
Your production codebase contains patterns, decisions, and engineering context that no synthetic data generator can replicate.
We evaluate every codebase individually. These are the categories we actively source right now.
Five steps. Designed to respect your time.
These three categories receive the fastest evaluation and strongest interest right now.
The most common concern is intellectual property. Here is exactly how we handle it — no ambiguity.
No code shared at this stage. We review and respond within 5 business days.
By submitting, you confirm this codebase is yours to license. We'll send an NDA before any technical review. Compensation is discussed on the discovery call — no pricing on this page.
Submit your codebase in under 5 minutes. No code shared at this stage. Our team responds within 5 business days.