# Install SDK
$ pip install datalent
# Authenticate and load a dataset
import datalent
client = datalent.Client(api_key="your_api_key")
# Load as HuggingFace Dataset
ds = client.load_dataset("urban-scene-detection-v3")
train = ds["train"] # HuggingFace Dataset ready
# Filter by metadata
night_data = ds.filter(lighting="night", city="tokyo")
Authentication
API Key + OAuth 2.0
API key for server-side access. OAuth 2.0 for team and SSO environments. Role-based access control. All requests logged for audit compliance.
API KeysOAuth 2.0RBACAudit Logs
Data Access
Streaming + Snapshot
Stream samples on-demand for memory-efficient training, or pull complete dataset snapshots for offline use. All datasets versioned. Delta updates for ongoing datasets.
StreamingSnapshotsVersionedDelta Updates
Formats
Any Framework
HuggingFace Datasets, PyTorch DataLoader, TensorFlow tf.data, COCO JSON, YOLO format, CSV/Parquet. Automatic format conversion on export.
HuggingFacePyTorchCOCOYOLO
SLA
99.9% Uptime
Enterprise tier includes 99.9% uptime SLA, dedicated rate limits, priority support, and a dedicated technical account manager. Free tier for evaluation.
99.9% SLAPriority SupportFree Tier
Query
Rich Metadata Filtering
Filter any dataset by its metadata fields. Build custom training splits on the fly without downloading entire datasets.
Metadata FilterCustom SplitsOn-the-fly
Webhooks
Dataset Update Alerts
Subscribe to webhooks for real-time notifications when new data is added. Trigger automated retraining pipelines. Integration with GitHub Actions and Airflow.
WebhooksCI/CDAirflow