Programmatic Access to Synthetic CDISC-Compliant Datasets
GET /api/therapeutic-areas
Returns all available therapeutic areas that can be used to customize dataset generation.
{
"therapeutic_areas": {
"Neurology": {
"common_conditions": ["Alzheimer's Disease", "Parkinson's Disease", "Multiple Sclerosis"],
"common_medications": ["Levodopa", "Carbidopa", "Memantine"],
"common_lab_tests": ["Cerebrospinal Fluid Analysis", "Acetylcholine Receptor Antibody", "Oligoclonal Bands"],
"condition_count": 10,
"medication_count": 10,
"lab_test_count": 5
},
"Cardiology": {
"common_conditions": ["Coronary Artery Disease", "Heart Failure", "Arrhythmias"],
"common_medications": ["Atorvastatin", "Metoprolol", "Lisinopril"],
"common_lab_tests": ["Lipid Panel", "Cardiac Enzymes", "B-type Natriuretic Peptide"],
"condition_count": 8,
"medication_count": 10,
"lab_test_count": 6
},
...
},
"count": 24
}
The CDISC Dataset Generator API provides programmatic access to synthetic clinical trial data in various CDISC formats. You can generate data for SDTM, ADaM, and SEND domains and download them in multiple formats.
Base URL:
Current Version: v1.1
Currently, the API is available without authentication for development and testing purposes.
GET /api
Returns general information about the API, available endpoints, and example usage.
{
"name": "CDISC Dataset Generator API",
"description": "REST API for generating synthetic CDISC-compliant datasets",
"version": "1.0.0",
"endpoints": [
{
"path": "/api/domains",
"method": "GET",
"description": "Get all available domains for all dataset types"
},
...
]
}
GET /api/domains
Returns all available domains for all dataset types (SDTM, ADaM, and SEND).
{
"SDTM": {
"DM": {
"name": "Demographics",
"description": "Demographics domain containing subject-level data",
"class": "Special Purpose"
},
...
},
"ADaM": {
...
},
"SEND": {
...
}
}
GET /api/domains/{dataset_type}
Returns all available domains for a specific dataset type (SDTM, ADaM, or SEND).
dataset_type
(required): SDTM, ADaM, or SEND (case-insensitive)GET /api/domains/SDTM
{
"DM": {
"name": "Demographics",
"description": "Demographics domain containing subject-level data",
"class": "Special Purpose"
},
"VS": {
"name": "Vital Signs",
"description": "Vital signs measurements",
"class": "Findings"
},
...
}
POST /api/generate
Generates a synthetic CDISC dataset based on the provided parameters.
{
"dataset_type": "SDTM", // Required: SDTM, ADaM, or SEND
"domain": "DM", // Required: Domain code
"subjects": 100, // Optional: Number of subjects (default: 100)
"arms": 3, // Optional: Number of treatment arms (default: 3)
"format": "csv", // Optional: Output format (csv, sas, xpt) (default: csv)
// Note: "json" format temporarily unavailable, will return in a future update with CDISC Dataset-JSON 1.1
"therapeutic_area": "Neurology", // Optional: Customize for specific therapeutic area
"download_file": false // Optional: If true, returns download URL instead of data (default: false)
}
{
"metadata": {
"dataset_type": "SDTM",
"domain": "DM",
"name": "Demographics",
"description": "Demographics domain containing subject-level data",
"class": "Special Purpose",
"subject_count": 100,
"treatment_arms": 3,
"record_count": 100,
"variable_count": 15,
"variables": [
{"name": "STUDYID", "type": "object"},
{"name": "USUBJID", "type": "object"},
...
]
},
"data": [
{
"STUDYID": "STUDY001",
"USUBJID": "STUDY001-001",
"AGE": 45,
"SEX": "M",
...
},
...
]
}
{
"message": "Dataset generated successfully",
"dataset_type": "SDTM",
"domain": "DM",
"subjects": 100,
"arms": 3,
"format": "csv",
"file_id": "sdtm_dm_20250405123456.csv",
"download_url": "/api/download/sdtm_dm_20250405123456.csv",
"record_count": 100
}
GET /api/download/{file_id}
Downloads a previously generated file by its ID. Files are automatically deleted after 1 hour. Note: This endpoint is not needed if you use the direct_download parameter with the generate endpoint.
file_id
(required): File ID returned from the generate endpointReturns the requested file with the appropriate MIME type for download.
GET /api/therapeutic-areas
Returns all available therapeutic areas that can be used to customize dataset generation.
{
"therapeutic_areas": {
"Neurology": {
"common_conditions": ["Alzheimer's Disease", "Parkinson's Disease", "Multiple Sclerosis"],
"common_medications": ["Levodopa", "Carbidopa", "Memantine"],
"common_lab_tests": ["Cerebrospinal Fluid Analysis", "Acetylcholine Receptor Antibody", "Oligoclonal Bands"],
"condition_count": 10,
"medication_count": 10,
"lab_test_count": 5
},
"Cardiology": {
"common_conditions": ["Coronary Artery Disease", "Heart Failure", "Arrhythmias"],
"common_medications": ["Atorvastatin", "Metoprolol", "Lisinopril"],
"common_lab_tests": ["Lipid Panel", "Cardiac Enzymes", "B-type Natriuretic Peptide"],
"condition_count": 8,
"medication_count": 10,
"lab_test_count": 6
},
...
},
"count": 24
}
// Generate a DM domain dataset
fetch('/api/generate', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
dataset_type: 'SDTM',
domain: 'DM',
subjects: 50,
arms: 2,
format: 'csv', // Note: 'json' format temporarily unavailable
therapeutic_area: "Oncology",
})
})
.then(response => response.json())
.then(data => console.log(data))
.catch(error => console.error('Error:', error));
import requests
import json
# Generate a DM domain dataset
response = requests.post(
'https://your-app-url/api/generate',
headers={'Content-Type': 'application/json'},
data=json.dumps({
'dataset_type': 'SDTM',
'domain': 'DM',
'subjects': 50,
'arms': 2,
'format': 'csv',
"therapeutic_area": "Neurology",
'download_file': True
})
)
# Get the download URL
result = response.json()
download_url = result['download_url']
# Download the file
file_response = requests.get('https://your-app-url' + download_url)
with open('dm_data.csv', 'wb') as f:
f.write(file_response.content)
# Alternative: Generate and download a CSV file directly in a single step
# This will download the file without needing a separate download step
response <- POST(
url = "https://your-app-url/api/generate",
body = list(
dataset_type = "SDTM",
domain = "DM",
subjects = 50,
arms = 2,
format = "csv",
therapeutic_area = "Cardiology",
direct_download = TRUE
),
encode = "json",
write_disk("dm_data_direct.csv")
)
The API uses standard HTTP status codes to indicate the status of a request:
Error responses include a JSON object with an "error" key containing a detailed error message.
{
"error": "Missing required parameter: domain"
}