Programmatic Access to Synthetic CDISC-Compliant Datasets
The CDISC Dataset Generator API provides programmatic access to synthetic clinical trial data in various CDISC formats. You can generate data for SDTM, ADaM, and SEND domains and download them in multiple formats.
Base URL:
Current Version: v1.1
Currently, the API is available without authentication for development and testing purposes.
POST /api/get-available-domains
Returns available domains for a specific therapeutic area (used for EDC raw dataset generation).
{
"therapeutic_area": "Oncology" // Required: Therapeutic area name
}
{
"success": true,
"domains": ["dm", "ae", "lb", "vs", "cm", "ex", "rawlb1", "rawlb2", "rawtu1", ...]
}
POST /api/generate-sdtm
Generates a synthetic SDTM dataset based on the provided parameters.
{
"domain": "DM", // Required: Domain code (DM, AE, LB, VS, etc.)
"numSubjects": 100, // Optional: Number of subjects (default: 50)
"therapeuticArea": "Oncology", // Optional: Therapeutic area (default: "Oncology")
"format": "csv" // Optional: Output format (csv, json, xpt) (default: csv)
}
curl -X POST "http://localhost:5000/api/generate-sdtm" \
-H "Content-Type: application/json" \
-d '{"domain": "DM", "numSubjects": 50, "therapeuticArea": "Cardiology", "format": "csv"}'
{
"success": true,
"filename": "sdtm_dm_20250821_101032.json",
"download_url": "/download/sdtm_dm_20250821_101032.json",
"domain": "DM",
"num_subjects": 50,
"therapeutic_area": "Cardiology",
"format": "csv"
}
POST /api/generate-adam
Generates a synthetic ADaM dataset based on the provided parameters. Note: Only ADSL contains demographic variables (AGE, SEX, RACE) per CDISC standards.
{
"domain": "ADSL", // Required: Domain code (ADSL, ADVS, ADLB, ADAE, etc.)
"numSubjects": 100, // Optional: Number of subjects (default: 50)
"therapeuticArea": "Oncology", // Optional: Therapeutic area (default: "Oncology")
"format": "json" // Optional: Output format (csv, json, xpt) (default: csv)
}
curl -X POST "http://localhost:5000/api/generate-adam" \
-H "Content-Type: application/json" \
-d '{"domain": "ADSL", "numSubjects": 100, "therapeuticArea": "Oncology", "format": "csv"}'
{
"success": true,
"filename": "adam_adsl_20250821_101025.csv",
"download_url": "/download/adam_adsl_20250821_101025.csv",
"domain": "ADSL",
"num_subjects": 100,
"therapeutic_area": "Oncology",
"format": "csv"
}
POST /api/generate-send
Generates a synthetic SEND dataset for nonclinical studies based on the provided parameters.
{
"domain": "DM", // Required: Domain code (DM, LB, MI, MA, etc.)
"numSubjects": 50, // Optional: Number of animals (default: 50)
"therapeuticArea": "Toxicology", // Optional: Study type (default: "Oncology")
"format": "xpt" // Optional: Output format (csv, json, xpt) (default: csv)
}
curl -X POST "http://localhost:5000/api/generate-send" \
-H "Content-Type: application/json" \
-d '{"domain": "LB", "numSubjects": 40, "therapeuticArea": "Toxicology", "format": "json"}'
{
"success": true,
"filename": "send_lb_20250821_101033.json",
"download_url": "/download/send_lb_20250821_101033.json",
"domain": "LB",
"num_subjects": 40,
"therapeutic_area": "Toxicology",
"format": "json"
}
GET /api/download/{file_id}
Downloads a previously generated file by its ID. Files are automatically deleted after 1 hour. Note: This endpoint is not needed if you use the direct_download parameter with the generate endpoint.
file_id
(required): File ID returned from the generate endpointReturns the requested file with the appropriate MIME type for download.
GET /api/therapeutic-areas
Returns all available therapeutic areas that can be used to customize dataset generation.
{
"therapeutic_areas": {
"Neurology": {
"common_conditions": ["Alzheimer's Disease", "Parkinson's Disease", "Multiple Sclerosis"],
"common_medications": ["Levodopa", "Carbidopa", "Memantine"],
"common_lab_tests": ["Cerebrospinal Fluid Analysis", "Acetylcholine Receptor Antibody", "Oligoclonal Bands"],
"condition_count": 10,
"medication_count": 10,
"lab_test_count": 5
},
"Cardiology": {
"common_conditions": ["Coronary Artery Disease", "Heart Failure", "Arrhythmias"],
"common_medications": ["Atorvastatin", "Metoprolol", "Lisinopril"],
"common_lab_tests": ["Lipid Panel", "Cardiac Enzymes", "B-type Natriuretic Peptide"],
"condition_count": 8,
"medication_count": 10,
"lab_test_count": 6
},
...
},
"count": 24
}
// Generate a DM domain dataset
fetch('/api/generate', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
dataset_type: 'SDTM',
domain: 'DM',
subjects: 50,
arms: 2,
format: 'csv', // Note: 'json' format temporarily unavailable
therapeutic_area: "Oncology",
})
})
.then(response => response.json())
.then(data => console.log(data))
.catch(error => console.error('Error:', error));
import requests
import json
# Generate a DM domain dataset
response = requests.post(
'https://your-app-url/api/generate',
headers={'Content-Type': 'application/json'},
data=json.dumps({
'dataset_type': 'SDTM',
'domain': 'DM',
'subjects': 50,
'arms': 2,
'format': 'csv',
"therapeutic_area": "Neurology",
'download_file': True
})
)
# Get the download URL
result = response.json()
download_url = result['download_url']
# Download the file
file_response = requests.get('https://your-app-url' + download_url)
with open('dm_data.csv', 'wb') as f:
f.write(file_response.content)
# Alternative: Generate and download a CSV file directly in a single step
# This will download the file without needing a separate download step
response <- POST(
url = "https://your-app-url/api/generate",
body = list(
dataset_type = "SDTM",
domain = "DM",
subjects = 50,
arms = 2,
format = "csv",
therapeutic_area = "Cardiology",
direct_download = TRUE
),
encode = "json",
write_disk("dm_data_direct.csv")
)
The API uses standard HTTP status codes to indicate the status of a request:
Error responses include a JSON object with an "error" key containing a detailed error message.
{
"error": "Missing required parameter: domain"
}