CDISC Dataset Generator API

Programmatic Access to Synthetic CDISC-Compliant Datasets

API Overview

The CDISC Dataset Generator API provides programmatic access to synthetic clinical trial data in various CDISC formats. You can generate data for SDTM, ADaM, and SEND domains and download them in multiple formats.

Base URL:

Current Version: v1.1

Authentication

Currently, the API is available without authentication for development and testing purposes.

Endpoints

Get Available Domains for Therapeutic Area

POST /api/get-available-domains

Returns available domains for a specific therapeutic area (used for EDC raw dataset generation).

Request Body:

{
  "therapeutic_area": "Oncology"  // Required: Therapeutic area name
}

Example Response:

{
  "success": true,
  "domains": ["dm", "ae", "lb", "vs", "cm", "ex", "rawlb1", "rawlb2", "rawtu1", ...]
}

Generate SDTM Dataset

POST /api/generate-sdtm

Generates a synthetic SDTM dataset based on the provided parameters.

Request Body:

{
  "domain": "DM",                    // Required: Domain code (DM, AE, LB, VS, etc.)
  "numSubjects": 100,               // Optional: Number of subjects (default: 50)
  "therapeuticArea": "Oncology",    // Optional: Therapeutic area (default: "Oncology")
  "format": "csv"                   // Optional: Output format (csv, json, xpt) (default: csv)
}

Example Request:

curl -X POST "http://localhost:5000/api/generate-sdtm" \
  -H "Content-Type: application/json" \
  -d '{"domain": "DM", "numSubjects": 50, "therapeuticArea": "Cardiology", "format": "csv"}'

Example Response:

{
  "success": true,
  "filename": "sdtm_dm_20250821_101032.json",
  "download_url": "/download/sdtm_dm_20250821_101032.json",
  "domain": "DM",
  "num_subjects": 50,
  "therapeutic_area": "Cardiology",
  "format": "csv"
}

Generate ADaM Dataset

POST /api/generate-adam

Generates a synthetic ADaM dataset based on the provided parameters. Note: Only ADSL contains demographic variables (AGE, SEX, RACE) per CDISC standards.

Request Body:

{
  "domain": "ADSL",                 // Required: Domain code (ADSL, ADVS, ADLB, ADAE, etc.)
  "numSubjects": 100,               // Optional: Number of subjects (default: 50)
  "therapeuticArea": "Oncology",    // Optional: Therapeutic area (default: "Oncology")
  "format": "json"                  // Optional: Output format (csv, json, xpt) (default: csv)
}

Example Request:

curl -X POST "http://localhost:5000/api/generate-adam" \
  -H "Content-Type: application/json" \
  -d '{"domain": "ADSL", "numSubjects": 100, "therapeuticArea": "Oncology", "format": "csv"}'

Example Response:

{
  "success": true,
  "filename": "adam_adsl_20250821_101025.csv",
  "download_url": "/download/adam_adsl_20250821_101025.csv",
  "domain": "ADSL",
  "num_subjects": 100,
  "therapeutic_area": "Oncology",
  "format": "csv"
}

Generate SEND Dataset

POST /api/generate-send

Generates a synthetic SEND dataset for nonclinical studies based on the provided parameters.

Request Body:

{
  "domain": "DM",                   // Required: Domain code (DM, LB, MI, MA, etc.)
  "numSubjects": 50,                // Optional: Number of animals (default: 50)
  "therapeuticArea": "Toxicology",  // Optional: Study type (default: "Oncology")
  "format": "xpt"                   // Optional: Output format (csv, json, xpt) (default: csv)
}

Example Request:

curl -X POST "http://localhost:5000/api/generate-send" \
  -H "Content-Type: application/json" \
  -d '{"domain": "LB", "numSubjects": 40, "therapeuticArea": "Toxicology", "format": "json"}'

Example Response:

{
  "success": true,
  "filename": "send_lb_20250821_101033.json", 
  "download_url": "/download/send_lb_20250821_101033.json",
  "domain": "LB",
  "num_subjects": 40,
  "therapeutic_area": "Toxicology",
  "format": "json"
}

Download Generated File

GET /api/download/{file_id}

Downloads a previously generated file by its ID. Files are automatically deleted after 1 hour. Note: This endpoint is not needed if you use the direct_download parameter with the generate endpoint.

Path Parameters:

  • file_id (required): File ID returned from the generate endpoint

Response:

Returns the requested file with the appropriate MIME type for download.

Get Therapeutic Areas

GET /api/therapeutic-areas

Returns all available therapeutic areas that can be used to customize dataset generation.

Example Response:

{
  "therapeutic_areas": {
    "Neurology": {
      "common_conditions": ["Alzheimer's Disease", "Parkinson's Disease", "Multiple Sclerosis"],
      "common_medications": ["Levodopa", "Carbidopa", "Memantine"],
      "common_lab_tests": ["Cerebrospinal Fluid Analysis", "Acetylcholine Receptor Antibody", "Oligoclonal Bands"],
      "condition_count": 10,
      "medication_count": 10,
      "lab_test_count": 5
    },
    "Cardiology": {
      "common_conditions": ["Coronary Artery Disease", "Heart Failure", "Arrhythmias"],
      "common_medications": ["Atorvastatin", "Metoprolol", "Lisinopril"],
      "common_lab_tests": ["Lipid Panel", "Cardiac Enzymes", "B-type Natriuretic Peptide"],
      "condition_count": 8,
      "medication_count": 10,
      "lab_test_count": 6
    },
    ...
  },
  "count": 24
}

Code Examples

JavaScript / Fetch API

// Generate a DM domain dataset
fetch('/api/generate', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    dataset_type: 'SDTM',
    domain: 'DM',
    subjects: 50,
    arms: 2,
    format: 'csv',  // Note: 'json' format temporarily unavailable
    therapeutic_area: "Oncology",
  })
})
.then(response => response.json())
.then(data => console.log(data))
.catch(error => console.error('Error:', error));

Python / Requests

import requests
import json

# Generate a DM domain dataset
response = requests.post(
    'https://your-app-url/api/generate',
    headers={'Content-Type': 'application/json'},
    data=json.dumps({
        'dataset_type': 'SDTM',
        'domain': 'DM',
        'subjects': 50,
        'arms': 2,
        'format': 'csv',
        "therapeutic_area": "Neurology",
        'download_file': True
    })
)

# Get the download URL
result = response.json()
download_url = result['download_url']

# Download the file
file_response = requests.get('https://your-app-url' + download_url)
with open('dm_data.csv', 'wb') as f:
    f.write(file_response.content)
# Alternative: Generate and download a CSV file directly in a single step # This will download the file without needing a separate download step response <- POST( url = "https://your-app-url/api/generate", body = list( dataset_type = "SDTM", domain = "DM", subjects = 50, arms = 2, format = "csv", therapeutic_area = "Cardiology", direct_download = TRUE ), encode = "json", write_disk("dm_data_direct.csv") )

Error Responses

The API uses standard HTTP status codes to indicate the status of a request:

  • 200 OK: The request was successful.
  • 400 Bad Request: The request was invalid or missing required parameters.
  • 404 Not Found: The requested resource was not found.
  • 500 Internal Server Error: An error occurred on the server.

Error responses include a JSON object with an "error" key containing a detailed error message.

Example Error Response:

{
  "error": "Missing required parameter: domain"
}
Coffee cup Support this project