Protegrity Synthetic Data Architecture

Communication between Protegrity Synthetic Data, the Dask Scheduler, and Dask Workers is detailed in this section.

An overview of the communication is shown in the following figure. Synthetic Data Components

The Synthetic Data system includes the following core components:

Key Pods and Services

Synthetic Data App Pod
- Orchestrates Synthetic Data generation.
MLFlow Pod
- Captures model training and evaluation.
- Hosted in containers for scalability.
MinIO Pod
- Stores models, model artifacts, and generated reports.
- Used by both MLFlow and Synthetic Data App pods.
SQL Database Server Pod
- Provides storage for MLFlow experiments metadata.

Synthetic Data can be generated using:

These interfaces allow developers and data scientists to interact with the system programmatically or visually.

Users access the Protegrity Synthetic Data using HTTP over default port 8095 and other services using the following ports:

Port	Communication Path
5000	MLFlow pod
5432	SQL Database Server
8095	Protegrity Synthetic Data Service
9000	MinIO

Like the Protegrity Anonymization API, the entire Synthetic Data API can be hosted using any cloud-provided Kubernetes service, including:

This flexibility allows organizations to scale Synthetic Data generation securely across environments.

Was this page helpful?

Last modified : November 07, 2025