Protegrity Anonymization Architecture

Communication between Protegrity Anonymization, the Dask Scheduler, and Dask Workers is detailed in this section.

An overview of the communication is shown in the following figure.

Protegrity Anonymization leverages several pods on Kubernetes. The first pod contains the Dask Scheduler. This pod connects to the Dask Worker pod over TLS. If Protegrity Anonymization requires more processing to work with the dataset, then based on the configuration, additional Dask Worker pods can be added. Protegrity Anonymization Web Server performs the processing using an internal Database Server for holding the data securely. The Protegrity Anonymization request is received by the Nginx-Ingress component. Ingress forwards the request to the Anon-App. The Anon-App processes the request and submits the tasks to the Dask Cluster. The Dask Scheduler schedules task on the Dask Workers The Anon-app stores the metadata about the job in the Anon-DB container. Next, the Dask Workers read, write, and process the data that is stored in the Anon-Storage, the request stream, or the Cloud storage. The Anon-Storage uses S3 bucket for storing data. The communication between the Dask Scheduler and the Dask Workers is handled by the Dask Scheduler. The Dask workers run on random ports.

The user accesses Protegrity Anonymization using HTTPS over port 443. The user requests are directed to an Ingress Controller, and the controller in turn communicates with the required pods using the following ports:

8090: Ingress controller and the Protegrity Anonymization API Web Service
8786: Ingress controller and the Dask Scheduler
8100: Ingress controller and S3 bucket

Protegrity Anonymization leverages Kubernetes for data anonymization at scale and it provides instructions and support for deployment and usage on AWS EKS and Microsoft Azure AKS.

Feedback

Was this page helpful?

Last modified : March 24, 2026