Installing Protegrity Synthetic Data

Steps to install Protegrity Synthetic Data

Helm Deployment

This project deploys the Protegrity Synthetic Data stack on Amazon EKS as a Protegrity AI Team Edition Feature. It uses Helm to deploy Kubernetes workloads.

Deployment Steps

1. Prepare Configuration

  1. Create a namespace for the deployment.

    kubectl create namespace syntheticdata-ns
    
  2. Create a Kubernetes secret using the static IAM access keys for S3 bucket access.

    kubectl -n syntheticdata-ns create secret generic synthobjectstore-creds \
    --from-literal=access_key=YOUR_STATIC_ACCESS_KEY_ID \
    --from-literal=secret_key=YOUR_STATIC_SECRET_ACCESS_KEY
    

    Note: Use static access keys, not temporary session credentials, when creating this secret. These keys allow the Synthetic Data service to access the configured S3 bucket.

  3. Create override_values.yaml file with your specific configuration details, such as

     objectstorage:
       endpoint: "s3.us-east-1.amazonaws.com"  # Update the region 
       bucketName: "<>"  # S3 bucket name for storage (must exist before installation)
     image:
       syndataapi_tag: /synthetic-data/1.0/containers/syntheticdata-service:1.0.1.27
       postgres_tag: /shared/containers/postgres/17:37
     karpenter:
       gpu:
         nodeclass:
           amiId: ami-0f7f4d7faa23356aa   # ID for us-east-1. Update based on your region.
    

    Note:

    • Ensure the S3 bucket is not KMS encrypted. The bucket must use default SSE-S3 encryption or no encryption.
    • Ensure all necessary parameters are set.

2. Deploy

Run the following command to deploy the stack:

helm install pty-synthetic-data oci://<Container_Registry_Path>/synthetic-data/1.0/helm/syntheticdata-service --version=1.0.1 -n syntheticdata-ns --values override_values.yaml

3. Monitor

  1. Monitor the deployment process using:

    kubectl get pods -n syntheticdata-ns
    

    Verify all pods are in the Running state. The following is the sample output.

    NAME                                            READY   STATUS    RESTARTS   AGE
    pty-synthetic-data-nvidia-device-plugin-5648s   1/1     Running   0          3d17h
    syn-db-depl-0                                   1/1     Running   0          3d17h
    syn-scheduler-depl-6696687695-fcsvj             1/1     Running   0          3d17h
    syn-worker-depl-6bf8dcf965-5w2j2                1/1     Running   0          3d17h
    syn-worker-depl-6bf8dcf965-zr829                1/1     Running   0          3d17h
    syndata-app-depl-6c8cb85f89-rpf5j               1/1     Running   0          3d17h
    
  2. Verify all the Synthetic Data services are deployed.

    kubectl get svc -n syntheticdata-ns
    

    The following is the sample output.

    NAME              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
    syn-dask-svc      ClusterIP   172.20.177.37   <none>        8786/TCP   3d17h
    syn-db-svc        ClusterIP   172.20.208.6    <none>        5432/TCP   3d17h
    syndata-app-svc   ClusterIP   172.20.231.58   <none>        8095/TCP   3d17h
    

For more information about building the REST API request, refer to Building the Request Using the REST API.


Last modified : April 13, 2026