Data Discovery is currently in Private Preview and is not available for General Availability (GA). It should not be used in production environments, as features and functionality may change before the final GA release.

Deploying the Application

Deploying the application and components.

The step-by-step deployment of Data Discovery on Amazon EKS is explained here. Each component builds on the previous, ensuring a reliable and production-ready environment.

The deployment is separated into two main phases:

  • Phase 1: Infrastructure (Terraform) - Provisions the EKS cluster and underlying AWS resources
  • Phase 2: Applications (Helm) - Deploys Kubernetes components and the Data Discovery application

After completing Step 1 (Terraform), if an existing EKS cluster is used, configure the kubectl context to connect to the cluster:

   aws eks update-kubeconfig --region <region> --name <cluster-name>
   # Replace `<region>` with your AWS region and `<cluster-name>` with your EKS cluster name.

EKS Control Plane Provisioning (Terraform)

Deploy the required infrastructure - Terraform setup for EKS cluster, IAM roles, and VPC

Metrics Server

Deploy a Metrics Server for autoscaling capabilities.

Karpenter NodePool

Deploy a Karpenter NodePool for EKS to enable automatic node provisioning and scaling for Data Discovery workloads.

Ingress Controller

Deploy an internal-only NGINX ingress controller with private AWS NLB for a secure TLS-only access to Data Discovery services within your VPC.

Data Discovery Classification

Deploy the Data Discovery Classification service with Pattern and Context providers for data classification and transformation.

Last modified : August 22, 2025