This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Snowflake

Protector for Snowflake on GCP.

1: Overview
2: Architecture
3: Installation

3.1: Prerequisites
3.2: Pre-Configuration
3.3: Protect Service Installation
3.4: Snowflake Configuration
3.5: Policy Agent Installation

3.6: Audit Log Forwarder Installation
3.7:
3.8:

4: Understanding Snowflake Objects

4.1: External Functions

4.2: Snowflake Masking Policies

5: Performance

5.1: Performance Considerations
5.2: Sample Benchmarks
5.3: Concurrency
5.4: Log Forwarder Performance

6: Audit Logging

7: No Access Behavior
8: Upgrading To The Latest Version
9: Known Limitations
10: Appendices

10.1: Integrating Cloud Protect with PPC (Protegrity Provisioned Cluster)
10.2: Sample Snowflake External Function
10.3: Configuring Regular Expression to Extract Policy Username
10.4: Associating ESA Data Store With Cloud Protect Agent
10.5: Undeliverable Audit Log Recovery

This section describes the high-level architecture of the Protegrity Snowflake Protector on Google Cloud Platform (GCP), and the installation procedures. This section focuses on Protegrity specific aspects and should be consumed in conjunction with corresponding Snowflake and Google Cloud documentation.

This guide may also be used with the Protegrity Enterprise Security Administrator Guide, which explains the mechanism for managing the data security policy.

1 - Overview

Solution overview and features.

Solution Overview

Snowflake Protector on Google Cloud is a cloud native, serverless product for fine-grained data protection with Snowflake™, a managed Cloud data warehouse. This enables invocation of the Protegrity data protection cryptographic methods from the Snowflake SQL execution context. The benefits of serverless include rapid auto-scaling, performance, low administrative overhead, and reduced infrastructure costs compared to a server-based solution.

This product provides data protection services invoked by External User Defined Functions (UDFs) within Snowflake. The UDFs act as a client transmitting micro-batches of data to the serverless Protegrity Cloud function. User queries may generate hundreds or thousands of parallel requests to perform security operations. Protegrity’s serverless function is designed to scale and yield reliable query performance under such load.

Snowflake Protector on Google Cloud utilizes a data security policy maintained by an Enterprise Security Administrator (ESA), similar to other Protegrity products. Using regular SQL database queries or tools, such as, Tableau™, authorized users can perform de-identification (protect) and re-identification (unprotect) operations on data within the managed Cloud data warehouse. A user’s individual capabilities are subject to privileges and policies defined by the Enterprise Security Administrator.

The following data ingestion patterns are available with your managed Cloud data warehouse:

Data protection at source applications: In this case, sensitive data is already de-identified (protected) across the enterprise wherever it resides, including the managed data warehouse. Protected data can be ingested directly into your managed Cloud data warehouse. Depending on usage patterns, this ensures that your managed data warehouse is not brought into scope for PCI, PII, GDPR, HIPPA, and other compliance policies.
Data protection using the Extract-Transform-Load (ETL) pattern: In this case, sensitive data may be transformed with a Protegrity protector either on-premise or in the Cloud before it is ingested into Snowflake.
Data protection using the Extract-Load-Transform (ELT) pattern: In this case, sensitive data is protected after it lands into the target system typically through a temporary landing table. It uses the native data warehouse’s compute engine with Protegrity to protect incoming data at very high throughput rates. After the data is protected, the intermediate loading tables are dropped as part of the ingestion workflow.

Analytics on Protected Data

Protegrity’s format and length preserving tokenization scheme make it possible to perform analytics directly on protected data. Tokens are join-preserving so protected data can be joined across datasets. Often statistical analytics and machine learning training can be performed without the need to re-identify protected data. However, a user or service account with authorized security policy privileges may re-identify subsets of data using the Snowflake Protector on GCP service.

Features

Snowflake Protector on GCP incorporates Protegrity’s patent-pending vaultless tokenization capabilities into cloud-native serverless technology. Combined with an ESA security policy, the protector provides the following features:

Role-based access control (RBAC) to protect and unprotect (re-identify) data depending on the user privileges.
Policy enforcement features of other Protegrity application protectors.

For more information about the available protection options, such as, data types, Tokenization or Encryption types, or length preserving and non-preserving tokens, refer to Protection Methods Reference.

2 - Architecture

Deployment architecture and connectivity

Deployment Architecture

The Protegrity product should be deployed in the customer’s Cloud account within the same Google Cloud region as the Snowflake cluster. The product incorporates Protegrity’s vaultless tokenization engine within Google Cloud Functions. The encrypted data security policy from an ESA is deployed periodically as a static resource together with Cloud Function binaries. The policy is decrypted in memory at runtime within the Cloud Function. This architecture allows Protegrity to be highly available and scale very quickly without direct dependency on any other Protegrity services.

The product exposes a remote data protection service invoked from external User Defined Functions (UDFs), a native feature of specific Cloud databases. The UDFs can be invoked through direct SQL queries or database views. The external UDF makes parallel API calls to the serverless Protegrity Cloud function via the GCP API Gateway to perform protect and unprotect data operations. Each network REST request includes a micro-batch of data to process and a secure context header generated by the database with the username and a Protegrity context header with the data element type, and operation to perform. The product applies the ESA security policy including user authorization and returns a corresponding response. Security operations on sensitive data performed by protector can be audited. The product can be configured to send audit logs to ESA via optional component called Log Forwarder.

The security policy is synchronized through another serverless component called the Protegrity Policy Agent. The agent operates on a configurable schedule, fetches the policy from the ESA, performs envelope encryption using Google Key Management Service, and deploys new version of Cloud Function with updated policy. This solution can be configured to automatically provision the static policy or the final step can be performed on-demand by an administrator. There is no downtime for users during this process. Instances provisioned with the function’s previous policy version may continue running (and processing traffic) for several minutes after a deployment has finished.

The following diagram shows the high-level architecture described above.

The following diagram shows a reference architecture for synchronizing the security policy from the ESA to the product.

The Protegrity Policy Agent requires network access from the Cloud to your ESA. Most organizations install the ESA on-premise. Therefore, it is recommended that the Policy Agent is installed into a private subnet with a Cloud VPC using a NAT Gateway to enable this communication through a corporate firewall.

The ESA is a soft appliance that must be pre-installed on a separate server. It is used to create and manage security policies.

For more information about installing the ESA, and creating and managing policies, refer to the Policy Management Section.

Audit Log Forwarding Architecture

Audit logs are by default sent to Cloud Logging. The Protegrity Product can also be configured to send audit logs to ESA. Such configuration requires deploying Log Forwarder component which is available as part of Protegrity Product deployment bundle. The diagram below shows additional resources deployed with Log Forwarder component.

The log forwarder component includes Pub/Sub service topic and the audit log forwarder function. Pub/Sub service is used to asynchronously send audit records to forwarder function, where similar audit logs are aggregated before sending to ESA. Aggregation rules are described in the Protegrity Log Forwarding guide. When the protector function is configured to send audit logs to log forwarder, audit logs are aggregated on the protector function before sending to Pub/Sub topic. Protector function exposes configuration to control the time it spends aggregating audit logs which is described in the protector function installation section.

The security of audit logs is ensured by using HTTPS connection on each link of the communication between protector function and ESA. Integrity and authenticity of audit logs is additionally checked on log forwarder which verifies individual logs signature. The signature verification is done upon arrival from Pub/Sub topic before applying aggregation. If signature cannot be verified, the log is forwarded as is to ESA where additional signature verification can be configured. Log forwarder function uses basic auth and optional certificate verification to authenticate calls to ESA. Basic auth credentials are stored securely in Secret Manager Service.

To learn more about individual audit log entry structure and purpose of audit logs, refer to Audit Logging section in this document. Installation instruction can be found in the Audit Log Forwarder Installation.

The audit log forwarding requires network access from the cloud to the ESA. Most organizations install the ESA on-premise. Therefore, it is recommended that the Log Forwarder Function is installed into a private subnet with a Cloud VPC using a NAT Gateway to enable this communication through a corporate firewall.

Snowflake Connectivity

Snowflake communicates to the Snowflake Protector through the Google Cloud API Gateway. The API Gateway is configured to use an OAuth 2.0 implicit grant flow. Snowflake generates a JWT token used for remote authorization with the API Gateway. Snowflake requests are directed to the customer’s Google API Gateway which is configured to verify the JWT token. The following figure illustrates the integration architecture between Snowflake and the Protegrity software.

Snowflake’s API Integration Object

The Snowflake API Integration Object provides a connection between your Snowflake Google Cloud Account and your Google Cloud account where the Protegrity product is hosted. Creating this connection requires setting up the API Gateway and the appropriate access policies. These steps are provided in the installation.

3 - Installation

Product Installation Guide.

3.1 - Prerequisites

Requirements before installing the protector.

Google Cloud Services

The following table describes the Google Cloud services that may a part of your Protegrity installation.

Service	Description
Cloud Run Functions	Provides serverless compute for Protegrity protection operations and the ESA integration to fetch policy updates.
API Gateway	Provides the end-point and access control (Required for Snowflake only).
Key Management Service	Provides cryptographic keys for envelope encryption/decryption of the policy.
Secret Manager Service	Stores secrets required during deployment, e.g., ESA credentials.
Cloud Storage Service	Storage location for the encrypted ESA policy package.
Identity and Access Management	Enforces access policies for deployed resources.
Cloud Logging Service	Application and audit logs, performance monitoring, and alerts.
Cloud VPC	Required for securing network access to On-Prem or cloud-based ESA.
Pub/Sub	Provides a messaging service when forwarding audit logs to ESA is enabled.

ESA Version Requirements

The Protector and Log Forwarder functions require a security policy from a compatible ESA version.

The table below shows compatibility between different Protector and ESA versions.

Note

For the latest up-to-date information refer to: Protegrity Compatibility Matrix

Protector Version	ESA Version
Protector Version	8.x	9.0	9.1 & 9.2	10.0
2.x	No	Yes	*	No
3.0.x & 3.1.x	No	No	Yes	No
3.2.x	No	No	Yes	*
4.0.x	No	No	No	Yes

Legend
Yes	Protector was designed to work with this ESA version
No	Protector will not work with this ESA version
*	Backward compatible policy download supported: Data elements and features which are common between this and previous ESA versions will be downloaded Data elements and features which are new to this ESA version and do not exist in previous ESA version will not be downloaded

Legend

Yes

Protector was designed to work with this ESA version

Protector will not work with this ESA version

Backward compatible policy download supported:

Data elements and features which are common between this and previous ESA versions will be downloaded
Data elements and features which are new to this ESA version and do not exist in previous ESA version will not be downloaded

Prerequisites

The following requirements must be completed for the Snowflake implementation.

Requirements	Description
Protegrity distribution and installation scripts	These artifacts are provided by Protegrity
Protegrity ESA 10.0+	The Cloud VPC must be able to obtain network access to the ESA
Google Cloud Account	Recommend creating a new project for Protegrity Serverless
Snowflake cluster (Enterprise Edition recommended)
Terraform CLI v0.14 or higher	Terraform is used to deploy resources to Google Cloud Account

Required Skills and Abilities

Requirements	Description
Google Cloud Account Administrator	Run Terraform (or perform steps manually), create/configure a VPC and IAM permissions.
Protegrity Administrator	The ESA credentials required to extract the policy for the Policy Agent
Snowflake Administrator	Account Admin access required to setup access
Network Administrator	Open firewall to access ESA and evaluate Google Cloud network setup

3.2 - Pre-Configuration

Configuration steps prior product installation.

Google Cloud Project

Identify or create a new Google Cloud Project where the Protegrity solution will be installed. It is recommended to create a new project. This provides greater security controls and avoids conflicts with other applications that might impact regional account limits. An individual with the Owner role will be required for some of the subsequent installations.

Google Project ID: ___________________

Google Project Number: ___________________

Google Cloud Region: ___________________

Region

Query the Google Cloud region where the Snowflake cluster is running. This is the region in which Protegrity Serverless must be installed.

To determine the Google Cloud region:

Login to Snowflake
In the SQL console, run the following query.
```
select current_region();
```
Record the Google Cloud region. For example, GCP_US_CERNTAL1.
Snowflake Google Cloud Region: ___________________

Key Management Service

The Google Cloud Key Management Service (KMS) provides Protegrity Serverless solution the ability to encrypt and decrypt the Protegrity Security Policy.

To create KMS Key Ring and Asymmetric Encryption Master Key:

Log in to Google Account and select project where Protegrity service will be installed.
Navigate to Security > Key Management.
Select Create key ring.
Specify key ring name. For example, protegrity-policy-keyring.
select Key ring location which corresponds to the region where Protegrity solution will be installed.
Note
A key’s location impacts the performance of protect service.
Select Create.
Select CREATE KEY to create encryption key.
Specify key name. For example, protegrity-policy-key.
under Purpose selection, select Asymmetric Decrypt .
Select Key Algorithm. For example, 3072-bit RSA with OAEP Padding and SHA256 digest.
Select Create.
Once the key is created, a screen opens on the key. If the screen does not appear, click on the key name.
Then click on the elipses under Actions that is next to the key version.
Select Copy Resource Name and record the value below, e.g., projects/{project-id}/locations/region/keyRings/{key-ring}/cryptoKeys/{key-name}/cryptoKeyVersions/1
Policy Encryption Key Version Resource Name: ___________________

Google Cloud Storage

Cloud Storage buckets are required for the Gen 2 Cloud Function sources, the Terraform backend, and the deployment of the Protegrity installation artifacts. It is recommended that you create 3 separate buckets to separate files used for different purposes. If you cannot create 3 separate buckets, you may reuse a bucket for multiple purposes.

Create the buckets:

Run the cloud command below to enable the Google Storage Transfer API.
```
gcloud services enable storagetransfer.googleapis.com
```
Create the Gen 2 Cloud Function sources bucket. This bucket is not required if you will be deploying to Gen 1 Cloud Functions. The bucket name much match the example below. Replace the <gcp-project-number> and <region> placeholders.
```
gcf-v2-sources-<gcp-project-number>-<region>
```
Use the following gcloud command to obtain project number
```
gcloud projects describe <gcp-project-id> --format='value(projectNumber)'
```
Create the deployment bucket or reuse an existing bucket. This bucket is used during the installation process to store the Protegrity installation artifacts.
Deployment Bucket Name:___________________
Create the Terraform backend bucket or reuse an existing bucket. This bucket is used by Terraform to store information about your Cloud Protect installation, and will be used if you upgrade to a later version of Cloud Protect in the future.
Terraform Backend Bucket Name:___________________

Note

You may delete the deployment bucket after you’ve completed the installation. A deployment bucket is required for upgrades, but it can be recreated at that time. The Terraform backend files must be retained for upgrading your Cloud Protect deployment in the future.

Cloud Functions Service Accounts

Cloud Functions use the service accounts created in this deployment. You can create Service accounts manually or use the Protegrity Terraform installation script to create one. Each service account requires specific permissions, which must be granted through IAM roles. Run the following steps to create service accounts and configure the required IAM access. If you use Terraform scripts, skip these steps.

Agent Function IAM Role

To create Agent Function IAM Role:

Log in to Google Account and select project where Protegrity service will be installed.
Navigate to IAM & Admin > Roles, Select CREATE ROLE.
Specify role name and description.
Select ADD PERMISSIONS.
Select the following permissions:
- cloudkms.cryptoKeyVersions.useToEncrypt
- cloudkms.cryptoKeyVersions.viewPublicKey
- secretmanager.versions.access
- storage.objects.get
- storage.objects.create
- storage.objects.delete
- storage.objects.list
- storage.objects.update
- storage.buckets.get
- cloudfunctions.functions.get
- cloudfunctions.functions.update
- cloudfunctions.functions.sourceCodeGet
- cloudfunctions.functions.sourceCodeSet
- iam.serviceAccounts.actAs
Click Add and then Create.

Alternatively, you can run the following command from the Cloud Shell Terminal.

      gcloud iam roles create role-id \
      --project=project-id \
      --title=role-title \
      --description=role-description \
      --permissions=cloudkms.cryptoKeyVersions.useToEncrypt,\
      cloudkms.cryptoKeyVersions.viewPublicKey,\
      secretmanager.versions.access,\
      storage.objects.get,\
      storage.objects.create,\
      storage.objects.delete,\
      storage.objects.list,\
      storage.objects.update,\
      storage.buckets.get,\
      cloudfunctions.functions.get,\
      cloudfunctions.functions.update,\
      cloudfunctions.functions.sourceCodeGet,\
      cloudfunctions.functions.sourceCodeSet,\
      iam.serviceAccounts.actAs \
      --stage=GA

role-id
is the name of the role, such as ptyProtectRole.
project-id
is the name of the project, such as my-project-id.
role-description
is a short description of the role, such as “My custom role description”.

Sample output:


      Created role [role-id]. 
      description: role-description 
      etag: *****************
      includedPermissions: 
      - cloudfunctions.functions.get 
      - cloudfunctions.functions.sourceCodeGet 
      - cloudfunctions.functions.sourceCodeSet 
      - cloudfunctions.functions.update 
      - cloudkms.cryptoKeyVersions.useToEncrypt 
      - cloudkms.cryptoKeyVersions.viewPublicKey 
      - iam.serviceAccounts.actAs 
      - secretmanager.versions.access 
      - storage.buckets.get 
      - storage.objects.create 
      - storage.objects.delete 
      - storage.objects.get 
      - storage.objects.list 
      - storage.objects.update 
      name: projects/{project-id}/roles/{role-id} 
      stage: GA 
      title: role-title

Agent Service Account

To create Agent Service Account:

Log in to Google Account and select project where Protegrity service will be installed.
Navigate to IAM & Admin > Service Accounts.
Select CREATE SERVICE ACCOUNT.
Specify service account name and description.
Select Create and Continue.
In the next step, click Select Role.
Select Custom and select the role created above .
Click Done.
Once the service account is created, the screen should open on the service account. If the screen does not appear, refresh the page with the service account list and select the service account created.
Record the full email. For example, service-account-name@project-id.iam.gserviceaccount.com
Agent Function Service Account Email: ___________________

Protect Function IAM role

To create Protect Function IAM role:

Log in to Google Account and select project where Protegrity service will be installed.
Navigate to IAM & Admin > Roles, Select CREATE ROLE.
Specify role name and description.
Select ADD PERMISSIONS.
Select the cloudkms.cryptoKeyVersions.useToDecrypt permission.
Click Add and then Create.

Protect Service Account

To create Protect Service Account:

Log in to Google Account and select the project where Protegrity service will be installed.
Navigate to IAM & Admin > Service Accounts.
Select CREATE SERVICE ACCOUNT.
Specify service account name and description.
Select Create and Continue.
In the next step, click Select Role. Then select Custom and select the role created above .
Click Done.
Once the service account is created, the screen should open on the service account. If the screen does not appear, refresh the page with the service account list and select the service account created.
Record the full email. For example, service-account-name@project-id.iam.gserviceaccount.com.
Protect Function Service Account Email: ___________________

3.3 - Protect Service Installation

Product Installation Guide.

Preparation

Ensure that all the steps in Pre-Configuration are performed.
Log in to the Google Cloud account where Protegrity will be installed.
Select the project.
Ensure that you have access to shell command on your computer or Cloud Shell with Terraform CLI v0.14 or higher installed.
Ensure that the Terraform scripts provided by Protegrity are available on your local computer.

Install Protect Function via Terraform Scripts

Resources created with Terraform scripts include Protect Cloud Functions Service and other required resources depending on Terraform parameters. If you don’t specify the deployment bucket Terraform parameter, a new storage bucket will also be created. You can optionally choose to create a new service account with custom IAM role.

To install using Terraform:

From the command shell move to directory where you downloaded Protegrity installation bundle.
Unzip the main bundle. Then unzip protegrity-cloud-protect-gcp-{version}.zip. Verify that the following files are available:
- pty-protect-gcp/
- main.tf
- outputs.tf
- protegrity-cloud-protect-gcp-{version}.zip
- README.md
Unzip the protegrity-cloud-protect-gcp-{version}.zip file. Verify that the following files are available:
- pty-protect-gcp/
- main.tf
- outputs.tf
- protegrity-cloud-protect-gcp-{version}.zip
- README.md

Open the main.tf file and update Terraform backend information at the top of the file:

terraform {
  backend "gcs" {
    bucket  = ""
    prefix  = "protegrity/terraform/pty-protect-gcp/state"
  }
}

In the same main.tf file, specify the following Terraform variables: All the values were recorded in Google Cloud Project.
Warning
Google Cloud Function 2nd Generation currently does not support CMEK.

Parameter	Description
project_id	The project id recorded in the pre-configuration step
region	The Region recorded in the pre-configuration step.
deployment_id	Specify short name to identify deployment. This id will be added to all resources deployed with Terraform.
deployment_bucket	Use Deployment Bucket Name recorded in pre-configuration or leave empty to create new bucket.
create_service_account	Leave this as false if you created service account in pre-configuration. Otherwise set to true.
protect_function_service_account_email	Use Protect Function Service account recorded in pre-configuration or leave empty.
api_gateway_gcp_service_account_issuer	Allows setting issuer of JSON Web Token credential used to authenticate calls to API Gateway. Set this to API_GCP_SERVICE_ACCOUNT obtained in section Describe the API Integration Object
username_regex	If username_regex is set, the effective policy user will be extracted from the user in the request. Note See Configuring Regular Expression to Extract Policy Username to learn how to extract username from the request
max_instance_count	GCP Cloud Functions advanced configuration
available_memory_mb	GCP Cloud Functions advanced configuration
timeout_seconds	GCP Cloud Functions advanced configuration
gen2_available_cpu	2nd Gen Cloud Function advanced configuration
gen2_container_concurrency	2nd Gen Cloud Function advanced configuration
upgrade_step	Set this variable when upgrading to the latest version.
labels	You can set this map to include labels for deployed resources. Pay attention to GCP label requirements. For more information, refer to the following link https://cloud.google.com/compute/docs/labeling-resources. For example, only use lowercase and maximum length of 63 characters.
min_log_level	Minimum log level for log forwarder function. One of off\|severe\|warning\|info\|config\|all. Defaults to ‘severe’
pty_log_output	Audit log output. Accepted values: “”(empty string), “pub_sub”. Note When set to “pub_sub” audit logs will be aggregated and sent to Pub/Sub topic. See Log Forwarder installation section for more details.
pty_pub_sub_topic	Pub/Sub topic where audit logs will be sent.

Run the following command.
```
terraform init
```
Terraform will download necessary providers.
Run the following command to verify configuration and print out deployment plan.
```
terraform plan
```
Run the following command to deploy resources to your account.
```
terraform apply
```
Once deployment is complete Terraform will print output variables.
Record the following values:
- protect_function_name: ________________________________
- protect_function_url: __________________________
- api_gateway_managed_service: _____________________________
- api_gateway_protect_service_url: ____________________
- protect_function_resource_name: _______________________

Test Protect Function Installation

Before continuing with next steps, you can verify whether Cloud Functions are installed correctly. This step is optional and can be skipped.

Below you can find example CURL command to test your function.
Before you can execute it, you need to obtain temporary authentication token. Run the gcloud auth login and then gcloud auth print-identity-token commands. The logged in gcloud user must have the roles/run.invoker role. Record the output of print identity token command.
gcloud_auth_token: _________________
Replace {protect_function_url}; with value recorded in previous step.
Replace {gcloud_auth_token} with value recorded in above step.

Run the following CURL command to test Function deployment.

curl -X POST "{protect_function_url}" \
        -H 'Authorization:Bearer {gcloud_auth_token}' \
-H 'sf-custom-X-Protegrity-HCoP-Rules: {"jsonpaths":[{"op_type":"unprotect","data_element":"alpha"}]}' \
-H 'sf-context-current-user: test' \
-H 'sf-external-function-current-query-id: test-id' \
-H 'Content-Type: application/json' \
-d '{ 
  "data": [ 
    ["0", "UtfVk UHgcD!"] 
  ] 
} 
'

Note

When you copy-paste the curl command, make sure each header is in its separate line.

Verify the following output:
```
{"data":[[0,"hello world!"]]}
```

3.4 - Snowflake Configuration

Configure Snowflake to access the API Gateway.

The following sections will configure Snowflake to access the API Gateway. The Terraform installation deployed a sample policy that can be used to smoke test the installation.

Ensure that the current user can assume the Account Administrator role. This role is required to create the Snowflake API Integration object.

Create the Snowflake API Integration Object

From the Snowflake console worksheet, select the role ACCOUNTADMIN.
Paste the following text and replace the two parameters <api_gateway_managed_service> and <api_gateway_protect_service_url> with values recorded in the last installation step of Install Protect Function via Terraform Scripts, then run the following Data Definition Language (DDL) in the console to create API integration object:
```
create or replace api integration protegrity_api 
api_provider = google_api_gateway
google_audience = '<api_gateway_managed_service>' 
enabled = true
api_allowed_prefixes = ('<api_gateway_protect_service_url>/pty/snowflake');
```
Note
The name of the object protegrity_api can be replaced with a name of your choice, however the name you choose must be used consistently throughout the installation steps below.

Describe the API Integration Object

We require values generated by the Snowflake integration object to configure the API Gateway Authorization.

To describe API integration objects:

Run the following query in the console.

DESCRIBE API INTEGRATION protegrity_api;

Record the API_GCP_SERVICE_ACCOUNT value from the resulting query:
- API GCP Service Account: ___________________

Update API Gateway Authorization Configuration

This step allows the Snowflake service account to invoke Protect API Gateway endpoint.

Update Protect API Gateway Endpoint:

Return to Terraform script used to install Protegrity Protect service.
Open main.tf and update api_client_service_account_email with the API GCP Service Account recorded in previous step.
Run terraform apply.
Wait till the process is completed.

Test Connectivity

Perform the following steps to verify whether Snowflake is working correctly with the Protegrity product.

Access the Snowflake SQL console.

Copy and paste the following snippet into a worksheet.

CREATE OR REPLACE SECURE EXTERNAL FUNCTION PTY_UNPROTECT_SAMPLE_POLICY(VAL VARCHAR)
    RETURNS VARCHAR(16777216)
    IMMUTABLE
    API_INTEGRATION = PROTEGRITY_API
    HEADERS =(  
    'X-Protegrity-HCoP-Rules'=
    '{"jsonpaths":[{"op_type":"UNPROTECT","data_element":"alpha"}]}'
    ) 
    CONTEXT_HEADERS = (CURRENT_USER,CURRENT_TIMESTAMP,CURRENT_ACCOUNT)
    COMMENT='Unprotects text using an alpha token type.'
    AS '<api_gateway_protect_service_url>/pty/snowflake';

Replace the placeholder value indicated substituting your API Gateway URL captured in the Terraform outputs (api_gateway_protect_service_url).

Run the following protect in the console:

select pty_unprotect_sample_policy('UtfVk UHgcD!');

Verify that the string hello world! is returned.

Troubleshooting

Use Cloud Logging to troubleshoot errors.

From your Google Console, navigate to Logging > Logs Explorer

Use the Log Fields panel to filter results by resource type, name, severity, and other criteria. For instance to see the last Cloud Protect Function logs, make the following selections:

RESOURCE TYPE = Cloud Function 
    FUNCTION NAME = pty-protect-{deployment-id}

You can also use the Log Filter Query and run the following query:

resource.type="cloud_function" 
    resource.labels.function_name="pty-protect-"

You can change the time range in the top right corner. If Protegrity policy is configured to generate audit logs, you can use the following query to only view the audit logs:

resource.type="cloud_function" 
  resource.labels.function_name="pty-protect-" 
  jsonPayload.message=~"\"type\":\"audit\""

3.5 - Policy Agent Installation

Install the policy agent.

Policy Agent Function installation is done via Terraform scripts provided by Protegrity. Before running the template, some resources must be created manually.

ESA Server

Policy Agent function requires ESA server running and accessible from Agent Cloud Function on TCP port 8443. Make sure inbound connections on TCP:8443 are allowed for the network where ESA is hosted.

Note down ESA IP address:

ESA IP Address (EsaIpAddress): ___________________

Certificates on ESA

By default, ESA is configured with self-signed certificates, which can only be validated using self-signed CA certificate supplied in Cloud Function Environment variables configuration.

In case ESA is configured with publicly signed certificates, this section can be skipped since the Cloud Function will use public CA to validate ESA certificates.

To obtain self-signed CA certificate from ESA:

Log in to ESA Web UI.
Select Settings > Network > Manage Certificates.
Hover over Server Certificate and click on download icon to download the CA certificate.
After certificate is downloaded, open the PEM file in text editor and replace all new lines with escaped new line: \n.
To escape new lines from command line, use one of the following commands depending on your operating system:
Linux Bash:
```
awk 'NF {printf "%s\\n",$0;}' ProtegrityCA.pem > output.txt
```
Windows PowerShell:
```
(Get-Content '.\ProtegrityCA.pem') -join '\n' | Set-Content 'output.txt'
```
Record the certificate content with new lines escaped.
ESA CA Server Certificate (EsaCaCert): ___________________
This value will be used to set pty_esa_ca_server_cert Terraform variable in installation section.

For more information about ESA certificate management refer to Certificate Management Guide in ESA documentation.

Identify or Create a new VPC

Google Cloud VPC is used to route traffic from Policy Agent Cloud Function to ESA. If your ESA is in a Google Cloud VPC, it is recommended to create a serverless VPC access and record its name:

google_vpc_access_connector_name: ___________________

Note

For more information on serverless VPC connector, refer to the following link. https://cloud.google.com/vpc/docs/configure-serverless-vpc-access

If ESA is not on Google Cloud VPC, you can either create one or choose to let Terraform script to create one. The Terraform script will create the following elements:

NAT gateway
To connect to ESA outside the Google Cloud Network
External IP address
Can add it to the allowlist by the firewall in the network environment where ESA is hosted.
Serverless VPC access
Allows connectivity from the Cloud function to the VPC.

Note

These services will incur additional Google Cloud charges.

Creating ESA Credentials

Policy Agent Function requires ESA credentials to be provided as one of the two options:

Note

The username and password of the ESA user requires role with DPS Admin and Export Certificates permissions. Security Administrator is one of the predefined roles which contains the above permissions, however for separation of duties it is recommended to create custom role.

Secret Manager

Secret Manager is the recommended option for storing ESA credentials.

Create ESA credentials secrets:

Log in to Google Account and select project where Protegrity service will be installed.
Go to Security > Secret Manager.
Select CREATE SECRET.

Specify the Secret Value:

{
  "username": "{esa_username}", 
  "password": "{esa_password}"
}

Select Create Secret.
Once the secret is created, you should see the secret screen opened. If not click on the secret name to see a screen with secret versions.
Click on Actions, next to the secret version you just created.
Select Copy Resource ID and record the full secret version path, For example, projects/{project-id}/secrets/{secret name}/versions/2.
Secret resource id: ___________________

Custom Cloud Function

If you have the skills to write code, you may provide a custom Cloud Function that returns the ESA credentials to the Policy Agent. One use case is when reading the ESA credentials from a third-party password vault.

Create the Cloud Function:

Create a new 2nd gen Cloud Function using any runtime.
1. The Policy Agent does not provide an input payload.
2. The Cloud Function must return a response according to the following schema:
```
response: 
  type: object 
    properties: 
      username: string 
      password: string
```
  For example,
```
example output: {"username": "admin", "password": "Password1234"} 
```
3. Sample GCP Function in Python:
```
def handler(request): 
    return {"username": "admin", "password": "password1234"} 
```
  Warning
  Protegrity does not recommend hardcoding ESA password in the clear.
Grant the Cloud Run Invoker role to the Policy Agent function service account.
Grant the cloudfunctions.functions.get permission to the Policy Agent function service account role.
Record the Function name:
ESA CREDENTIALS FUNCTION NAME: _______________

Install Policy Agent Function through Terraform Scripts

Agent Terraform scripts provided by Protegrity create a Cloud Function in your Google account. If you don’t specify the deployment bucket Terraform parameter, a new storage bucket will also be created. You can also create the following optional resources by specifying the corresponding parameters:

Service account with IAM role
VPC with NAT external IP
VPC access connector

To install Policy Agent Function through Terraform:

From command shell, move to the directory where you downloaded Protegrity installation bundle.
Unzip the bundle, then unzip the protegrity-agent-gcp-{version}.zip. Verify that the following files are available:
- pty-agent-gcp/
- main.tf
- outputs.tf
- README.md

Open the main.tf file and update Terraform backend information at the top of the file:


terraform {
  backend "gcs" {
    bucket  = ""
    prefix  = "protegrity/terraform/pty-protect-gcp/state"
  }
}

Set the bucket property to Terraform Backend Bucket Name recorded in Google Cloud Storage
Set the prefix property with value unique to your deployment.

In the same main.tf file, specify the following Terraform variables.

Parameter	Description
project_id	The Project ID recorded in the pre-configuration step
region	The Region recorded in the pre-configuration step, for example, us-central1.
deployment_id	Specify short name to identify deployment. This id will be added to all resources deployed with Terraform.
deployment_bucket	Use Deployment Bucket Name recorded in pre-configuration or leave empty to create new bucket.
deployment_bucket_location	Geographical location of deployment bucket, e.g., US, EU, ASIA.
deployment_file_directory_path	Path to directory where deployment zip file is located. By default the deployment file should be in the same directory as this main.tf file.
policy_download_cron_expression	Cron expression determining how often policy agent function will run to synchronize security policy from ESA.
create_service_account	Leave this as false if you created service account in pre-configuration. Otherwise set to true.
agent_function_service_account_email	Use Agent Function Service account recorded in pre-configuration or leave empty.
create_vpc	Set this to true, if you would like to create VPC with NAT, external IP and vpc access connector, otherwise leave empty. This will be ignored if google_vpc_access_connector_name is specified.
google_vpc_access_connector_name	Specify the existing VPC access connector name you identified in earlier step, otherwise leave empty. This setting will disable create_vpc = true.
google_vpc_access_connector_full_resource_name	Alternative configuration for VPC access connector. If this parameter is set the google_vpc_access_connector_name will be ignored. Use this parameter, if vpc connector is in different region/project that the one specified for the deployment.
labels	You can set this map to include labels for deployed resources. Pay attention to gcp label requirements. More information in: https://cloud.google.com/compute/docs/labeling-resources. For example, only use lowercase and maximum length of 63 characters.

All the values were recorded in Pre-Configuration and this section’s previous steps.

Provide Policy update Terraform variables. In the same main.tf file, you can specify configuration related to policy update. Any of these variables can be updated at any given time by running the terraform again or directly in the GCP Console. Most of the values were recorded in previous installation steps.

Parameter	Description	Notes
pty_esa_ip	ESA IP address or hostname	ESA Server
pty_esa_ca_server_cert	ESA self-signed CA certificate used by policy Agent Function to ensure ESA is the trusted server.	Recorded in step Certificates on ESA In case ESA is configured with publicly signed certificates, the pty_esa_ca_server_cert configuration will be ignored.
gcp_esa_credentials_secret_resource_id	ESA username and password (encrypted value by Google Cloud Secrets Manager). For example, projects/{project-id}/secrets/{secret name}/versions/{version}	Creating ESA Credentials
pty_esa_credentials_function	ESA credentials GCP function resource name. For example, projects/{project-name}/locations/{region}/functions/{esa-credentials-function-name}.	Recorded in step Option 2: Custom Cloud Function ESA CREDENTIALS FUNCTION NAME. Presence of gcp_esa_credentials_secret_resource_id will cause this value to be ignored. The Policy Agent Function must have network access and IAM permissions to call the ESA Credentials function you have created in Option 2: Custom Cloud Function.
gcp_kms_key_resource_name	The Key full resource name. For Example, projects/{project-id}/locations/region/keyRings/ {key-ring}/cryptoKeys/{key-name}/cryptoKeyVersions/1	Key Management Service
gcp_protect_function_resource_name	List of comma separated Protect function resource names. For Example, projects/{project-id}/ locations/{region}/functions/{function-name1},projects/{project-id}/ locations/{region}/functions/{function-name2}	Use protect_function_resource_name recorded in Protect Service Installation section.
gcp_policy_retention_storage_bucket	Deployment Bucket Name where the encrypted policy will be written.	You can use deployment bucket recorded in Google Cloud Storage section, or you can specify other existing bucket.
gcp_policy_version_object_key	Filename of the encrypted policy stored in the Deployment Bucket Name	Default: policy.zip
retain_policy_versions	Number of policy versions to retain as backup. (e.g. 2 will retain the latest 2 policies and remove older ones). -1 retains all.	Default: 10
disable_deploy	This flag can be either 1 or 0. If set to 1, then the agent will not update protector function with the newest policy. Else, the policy will be saved in the cloud storage bucket and deployed to the protector function. Warning Agent deployment requires a deployed Protect or Log Forwarder Cloud Run function when disable_deploy is set	Default: 0
log_level	Application and audit logs verbiage level	Default: INFO. Allowed values: DEBUG – the most verbose INFO, WARNING, ERROR – the least verbose
policy_pull_timeout	Time in seconds to wait for the ESA to send the full policy	Default: 20
pty_core_casesensitive	Specifies whether policy usernames should be case sensitive	Default: no. Allowed values: yes, no
pty_core_emptystring	Override default behavior. Empty string response values are returned as null values. For instance, (un)protect(’’) -> null (un)protect(’’) -> ''	Default: empty. Allowed values: null, empty
esa_connection_timeout	Time in seconds to wait for the ESA response	Default: 5s
pty_addipaddressheader	When enabled, agent will send its source IP address in the request header. This configuration works in conjunction with ESA hubcontroller configuration ASSIGN_DATASTORE_USING_NODE_IP (default=false). See Associating ESA Data Store With Cloud Protect Agent for more information.	Default: yes. Allowed values: yes, no
pty_datastore_key	ESA policy datastore public key fingerprint (64 char long) e.g. 123bff642f621123d845f006c6bfff27737b21299e8a2ef6380aa642e76e89e5.	Note This configuration is not applicable for ESA versions lower than 10.2. The export key is the public part of an asymmetric key pair created in a Create KMS Key. A user with Security Officer permissions adds the public key to the data store in ESA via Policy Management > Data Stores > Export Keys. The fingerprint can then be copied using the Copy Fingerprint icon next to the key. Refer to Exporting Keys to Datastore for details. Note For PPC deployments, see PPC Appendix: Policy Agent Certificate and Key Guidance for details on obtaining and using the datastore key fingerprint.
pty_sync_datastore	Optional name of the policy datastore to sync with ESA. Refer to ESA documentation for more information on policy datastore sync.	Default: ""

From local command line or Cloud Shell, change directory to location of the main.tf, for example:
```
protegrity-agent-gcp-{version}/pty-agent-gcp/
```
Run terraform init.
Terraform will download necessary providers.
Run terraform plan to verify configuration and print out deployment plan.
Run terraform apply to deploy resources to your account. Once deployment is complete, Terraform will print output variables.

Below is the sample output from successful deployment.


        Apply complete! Resources: 1 added, 0 changed, 0 destroyed. 
        Outputs: 
        agent_function_service_account_email = "pty-agent-test@test.iam.gserviceaccount.com" 
        deployment_bucket_name = "test-bucket" 
        nat_ip = 0 
        policy_agent_function_deployment_object = "pty-agent-test-1.0.1.zip" 
        policy_agent_function_name = "pty-agent-test"

Test Agent Function Installation

After configuration is complete, you can test the function.

To test and run the Policy Agent Function:

From the Google Cloud console, go to Cloud Run Functions or Cloud Run.
Click on the function you just deployed: pty_agent_{deployment_id}.
Click Test button at the top right section of the screen.
Scroll down to CLI test command.
Copy and run the curl command to trigger the agent. Alternatively, use the option Test in Cloud Shell.
Wait for the function to complete.
Note
The Policy Agent function deploys a new version of the Cloud Protect Function with updated policy. This process may take several minutes. During this time, the previous policy version remains available until the function update is complete.
Navigate to the LOGS tab to view agent execution logs.

Alternatively, you may review the logs by navigating to Logging from your Google Console. In the Log Explorer select the All resources dropdown, then Cloud Run Revision > pty-agent-{deployment-id} and apply.

Note

Example logs (most recent first):


Function execution took 23892 ms, finished with status: 'ok'
iap.policy_deployer:INFO:Deleting object [policy_v07-26-2021_21-00-00.zip]
iap.policy_deployer:INFO:Deleting object [policy_v07-26-2021_19-03-23.zip]
iap.policy_deployer:INFO:Removing old function versions in [test-artifacts]. Will retain [1] versions.
iap.policy_deployer:INFO:Updating function [projects/cloud-engineering-315519/locations/us-central1/functions/pty-protect-test] with new deployment artifact [test-artifacts/policy_v07-26-2021_21-00-01.zip] ...
iap.imp_creator:INFO:Uploading encrypted policy data to: [test-artifacts/policy_v07-26-2021_19-03-23.zip]
iap.imp_creator:INFO:Preparing deployment package ...
iap_agent_gcp.cloud_functions_util:INFO:Downloading function deployment package ...
iap.imp_creator:INFO:Encrypting policy package ...
iap.policy_agent:INFO:Preparing new policy deployment ...
iap.policy_agent:WARNING:Current policy deployment has no checksum_mapping metadata:
iap.imp_creator:INFO:Checking current policy version ...
iap.policy_agent:INFO:Current deployment package version: [policy_v07-26-2021_18-51-43.zip].
iap.policy_agent:INFO:Getting current policy metadata ...
iap.imp_creator:INFO:Policy downloaded successfully ...
iap.imp_creator:INFO:PepServer started ...
iap.imp_creator:INFO:Starting PepServer ...
iap.imp_creator:INFO:PepServer configured successfully
iap.imp_creator:INFO:Downloading certificates from ESA ...
iap.imp_creator:INFO:Configuring PepServer ...
iap.policy_agent:INFO:Starting policy agent ...
iap.policy_agent:INFO:Using Secret Manager [GCP_ESA_CREDENTIALS_SECRET_RESOURCE_ID] to retreive ESA credentials.
iap.policy_agent:INFO:PTY_CORE_CASESENSITIVE [no]
iap.policy_agent:INFO:PTY_CORE_EMPTYSTRING [empty]
iap.policy_agent:INFO:RETAIN_POLICY_VERSIONS [1]
iap.policy_agent:INFO:GCP_PROTECT_FUNCTION_RESOURCE_NAME [projects/test/locations/us-central1/functions/pty-protect-test]
iap.policy_agent:INFO:GCP_POLICY_VERSION_OBJECT_KEY [policy.zip]
iap.policy_agent:INFO:GCP_POLICY_RETENTION_STORAGE_BUCKET [test-artifacts]
iap.policy_agent:INFO:GCP_KMS_KEY_RESOURCE_NAME [projects/test/locations/us-central1/keyRings/test-key-ring/cryptoKeys/test-protect-asymmetric/cryptoKeyVersions/1]
iap.policy_agent:INFO:GCP_ESA_CREDENTIALS_SECRET_RESOURCE_ID [projects/1234/secrets/ESA_ADMIN_CREDENTIALS/versions/2]
iap.policy_agent:INFO:PTY_ESA_IP [54.236.107.39]
iap.policy_agent:INFO:POLICY_PULL_TIMEOUT [20]
iap.policy_agent:INFO:DISABLE_DEPLOY [0]
Function execution started

Troubleshooting

Configure additional logging:

Set log_level Terraform variable on the Agent function to DEBUG.

In the GCP Logs Explorer, you can run the query below, replacing placeholders with your deployment id and project name.

resource.type="cloud_run_revision"
resource.labels.service_name=~"pty-agent-<deploymentd-id>"
severity=ERROR OR textPayload=~"\[error\]"
-logName="projects/<gcp-project-id>/logs/run.googleapis.com%2Frequests"

Expand each log entry for more details. Check for jsonPayload > exception to see more detailed error.

Error message	Details
`iap_agent_gcp.cloud_functions_util.CloudFunctionsApiException: Resource 'projects/<account>/locations/<region>/functions/protegrity-protect-<deployment-id>' was not found`	This error may indicate the following configuration issues: The function name indicated in setting gcp_protect_function_resource_name has been provided incorrectly, and thus cannot be found. disable_deploy has been set, and a dummy function has been entered to work around the gcp_protect_function_resource_name requirement. The Agent deployment requires a deployed Protect or Log Forwarder Cloud Run function to operate.
`[ERROR] policy_agent:Invalid GCP_PROTECT_FUNCTION_RESOURCE_NAME parameter value. Must be a comma separated list of Lambda Function names or ARNs.`	This error may indicate the following configuration issues: The setting gcp_protect_function_resource_name is empty. The Agent deployment requires a deployed Protect or Log Forwarder Cloud Run function to operate, this setting may not be left empty. The list of function names provided to gcp_protect_function_resource_name contains invalid function name or is not valid csv format.
[ERROR] iap_agent_gcp.cloud_functions_util:<HttpError 403 when requesting https://cloudfunctions.googleapis.com/v2/projects/<account>/locations/<region>/functions/pty-protect-<deployment-id>:generateDo wnloadUrl?alt=json returned "Permission 'cloudfunctions.functions.sourceCodeGet' denied on 'projects/<account>/locations/<region>/functions/<deployment-id>'". Details: "Permission 'cloudfunctions.functions.sourceCodeGet' denied on 'projects/<account>/locations/<region>/functions/pty-protect-<deployment-id>'"> [ERROR] policy_agent:Permission 'cloudfunctions.functions.sourceCodeGet' denied on 'projects/<account>/locations/<region>/functions/pty-protect-<deployment-id>' ... iap_agent_gcp.cloud_functions_util.CloudFunctionsApiException: Permission 'cloudfunctions.functions.sourceCodeGet' denied on 'projects/<account>/locations/<region>/functions/pty-protect-<deployment-id>'	Indicates the Agent Cloud Run function’s identity does not have permissions to sourceCodeGet for Protect/Log Forwarder function(s) provided to the gcp_protect_function_resource_name configuration.

Error message

Details

iap_agent_gcp.cloud_functions_util.CloudFunctionsApiException: Resource 'projects/<account>/locations/<region>/functions/protegrity-protect-<deployment-id>' was not found

This error may indicate the following configuration issues:

The function name indicated in setting gcp_protect_function_resource_name has been provided incorrectly, and thus cannot be found.
disable_deploy has been set, and a dummy function has been entered to work around the gcp_protect_function_resource_name requirement. The Agent deployment requires a deployed Protect or Log Forwarder Cloud Run function to operate.

[ERROR] policy_agent:Invalid GCP_PROTECT_FUNCTION_RESOURCE_NAME parameter value. Must be a comma separated list of Lambda Function names or ARNs.

This error may indicate the following configuration issues:

The setting gcp_protect_function_resource_name is empty. The Agent deployment requires a deployed Protect or Log Forwarder Cloud Run function to operate, this setting may not be left empty.
The list of function names provided to gcp_protect_function_resource_name contains invalid function name or is not valid csv format.

[ERROR] iap_agent_gcp.cloud_functions_util:<HttpError 403 when requesting https://cloudfunctions.googleapis.com/v2/projects/<account>/locations/<region>/functions/pty-protect-<deployment-id>:generateDo
wnloadUrl?alt=json returned "Permission 'cloudfunctions.functions.sourceCodeGet' denied on 'projects/<account>/locations/<region>/functions/<deployment-id>'". Details: "Permission 'cloudfunctions.functions.sourceCodeGet' denied on 'projects/<account>/locations/<region>/functions/pty-protect-<deployment-id>'">
[ERROR] policy_agent:Permission 'cloudfunctions.functions.sourceCodeGet' denied on 'projects/<account>/locations/<region>/functions/pty-protect-<deployment-id>'
...
iap_agent_gcp.cloud_functions_util.CloudFunctionsApiException: Permission 'cloudfunctions.functions.sourceCodeGet' denied on 'projects/<account>/locations/<region>/functions/pty-protect-<deployment-id>'

Indicates the Agent Cloud Run function’s identity does not have permissions to sourceCodeGet for Protect/Log Forwarder function(s) provided to the gcp_protect_function_resource_name configuration.

3.6 - Audit Log Forwarder Installation

Install the audit log forwarder.

Audit Log Forwarder installation is done via Terraform scripts provided by Protegrity in the installation bundle.

ESA Audit Store Configuration

ESA server is required as the recipient of audit logs. Verify the information below to ensure ESA is accessible and configured properly.

ESA server running and accessible on TCP port 9200.
Audit Store service is configured and running on ESA. For information related to ESA Audit Store configuration, refer to Audit Store Guide.

Certificates on ESA

By default, ESA is configured with self-signed certificates, which can only be validated using self-signed CA certificate supplied in Log Forwarder configuration.

Note

Certificate Validation can be bypassed for testing purposes, see section: Install Log Forwarder via Terraform

In case ESA is configured with publicly signed certificates, this section can be skipped since the Log Forwarder will use public CA to validate ESA certificates.

To obtain self-signed CA certificate from ESA:

Download ESA CA certificate from the /etc/ksa/certificates/plug directory of the ESA
After certificate is downloaded, open the PEM file in text editor and replace all new lines with escaped new line: \n.
To escape new lines from command line, use one of the following commands depending on your operating system:
Linux Bash:
```
awk 'NF {printf "%s\\n",$0;}' CA.pem > output.txt
```
Windows PowerShell:
```
(Get-Content '.\CA.pem') -join '\n' | Set-Content 'output.txt'
```
Record the certificate content with new lines escaped.
ESA CA Server Certificate (EsaCaCert): ___________________
This value will be used to set pty_esa_ca_server_cert Terraform variable in installation section. Install Log Forwarder via Terraform

For more information about ESA certificate management refer to Certificate Management Guide in ESA documentation.

VPC configuration

Similar to Policy Agent Function, log forwarder function requires Google Cloud VPC to route traffic from the function to ESA. Review the VPC configuration steps for agent in section Identify or Create a new VPC. Same VPC connector as the policy agent can be used. Note down VPC connector name:

google_vpc_access_connector_name: ___________________

ESA Authentication

Audit Log Forwarder must authenticate with ESA using certificate-based authentication with client certificate and certificate key. Download the following certificates from the /etc/ksa/certificates/plug directory of the ESA:

File Name	Description
client.key	Client certificate key
client.pem	Client certificate (PEM)

Both certificate and certificate key must be converted to single-line values using code similar to the following examples.

Client certificate (client.pem):

$folder = 'C:\Temp'
cd $folder
(Get-Content "$folder\client.pem") -join '\n' | Set-Content "$folder\one-liner-client.pem"
cat "$folder\one-liner-client.pem"

folder="/tmp"
cd "$folder"
awk 'NF {printf "%s\\n",$0}' "client.pem" > "one-liner-client.pem"
cat "one-liner-client.pem"

Client certificate key (client.key):

$folder = 'C:\Temp'
cd $folder
(Get-Content "$folder\client.key") -join '\n' | Set-Content "$folder\one-liner-client.key"
cat "$folder\one-liner-client.key"

folder="/tmp"
cd "$folder"
awk 'NF {printf "%s\\n",$0}' "client.key" > "one-liner-client.key"
cat "one-liner-client.key"

Note

Use single-line certificate and single-line certificate key values below when configuring Log Forwarder.

While installing using Terraform template:

Provide single-line client certificate for pty_esa_client_cert
Provide ID of the GCP secret containing the single-line certificate key for pty_esa_client_cert_key_secret_id Secret is created in a later step

Configure ESA Secrets In GCP Secret Manager

Audit Log Forwarder Function uses GCP Secret Manager to store ESA Audit Store credentials used during authentication.

For information on how to configure basic and certificate authentication for Audit Store on ESA refer to Audit Store Guide.

Log in to Google Account and select project where Protegrity service will be installed.
Go to Security > Secret Manager.
Select CREATE SECRET.

Specify the Secret Value:

{
  "username": "admin", 
  "password": "{esa_password}"
}

Select Create Secret.
Once the secret is created, you should see the secret screen opened. If not click on the secret name to see a screen with secret versions.
Click on Actions, next to the secret version you just created.
Select Copy Resource ID and record the full secret version path, for example, projects/{project-id}/secrets/{secret name}/versions/2.
ESA Log Forwarder Credentials Secret Name: _________________
Create another secret with single-line contents of ESA client certificate key file
See Certificate Authentication for details on client certificate key
Record the full secret version path, for example, projects/{project-id}/secrets/{secret name}/versions/1.
ESA Log Forwarder Client Certificate Key Secret Name: _________________

Create Log Forwarder Service Account

To create Log Forwarder Service Account:

Log in to Google Account and select the project where Protegrity service will be installed.
Navigate to IAM & Admin > Service Accounts.
Select CREATE SERVICE ACCOUNT.
Specify service account name and description.
Select Create and Continue.
In the next step, click Select Role. Then select the following roles:
- Cloud KMS CryptoKey Decrypter
- Pub/Sub Publisher
- Secret Manager Secret Accessor
Click Done.
Once the service account is created, the screen should open on the service account. If the screen does not appear, refresh the page with the service account list and select the service account created.
Record the full email. For example, service-account-name@project-id.iam.gserviceaccount.com.
Log Forwarder Function Service Account Email: ___________________

Create Service Account For Forwarder Pub/Sub

Pub/Sub service requires Cloud Run Invoker permissions in order to be able to send messages to the Forwarder function.

Log in to Google Account and select the project where Protegrity forwarder will be installed.
Navigate to IAM & Admin > Service Accounts.
Select CREATE SERVICE ACCOUNT.
Specify service account name and description.
Select Create and Continue.
In the next step, click Select Role. Then select Cloud Run Invoker.
Click Done.
Once the service account is created, the screen should open on the service account. If the screen does not appear, refresh the page with the service account list and select the service account created.
Record the full email. For example, service-account-name@project-id.iam.gserviceaccount.com.
Pub/Sub Log Forwarder Service Account Email: ___________________

Preparation

Ensure that all the steps in Google Cloud Project are performed.
Log in to the Google Cloud account where Protegrity will be installed.
Select the project.
Ensure that you have access to shell command on your computer or Cloud Shell with Terraform CLI v0.14 or higher installed.
Ensure that the Terraform scripts provided by Protegrity are available on your local computer.

Install Log Forwarder Function via Terraform Scripts

Resources created with Terraform scripts include Audit Log Forwarder Cloud Functions Service and Pub/Sub topic. If you don’t specify the deployment bucket Terraform parameter, a new storage bucket will also be created. You can optionally choose to create a new service account with custom IAM role.

To install using Terraform:

From the command shell move to directory where you downloaded Protegrity installation bundle.
Unzip the bundle, then unzip the protegrity-gcp-bigquery-{version}.zip. Navigate to pty-log-forwarder-gcp/. Verify that the following files are available:
- pty-log-forwarder-gcp/
- main.tf
- outputs.tf
- protegrity-cloud-api-gcp-{version}.zip
- README.md

Open the main.tf file and update Terraform backend information at the top of the file:

terraform {
  backend "gcs" {
    bucket  = ""
    # The bucket/prefix combination must be unique for different deployments 
    # to avoid conflicting Terraform states and accidental resources destruction.
    # prefix = "protegrity-gcp-bigquery/forwarder/<deployment_id>/tf-state"
  }
}

Set the bucket property to Terraform Backend Bucket Name recorded in Google Cloud Storage
Set the prefix property with value unique to your deployment.

In the same main.tf file, specify the following Terraform variables: All the values were recorded in Google Cloud Project.

Warning

Google Cloud Function 2nd Generation currently does not support CMEK.

Parameter	Description
project_id	The project id recorded in the pre-configuration step
region	The Region recorded in the pre-configuration step.
deployment_id	Specify short name to identify deployment. This id will be added to all resources deployed with Terraform.
deployment_bucket	Use Deployment Bucket Name recorded in pre-configuration or leave empty to create new bucket.
create_service_account	Leave this as false if you created service account in pre-configuration. Otherwise set to true.
forwarder_function_service_account_email	Use Forwarder Function Service account recorded in pre-configuration or leave empty.
pub_sub_log_forwarder_service_account_email	Service account of the audit log Pub/Sub trigger. The service account must be assigned Cloud Run Invoker (roles/run.invoker) role.
create_vpc	If create_vpc flag is set, new vpc will be created together with vpc connector, NAT and external IP Use this flag if you have VPC admin permissions in your Google Account. If you set it to false, you can specify the existing VPC in the google_vpc_access_connector_name parameter.
google_vpc_access_connector_name	Use existing VPC connector to associate with Log Forwarder Function. You can specify either the VPC connector name or the full resource name if vpc connector is in different region/project that the one specified for the deployment. You can alternatively set the use google_vpc_access_connector_full_resource_name. Both parameters are optional. Full resource name takes precedence over connector name.
log_destination_esa_ip	Ip address of the ESA where Protector logs will be sent to.
pty_esa_ca_server_cert	ESA self-signed CA certificate used by log forwarder function to ensure ESA is the trusted server. See documentation for more details.
esa_credentials_secret_resource_id	GCP Secret Manager secret id where ESA Fluent Bit logger credentials are stored.
pty_esa_client_cert	Single-line ESA client certificate content. See Certificate Authentication for details on client certificate
pty_esa_client_cert_key_secret_id	GCP Secret Manager secret id where single-line ESA client certificate key content is stored. See Configure ESA Secrets In GCP Secret Manager for details on client certificate key secret
min_log_level	Minimum log level for log forwarder function. Must be one of the following: [off,severe,warning,info,config,all].
esa_tls_disable_cert_verify	Disable certificate verification when connecting to ESA. This is only for dev purposes, should not be used in production environment.
esa_connect_timeout	Esa connection timeout in seconds.
esa_virtual_host	ESA Virtual Host.
audit_log_flush_interval	Time interval in seconds used to accumulate audit logs before sending to ESA. Default value: 10 Min value: 1 Max value: 900
dlq_topic_message_retention_duration	Indicates the minimum duration to retain a message in dead letter queue topic in case log destination server is not available. Value must be decimal number, followed by the letter s (seconds). Cannot be more than 31 days or less than 10 minutes. Default value is 1 day
audit_log_dead_letter_topic	This parameter is expected to be used in a separate deployment to replay dead letter queue messages.
max_instance_count	GCP Cloud Functions advanced configuration
available_memory_mb	GCP Cloud Functions advanced configuration
timeout_seconds	GCP Cloud Functions advanced configuration
gen2_available_cpu	2nd Gen Cloud Function advanced configuration
gen2_container_concurrency	2nd Gen Cloud Function advanced configuration
upgrade_step	Set this variable when upgrading to the latest version.
labels	You can set this map to include labels for deployed resources. Pay attention to GCP label requirements. For more information, refer to the following link https://cloud.google.com/compute/docs/labeling-resources. For example, only use lowercase and maximum length of 63 characters.

From local command line or Cloud Shell, change directory to location of the main.tf, for example:
```
pty-log-forwarder-gcp-{version}/pty-log-forwarder-gcp/
```
Run the following command.
```
terraform init
```
Terraform will download necessary providers.
Run the following command to verify configuration and print out deployment plan.
```
terraform plan
```
Run the following command to deploy resources to your account.
```
terraform apply
```
Once deployment is complete Terraform will print output variables.
Record the following values:
- forwarder_function_name: ____________________________
- forwarder_function_url: ____________________________
- forwarder_function_resource_name: __________________

Turn on Instance-based billing.

Both Protect and Log Forwarder functions must run for a short period of time after all requests are handled. In order for the GCP Cloud Run service to allow that, the Instance-based billing feature must be enabled for both function deployments.

To enable Instance-based billing:

Log in to Google Account and select the project where Protegrity Cloud Run Function was installed.
Navigate to Cloud Run.
Click on the Cloud Function name.
In Cloud Run revision view, select Edit & deploy new revision.
Scroll down to Billing.
Select Instance-based.
Click DEPLOY.
Repeat the steps for Log Forwarder function.

Test Log Forwarder Function Installation

Before continuing with next steps, you can verify whether Log Forwarder Function is installed correctly. This step is optional and can be skipped.

Below you can find example CURL command to test your function.
Before you can execute it, test if you can obtain temporary authentication token. Run the gcloud auth login and then gcloud auth print-identity-token commands. The logged in gcloud user must have the Cloud Run Invoker permissions. Continue to the next step if the command succeeds and prints the token.
Replace {forwarder_function_url}; with value recorded in previous step.

Run the following CURL command to test Function deployment.

curl {forwarder_function_url} \
-H "Authorization: Bearer $(gcloud auth print-identity-token)" \
-H "Content-Type: application/json" \
-H "ce-id: 123451234512345" \
-H "ce-specversion: 1.0" \
-H "ce-time: 2020-01-02T12:34:56.789Z" \
-H "ce-type: google.cloud.pubsub.topic.v1.messagePublished" \
-H "ce-source: //pubsub.googleapis.com/projects/MY-PROJECT/topics/MY-TOPIC" \
-d '{
    "message": { 
        "data": "'"$(echo '{"additional_info":{"description":"Data unprotect operation was successful.","query_id":"sf-query-id:k6-test-df51a612-4739-4cfb-9fe4-6ec548b70d23"},"client":{},"cnt":4000,"correlationid":"sf-query-id:k6-test-df51a612-4739-4cfb-9fe4-6ec548b70d23","level":"SUCCESS","logtype":"Protection","origin":{"hostname":"localhost","time_utc":1725558586},"process":{"id":1},"protection":{"audit_code":8,"dataelement":"alpha","datastore":"SAMPLE_POLICY","mask_setting":"","operation":"Unprotect","policy_user":"master_user"},"protector":{"core_version":"1.2.2+42.g01eb3.HEAD","family":"cp","pcc_version":"3.4.0.20","vendor":"gcp.snowflake","version":"3.1.0.158"},"signature":{"checksum":"7CE5FFCE9DBE570AAA72D1BB20CD083532EF8FAD3E96E38629EB92E837272D8E","key_id":"676c5178-756d-4363-9"}}' | base64 -w 0)"'",
        "attributes": {},  
        "messageId": "",  
        "publishTime": "2014-10-02T15:01:23Z",
        "orderingKey": ""
   }
}'

In GCP Logs Explorer console verify that the following output appears in the logs:

Request finished HTTP/1.1 POST http://pty-forwarder-31-smoke-jf-pfadh7riaq-uc.a.run.app/ - 200 0 - 75.6570ms

Warning
Test steps will only succeed if the Policy Agent has not updated the Log Forwarder policy. Once updated, logs must be signed with your policy, and the sample data blob above will no longer pass the check, resulting in the error below: [/jenkins/workspace/iaplambda_release_3.1/src/iap/logging/log-aggregator.cpp:66] Failed to aggregate log entry at index 0
.

Grant Pub/Sub Publisher Permission to the Protect Function Service Account

Protect function requires permissions to publish audit log messages to Pub/Sub.

Log in to Google Account and select the project where Protegrity service will be installed.
Navigate to IAM & Admin.
Search for protector function service account email recorded in protect service installation step.
Select Edit principal pencil icon.
Select ADD ANOTHER ROLE.
Select Pub/Sub Publisher.
Click Save.

Protect Function Pub/Sub Log Output

Protect function must be configured to output audit logs to Pub/Sub topic.

To configure Protect function audit log output:

Go to Protect function Terraform deployment.
Navigate to pty-protect-gcp/main.tf.
Set Terraform variable pty_log_output=“pub_sub”.
Set Terraform variable pty_pub_sub_topic to log forwarder Pub/Sub topic.
Note
You can obtain the topic resource name from Log Forwarder Terraform output: audit_log_topic.
Run terraform apply.

Troubleshooting

Configure additional logging:

Set min_log_level Terraform variable on both Protect function and Log Forwarder function to config.

In the GCP Logs Explorer, you can run the query below, replacing placeholders with your deployment id and project name.

resource.type="cloud_run_revision"
resource.labels.service_name=~"pty-(protect|forwarder)-<deploymentd-id>"
severity=ERROR OR textPayload=~"\[error\]"
-logName="projects/<gcp-project-id>/logs/run.googleapis.com%2Frequests"

Expand each log entry for more details. Check for jsonPayload > exception to see more detailed error.

Error message	Details
`Pub/Sub configuration error.`	Indicates problems with Pub/Sub service configuration/availability. Expand error log entry and check exception details. For instance: `exception: "Grpc.Core.RpcException: Status(StatusCode="InvalidArgument", Detail="Invalid resource name given (name=projects/<todo>/topics/pty-forwarder-<todo>).` Verify that pty_pub_sub_topic Terraform variable is set to correct pub/sub resource name. Verify that Pub/Sub topic exists.
`Failed to send x/y audit logs to GCP Pub/Sub.`	This error may be shown as a consequence of Pub/Sub configuration/availability errors. Check for pub/sub configuration errors. If pub/sub configuration looks correct, this may indicate that cloud function can’t process audit logs fast enough. From Protector Function Terraform configuration, try increasing CPU and concurrency.
`Audit log sink error: Unable to deliver all logs.` `opensearch.0: Dropped records: 1/1` `[error] [output:opensearch:opensearch.0] HTTP status=401 URI=/_bulk`	Indicates problems with ESA Audit Store availability/configuration. Those errors will usually be displayed together. The third error will have details on what is the status or response from ESA. In this example, the HTTP status 401 indicates authentication issue.

3.7 -

Prerequisites

The following requirements must be completed for the Snowflake implementation.

Requirements	Description
Protegrity distribution and installation scripts	These artifacts are provided by Protegrity
Protegrity ESA 10.0+	The Cloud VPC must be able to obtain network access to the ESA
Google Cloud Account	Recommend creating a new project for Protegrity Serverless
Snowflake cluster (Enterprise Edition recommended)
Terraform CLI v0.14 or higher	Terraform is used to deploy resources to Google Cloud Account

3.8 -

Required Skills and Abilities

Requirements	Description
Google Cloud Account Administrator	Run Terraform (or perform steps manually), create/configure a VPC and IAM permissions.
Protegrity Administrator	The ESA credentials required to extract the policy for the Policy Agent
Snowflake Administrator	Account Admin access required to setup access
Network Administrator	Open firewall to access ESA and evaluate Google Cloud network setup

4 - Understanding Snowflake Objects

Key concepts in understanding the Protegrity Serverless with Snowflake.

4.1 - External Functions

Call out to a process external to Snowflake through a REST API.

External Functions

Snowflake provides an External Function capability used to call out to a process external to Snowflake through a REST request over TLS encryption. In the Protegrity Cloud Protect for Snowflake solution, this external service is the Protegrity Endpoint for data re-identification operations.

Security Operation Parameters

The following table describes optional and required security operation parameters.

Parameter	Type	Example	Description
op_type	String	“op_type”:“UNPROTECT” “op_type”:“PROTECT”	Required operation name, can be either UNPROTECT or PROTECT
data_element	String	“data_element”:“TOK_ALPHA”	Required data element name defined in Protegrity Security Policy
external_iv	String	“external_iv”:“abc-123”	Optional external intialization vector, which allows for different tokenized results for the same input data and data element of the same security policy. Refer to the External Initialization Vector (IV) in the Protection Methods Reference for more details.
External Function Sample Definition with External IV:
`CREATE SECURE EXTERNAL FUNCTION PTY_PROTECT_ALPHA ( val varchar ) RETURNS varchar NULL IMMUTABLE COMMENT = 'Protects using an ALPHA data element using External IV' API_INTEGRATION = REPLACE_WITH_YOUR_API_INTEGRATION_ID HEADERS = ( 'X-Protegrity-HCoP-Rules'= '{"jsonpaths":[{"op_type":"PROTECT","data_element":"TOK_ALPHA","external_iv":"abc-123"}]}' ) CONTEXT_HEADERS = ( current_user, current_timestamp, current_account ) AS '<AWS API GATEWAY URL>/SF_CUSTOMER';`

4.2 - Snowflake Masking Policies

Optimize REST requests to the Protegrity endpoint.

Masking Policies in the Sample Application are used to optimize REST requests to the Protegrity endpoint and to prevent integration of External Functions into queries.

5 - Performance

Performance benchmarks and considerations.

5.1 - Performance Considerations

The following factors may affect performance benchmarks

Performance Considerations

The following factors may affect performance benchmarks:

Cold startup: Cloud Function spends additional time on the initial invocation to decrypt and load the policy into memory. This time can vary depending on the policy size. Once the Function is initialized, subsequent “warm executions” should process quickly.
Size of policy: The size of the policy impacts cold start performance. Larger policies take more time to initialize.
Cloud Function memory: GCP provides more virtual cores based on the memory configuration. The initial configuration of 2048 MB provides a good tradeoff between performance and cost with the benchmarked policy. Memory can be increased to optimize for your individual cases.
Number of security operations (protect or unprotect).
Cloud Function max instances and concurrency quota: The instance limit affects how functions are scaled. By default the limit is not set to allow handling any traffic pattern. The instance limit can be set to prevent abnormally high request levels. Cloud Functions are also subject to maximum quota for concurrent invocations and request rate.
Size of data element: Operations on larger text consume time.

5.2 - Sample Benchmarks

Sample benchmarks for Snowflake performance with Cloud Protect.

The following benchmarks were performed against different cluster sizes. These are average times of approximately six runs each. The query unprotected six columns per row (first_name, last_name, email, street, city, birthday):

Rows x Cols	# Ops	Small	Medium	Large	2XL
100K x 6 cols	600K	3.4	3.5	3.3	3.3
1M x 6 cols	6M	9.0	8.9	9.1	8.7
10M x 6 cols	60M	29.6	21.9	18.4	33.3
100M x 6 cols	600M	-	-	77.0	76.7

5.3 - Concurrency

Snowflake concurrency considerations.

Snowflake provides guidance on the maximum concurrent requests to the Protegrity API. However, reaching this maximum request depends on additional factors, such as, cluster use and available resources. In addition, depending on the query plan, individual batches may be processed serially across different UDFs.

The formula for theoretical maximum Snowflake concurrency is N * C * M * E * P:

N - # of servers in the cluster (e.g. 2xl = 32, xl = 16)
C - # of CPUs. This is typically 8, but depends on the hardware.
M – parallelism multiplier (fixed to 8)
E - # of external functions invoked
P - # of queries in running in parallel

The following table shows this calculation for a single query.

Cluster size	Predicted concurrent per query *	1 UDF	2 UDF	5 UDF	10 UDF
Medium	4 servers x 8 CPU x 8 = 256	256	512	1,280	2,560
X-Large	16 servers x 8 CPU x 8 = 1,024	1,024	2,048	5,120	10,240
2X-Large	32 servers x 8 CPU x 8 = 2,048	2,048	4,096	10,240	20,480

Note

* theoretical maximum concurrent requests based on engineering guidance from Snowflake.

5.4 - Log Forwarder Performance

Guidance on Log Forwarder Performance settings and considerations.

Log Forwarder Performance

Log forwarder architecture is optimized to minimize the amount of connections and reduce the overall network bandwidth required to send audit logs to ESA. This is achieved with batching and aggregation taking place on two levels. The first level is in protector function instances, where audit logs from consecutive requests to an instance are batched and aggregated. The second level of batching and aggregation takes place in the log forwarder function before audit logs are forwarded to ESA. This section shows how to configure the deployment to accommodate different patterns of anticipated audit log stream. It also shows how to monitor deployment resources to detect problems before audit records are lost.

Protector Function Terraform configuration:
- audit_log_flush_interval: Determines the minimum amount of time audit logs are aggregated for before they are sent to Pub/Sub topic. Default value is 30 seconds. Changing flush interval may affect the level of aggregation and it will affect the backlog of audit logs buffered in the function. The protector function features multithreaded processing which means that multiple requests can be handled at the same time, which in turn can contribute to large backlog of audit logs waiting to be sent to Pub/Sub. The protector function is hosted on Cloud Run containerized environment where each instance of the function is shut down after specific amount of time when there is no more requests to be handled. If the flush interval is too long, the function might be shut down before all of the audit log backlog is send to Pub/Sub. This can be avoided by lowering the interval value.
Log Forwarder Function Terraform configuration
- audit_log_flush_interval: Determines the minimum amount of time audit logs are aggregated for before they are sent to ESA audit log store. Default value is 10 seconds. Changing flush interval may affect the level of aggregation and it will affect the backlog of audit logs buffered in the function. The forwarder function features multithreaded processing which means that multiple requests can be handled at the same time, which in turn can contribute to large backlog of audit logs waiting to be sent to ESA. The forwarder function is hosted on Cloud Run containerized environment where each instance of the function is shut down after specific amount of time when there is no more requests to be handled. If the flush interval is too long, the function might be shut down before all of the audit log backlog is send to ESA. This can be avoided by lowering the interval value. On the other hand if the interval is too short, forwarder function might end up sending to many requests to ESA, which in some extreme cases may result in messages being sent to dead letter queue.
- gen2_available_cpu: Increasing the Function CPU count allows setting higher concurrency, which in turn allows processing more messages from the Pub/Sub in parallel. The high CPU count will effectively lower the number of forwarder function instances which will lower number of connections to ESA.
- gen2_container_concurrency: See bullet point above.
- audit_log_dead_letter_topic: Dead-letter Pub/Sub topic name. This topic will be used by Log Forwarder in case ESA is temporarily unavailable. Messages from DLQ topic can be re-processed by another instance of Log Forwarder either manually or on schedule once ESA connectivity is restored.
Monitoring Log Forwarder Resources
- Protector Function Logs: If protector function is unable to send logs to Pub/Sub, it will log the following message:
```
Failed to send x/y audit logs to GCP Pub/Sub.
```
  See the description of ‘audit_log_flush_interval’ in the protector function configuration section above to learn about potential mitigation.
- Pub/Sub DLQ Topic Metrics: Any positive value in count aggregator on ’topic/message_sizes’ metric indicates that not all audit logs are being delivered to ESA. Review whether connection to ESA is set up in Install Log Forwarder Function via Terraform Scripts
- Log Forwarder Function Logs: If log forwarder function is unable to send logs to ESA, it will log the following message:
```
[/jenkins/workspace/iaplambda_release_3.1/src/iap/logging/fluent-bit-external-sink.cpp:225] opensearch.0: Dropped records: x/y.
```
  See the description of ‘audit_log_flush_interval’ in the log forwarder configuration section above to learn about potential mitigation.
  Note
  When the error message above occurs, the dropped audit records will be sent to a dead-letter Pub/Sub topic for later manual or automated re-processing.

6 - Audit Logging

Audit log description/formatting

Audit Logging

Audit records and application logs stream to Google Cloud Logging. Cloud Protect uses a JSON format for audit records that is described in the following sections.

You can analyze and alert on audit records using Protegrity ESA or Google Cloud Logging. For more information about forwarding your audit records to ESA, contact Protegrity. For more information about Google Cloud Logging, refer to the Google Cloud Logging overview.

For more information about audit records, refer to the Protegrity Analytics Guide.

Audit record fields

The audit record format has been altered in version 3.1 of the protector to provide more information.

Field	Description
additional_info.deployment_id	The deployment_id contains the name of the Protect Function. It is automatically set based on the cloud-specific environment variables assigned to the Protect Function. This allows identifying the Cloud Protect deployment responsible for generating audit log.
additional_info.cluster	(Optional) Redshift cluster ARN
additional_info.description	A human-readable message describing the operation
additional_info.query_id	(Optional) Identifies the query that triggered the operation
additional_info.request_id	(Optional) AWS Lambda request identifier
cnt	Number of operations, may be aggregated
correlationid	(Deprecated) Use additional_info instead
level	Log severity, one of: SUCCESS, WARNING, ERROR, EXCEPTION
logtype	Always “Protection”
origin.ip	The private IP address of the compute resource that operates the Protect Function and is responsible for generating the log entry. Note The IP address is private, meaning it is used for internal network communication and is not accessible directly from the public internet. When Log Forwarding is enabled the IP address may be aggregated into minimal CIDR blocks.
origin.hostname	Hostname of the system that generated the log entry
origin.time_utc	UTC timestamp when the log entry was generated
protection.audit_code	Audit code of the protect operation; see the log return codes table in the Protegrity Troubleshooting Guide
protection.dataelement	Data element used for the policy operation
protection.datastore	Name of the data store corresponding to the deployed policy
protection.mask_setting	(Optional) Mask setting from policy management
protection.operation	Operation type, one of: Protect, Unprotect, Reprotect
protection.policy_user	User that performed the operation
protector.core_version	Internal core component version
protector.family	Always “cp” for Cloud Protect
protector.lambda_version	Protector Lambda application version.
protector.pcc_version	Internal pcc component version
protector.vendor	Identifies the cloud vendor and the database vendor
protector.version	Protector version number
signature.checksum	Hash value of the signature key ID used to sign the log message when the log is generated
signature.key_id	Key used to sign the log message when the log is generated

The following are sample audit messages:

Protect Success:

{
      "additional_info": {
        "description": "Data protect operation was successful.",
        "query_id": "sf-query-id:01978dbc-0582-d7e4-0000-002a3603a20d",
        "request_id": "8476a536-e9f4-11e8-9739-2dfe598c3fcd"
      },
      "cnt": 4000,
      "correlationid": "sf-query-id:01978dbc-0582-d7e4-0000-002a3603a20d",
      "logtype": "Protection",
      "level": "SUCESS",
      "origin": {
        "hostname": "localhost",
        "time_utc": 1635363966
      },
      "protection": {
        "dataelement": "deAddress",
        "operation": "Protect",
        "audit_code": 6,
        "datastore": "SAMPLE_POLICY",
        "policy_user": "test_user"
      },
      "client": {},
      "protector": {
        "family": "cp",
        "lambda_version": "3.2.10",
        "version": "3.2.0",
        "vendor": "aws.snowflake",
        "pcc_version": "3.4.0.14",
        "core_version": "1.2.1+55.g590fe.HEAD"
      },
      "signature": {
        "key_id": "95f5a194-b0a4-4351-a",
        "checksum": "B324AF7C56944D91C47847A77C0367C594C0B948E7E75654B889571BD4F60A71"
      }
    }

User permission denied:

{
      "additional_info": {
        "description": "The user does not have the appropriate permissions to perform the requested operation."
      },
      "cnt": 4000,
      "correlationid": "sf-query-id:01978dbc-0582-d7e4-0000-002a3603a20d",
      "logtype": "Protection",
      "level": "ERROR",
      "origin": {
        "hostname": "localhost",
        "time_utc": 1635363966
      },
      "protection": {
        "dataelement": "deAddress",
        "operation": "Protect",
        "audit_code": 3,
        "policy_user": "test_user"
      },
      "process": {
        "id": "1",
        "thread_id": "849348352"
      },
      "client": {},
      "protector": {
        "family": "IAP Lambda",
        "lambda_version": "3.2.10",
        "version": "3.2.0",
        "vendor": "Cloud Protect",
        "pcc_version": "3.3.0.5",
        "core_version": "1.1.0"
      },
      "signature": {
        "key_id": "95f5a194-b0a4-4351-a",
        "checksum": "A216797C56944D91C47847A77C0367C594C0B948E7E75654B889571BD4F60A71"
      }
    }

Data element not found:

{
      "additional_info": {
        "description": "The data element could not be found in the policy in shared memory."
      },
      "cnt": 4000,
      "correlationid": "sf-query-id:01978dbc-0582-d7e4-0000-002a3603a20d",
      "logtype": "Protection",
      "level": "ERROR",
      "origin": {
        "hostname": "localhost",
        "time_utc": 1635363966
      },
      "protection": {
        "dataelement": "deAddress",
        "operation": "Protect",
        "audit_code": 2,
        "policy_user": "test_user"
      },
      "process": {
        "id": "1",
        "thread_id": "849348352"
      },
      "client": {},
      "protector": {
        "family": "IAP Lambda",
        "lambda_version": "3.2.10",
        "version": "3.2.0",
        "vendor": "Cloud Protect",
        "pcc_version": "3.3.0.5",
        "core_version": "1.1.0"
      },
      "signature": {
        "key_id": "95f5a194-b0a4-4351-a",
        "checksum": "AF09217C56944D91C47847A77C0367C594C0B948E7E75654B889571BD4F60A71"
      }
    }

Example Audit Records

The following are sample audit messages:

Protect Success:

{
      "additional_info": {
        "description": "Data protect operation was successful.",
        "query_id": "sf-query-id:01978dbc-0582-d7e4-0000-002a3603a20d",
        "request_id": "8476a536-e9f4-11e8-9739-2dfe598c3fcd"
      },
      "cnt": 4000,
      "correlationid": "sf-query-id:01978dbc-0582-d7e4-0000-002a3603a20d",
      "logtype": "Protection",
      "level": "SUCESS",
      "origin": {
        "hostname": "localhost",
        "time_utc": 1635363966
      },
      "protection": {
        "dataelement": "deAddress",
        "operation": "Protect",
        "audit_code": 6,
        "datastore": "SAMPLE_POLICY",
        "policy_user": "test_user"
      },
      "client": {},
      "protector": {
        "family": "cp",
        "version": "3.1.0",
        "vendor": "aws.snowflake",
        "pcc_version": "3.4.0.14",
        "core_version": "1.2.1+55.g590fe.HEAD"
      },
      "signature": {
        "key_id": "95f5a194-b0a4-4351-a",
        "checksum": "B324AF7C56944D91C47847A77C0367C594C0B948E7E75654B889571BD4F60A71"
      }
    }

Reprotect Success:

{
      "additional_info": {
        "description": "Data reprotect operation was successful.",
        "query_id": "sf-query-id:01978dbc-0582-d7e4-0000-002a3603a20d",
        "request_id": "8476a536-e9f4-11e8-9739-2dfe598c3fcd"
      },
      "cnt": 4000,
      "correlationid": "sf-query-id:01978dbc-0582-d7e4-0000-002a3603a20d",
      "logtype": "Protection",
      "level": "SUCCESS",
      "origin": {
        "hostname": "localhost",
        "time_utc": 1635363966
      },
      "protection": {
        "old_dataelement": "deAddress1",
        "dataelement": "deAddress2",
        "operation": "Reprotect",
        "audit_code": 50,
        "datastore": "SAMPLE_POLICY",
        "policy_user": "test_user"
      },
      "client": {},
      "protector": {
        "family": "cp",
        "version": "3.1.0",
        "vendor": "aws.snowflake",
        "pcc_version": "3.4.0.14",
        "core_version": "1.2.1+55.g590fe.HEAD"
      },
      "signature": {
        "key_id": "95f5a194-b0a4-4351-a",
        "checksum": "B324AF7C56944D91C47847A77C0367C594C0B948E7E75654B889571BD4F60A71"
      }
    }

User permission denied:

{
      "additional_info": {
        "description": "The user does not have the appropriate permissions to perform the requested operation.",
        "query_id": "sf-query-id:01978dbc-0582-d7e4-0000-002a3603a20d",
        "request_id": "8476a536-e9f4-11e8-9739-2dfe598c3fcd"
      },
      "cnt": 4000,
      "correlationid": "sf-query-id:01978dbc-0582-d7e4-0000-002a3603a20d",
      "logtype": "Protection",
      "level": "ERROR",
      "origin": {
        "hostname": "localhost",
        "time_utc": 1635363966
      },
      "protection": {
        "dataelement": "deAddress",
        "operation": "Protect",
        "audit_code": 3,
        "datastore": "SAMPLE_POLICY",
        "policy_user": "test_user"
      },
      "client": {},
      "protector": {
        "family": "cp",
        "version": "3.1.0",
        "vendor": "aws.snowflake",
        "pcc_version": "3.4.0.14",
        "core_version": "1.2.1+55.g590fe.HEAD"
      },
      "signature": {
        "key_id": "95f5a194-b0a4-4351-a",
        "checksum": "A216797C56944D91C47847A77C0367C594C0B948E7E75654B889571BD4F60A71"
      }
    }

Data element not found:

{
      "additional_info": {
        "description": "The data element could not be found in the policy in shared memory.",
        "query_id": "sf-query-id:01978dbc-0582-d7e4-0000-002a3603a20d",
        "request_id": "8476a536-e9f4-11e8-9739-2dfe598c3fcd"
      },
      "cnt": 4000,
      "correlationid": "sf-query-id:01978dbc-0582-d7e4-0000-002a3603a20d",
      "logtype": "Protection",
      "level": "ERROR",
      "origin": {
        "hostname": "localhost",
        "time_utc": 1635363966
      },
      "protection": {
        "dataelement": "deAddress",
        "operation": "Protect",
        "audit_code": 2,
        "datastore": "SAMPLE_POLICY",
        "policy_user": "test_user"
      },
      "client": {},
      "protector": {
        "family": "cp",
        "version": "3.1.0",
        "vendor": "aws.snowflake",
        "pcc_version": "3.4.0.14",
        "core_version": "1.2.1+55.g590fe.HEAD"
      },
      "signature": {
        "key_id": "95f5a194-b0a4-4351-a",
        "checksum": "AF09217C56944D91C47847A77C0367C594C0B948E7E75654B889571BD4F60A71"
      }
    }

7 - No Access Behavior

Unauthorized unprotect requests behaviour.

The security policy maintains a No Access Operation, configured in an ESA, which determines the response for unauthorized unprotect requests.

The following table describes the value returned to the UDF function for various cases:

No Access Operation	Data Returned
Null	null
Protected	(protected value)
Exception	Query will return an exception

Note

An unauthorized protect will throw an exception.

8 - Upgrading To The Latest Version

Instructions for upgrading the protector.

Important

Upgrading the Policy Agent component to version 4 from any previous major version requires a new installation
Upgrading the Protector component to version 4 from any previous major version requires a new installation
Upgrading the Log Forwarder component to version 4 from any previous major version requires a new installation

9 - Known Limitations

Known product limitations.

Only protect and unprotect operations are supported. The re-protect operation is not supported.
The Semi-structured (JSON) data type is not supported in the product.

Cloud Function (Gen2) labels must not be updated from the Cloud Run Services console. When updating labels for a GCP Cloud Function (Gen2) through the Cloud Run Services console, GCP creates a new Cloud Run revision with the updated labels, but the underlying Cloud Function retains the old labels. Because the policy agent reads labels from the Cloud Function definition (not the Cloud Run revision), it will not detect the label change and will not trigger a policy update.
To avoid this issue, always update labels using one of the following methods:
- Cloud Run Functions console — Navigate to Cloud Run Functions, select the function, and update labels there. This ensures both the Cloud Function and its underlying Cloud Run revision are updated consistently.
- Terraform — Update the labels variable in your Terraform configuration and run terraform apply.
- gcloud CLI — Use gcloud functions deploy with the updated --update-labels flag.
If labels were already updated incorrectly through the Cloud Run Services console, redeploy the function using one of the methods above to synchronize the labels and trigger a policy update.

10 - Appendices

Additional references for the protector.

10.1 - Integrating Cloud Protect with PPC (Protegrity Provisioned Cluster)

Concepts for integrating with PPC (Protegrity Provisioned Cluster)

This guide describes how to configure the Protegrity Policy Agent and Log Forwarder to connect to a Protegrity Provisioned Cluster (PPC), highlighting the differences from connecting to ESA.

Key Differences: PPC vs ESA

Feature	ESA 10.2	PPC (this guide)
Datastore Key Fingerprint	Optional/Recommended	Required
CA Certificate on Agent	Optional/Recommended	Optional/Recommended
CA Certificate on Log Forwarder	Optional/Recommended	Not supported
Client Certificate Authentication from Log Forwarder	Optional/Recommended	Not supported
IP Address	ESA IP address	PPC address

Prerequisites

Access to PPC and required credentials.
Tools: curl, kubectl installed.

Policy Agent Setup with PPC

Important

When connecting to PPC, the Policy Agent requires use of a datastore key fingerprint. For connecting to ESA 10.2 with Cloud Protect Policy Agent, the fingerprint is optional but recommended. See Policy Agent Installation for general setup steps.

Follow these instructions as a guide for understanding specific inputs for Policy Agent integrating with PPC:

Obtain the Datastore Key Fingerprint
To retrieve the fingerprint for your Policy Agent:
1. Retrieve public key from the Cloud Provider Key Management service for the policy encryption key created in pre-configuration:
  1. Navigate to the Key Management Service in AWS console and open Customer Managed Keys
  2. Select the desired key
  3. Select the Public Key tab
  4. Select Download
  1. Navigate to the Key Vault in Azure console and open Objects>Keys
  2. Select the desired key
  3. Select the key indicated as CURRENT VERSION
  4. Select Download public key
  1. Navigate to Key Management in GCP console
  2. Select the desired key and open the Versions tab
  3. Select Get public key from the Actions column menu
  4. Select Download
2. Escape the new line characters in the downloaded public key for use in the next step - for example:
```
awk 'NF {printf "%s\\n",$0}' "<public_key_file>" > "new-line-escaped-public-key.pem"
cat new-line-escaped-public-key.pem
```
3. Export key fingerprint using the PPC API as indicated in the curl example below:
```
curl -k -H "Authorization: Bearer ${TOKEN}" -X POST https://${HOST}/pty/v2/pim/datastores/1/export/keys  -H "Content-Type: application/json" --data '{
  "algorithm": "RSA-OAEP-256",
  "description": "example-key-from-key-management",
  "pem": "<value of new-line-escaped-public-key>"
}'
```
  Sample Output:
```
{"uid":"1","algorithm":"RSA-OAEP-256","fingerprint":"4c:46:d8:05:35:2e:eb:39:4d:39:8e:6f:28:c3:ab:d3:bc:9e:7a:cb:95:cb:b1:8e:b5:90:21:0f:d3:2c:0b:27","description":"example-key-from-kms"}
```
  Note
  Alternatively, set using the PPC CLI utility. See the export key example in Create Datastores Key
4. Record the value for fingerprint and configure the Policy Agent:
  Set the environment variable PTY_DATASTORE_KEY in the Policy Agent Lambda function to the fingerprint value.
  Set the environment variable PTY_DATASTORE_KEY in the Policy Agent Function App to the fingerprint value.
  Set the variable in Policy Agent main.tf pty_datastore_key to the fingerprint value and apply the changes.
Retrieve the PPC CA Certificate
To obtain the CA certificate from PPC:
```
kubectl -n api-gateway get secret ingress-certificate-secret -o jsonpath='{.data.ca\.crt}' | base64 -d > CA.pem
```
Use the ProtegrityCA.pem that was returned as described in Policy Agent Installation.
Configure the PPC Address
Use the PPC address in place of the ESA IP address wherever required in your configuration.
Note
Use FQDN as described in the PPC Rest API documentation

Log Forwarder Setup with PPC

Note

When using PPC, certificate authentication and CA validation are not supported for the Log Forwarder. Configuration steps related to certificates in Log Forwarder Installation do not apply to PPC. If you attempt to use certificates provided by PPC, the Log Forwarder will not function correctly.

The Log Forwarder will proceed without certificates and will print a warning if PTY_ESA_CA_SERVER_CERT is not provided.
No additional certificate or CA configuration is needed for PPC.

10.2 - Sample Snowflake External Function

Sample Snowflake External Function definitions and calls for tokenization data elements.

Appendix A. Sample Snowflake External Function

Method: Tokenization
Type: ALPHA

Snowflake Data Types	Snowflake Max Size	Protegrity Max Size
VARCHAR	16M (16,777,216 bytes)	4K (4,096 bytes)
CHAR
STRING
TEXT

External Function Sample Definitions:
`CREATE SECURE EXTERNAL FUNCTION PTY_PROTECT_ALPHA ( val varchar ) RETURNS varchar NULL IMMUTABLE COMMENT = 'Protects using an ALPHA data element' API_INTEGRATION = REPLACE_WITH_YOUR_API_INTEGRATION_ID HEADERS =( 'X-Protegrity-HCoP-Rules'= '{"jsonpaths": [{"op_type":"PROTECT","data_element":"TOK_ALPHA"}]}' ) CONTEXT_HEADERS = ( current_user, current_timestamp, current_account ) AS '<api_gateway_protect_service_url>/pty/snowflake';`
`CREATE SECURE EXTERNAL FUNCTION PTY_UNPROTECT_ALPHA ( val varchar ) RETURNS varchar NULL IMMUTABLE COMMENT = 'Unprotects using an ALPHA data element' API_INTEGRATION = REPLACE_WITH_YOUR_API_INTEGRATION_ID HEADERS =( 'X-Protegrity-HCoP-Rules'= '{"jsonpaths":[{"op_type":"UNPROTECT","data_element":"TOK_ALPHA"}]}' ) CONTEXT_HEADERS = ( current_user, current_timestamp, current_account ) AS '<api_gateway_protect_service_url>/pty/snowflake';`

Sample EF Calls:
`SELECT PTY_PROTECT_ALPHA ('Hello World')`
`SELECT PTY_UNPROTECT_ALPHA('rfDtw sLMJK');`

Snowflake Masking Policy example:
`create or replace masking policy alpha_policy as (val string) returns string -> case when current_role() in ('ACCOUNTADMIN') then PTY_UNPROTECT_ALPHA(val) else val end;`
`alter table pii_data modify column field01 set masking policy alpha_policy; alter table pii_data modify column field01 unset masking policy;`

Method: Tokenization
Type: NUMERIC

Snowflake Data Types	Snowflake Max Size	Protegrity Max Size
NUMBER
DECIMAL
INTEGER
DOUBLE

External Function Sample Definitions:
`CREATE SECURE EXTERNAL FUNCTION PTY_PROTECT_NUMERIC ( val number ) RETURNS number NULL IMMUTABLE COMMENT = 'Protects using a NUMERIC data element' API_INTEGRATION = REPLACE_WITH_YOUR_API_INTEGRATION_ID HEADERS =( 'X-Protegrity-HCoP-Rules'= '{"jsonpaths":[{"op_type":"PROTECT","data_element":"TOK_NUMERIC"}]}' ) CONTEXT_HEADERS = ( current_user, current_timestamp, current_account ) AS '<api_gateway_protect_service_url>/pty/snowflake';`
`CREATE SECURE EXTERNAL FUNCTION PTY_UNPROTECT_NUMERIC ( val number) RETURNS number NULL IMMUTABLE COMMENT = 'Unprotects using a NUMERIC data element' API_INTEGRATION = REPLACE_WITH_YOUR_API_INTEGRATION_ID HEADERS =( 'X-Protegrity-HCoP-Rules'= '{"jsonpaths":[{"op_type":"UNPROTECT","data_element":"TOK_NUMERIC"}]}' ) CONTEXT_HEADERS = ( current_user, current_timestamp, current_account ) AS '<api_gateway_protect_service_url>/pty/snowflake';`

Sample EF Calls:
`SELECT PTY_PROTECT_NUMERIC ('123456789');`
`SELECT PTY_UNPROTECT_NUMERIC ('752513497');`

Snowflake Masking Policy example:
`create or replace masking policy num_policy as (val number) returns number -> case when current_role() in ('ACCOUNTADMIN') then PTY_UNPROTECT_NUMERIC(val) else val end;`
`alter table pii_data modify column field02 set masking policy num_policy; alter table pii_data modify column field02 unset masking policy;`

Method: Tokenization
Type: DATE YYYY-MM-DD

Snowflake Data Types	Snowflake Max Size	Protegrity Max Size
DATE (any supported format)	10 bytes	10 bytes

External Function Sample Definitions:
`CREATE SECURE EXTERNAL FUNCTION PTY_PROTECT_DATEYYYYMMDD ( val date ) RETURNS date NULL IMMUTABLE COMMENT = 'Protects using a Date data element' API_INTEGRATION = REPLACE_WITH_YOUR_API_INTEGRATION_ID HEADERS =( 'X-Protegrity-HCoP-Rules'= '{"jsonpaths":[{"op_type":"PROTECT","data_element":"TOK_DATEYYYYMMDD"}]}' ) CONTEXT_HEADERS = ( current_user, current_timestamp, current_account ) AS '<api_gateway_protect_service_url>/pty/snowflake';`
`CREATE SECURE EXTERNAL FUNCTION PTY_UNPROTECT_DATEYYYYMMDD ( val date ) RETURNS date NULL IMMUTABLE COMMENT = 'Unprotects using a Date data element' API_INTEGRATION = REPLACE_WITH_YOUR_API_INTEGRATION_ID HEADERS =( 'X-Protegrity-HCoP-Rules'= '{"jsonpaths":[{"op_type":"UNPROTECT","data_element":"TOK_DATEYYYYMMDD"}]}' ) CONTEXT_HEADERS = ( current_user, current_timestamp, current_account ) AS '<api_gateway_protect_service_url>/pty/snowflake';`
Sample EF Calls:
`SELECT PTY_PROTECT_DATEYYYYMMDD ('2020-12-31');`
`SELECT PTY_UNPROTECT_DATEYYYYMMDD('0653-06-01');`
`SELECT PTY_PROTECT_DATEYYYYMMDD ('31-DEC-2020');*`
`SELECT PTY_UNPROTECT_DATEYYYYMMDD('01-JUN-0653');*`
`SELECT PTY_PROTECT_DATEYYYYMMDD('12/31/2020');*`
`SELECT PTY_UNPROTECT_DATEYYYYMMDD('06/01/0653');*`
`SELECT PTY_PROTECT_DATEYYYYMMDD (current_date);`

Snowflake Masking Policy example:
`create or replace masking policy date_policy as (val date) returns date -> case when current_role() in ('ACCOUNTADMIN') then PTY_UNPROTECT_DATEYYYYMMDD (val) else val end;`
`alter table pii_data modify column field11 set masking policy date_policy; alter table pii_data modify column field11 unset masking policy;`

*: Automatic cast to YYYY-MM-DD, no need to make any conversions. The output is always in the YYYY-MM-DD format

Cutover Dates of the Proleptic Gregorian Calendar: no issues (no conversions performed by Snowflake)

Method: Tokenization
Type: DATETIME

Snowflake Data Types	Snowflake Max Size	Protegrity Max Size
DATE	10 bytes	29 bytes
DATETIME	29 bytes
TIMESTAMPNTZ*
TIMESTAMP_NTZ*
TIMESTAMP WITHOUT TIME ZONE*

External Function Sample Definitions:
`CREATE SECURE EXTERNAL FUNCTION PTY_PROTECT_DATETIME ( val timestamp ) RETURNS timestamp NULL IMMUTABLE COMMENT = 'Protects using a TIMESTAMP data element' API_INTEGRATION = REPLACE_WITH_YOUR_API_INTEGRATION_ID HEADERS =( 'X-Protegrity-HCoP-Rules'= '{"jsonpaths":[{"op_type":"PROTECT","data_element":"TOK_DATETIME"}]}' ) CONTEXT_HEADERS = ( current_user, current_timestamp, current_account ) AS '<api_gateway_protect_service_url>/pty/snowflake';`
`CREATE SECURE EXTERNAL FUNCTION PTY_UNPROTECT_DATETIME ( val timestamp ) RETURNS timestamp NOT NULL IMMUTABLE COMMENT = 'Unprotects using a TIMESTAMP data element' API_INTEGRATION = REPLACE_WITH_YOUR_API_INTEGRATION_ID HEADERS =( 'X-Protegrity-HCoP-Rules'= '{"jsonpaths":[{"op_type":"UNPROTECT","data_element":"TOK_DATETIME"}]}' ) CONTEXT_HEADERS = ( current_user, current_timestamp, current_account ) AS '<api_gateway_protect_service_url>/pty/snowflake';`
Sample EF Calls:
`SELECT PTY_PROTECT_DATETIME('2010-10-25');`
`SELECT PTY_UNPROTECT_DATETIME('0845-04-04');`
`SELECT PTY_PROTECT_DATETIME('2010-10-25 10:45:33');`
`SELECT PTY_UNPROTECT_DATETIME('0845-04-04 10:45:33');`
`SELECT PTY_PROTECT_DATETIME('2010-10-25 10:45:33.123');`
`SELECT PTY_UNPROTECT_DATETIME('0845-04-04 10:45:33.123');`
`SELECT PTY_PROTECT_DATETIME(current_date);`
`SELECT PTY_PROTECT_DATETIME(cast(current_timestamp as TIMESTAMPNTZ));`

Snowflake Masking Policy example:
`create or replace masking policy datetime_policy as (val timestampntz) returns timestampntz -> case when current_role() in ('ACCOUNTADMIN') then PTY_UNPROTECT_DATETIME (val) else val end;`
`alter table pii_data modify column field12 set masking policy datetime_policy; alter table pii_data modify column field12 unset masking policy;`

*: Default TIMESTAMP in Snowflake includes Time Zone – not supported by Protegrity’s DATETIME data element

Method: Tokenization
Type: DECIMAL

Snowflake Data Types	Snowflake Max Size	Protegrity Max Size
NUMBER(N,M)	38 digits	36 digits
NUMERIC(N,M)*
DECIMAL(N,M)*

External Function Sample Definitions:
`CREATE SECURE EXTERNAL FUNCTION PTY_PROTECT_DECIMAL ( val NUMBER(38,6) ) RETURNS NUMBER(38,6) NULL IMMUTABLE COMMENT = 'Protects using a DECIMAL data element' API_INTEGRATION = REPLACE_WITH_YOUR_API_INTEGRATION_ID HEADERS =( 'X-Protegrity-HCoP-Rules'= '{"jsonpaths":[{"op_type":"PROTECT","data_element":"TOK_DECIMAL"}]}' ) CONTEXT_HEADERS = ( current_user, current_timestamp, current_account ) AS '<api_gateway_protect_service_url>/pty/snowflake';`
`CREATE SECURE EXTERNAL FUNCTION PTY_UNPROTECT_DECIMAL ( val NUMBER(38,6) ) RETURNS NUMBER(38,6) NULL IMMUTABLE COMMENT = 'Unprotects using a DECIMAL data element' API_INTEGRATION = REPLACE_WITH_YOUR_API_INTEGRATION_ID HEADERS =( 'X-Protegrity-HCoP-Rules'= '{"jsonpaths":[{"op_type":"UNPROTECT","data_element":"TOK_DECIMAL"}]}' ) CONTEXT_HEADERS = ( current_user, current_timestamp, current_account ) AS '<api_gateway_protect_service_url>/pty/snowflake';`
Sample EF Calls:
`SELECT PTY_PROTECT_DECIMAL (12345678.99);`
`SELECT PTY_UNPROTECT_DECIMAL (21872469.760000);`

Snowflake Masking Policy example:
`create or replace masking policy decimal_policy as (val NUMBER(38,6)) returns NUMBER(38,6)-> case when current_role() in ('ACCOUNTADMIN') then PTY_UNPROTECT_DECIMAL (val) else val end;`
`alter table pii_data modify column field13 set masking policy decimal_policy; alter table pii_data modify column field13 unset masking policy;`

*: Synonymous with NUMBER

Method: Tokenization
Type: INTEGER

Snowflake Data Types	Snowflake Max Size	Protegrity Max Size
NUMBER	38 digits	2 bytes 4 bytes 8 bytes
NUMERIC*
INT*
INTEGER*
BIGINT*
SMALLINT*
TINYINT*
BYTEINT*

External Function Sample Definitions:
`CREATE SECURE EXTERNAL FUNCTION PTY_PROTECT_INTEGER ( val NUMBER ) RETURNS NUMBER NULL IMMUTABLE COMMENT = 'Protects using an INTEGER data element' API_INTEGRATION = REPLACE_WITH_YOUR_API_INTEGRATION_ID HEADERS =( 'X-Protegrity-HCoP-Rules'= '{"jsonpaths":[{"op_type":"PROTECT","data_element":"TOK_INTEGER"}]}' ) CONTEXT_HEADERS = ( current_user, current_timestamp, current_account ) AS '<api_gateway_protect_service_url>/pty/snowflake';`
`CREATE SECURE EXTERNAL FUNCTION PTY_UNPROTECT_INTEGER ( val NUMBER ) RETURNS NUMBER NOT NULL IMMUTABLE COMMENT = 'Unprotects using an INTEGER data element' API_INTEGRATION = REPLACE_WITH_YOUR_API_INTEGRATION_ID HEADERS =( 'X-Protegrity-HCoP-Rules'= '{"jsonpaths":[{"op_type":"UNPROTECT","data_element":"TOK_INTEGER"}]}' ) CONTEXT_HEADERS = ( current_user, current_timestamp, current_account ) AS '<api_gateway_protect_service_url>/pty/snowflake';`
Sample EF Calls:
`SELECT PTY_PROTECT_INTEGER (123456789);`
`SELECT PTY_UNPROTECT_INTEGER (1104108887);`

Snowflake Masking Policy example:
`create or replace masking policy int_policy as (val NUMBER ) returns NUMBER -> case when current_role() in ('ACCOUNTADMIN') then PTY_UNPROTECT_INTEGER (val) else val end;`
`alter table pii_data modify column field14 set masking policy int_policy; alter table pii_data modify column field14 unset masking policy;`

*: Synonymous with NUMBER, except that precision and scale cannot be specified $i.e. always defaults to NUMBER\(38, 0$\)

**Recommended approach for protecting whole numbers fields in Snowflake

When values are	…then use the following Data Element:
Between -32768 and 32767	INTEGER (2 bytes)
Between -2147483648 and 2147483647	INTEGER (4 bytes)
Between -9223372036854775808 and 9223372036854775807	INTEGER (8 bytes)
< -9223372036854775808 or > 9223372036854775807	DECIMAL

When in doubt, use DECIMAL for any numeric range.

10.3 - Configuring Regular Expression to Extract Policy Username

Example configurations for user extraction with regular expressions

Configuring Regular Expression to Extract Policy Username

Cloud Protect Cloud Function exposes USERNAME_REGEX configuration to allow extraction of policy username from user in the request.

USERNAME_REGEX Cloud Function Environment configuration
The USERNAME_REGEX environment variable can be set to contain regular expression with one capturing group. This group is used to extract the username. Examples below show different regular expression values and the resulting policy user.

USERNAME_REGEX	User in the request	Effective Policy User
Not Set	user@domain.com	user@domain.com
Not Set	service-account-user@project-id.iam.gserviceaccount.com	service-account-user@project-id.iam.gserviceaccount.com
`^(.)@.$`	service-account-user@project-id.iam.gserviceaccount.com	service-account-user
`^(.)@.$`	user@domain.com	user

10.4 - Associating ESA Data Store With Cloud Protect Agent

ESA controls policy access by mapping server IPs to data stores, registering a node when an agent requests a policy and its IP is identified.

ESA controls which policy is deployed to protector using concept of data store. A data store may contain a list of IP addresses identifying servers allowed to pull the policy associated with that specific data store. Data store may also be defined as default data store, which allows any server to pull the policy, provided it does not belong to any other data stores. Node registration occurs when the policy server (in this case the policy agent) makes a policy request to ESA, where the agent’s IP address is identified by ESA.

Note

For more information about ESA data store refer to Policy Management Guide which is part of Protegrity ESA documentation.

Policy agent function source IP address used for node registration on ESA depends on ESA hubcontroller configuration ASSIGN_DATASTORE_USING_NODE_IP and the PTY_ADDIPADDRESSHEADER configuration exposed by the agent function.

The function service uses multiple network interfaces, internal network interface with ephemeral IP range of 169.254.x.x and external network interface with IP range described in Function app outbound IP addresses section under function configuration. By default, when agent function is contacting ESA to register node for policy download, ESA uses agent function outbound IP address. This default behavior is caused by the default ESA hubcontroller configuration ASSIGN_DATASTORE_USING_NODE_IP=false and agent default configuration PTY_ADDIPADDRESSHEADER=yes.

In some cases, when there is a proxy server between the ESA and agent function, the desirable ESA configuration is ASSIGN_DATASTORE_USING_NODE_IP=true. and PTY_ADDIPADDRESSHEADER=no which will cause the ESA to use proxy server IP address.

The table below shows how the hubcontroller and agent settings will affect node IP registration on ESA.

Agent source IP	Agent Function Outbound IP	Proxy IP	ESA config - ASSIGN_DATASTORE_USING_NODE_IP	Agent function config - PTY_ADDIPADDRESSHEADER	Agent node registration IP
169.254.144.81	20.75.43.207	No Proxy	true	yes	169.254.144.81
true	no	20.75.43.207
false	yes
false	no
169.254.144.81	20.75.43.207	34.230.42.110	true	yes	169.254.144.81
true	no	34.230.42.110
false	yes
false	no

10.5 - Undeliverable Audit Log Recovery

Audit log recovery procedures

Protegrity Cloud Protect Log Forwarder installation provides a solution to recover undelivered audit logs. Reasons for undeliverable logs may include:

Changes to network configuration in ESA or cloud provider (VPC, firewall, certificate rotation, service user credentials)
Log Forwarder IAM Service Account permissions
Log Forwarder Cloud Run Function configuration
Disruption in cloud provider service

Log Forwarder Dead Letter Pub/Sub Architecture

Log Forwarder is triggered by pub/sub events generated by Protect Functions. If Log Forwarder is unable to reach ESA to deliver the logs, they are pushed to a dead letter pub/sub topic. Dead letter pub/sub topic is created when installing the Log Forwarder with the service installation script. See Install Log Forwarder Function via Terraform Scripts for dead letter topic configuration options and naming conventions.

Logs are not delivered to ESA. Undelivered audit logs are sent to a dead letter pub/sub topic.

Monitoring Undelivered Logs

Logs pushed to the dead letter pub/sub topic will be purged and no longer recoverable when specified dlq_topic_message_retention_duration has been reached. Monitoring the dead letter topic is recommended to ensure timely recovery of audit messages before they are permanently lost. Consult the GCP monitoring alerts documentation for setting up alerts based on pub/sub topic metrics.

Recovering Logs in Dead Letter Topic (Recommended)

Protegrity recommends creation of an additional Log Forwarder installation in the case where logs are not delivered to ESA, as described in Log Forwarder Dead Letter Pub/Sub Architecture.

Audit log recovery using new log forwarder installation

Steps to recover audit logs using new Log Forwarder installation:

Create a second Log Forwarder installation (Log Forwarder 2 in the above diagram) for processing undelivered logs. Value for audit_log_dead_letter_topic in the terraform script should be set to null during installation.
Configure and test newly installed Log Forwarder to verify ESA connectivity. See Install Log Forwarder Function via Terraform Scripts for installation instructions.
Identify the dead letter pub/sub topic (DLQ 1 in the above diagram) resource name by running command
```
terraform output
```
for the Log Forwarder which failed to deliver logs (Log Forwarder as described in Log Forwarder Dead Letter Pub/Sub Architecture). Note the value for audit_log_dlq_topic.
Set audit_log_dead_letter_topic in the new Log Forwarder (Log Forwarder 2 in the above diagram) terraform installation script to the value of audit_log_dlq_topic identified in previous step. Apply the changes with terraform apply.
Monitor the new Log Forwarder function logs for any failures.

Note

Any additional failed logs will be pushed to the dead letter pub/sub topic (DLQ 2 in the above diagram) of the new Log Forwarder.

Recovering Logs in Dead Letter Topic (Alternative)

When the recommended method of for recovery described in Recovering Logs in Dead Letter Topic (Recommended) is not an option, you may use the existing Log Forwarder to reprocess undelivered logs.

Audit log recovery using existing log forwarder installation

Warning

This approach is only recommended for implementors with advanced knowledge of the involved GCP services and can result in permanent loss/duplication of audit logs and additional cost. If unsure, install an additional log forwarder to reprocess logs or reach out to Protegrity for guidance.

Steps to recover audit logs using existing Log Forwarder installation:

Fix any configuration errors causing the Log Forwarder to fail. Verify audit logs are being transmitted successfully to ESA.
Identify the dead letter pub/sub topic (DLQ 1 in the above diagram) resource name by running command
```
terraform output
```
for the Log Forwarder. Note the value for audit_log_dlq_topic.
Set audit_log_dead_letter_topic in the terraform installation script to the value of audit_log_dlq_topic identified in previous step. Apply the changes with
```
terraform apply
```
When audit logs have been transmitted to ESA, revert setting audit_log_dead_letter_topic to null Apply the changes with
```
terraform apply
```

11 -

Solution Overview

The following data ingestion patterns are available with your managed Cloud data warehouse:

Data protection at source applications: In this case, sensitive data is already de-identified (protected) across the enterprise wherever it resides, including the managed data warehouse. Protected data can be ingested directly into your managed Cloud data warehouse. Depending on usage patterns, this ensures that your managed data warehouse is not brought into scope for PCI, PII, GDPR, HIPPA, and other compliance policies.
Data protection using the Extract-Transform-Load (ETL) pattern: In this case, sensitive data may be transformed with a Protegrity protector either on-premise or in the Cloud before it is ingested into Snowflake.
Data protection using the Extract-Load-Transform (ELT) pattern: In this case, sensitive data is protected after it lands into the target system typically through a temporary landing table. It uses the native data warehouse’s compute engine with Protegrity to protect incoming data at very high throughput rates. After the data is protected, the intermediate loading tables are dropped as part of the ingestion workflow.

Snowflake

1 - Overview

Solution Overview

Analytics on Protected Data

Features

2 - Architecture

Deployment Architecture

Audit Log Forwarding Architecture

Snowflake Connectivity

Snowflake’s API Integration Object

3 - Installation

3.1 - Prerequisites

Google Cloud Services

ESA Version Requirements

Note

Prerequisites

Required Skills and Abilities

3.2 - Pre-Configuration

Google Cloud Project

Region

Key Management Service

Note

Google Cloud Storage

Note

Cloud Functions Service Accounts

Agent Function IAM Role

Agent Service Account

Protect Function IAM role

Protect Service Account

3.3 - Protect Service Installation

Preparation

Install Protect Function via Terraform Scripts

Warning

Note

Note

Test Protect Function Installation

Note

3.4 - Snowflake Configuration

Login to Snowflake as ACCOUNTADMIN

Create the Snowflake API Integration Object

Note

Describe the API Integration Object

Update API Gateway Authorization Configuration

Test Connectivity

Troubleshooting

3.5 - Policy Agent Installation

ESA Server

Certificates on ESA

Identify or Create a new VPC

Note

Note

Creating ESA Credentials

Note

Secret Manager

Custom Cloud Function

Warning

Install Policy Agent Function through Terraform Scripts

Warning

Note

Note

Test Agent Function Installation

Note

Note

Troubleshooting

3.6 - Audit Log Forwarder Installation

ESA Audit Store Configuration

Certificates on ESA

Note

VPC configuration

ESA Authentication

Note

Configure ESA Secrets In GCP Secret Manager

Create Log Forwarder Service Account

Create Service Account For Forwarder Pub/Sub

Preparation

Install Log Forwarder Function via Terraform Scripts

Warning

Turn on Instance-based billing.

Test Log Forwarder Function Installation

Warning