Overview

Solution overview and features.

Solution Overview

S3 Protector automatically protects files that are uploaded to an Amazon S3 source bucket and writes the processed result to an output bucket. Amazon S3 event notifications trigger the protector, and Cloud Protect API applies the configured Protegrity protection rules to the selected columns in each file.

For most deployments, users only need to decide where files arrive, where protected files should be written, and which mapping.json rules apply to each dataset. The service then processes the file automatically.

The solution requires Protegrity Cloud Protect API on AWS. Cloud Protect API provides the endpoint used by S3 Protector to perform Protegrity operations as part of cloud-based data pipelines.

Protected files can be used as source for a data lake or downstream database ingestion. For example:

Snowflake Snowpipe can be used to automatically ingest protected files as they are written by the S3 Protector.
Amazon Redshift provides a mechanism for bulk loading data from Amazon S3 using the COPY INTO command.

Like other Protegrity products, S3 Protector uses the data security policy maintained on Enterprise Security Appliance (ESA). The ESA policy user supplied during setup acts as the service account for the deployment. For more information about policy user configuration, refer to the Enterprise Security Administrator Guide.

Analytics on Protected Data

Protegrity’s format and length preserving tokenization scheme make it possible to perform analytics directly on protected data. Tokens are join-preserving so protected data can be joined across datasets. Often statistical analytics and machine learning training can be performed without the need to re-identify protected data. However, a user or service account with authorized security policy privileges may re-identify subsets of data using the Cloud Storage Protector - Amazon S3 service.

Features

Protegrity S3 Protector provides the following features:

Fine-grained field-level protection for structured data with the following formats supported:
File Format Suffix
CSV .csv
JSON .json
Parquet .parquet
Excel .xlsx
Role-based access control (RBAC) to protect and unprotect (re-identify) data depending on user privileges.
Policy enforcement features of other Protegrity application protectors.

File Format	Suffix
CSV	.csv
JSON	.json
Parquet	.parquet
Excel	.xlsx

For more information about the available protection options, such as data types, Tokenization or Encryption types, or length-preserving and non-preserving tokens, refer to Protection Methods Reference.

Feedback

Was this page helpful?

Last modified : January 02, 2026