Tokenization Support by Protegrity Products

Lists all token types used by different types of protectors.

Protegrity offers various types of protectors which helps to protect data in different software and platforms. For example, we can use:

Application Protectors: To protect data in C, C++, Python, Java, .Net, and Go programming languages.
Big Data Protectors: To protect data in Big Data at various component levels, such as, Hive, Pig, MapReduce, etc.
Data Warehouse Protectors: To protect data in the Teradata Data Warehouses.
Gateway Protectors: To protect data in Gateway Protectors like Data Security Gateway (DSG).
Cloud Protectors: To protect data in Cloud Protectors.

Each protector has certain tokenization types which are listed in the following sections.

Application Protector

The Protegrity Application Protector (AP) is a high-performance, versatile solution that provides a packaged interface to integrate comprehensive, granular security and auditing into enterprise applications.

Application Protectors support all types of tokens.

Table: Supported Tokenization Types by Application Protector

Tokenization Type	AP Java^*1	AP Python	AP C
Credit Card Numeric Alpha Upper-case Alpha Alpha-Numeric Upper Alpha-Numeric Lower ASCII Email	STRING CHAR[] BYTE[]	STRING BYTES	STRING CHAR[] BYTE[]
Integer	SHORT: 2 bytes INT: 4 bytes LONG: 8 bytes	INT: 4 bytes and 8 bytes	SHORT: 2 bytes INT: 4 bytes LONG: 8 bytes
Datetime	DATE STRING CHAR[] BYTE[]	DATE STRING BYTES	DATE STRING CHAR[] BYTE[]
Decimal	STRING CHAR[] BYTE[]	STRING BYTES	STRING CHAR[] BYTE[]
Unicode Gen2	STRING CHAR[] BYTE[]	STRING BYTES	STRING CHAR[] BYTE[]
Binary	BYTE[]	BYTES	BYTE[]

^*1 - If the input and output types of the API are BYTE[], then the customer application should convert the input to and output from the byte array, before calling the API.

Table: Deprecated Tokenization Types supported by Application Protector

Tokenization Type	AP Java^*1	AP Python	AP C
Printable	STRING CHAR[] BYTE[]	STRING BYTES	STRING CHAR[] BYTE[]
Date	DATE STRING CHAR[] BYTE[]	DATE STRING BYTES	DATE STRING CHAR[] BYTE[]
Unicode	STRING CHAR[] BYTE[]	STRING BYTES	STRING CHAR[] BYTE[]
Unicode Base64	STRING CHAR[] BYTE[]	STRING BYTES	STRING CHAR[] BYTE[]

^*1 - If the input and output types of the API are BYTE[], then the customer application should convert the input to and output from the byte array, before calling the API.

For more information about Application protectors, refer to Application Protector.

Big Data Protector

Protegrity supports MapReduce, Hive, Pig, HBase, Spark, and Impala, which utilizes Hadoop Distributed File System (HDFS) or Ozone as the data storage layer. The data is protected from internal and external threats, and users and business processes can continue to utilize the secured data. Protegrity protects data inside the files using tokenization and strong encryption protection methods.

The following table shows the tokenization types supported for Big Data Protectors.

Table: Supported Tokenization Types for Big Data Protectors

Tokenization Type	MapReduce^*1	Hive	Pig	HBase^*1	Impala	Spark^*1	Spark SQL	Trino
Credit Card Numeric^3 Alpha^3 Upper-case Alpha^3 Alpha-Numeric^3 Upper Alpha-Numeric^3 Lower ASCII Email^3	BYTE[]	STRING	CHARARRAY	BYTE[]	STRING	VARCHAR STRING	STRING	VARCHAR
Integer	INT: 4 bytes LONG: 8 bytes	INT: 4 bytes BIGINT: 8 bytes	INT: 4 bytes	BYTE[]	SMALL INT: 2 bytes INT: 4 bytes BIGINT: 8 bytes	SHORT: 2 bytes INT: 4 bytes LONG: 8 bytes	SHORT: 2 bytes INT: 4 bytes LONG: 8 bytes	SMALL INT: 2 bytes INT: 4 bytes BIGINT: 8 bytes
Datetime^*2	BYTE[]	STRING DATE DATETIME	CHARARRAY	BYTE[]	STRING	BYTE[] STRING	STRING DATE DATETIME	VARCHAR DATE TIMESTAMP
Decimal	BYTE[]	STRING	CHARARRAY	BYTE[]	STRING	BYTE[] STRING	STRING	VARCHAR
Unicode Gen2	BYTE[]	STRING	Not supported	BYTE[]	STRING	BYTE[] STRING	STRING	VARCHAR
Binary	BYTE[]	Not supported	Not supported	BYTE[]	Not supported	BYTE[]	Not supported	Not supported

^*1 - The customer application should convert the input into a byte array and generate the output from the byte array in the required data type.
^*2 - The Datetime tokenization will only work with VARCHAR data type.
^*3 - The Char tokenization UDFs only support Numeric, Alpha, Alpha Numeric, Upper-case Alpha, Upper Alpha-Numeric, and Email data elements, and with length preservation selected. Using any other data elements with Char tokenization UDFs is not supported. Using non-length preserving data elements with Char tokenization UDFs is not supported.

The following table shows the deprecated tokenization types supported for Big Data Protectors.

Table: Deprecated Tokenization Types supported for Big Data Protectors

Tokenization Type	MapReduce^*1	Hive	Pig	HBase^*1	Impala	Spark^*1	Spark SQL	Trino
Printable	BYTE[]	Not supported	Not supported	BYTE[]	STRING	BYTE[]	Not supported	Not supported
Date	BYTE[]	STRING DATE DATETIME	CHARARRAY	BYTE[]	STRING	BYTE[] STRING	STRING DATE DATETIME	VARCHAR DATE TIMESTAMP
Unicode	BYTE[]	STRING	Not supported	BYTE[]	STRING	BYTE[] STRING	STRING	VARCHAR
Unicode Base64	BYTE[]	STRING	Not supported	BYTE[]	STRING	BYTE[] STRING	STRING	VARCHAR

^*1 - The customer application should convert the input into a byte array and generate the output from the byte array in the required data type.

For more information about Big Data protectors, refer to Big Data Protector.

Data Warehouse Protector

The Protegrity Data Warehouse Protector is an advanced security solution designed to protect sensitive data at the column level. This enables you to secure your data, while still permitting access to authorized users. Additionally, the Data Warehouse Protector integrates seamlessly with existing database systems using the User-Defined Functions for an enhanced security. Protegrity protects data inside the data warehouses using various tokenization and encryption methods.

Table: Supported Tokenization Types for Data Warehouse Protector

Tokenization Type	Teradata
Credit Card Numeric Alpha Upper-case Alpha Alpha-Numeric Upper Alpha-Numeric Lower ASCII Email Datetime Decimal	VARCHAR LATIN
Integer	SMALLINT: 2 bytes INTEGER: 4 bytes BIGINT: 8 bytes
Unicode Gen2	VARCHAR UNICODE
Binary	Not supported

Table: Deprecated Tokenization Types supported by Data Warehouse Protector

Tokenization Type	Teradata
Printable	VARCHAR LATIN
Date	DATE CHAR
Unicode	VARCHAR UNICODE
Unicode Base64	Not supported

For more information about Data Warehouse protectors, refer to Data Warehouse Protector.

If you have fixed-length data fields and the input data is shorter than the length of the field, then truncate the leading and trailing white spaces before passing the input to the respective Protect and Unprotect UDFs.
The truncation of whitespaces ensures consistent data output for the protect and unprotect operations. This consistency holds true across all Protegrity products.
For more information, refer to Truncating Whitespaces.

Database Protector

The Database Protector is a comprehensive data security solution designed to protect sensitive data directly within relational databases. It enables data protection using high‑performance, while allowing applications and authorized users to continue accessing the data transparently.

The following table shows the tokenization types supported for Database Protectors.

Table: Supported Tokenization Types for Database Protectors

Tokenization Type	Oracle Data Types	MSSQL Data Types
Credit Card	VARCHAR2 CHAR	VARCHAR CHAR
Numeric	VARCHAR2 CHAR	VARCHAR CHAR
Alpha	VARCHAR2 CHAR	VARCHAR CHAR
Upper-case Alpha	VARCHAR2 CHAR	VARCHAR CHAR
Alpha-Numeric	VARCHAR2 CHAR	VARCHAR CHAR
Upper Alpha-Numeric	VARCHAR2 CHAR	VARCHAR CHAR
Lower ASCII	VARCHAR2 CHAR	VARCHAR^*5 CHAR
Email	VARCHAR2 CHAR	VARCHAR CHAR
Integer	INTEGER	INTEGER
Datetime	DATE VARCHAR2 CHAR	VARCHAR CHAR
Decimal	NUMBER VARCHAR2 CHAR	VARCHAR CHAR
Unicode	Not Supported	NVARCHAR
Unicode Base64	VARCHAR2 NVARCHAR2	NVARCHAR
Binary	Not Supported	Not Supported
Printable	VARCHAR2 CHAR	VARCHAR CHAR

For more information about Database protectors, refer to Database Protectors

Feedback

Was this page helpful?

Last modified : March 05, 2026