Tokenization Support by Protegrity Products

Lists all token types used by different types of protectors.

Protegrity offers various types of protectors which helps to protect data in different software and platforms. For example, we can use:

  • Application Protectors: To protect data in C, C++, Python, Java, .Net, and Go programming languages.
  • Big Data Protectors: To protect data in Big Data at various component levels, such as, Hive, Pig, MapReduce, etc.
  • Data Warehouse Protectors: To protect data in the Teradata Data Warehouses.
  • Gateway Protectors: To protect data in Gateway Protectors like Data Security Gateway (DSG).
  • Cloud Protectors: To protect data in Cloud Protectors.

Each protector has certain tokenization types which are listed in the following sections.

Application Protector

The Protegrity Application Protector (AP) is a high-performance, versatile solution that provides a packaged interface to integrate comprehensive, granular security and auditing into enterprise applications.

Application Protectors support all types of tokens.

Table: Supported Tokenization Types by Application Protector

Tokenization TypeAP Java*1AP PythonAP C
Credit Card

Numeric

Alpha

Upper-case Alpha

Alpha-Numeric

Upper Alpha-Numeric

Lower ASCII

Email
STRING

CHAR[]

BYTE[]
STRING

BYTES
STRING

CHAR[]

BYTE[]
IntegerSHORT: 2 bytes

INT: 4 bytes

LONG: 8 bytes
INT: 4 bytes and 8 bytesSHORT: 2 bytes

INT: 4 bytes

LONG: 8 bytes
DatetimeDATE

STRING

CHAR[]

BYTE[]
DATE

STRING

BYTES
DATE

STRING

CHAR[]

BYTE[]
DecimalSTRING

CHAR[]

BYTE[]
STRING

BYTES
STRING

CHAR[]

BYTE[]
Unicode Gen2STRING

CHAR[]

BYTE[]
STRING

BYTES
STRING

CHAR[]

BYTE[]
BinaryBYTE[]BYTESBYTE[]

*1 - If the input and output types of the API are BYTE[], then the customer application should convert the input to and output from the byte array, before calling the API.

Table: Deprecated Tokenization Types supported by Application Protector

Tokenization TypeAP Java*1AP PythonAP C
PrintableSTRING

CHAR[]

BYTE[]
STRING

BYTES
STRING

CHAR[]

BYTE[]
DateDATE

STRING

CHAR[]

BYTE[]
DATE

STRING

BYTES
DATE

STRING

CHAR[]

BYTE[]
UnicodeSTRING

CHAR[]

BYTE[]
STRING

BYTES
STRING

CHAR[]

BYTE[]
Unicode Base64STRING

CHAR[]

BYTE[]
STRING

BYTES
STRING

CHAR[]

BYTE[]

*1 - If the input and output types of the API are BYTE[], then the customer application should convert the input to and output from the byte array, before calling the API.

For more information about Application protectors, refer to Application Protector.

Big Data Protector

Protegrity supports MapReduce, Hive, Pig, HBase, Spark, and Impala, which utilizes Hadoop Distributed File System (HDFS) or Ozone as the data storage layer. The data is protected from internal and external threats, and users and business processes can continue to utilize the secured data. Protegrity protects data inside the files using tokenization and strong encryption protection methods.

The following table shows the tokenization types supported for Big Data Protectors.

Table: Supported Tokenization Types for Big Data Protectors

Tokenization TypeMapReduce*1HivePigHBase*1ImpalaSpark*1Spark SQLTrino
Credit Card

Numeric*3

Alpha*3

Upper-case Alpha*3

Alpha-Numeric*3

Upper Alpha-Numeric*3

Lower ASCII

Email*3
BYTE[]STRINGCHARARRAYBYTE[]STRINGVARCHAR
STRING
STRINGVARCHAR
IntegerINT: 4 bytes

LONG: 8 bytes
INT: 4 bytes

BIGINT: 8 bytes
INT: 4 bytesBYTE[]SMALL INT: 2 bytes

INT: 4 bytes

BIGINT: 8 bytes
SHORT: 2 bytes

INT: 4 bytes

LONG: 8 bytes
SHORT: 2 bytes

INT: 4 bytes

LONG: 8 bytes
SMALL INT: 2 bytes

INT: 4 bytes

BIGINT: 8 bytes
Datetime*2BYTE[]STRING

DATE

DATETIME
CHARARRAYBYTE[]STRINGBYTE[]

STRING
STRING

DATE

DATETIME
VARCHAR

DATE

TIMESTAMP
DecimalBYTE[]STRINGCHARARRAYBYTE[]STRINGBYTE[]

STRING
STRINGVARCHAR
Unicode Gen2BYTE[]STRINGNot supportedBYTE[]STRINGBYTE[]

STRING
STRINGVARCHAR
BinaryBYTE[]Not supportedNot supportedBYTE[]Not supportedBYTE[]Not supportedNot supported

*1 - The customer application should convert the input into a byte array and generate the output from the byte array in the required data type.
*2 - The Datetime tokenization will only work with VARCHAR data type.
*3 - The Char tokenization UDFs only support Numeric, Alpha, Alpha Numeric, Upper-case Alpha, Upper Alpha-Numeric, and Email data elements, and with length preservation selected. Using any other data elements with Char tokenization UDFs is not supported. Using non-length preserving data elements with Char tokenization UDFs is not supported.

The following table shows the deprecated tokenization types supported for Big Data Protectors.

Table: Deprecated Tokenization Types supported for Big Data Protectors

Tokenization TypeMapReduce*1HivePigHBase*1ImpalaSpark*1Spark SQLTrino
PrintableBYTE[]Not supportedNot supportedBYTE[]STRINGBYTE[]Not supportedNot supported
DateBYTE[]STRING

DATE

DATETIME
CHARARRAYBYTE[]STRINGBYTE[]

STRING
STRING

DATE

DATETIME
VARCHAR

DATE

TIMESTAMP
UnicodeBYTE[]STRINGNot supportedBYTE[]STRINGBYTE[]

STRING
STRINGVARCHAR
Unicode Base64BYTE[]STRINGNot supportedBYTE[]STRINGBYTE[]

STRING
STRINGVARCHAR

*1 - The customer application should convert the input into a byte array and generate the output from the byte array in the required data type.

For more information about Big Data protectors, refer to Big Data Protector.

Data Warehouse Protector

The Protegrity Data Warehouse Protector is an advanced security solution designed to protect sensitive data at the column level. This enables you to secure your data, while still permitting access to authorized users. Additionally, the Data Warehouse Protector integrates seamlessly with existing database systems using the User-Defined Functions for an enhanced security. Protegrity protects data inside the data warehouses using various tokenization and encryption methods.

Table: Supported Tokenization Types for Data Warehouse Protector

Tokenization TypeTeradata
Credit Card

Numeric

Alpha

Upper-case Alpha

Alpha-Numeric

Upper Alpha-Numeric

Lower ASCII

Email

Datetime

Decimal
VARCHAR LATIN
IntegerSMALLINT: 2 bytes

INTEGER: 4 bytes

BIGINT: 8 bytes
Unicode Gen2VARCHAR UNICODE
BinaryNot supported

Table: Deprecated Tokenization Types supported by Data Warehouse Protector

Tokenization TypeTeradata
PrintableVARCHAR LATIN
Date

DATE

CHAR
UnicodeVARCHAR UNICODE
Unicode Base64Not supported

For more information about Data Warehouse protectors, refer to Data Warehouse Protector.

  • If you have fixed-length data fields and the input data is shorter than the length of the field, then truncate the leading and trailing white spaces before passing the input to the respective Protect and Unprotect UDFs.
  • The truncation of whitespaces ensures consistent data output for the protect and unprotect operations. This consistency holds true across all Protegrity products.
  • For more information, refer to Truncating Whitespaces.

Database Protector

The Database Protector is a comprehensive data security solution designed to protect sensitive data directly within relational databases. It enables data protection using high‑performance, while allowing applications and authorized users to continue accessing the data transparently.

The following table shows the tokenization types supported for Database Protectors.

Table: Supported Tokenization Types for Database Protectors

Tokenization TypeOracle Data TypesMSSQL Data Types
Credit CardVARCHAR2
CHAR
VARCHAR
CHAR
NumericVARCHAR2
CHAR
VARCHAR
CHAR
AlphaVARCHAR2
CHAR
VARCHAR
CHAR
Upper-case AlphaVARCHAR2
CHAR
VARCHAR
CHAR
Alpha-NumericVARCHAR2
CHAR
VARCHAR
CHAR
Upper Alpha-NumericVARCHAR2
CHAR
VARCHAR
CHAR
Lower ASCIIVARCHAR2
CHAR
VARCHAR*5
CHAR
EmailVARCHAR2
CHAR
VARCHAR
CHAR
IntegerINTEGERINTEGER
DatetimeDATE
VARCHAR2
CHAR
VARCHAR
CHAR
DecimalNUMBER
VARCHAR2
CHAR
VARCHAR
CHAR
UnicodeNot SupportedNVARCHAR
Unicode Base64VARCHAR2
NVARCHAR2
NVARCHAR
BinaryNot SupportedNot Supported
PrintableVARCHAR2
CHAR
VARCHAR
CHAR

For more information about Database protectors, refer to Database Protectors


Last modified : March 05, 2026