Protegrity Tokenization

Protegrity tokenization is a method for tokenizing data. It is optimized to meet the performance, scalability, and manageability requirements of large and complex environments.

Tokenization is the process of replacing sensitive data with tokens that has no worth to someone who gains unauthorized access to the data. With tokenization, specific pieces of original data can be preserved, while the system tokenizes data according to design. Tokens can be set up and deployed directly on the protectors, depending on your enterprise configuration and data security needs. Once tokenization is deployed, operational systems continually work with the tokens. If the operational systems experience a security breach, then only the tokens are at risk of being compromised. Protegrity tokenization is transparent to end-users. Data integrity is strongly enforced by way of the data security policy.

Protegrity tokenization can be configured to preserve different parts of the original value in the token, such as the last 4 digits. It also recognizes and preserves delimiters, which are often used in SSNs, dates, etc.

Protegrity tokenization enables the user to tokenize various input data types, such as payment card industry (PCI), personally identifiable information (PII), and protected health information (PHI).

With Protegrity tokenization, there is a 1:1 relationship between the real data value and its token value. This enables token values to be used as alternative unique IDs that can be used for joining related information.

The following table describes the token types supported by Protegrity tokenization.

Table: Tokenization Types

Tokenization Type	Alphabet Characters	Comment
Numeric (0-9)	Digits 0 through 9
Integer	Digits 0 through 9	Data length: 2 bytes, 4 bytes, and 8 bytes
Credit Card	Digits 0 through 9	Special settings: Invalid LUHN digit, invalid card type, alphabetic indicator
Alpha (a-z, A-Z)	Lowercase letters a through z Uppercase letters A through Z
Upper-case Alpha (A-Z)	Uppercase letters A through Z	Lower case characters will be converted to upper-case in tokenized output value.
Alpha-Numeric (0-9, a-z, A-Z)	Digits 0 through 9 Lowercase letters a through z Uppercase letters A through Z
Upper-Case Alpha-Numeric (0-9, A-Z)	Digits 0 through 9 Uppercase letters A through Z	Lower case characters will be converted to upper-case in tokenized output value.
Lower ASCII	The lower part of ASCII table. Hex character codes from 0x21 to 0x7E	Support of 94 printable characters (ASCII from 33 (!) to 126(~)), the rest are treated as delimiters
Datetime	YYYY-MM-DD HH:MM:SS	Special settings: Tokenize time, Distinguishable date, Date in clear
Decimal	Digits 0 through 9 sign and .(decimal delimiter)	Numeric data with precision and scale. The token will not contain any zeros.
Unicode Gen2	Unicode code points between U+0020 and U+3FFFF	Result is based on the customized set of characters named as alphabet to generate token values.
Binary	Hex character codes from 0x00 to 0xFF
Email	Digits 0 through 9 Lowercase letters a through z Uppercase letters A through Z Special characters with restrictions @ sign and .(dot) are delimiters	Domain part after @ sign will not be tokenized

The following table describes the deprecated token types supported by Protegrity tokenization.

Tokenization Type	Alphabet Characters	Comment
Printable	ASCII printable characters, which include letters, digits, punctuation marks, and miscellaneous symbols. Hex character codes from 0x20 to 0x7E, and from 0xA0 to 0xFF.	ISO 8859-15 Latin alphabet no. 9
Date (YYYY-MM-DD)	Date in big endian form, starting with the year. The following separators are supported: .(dot), / (slash), - (dash).
Date (DD/MM/YYYY)	Date in little endian form, starting with the day. The following separators are supported: . (dot), / (slash), - (dash).
Date (MM.DD.YYYY)	Date in middle endian form, starting with the month. The following separators are supported: . (dot), / (slash), - (dash) supported.
Unicode	UTF-8 text. Hex character codes from 0x00 to 0xFF	Result is Alpha-Numeric.
Unicode Base64	UTF-8 text. Hex character codes from 0x00 to 0xFF	Result is Alpha-Numeric, +, /, and =.

Feedback

Was this page helpful?

Last modified : January 20, 2026

Protegrity Tokenization

Tokenization Support by Protegrity Products

Delimiters

Tokenization Properties

Tokenization Types

Feedback