Tokenization Properties

The tokenization properties are specified when the data element is created.

Table: Common Tokenization Properties

Token Property	Description
User configured token properties
Name	Unique name identifying the token element. Maximum length is 56 characters.
Data Type	Type of data to tokenize. Name of the alphabet, which indicates the specific characters to tokenize.
Static Lookup Table (SLT) Tokenizers	Mentions the type of SLT tokenizers (SLT_1_3, SLT_1_6, SLT_2_3, SLT_2_6, SLT_6_DECIMAL, SLT_DATETIME, and SLT_X_1).
Preserve Case	Whether the case of the alphabets and position of the alphabets and numbers must be preserved when tokenizing the value. This is applicable when using the Alpha-Numeric (0-9, a-z, A-Z) token type and the SLT_2_3 tokenizer only.
Preserve Position	Whether the position of the alphabets and numbers must be preserved when tokenizing the value. This is applicable when using the Alpha-Numeric (0-9, a-z, A-Z) token type and the SLT_2_3 tokenizer only.
Preserve Length	Whether tokens will be the same length as the input or not.
Allow Short Data Tokenization	Whether short tokens will be enabled or not. We have the following options: “Yes”, “No, generate error”, or “No, return input as it is”.
From Left	Number of characters from left to keep in clear in tokenized output.
From Right	Number of characters from right to keep in clear in tokenized output.
Minimum Input Length	Minimum length of the input data that can be tokenized.
Maximum Input Length	Maximum length of the input data that can be tokenized.
Alphabet	Name of the alphabet, which is configured to enable specific set of characters to use for tokenization.
Automatically calculated token properties
Internal Initialization Vector (IV)	Whether internal initialization vector (IV) will be used or not.
Other token properties
External Initialization Vector (IV)	Whether external initialization vector (IV) will be used or not.

The following table shows what properties can be set for the token types.

Table: Tokenization Properties for Token Types

Tokenization Data Type	Tokenizer	Preserve length	Preserve Case/ Preserve Position	Allow Short Tokens	From Left, From Right	Minimum/ Maximum length	External IV	Internal IV
Numeric	SLT_1_3, SLT_2_3, SLT_1_6, SLT_2_6	√	X	√	√	X	√	√
Integer	SLT_1_3	√	X	X	X	X	X	X
Credit Card	SLT_1_3, SLT_2_3, SLT_1_6, SLT_2_6	√ (always yes)	X	X	√	X	√	√
Alpha	SLT_1_3, SLT_2_3	√	X	√	√	X	√	√
Upper-case Alpha	SLT_1_3, SLT_2_3	√	X	√	√	X	√	√
Alpha-Numeric	SLT_1_3	√	X	√	√	X	√	√
	SLT_2_3	√	√	√	√	X	√	√
Upper-Case Alpha-Numeric	SLT_1_3, SLT_2_3	√	X	√	√	X	√	√
Lower ASCII	SLT_1_3	√	X	√	√	X	√	√
Datetime	SLT_DATETIME	√ (always yes)	X	X	X (Left in clear = 0, Right in clear = 0)	X	X	X
Decimal	SLT_6_DECIMAL	X (always no)	X	X	X (Left in clear = 0, Right in clear = 0)	√	X	X
Unicode Gen2	SLT_1_3, SLT_X_1	√	X	√	√	X	√	√
Binary	SLT_1_3, SLT_2_3	X (always no)	X	X	√	X	√	√
Email	SLT_1_3, SLT_2_3	√	X	√	X (Left in clear = 0, Right in clear = 0)	X	√	X

X - means that Property is disabled and cannot be specified.
√ - means that Property is enabled or can be specified.

The following table shows what properties can be set for the deprecated token types.

Table: Tokenization Properties for deprecated Token Types

Tokenization Data Type	Tokenizer	Preserve length	Preserve Case/ Preserve Position	Allow Short Tokens	From Left, From Right	Minimum/ Maximum length	External IV	Internal IV
Printable	SLT_1_3	√	X	√	√	X	√	√
Date (YYYY-MM-DD)	SLT_1_3, SLT_2_3, SLT_1_6, SLT_2_6	√ (always yes)	X	X	X (Left in clear = 0, Right in clear = 0)	X	X	X
Date (DD/MM/YYYY)	SLT_1_3, SLT_2_3, SLT_1_6, SLT_2_6	√ (always yes)	X	X	X (Left in clear = 0, Right in clear = 0)	X	X	X
Date (MM.DD.YYYY)	SLT_1_3, SLT_2_3, SLT_1_6, SLT_2_6	√ (always yes)	X	X	X (Left in clear = 0, Right in clear = 0)	X	X	X
Unicode	SLT_1_3, SLT_2_3	X (always no)	X	√	X (Left in clear = 0, Right in clear = 0)	X	√	X
Unicode Base64	SLT_1_3, SLT_2_3	X (always no)	X	√	X (Left in clear = 0, Right in clear = 0)	X	√	X

X - means that Property is disabled and cannot be specified.
√ - means that Property is enabled or can be specified.

Data Type and Alphabet

The data type specifies the data that should be tokenized, for instance with the characters to expect as input and the output to generate.

Static Lookup Table (SLT) Tokenizers

SLT tokenizer represents a method that uses multiple SLTs to generate tokens.

From Left and From Right Settings

The From Left and From Right settings can be configured to specify the number of characters to leave in clear while tokenizing.

Internal Initialization Vector (IV)

An Internal IV is used during the tokenization process to make it more difficult to detect patterns in multiple tokenized values.

Minimum and Maximum Input Length

The minimum and maximum input lengths are the boundaries that are used in input validation.

Length Preserving

The length preserving tokenization property provides an option to generate token values to preserve the length of input data.

Short Data Tokenization

Data is considered short when the number of tokenizable characters is below the tokenizer’s limit. The behavior for short input data can be configured, as it generally produces weaker tokens.

Case-Preserving and Position-Preserving Tokenization

If you work with the Alpha-Numeric (0-9, a-z, A-Z) token type and SLT_2_3 tokenizer, you can specify additional tokenization options for case preservation and position preservation.

External Initialization Vector (EIV)

The External Initialization Vector (EIV) feature offers an additional level of security. It allows for different tokenized results across protectors for the same input data and token element. The tokenized results are based on the External IV setting on each protector.

Truncating Whitespaces

Truncating Whitespaces ensures that only the actual data is considered during tokenization.

Feedback

Was this page helpful?

Last modified : January 20, 2026