Tokenization Properties

The tokenization properties are specified when the data element is created.

Table: Common Tokenization Properties

Token PropertyDescription
User configured token properties
NameUnique name identifying the token element.

Maximum length is 56 characters.
Data TypeType of data to tokenize. Name of the alphabet, which indicates the specific characters to tokenize.
Static Lookup Table (SLT) TokenizersMentions the type of SLT tokenizers (SLT_1_3, SLT_1_6, SLT_2_3, SLT_2_6, SLT_6_DECIMAL, SLT_DATETIME, and SLT_X_1).
Preserve CaseWhether the case of the alphabets and position of the alphabets and numbers must be preserved when tokenizing the value. This is applicable when using the Alpha-Numeric (0-9, a-z, A-Z) token type and the SLT_2_3 tokenizer only.
Preserve PositionWhether the position of the alphabets and numbers must be preserved when tokenizing the value. This is applicable when using the Alpha-Numeric (0-9, a-z, A-Z) token type and the SLT_2_3 tokenizer only.
Preserve LengthWhether tokens will be the same length as the input or not.
Allow Short Data TokenizationWhether short tokens will be enabled or not. We have the following options: “Yes”, “No, generate error”, or “No, return input as it is”.
From LeftNumber of characters from left to keep in clear in tokenized output.
From RightNumber of characters from right to keep in clear in tokenized output.
Minimum Input LengthMinimum length of the input data that can be tokenized.
Maximum Input LengthMaximum length of the input data that can be tokenized.
AlphabetName of the alphabet, which is configured to enable specific set of characters to use for tokenization.
Automatically calculated token properties
Internal Initialization Vector (IV)Whether internal initialization vector (IV) will be used or not.
Other token properties
External Initialization Vector (IV)Whether external initialization vector (IV) will be used or not.

The following table shows what properties can be set for the token types.

Table: Tokenization Properties for Token Types

Tokenization Data TypeTokenizerPreserve lengthPreserve Case/ Preserve PositionAllow Short TokensFrom Left, From RightMinimum/ Maximum lengthExternal IVInternal IV
NumericSLT_1_3,
SLT_2_3,
SLT_1_6,
SLT_2_6
XX
IntegerSLT_1_3XXXXXX
Credit CardSLT_1_3,
SLT_2_3,
SLT_1_6,
SLT_2_6

(always yes)
XXX
AlphaSLT_1_3,
SLT_2_3
XX
Upper-case AlphaSLT_1_3,
SLT_2_3
XX
Alpha-NumericSLT_1_3XX
SLT_2_3X
Upper-Case Alpha-NumericSLT_1_3,
SLT_2_3
XX
Lower ASCIISLT_1_3XX
DatetimeSLT_DATETIME
(always yes)
XXX (Left in clear = 0, Right in clear = 0)XXX
DecimalSLT_6_DECIMALX
(always no)
XXX (Left in clear = 0, Right in clear = 0)XX
Unicode Gen2SLT_1_3,
SLT_X_1
XX
BinarySLT_1_3,
SLT_2_3
X
(always no)
XXX
EmailSLT_1_3,
SLT_2_3
XX (Left in clear = 0, Right in clear = 0)XX
  • X - means that Property is disabled and cannot be specified.
  • √ - means that Property is enabled or can be specified.

The following table shows what properties can be set for the deprecated token types.

Table: Tokenization Properties for deprecated Token Types

Tokenization Data TypeTokenizerPreserve lengthPreserve Case/ Preserve PositionAllow Short TokensFrom Left, From RightMinimum/ Maximum lengthExternal IVInternal IV
PrintableSLT_1_3XX
Date (YYYY-MM-DD)SLT_1_3,
SLT_2_3,
SLT_1_6,
SLT_2_6

(always yes)
XXX (Left in clear = 0, Right in clear = 0)XXX
Date (DD/MM/YYYY)SLT_1_3,
SLT_2_3,
SLT_1_6,
SLT_2_6

(always yes)
XXX (Left in clear = 0, Right in clear = 0)XXX
Date (MM.DD.YYYY)SLT_1_3,
SLT_2_3,
SLT_1_6,
SLT_2_6

(always yes)
XXX (Left in clear = 0, Right in clear = 0)XXX
UnicodeSLT_1_3,
SLT_2_3
X
(always no)
XX (Left in clear = 0, Right in clear = 0)XX
Unicode Base64SLT_1_3,
SLT_2_3
X
(always no)
XX (Left in clear = 0, Right in clear = 0)XX
  • X - means that Property is disabled and cannot be specified.
  • √ - means that Property is enabled or can be specified.

Data Type and Alphabet

The data type specifies the data that should be tokenized, for instance with the characters to expect as input and the output to generate.

Static Lookup Table (SLT) Tokenizers

SLT tokenizer represents a method that uses multiple SLTs to generate tokens.

From Left and From Right Settings

The From Left and From Right settings can be configured to specify the number of characters to leave in clear while tokenizing.

Internal Initialization Vector (IV)

An Internal IV is used during the tokenization process to make it more difficult to detect patterns in multiple tokenized values.

Minimum and Maximum Input Length

The minimum and maximum input lengths are the boundaries that are used in input validation.

Length Preserving

The length preserving tokenization property provides an option to generate token values to preserve the length of input data.

Short Data Tokenization

Data is considered short when the number of tokenizable characters is below the tokenizer’s limit. The behavior for short input data can be configured, as it generally produces weaker tokens.

Case-Preserving and Position-Preserving Tokenization

If you work with the Alpha-Numeric (0-9, a-z, A-Z) token type and SLT_2_3 tokenizer, you can specify additional tokenization options for case preservation and position preservation.

External Initialization Vector (EIV)

The External Initialization Vector (EIV) feature offers an additional level of security. It allows for different tokenized results across protectors for the same input data and token element. The tokenized results are based on the External IV setting on each protector.

Truncating Whitespaces

Truncating Whitespaces ensures that only the actual data is considered during tokenization.


Last modified : January 20, 2026