Encoding List

List of encoding methods available in DSG ruleset configuration

The rules that use the listed encoding are as follows:

  • Binary
  • HTML Form Media Type
  • Text
  • ProtegrityDataProtection

Standard encoding list

MethodDescription
asciiEnglish (646, us-ascii)
base64Base64 multiline MIME conversion (the result always includes a trailing ‘\n’)
big5Traditional Chinese (big5-tw, csbig5)
big5hkscsTraditional Chinese (big5-hkscs, hkscs)
bz2Compression using bz2
cp037English (IBM037, IBM039)
cp424Hebrew (EBCDIC-CP-HE, IBM424)
cp437English (437, IBM437)
cp500Western Europe (EBCDIC-CP-BE, EBCDIC-CP-CH, IBM500)
cp720Arabic (cp720)
cp737Greek (cp737)
cp775Baltic languages (IBM775)
cp850Western Europe (850, IBM850)
cp852Central and Eastern Europe (852, IBM852)
cp855Bulgarian, Byelorussian, Macedonian, Russian, Serbian (855, IBM855)
cp856Hebrew (cp856)
cp857Turkish (857, IBM857)
cp858Western Europe (858, IBM858)
cp860Portuguese (860, IBM860)
cp861Icelandic (861, CP-IS, IBM861)
cp862Hebrew (862, IBM862)
cp863Canadian (863, IBM863)
cp864Arabic (IBM864)
cp865Danish, Norwegian (865, IBM865)
cp866Russian (866, IBM866)
cp869Greek (869, CP-GR, IBM869)
cp874Thai (cp874)
cp875Greek (cp875)
cp932Japanese (932, ms932, mskanji, ms-kanji)
cp949Korean (949, ms949, uhc)
cp950Traditional Chinese (950, ms950)
cp1006Urdu (cp1006)
cp1026Turkish (ibm1026)
cp1140Western Europe (ibm1140)
cp1250Central and Eastern Europe (windows-1250)
cp1251Bulgarian, Byelorussian, Macedonian, Russian, Serbian (windows-1251)
cp1252Western Europe (windows-1252)
cp1253Greek (windows-1253)
cp1254Turkish (windows-1254)
cp1255Hebrew (windows-1255)
cp1256Arabic (windows-1256)
cp1257Baltic languages (windows-1257)
cp1258Vietnamese (windows-1258)
euc_jpJapanese (eucjp, ujis, u-jis)
euc_jis_2004Japanese (jisx0213, eucjis2004)
euc_jisx0213Japanese (eucjisx0213)
euc_krKorean (euckr, korean, ksc5601, ks_c-5601, ks_c-5601-1987, ksx1001, ks_x-1001)
gb2312Simplified Chinese (chinese, csiso58gb231280, euc-cn, euccn, eucgb2312-cn, gb2312-1980, gb2312-80, iso-ir-58)
gbkUnified Chinese (936, cp936, ms936)
gb18030Unified Chinese (gb18030-2000)
hexHexadecimal representation conversion (two digits per byte)
hzSimplified Chinese (hzgb, hz-gb, hz-gb-2312)
iso2022_jpJapanese (csiso2022jp, iso2022jp, iso-2022-jp)
iso2022_jp_1Japanese (iso2022jp-1, iso-2022-jp-1)
iso2022_jp_2Japanese, Korean, Simplified Chinese, Western Europe, Greek (iso2022jp-2, iso-2022-jp-2)
iso2022_jp_2004Japanese (iso2022jp-2004, iso-2022-jp-2004)
iso2022_jp_3Japanese (iso2022jp-3, iso-2022-jp-3)
iso2022_jp_extJapanese (iso2022jp-ext, iso-2022-jp-ext)
iso2022_krKorean (csiso2022kr, iso2022kr, iso-2022-kr)
latin_1West Europe (iso-8859-1, iso8859-1, 8859, cp819, latin, latin1, L1)
iso8859_2Central and Eastern Europe (iso-8859-2, latin2, L2)
iso8859_3Esperanto, Maltese (iso-8859-3, latin3, L3)
iso8859_4Baltic languages (iso-8859-4, latin4, L4)
iso8859_5Bulgarian, Byelorussian, Macedonian, Russian, Serbian (iso-8859-5, cyrillic)
iso8859_6Arabic (iso-8859-6, arabic)
iso8859_7Greek (iso-8859-7, greek, greek8)
iso8859_8Hebrew (iso-8859-8, hebrew)
iso8859_9Turkish (iso-8859-9, latin5, L5)
iso8859_10Nordic languages (iso-8859-10, latin6, L6)
iso8859_11Thai languages (iso-8859-11, thai)
iso8859_13Baltic languages (iso-8859-13, latin7, L7)
iso8859_14Celtic languages (iso-8859-14, latin8, L8)
iso8859_15Western Europe (iso-8859-15, latin9, L9)
iso8859_16South-Eastern Europe (iso-8859-16, latin10, L10)
johabKorean (cp1361, ms1361)
koi8_rRussian ()
koi8_uUkrainian ()
mac_cyrillicBulgarian, Byelorussian, Macedonian, Russian, Serbian (maccyrillic)
mac_greekGreek (macgreek)
mac_icelandIcelandic (maciceland)
mac_latin2Central and Eastern Europe (maclatin2, maccentraleurope)
mac_romanWestern Europe (macroman)
mac_turkishTurkish (macturkish)
ptcp154Kazakh (csptcp154, pt154, cp154, cyrillic-asian)
shift_jisJapanese (csshiftjis, shiftjis, sjis, s_jis)
shift_jis_2004Japanese (shiftjis2004, sjis_2004, sjis2004)
shift_jisx0213Japanese (shiftjisx0213, sjisx0213, s_jisx0213)
utf_32Unicode Transformation Format (U32, utf32)
utf_32_beUnicode Transformation Format (big endian)
utf_32_leUnicode Transformation Format (little endian)
utf_16Unicode Transformation Format (U16, utf16)
utf_16_beUnicode Transformation Format (big endian BMP only)
utf_16_leUnicode Transformation Format (little endian BMP only)
utf_7Unicode Transformation Format (U7, unicode-1-1-utf-7)
utf_8Unicode Transformation Format (U8, UTF, utf8)
utf_8_sigUnicode Transformation Format (with BOM signature)
zlibGzip compression (zip)

External encoding list

  • Base64
  • HTML Encoding
  • JSON Escape
  • URI Encoding
  • URI Encoding Plus
  • XML Encoding
  • Quoted Printable
  • SQL Escape

Proprietary

  • Base128
  • Unicode
  • CJK
  • High ASCII
Last modified February 7, 2025