PySpark - Scala Wrapper UDFs

All the Spark Scala Wrapper UDFs that are available for protection and unprotection in Big Data Protector to build secure Big Data applications are listed here.

For each of the Spark SQL UDF in Spark SQL UDFs, a Scala UDF wrapper class is created so that it can be registered in the PySpark and invoked using the spark.sql() method.

ptyGetVersionScalaWrapper()

The UDF returns the current version of the protector.

Signature:

ptyGetVersionScalaWrapper()

Parameters:

  • None

Result:

  • The UDF returns the current version of the protector.

Example:

spark.udf.registerJavaFunction("ptyGetVersionScalaWrapper", "com.protegrity.spark.wrapper.ptyGetVersion")
spark.sql("select ptyGetVersionScalaWrapper()").show(truncate = False)

ptyGetVersionExtendedScalaWrapper()

The UDF returns the extended version information of the protector.

Signature:

ptyGetVersionExtendedScalaWrapper()

Parameters:

  • None

Result:

  • The UDF returns a String in the following format:
    "BDP: <1>; JcoreLite: <2>; CORE: <3>;"
    
    where,
      1. Is the current version of the Protector.
      1. Is the Jcorelite library version.
      1. Is the Core library version

Example:

spark.udf.registerJavaFunction("ptyGetVersionExtendedScalaWrapper","com.protegrity.spark.wrapper.ptyGetVersionExtended")
spark.sql("select ptyGetVersionExtendedScalaWrapper()").show(truncate = False)

ptyWhoAmIScalaWrapper()

The UDF returns the current logged in user.

Signature:

ptyWhoAmIScalaWrapper()

Parameters:

  • None

Result:

  • The UDF returns the current logged in user.

Example:

spark.udf.registerJavaFunction("ptyWhoAmIScalaWrapper", "com.protegrity.spark.wrapper.ptyWhoAmI")
spark.sql("select ptyWhoAmIScalaWrapper()").show(truncate = False)

ptyProtectStrScalaWrapper()

The UDF protects the string format data that is provided as an input.

Note: For Date and Datetime type of data elements, the protect API returns an invalid input data error if the input value falls between the non-existent date range from 05-OCT-1582 to 14-OCT-1582 of the Gregorian Calendar.
For more information about the tokenization and de-tokenization of the cutover dates of the Proleptic Gregorian Calendar, refer Date and Datetime tokenization.

Signature:

ptyProtectStrScalaWrapper(String colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the string format to protect.
  • dataElement: Specifies the data element to protect the string format data.

Result:

  • The UDF returns the protected data in the string format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyProtectStrScalaWrapper", "com.protegrity.spark.wrapper.ptyProtectStr", StringType())
spark.sql("select ptyProtectStrScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyProtectUnicodeScalaWrapper()

The UDF protects the string (Unicode) format data, which is provided as an input.

Warning: This UDF should be used only if you want to tokenize the Unicode data in PySpark, and migrate the tokenized data from Pyspark to a Teradata database and detokenize the data using the Protegrity Database Protector. Ensure that you use this UDF with a Unicode tokenization data element only.

Signature:

ptyProtectUnicodeScalaWrapper(String colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the string (Unicode) format to protect.
  • dataElement: Specifies the data element to protect the string (Unicode) format data.

Result:

  • The UDF returns the protected data in the string format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyProtectUnicodeScalaWrapper", "com.protegrity.spark.wrapper.ptyProtectUnicode", StringType())
spark.sql("select ptyProtectUnicodeScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyProtectIntScalaWrapper()

The UDF protects the integer format data, which is provided as an input.

Signature:

ptyProtectIntScalaWrapper(Int input, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the integer format to protect.
  • dataElement: Specifies the data element to protect the integer format data.

Result:

  • The UDF returns the protected data in the integer format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyProtectIntScalaWrapper", "com.protegrity.spark.wrapper.ptyProtectInt", IntegerType())
spark.sql("select ptyProtectIntScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyProtectShortScalaWrapper()

The UDF protects the short format data, which is provided as an input.

Signature:

ptyProtectShortScalaWrapper(Short colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the short format to protect.
  • dataElement: Specifies the data element to protect the short format data.

Result:

  • The UDF returns the protected data in the short format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyProtectShortScalaWrapper", "com.protegrity.spark.wrapper.ptyProtectShort", ShortType())
spark.sql("select ptyProtectShortScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyProtectLongScalaWrapper()

The UDF protects the long format data, which is provided as an input.

Signature:

ptyProtectLongScalaWrapper(Long colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the long format to protect.
  • dataElement: Specifies the data element to protect the long format data.

Result:

  • The UDF returns the protected data in the long format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyProtectLongScalaWrapper", "com.protegrity.spark.wrapper.ptyProtectLong", LongType())
spark.sql("select ptyProtectLongScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyProtectDateScalaWrapper()

The UDF protects the date format data, which is provided as an input.

Signature:

ptyProtectDateScalaWrapper(Date colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the date format to protect.
  • dataElement: Specifies the data element to protect the date format data.

Result:

  • The UDF returns the protected data in the date format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyProtectDateScalaWrapper", "com.protegrity.spark.wrapper.ptyProtectDate", DateType())
spark.sql("select ptyProtectDateScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyProtectDateTimeScalaWrapper()

The UDF protects the timestamp format data, which is provided as an input.

Signature:

ptyProtectDateTimeScalaWrapper(Timestamp colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the timestamp format to protect.
  • dataElement: Specifies the data element to protect the timestamp format data.

Result:

  • The UDF returns the protected data in the timestamp format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyProtectDateTimeScalaWrapper", "com.protegrity.spark.wrapper.ptyProtectDateTime", TimestampType())
spark.sql("select ptyProtectDateTimeScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyProtectFloatScalaWrapper()

The UDF protects the float format data, which is provided as an input.

Caution: The Float, Double, and Decimal UDFs will be deprecated in a future version of the Big Data Protector and should not be used.
It is recommended not to use the Float or Double or Decimal data type directly in the Float or Double or Decimal UDFs of Protegrity.
If you want to protect the Float data type, then convert the Float data to String data type and pass the Float converted String data type to the ptyProtectStrScalaWrapper() UDF with the Float tokenizer. Ensure that the right precision and scale of input data are maintained during conversion.
If there is a Float datatype UDF with the Float input, then convert the Float to string data type and pass the Float converted string data type to ptyProtectStrScalaWrapper() UDF with the Float tokenizer.

Warning: Protegrity will not be responsible for any type of data conversion error that might occur during conversion.

Signature:

ptyProtectFloatScalaWrapper(Float colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the float format to protect.
  • dataElement: Specifies the data element to protect the float format data.

Warning: Ensure that you use the No Encryption data element only. Using any other data element might cause corruption of data.

Result:

  • The UDF returns the protected data in the float format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyProtectFloatScalaWrapper", "com.protegrity.spark.wrapper.ptyProtectFloat", FloatType())
spark.sql("select ptyProtectFloatScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyProtectDoubleScalaWrapper()

The UDF protects the double format data, which is provided as an input.

Caution: The Float, Double, and Decimal UDFs will be deprecated in a future version of the Big Data Protector and should not be used.
It is recommended not to use the Float or Double or Decimal data type directly in the Float or Double or Decimal UDFs of Protegrity.
If you want to protect the Double data type, then convert the Double data to String data type and pass the Double converted String data type to the ptyProtectStrScalaWrapper() UDF with the Double tokenizer. Ensure that the right precision and scale of input data are maintained during conversion.
If there is a Double datatype UDF with the Double input, then convert the Double to string data type and pass the Double converted string data type to ptyProtectStrScalaWrapper() UDF with the Double tokenizer.

Warning: Protegrity will not be responsible for any type of data conversion error that might occur during conversion.

Signature:

ptyProtectDoubleScalaWrapper(Double colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the double format to protect.
  • dataElement: Specifies the data element to protect the double format data.

Warning: Ensure that you use the No Encryption data element only. Using any other data element might cause corruption of data.

Result:

  • The UDF returns the protected data in the double format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyProtectDoubleScalaWrapper", "com.protegrity.spark.wrapper.ptyProtectDouble", DoubleType())
spark.sql("select ptyProtectDoubleScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyProtectDecimalScalaWrapper()

The UDF protects the decimal format data, which is provided as an input.

Caution: The Float, Double, and Decimal UDFs will be deprecated in a future version of the Big Data Protector and should not be used.
It is recommended not to use the Float or Double or Decimal data type directly in the Float or Double or Decimal UDFs of Protegrity.
If you want to protect the Decimal data type, then convert the Decimal data to String data type and pass the Decimal converted String data type to the ptyProtectStrScalaWrapper() UDF with the Decimal tokenizer. Ensure that the right precision and scale of input data are maintained during conversion.
If there is a Decimal datatype UDF with the Decimal input, then convert the Decimal to string data type and pass the Decimal converted string data type to ptyProtectStrScalaWrapper() UDF with the decimal tokenizer.

Warning: Protegrity will not be responsible for any type of data conversion error that might occur during conversion.

Signature:

ptyProtectDecimalScalaWrapper(Decimal colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the Decimal format to protect.
  • dataElement: Specifies the data element to protect the Decimal format data.

Warning: Ensure that you use the No Encryption data element only. Using any other data element might cause corruption of data.

Caution: Before the ptyProtectDecimalScalaWrapper() UDF is called, Spark SQL rounds off the decimal value in the table to 18 digits in scale, irrespective of the length of the data.

Result:

  • The UDF returns the protected data in the Decimal format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyProtectDecimalScalaWrapper", "com.protegrity.spark.wrapper.ptyProtectDecimal", DecimalType(precision=10, scale=4))
spark.sql("select ptyProtectDecimalScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyUnprotectStrScalaWrapper()

The UDF unprotects the string format data, which is provided as an input.

Note: For Date and Datetime type of data elements, the protect API returns an invalid input data error if the input value falls between the non-existent date range from 05-OCT-1582 to 14-OCT-1582 of the Gregorian Calendar.
For more information about the tokenization and de-tokenization of the cutover dates of the Proleptic Gregorian Calendar, refer Date and Datetime tokenization.

Signature:

ptyUnprotectStrScalaWrapper(String colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the string format to unprotect.
  • dataElement: Specifies the data element to protect the string format data.

Warning: Ensure that you use the No Encryption data element only. Using any other data element might cause corruption of data.

Result:

  • The UDF returns the unprotected data in the string format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyUnprotectStrScalaWrapper", "com.protegrity.spark.wrapper.ptyUnprotectStr", StringType())
spark.sql("select ptyUnprotectStrScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyUnprotectUnicodeScalaWrapper()

The UDF unprotects the string (unicode) format data, which is provided as an input.

Warning: This UDF should be used only if you want to tokenize the Unicode data in Teradata using the Protegrity Database Protector, and migrate the tokenized data from a Teradata database to PySpark and detokenize the data using the Protegrity Big Data Protector for PySpark. Ensure that you use this UDF with a Unicode tokenization data element only.

Signature:

ptyUnprotectUnicodeScalaWrapper(String colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the string (unicode) format to unprotect.
  • dataElement: Specifies the data element to protect the string (unicode) format data.

Warning: Ensure that you use the No Encryption data element only. Using any other data element might cause corruption of data.

Result:

  • The UDF returns the unprotected data in the string (unicode) format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyUnprotectUnicodeScalaWrapper", "com.protegrity.spark.wrapper.ptyUnprotectUnicode", StringType())
spark.sql("select ptyUnprotectUnicodeScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyUnprotectIntScalaWrapper()

The UDF unprotects the integer format data, which is provided as an input.

Signature:

ptyUnprotectIntScalaWrapper(Int colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the integer format to unprotect.
  • dataElement: Specifies the data element to protect the integer format data.

Caution: If an unauthorized user, with no privileges to unprotect data in the security policy, and the output value set to NULL, attempts to unprotect the protected data of Numeric type data containing Short, Int, Float, Long, Double, and Decimal format values using the respective Spark SQL UDFs, then the output is 0.

Result:

  • The UDF returns the unprotected data in the integer format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyUnprotectIntScalaWrapper", "com.protegrity.spark.wrapper.ptyUnprotectInt", IntegerType())
spark.sql("select ptyUnprotectIntScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyUnprotectShortScalaWrapper()

The UDF unprotects the short format data, which is provided as an input.

Signature:

ptyUnprotectShortScalaWrapper(Short colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the short format to unprotect.
  • dataElement: Specifies the data element to protect the short format data.

Caution: If an unauthorized user, with no privileges to unprotect data in the security policy, and the output value set to NULL, attempts to unprotect the protected data of Numeric type data containing Short, Int, Float, Long, Double, and Decimal format values using the respective Spark SQL UDFs, then the output is 0.

Result:

  • The UDF returns the unprotected data in the short format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyUnprotectShortScalaWrapper", "com.protegrity.spark.wrapper.ptyUnprotectShort", ShortType())
spark.sql("select ptyUnprotectShortScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyUnprotectLongScalaWrapper()

The UDF unprotects the long format data, which is provided as an input.

Signature:

ptyUnprotectLongScalaWrapper(Long colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the long format to unprotect.
  • dataElement: Specifies the data element to protect the long format data.

Caution: If an unauthorized user, with no privileges to unprotect data in the security policy, and the output value set to NULL, attempts to unprotect the protected data of Numeric type data containing Short, Int, Float, Long, Double, and Decimal format values using the respective Spark SQL UDFs, then the output is 0.

Result:

  • The UDF returns the unprotected data in the long format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyUnprotectLongScalaWrapper", "com.protegrity.spark.wrapper.ptyUnprotectLong", LongType())
spark.sql("select ptyUnprotectLongScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyUnprotectDateScalaWrapper()

The UDF unprotects the date format data, which is provided as an input.

Signature:

ptyUnprotectDateScalaWrapper(Date colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the date format to unprotect.
  • dataElement: Specifies the data element to protect the date format data.

Result:

  • The UDF returns the unprotected data in the date format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyUnprotectDateScalaWrapper", "com.protegrity.spark.wrapper.ptyUnprotectDate", DateType())
spark.sql("select ptyUnprotectDateScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyUnprotectDateTimeScalaWrapper()

The UDF unprotects the timestamp format data, which is provided as an input.

Signature:

ptyUnprotectDateTimeScalaWrapper(Timestamp colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the timestamp format to unprotect.
  • dataElement: Specifies the data element to protect the timestamp format data.

Result:

  • The UDF returns the unprotected data in the timestamp format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyUnprotectDateTimeScalaWrapper", "com.protegrity.spark.wrapper.ptyUnprotectDateTime", TimestampType())
spark.sql("select ptyUnprotectDateTimeScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyUnprotectFloatScalaWrapper()

The UDF unprotects the float format data, which is provided as an input.

Caution: The Float, Double, and Decimal UDFs will be deprecated in a future version of the Big Data Protector and should not be used.
It is recommended not to use the Float or Double or Decimal data type directly in the Float or Double or Decimal UDFs of Protegrity.
If you want to protect the Float data type, then convert the Float data to String data type and pass the Float converted String data type to the ptyProtectStrScalaWrapper() UDF with the Float tokenizer. Ensure that the right precision and scale of input data are maintained during conversion.
If there is a Float datatype UDF with the Float input, then convert the Float to string data type and pass the Float converted string data type to ptyProtectStrScalaWrapper() UDF with the Float tokenizer.

Warning: Protegrity will not be responsible for any type of data conversion error that might occur during conversion.

Signature:

ptyUnprotectFloatScalaWrapper(Float colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the float format to unprotect.
  • dataElement: Specifies the data element to unprotect the float format data.

Warning: Ensure that you use the No Encryption data element only. Using any other data element might cause corruption of data.

Caution: If an unauthorized user, with no privileges to unprotect data in the security policy, and the output value set to NULL, attempts to unprotect the protected data of Numeric type data containing Short, Int, Float, Long, Double, and Decimal format values using the respective Spark SQL UDFs, then the output is 0.

Result:

  • The UDF returns the unprotected data in the float format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyUnprotectFloatScalaWrapper", "com.protegrity.spark.wrapper.ptyUnprotectFloat", FloatType())
spark.sql("select ptyUnprotectFloatScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyUnprotectDoubleScalaWrapper()

The UDF unprotects the double format data, which is provided as an input.

Caution: The Float, Double, and Decimal UDFs will be deprecated in a future version of the Big Data Protector and should not be used.
It is recommended not to use the Float or Double or Decimal data type directly in the Float or Double or Decimal UDFs of Protegrity.
If you want to protect the Double data type, then convert the Double data to String data type and pass the Double converted String data type to the ptyProtectStrScalaWrapper() UDF with the Double tokenizer. Ensure that the right precision and scale of input data are maintained during conversion.
If there is a Double datatype UDF with the Double input, then convert the Double to string data type and pass the Double converted string data type to ptyProtectStrScalaWrapper() UDF with the Double tokenizer.

Warning: Protegrity will not be responsible for any type of data conversion error that might occur during conversion.

Signature:

ptyUnprotectDoubleScalaWrapper(Double colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the double format to unprotect.
  • dataElement: Specifies the data element to unprotect the double format data.

Warning: Ensure that you use the No Encryption data element only. Using any other data element might cause corruption of data.

Result:

  • The UDF returns the unprotected data in the double format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyUnprotectDoubleScalaWrapper", "com.protegrity.spark.wrapper.ptyUnprotectDouble", DoubleType())
spark.sql("select ptyUnprotectDoubleScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyUnprotectDecimalScalaWrapper()

The UDF unprotects the decimal format data, which is provided as an input.

Caution: The Float, Double, and Decimal UDFs will be deprecated in a future version of the Big Data Protector and should not be used.
It is recommended not to use the Float or Double or Decimal data type directly in the Float or Double or Decimal UDFs of Protegrity.
If you want to protect the Decimal data type, then convert the Decimal data to String data type and pass the Decimal converted String data type to the ptyProtectStrScalaWrapper() UDF with the Decimal tokenizer. Ensure that the right precision and scale of input data are maintained during conversion.
If there is a Decimal datatype UDF with the Decimal input, then convert the Decimal to string data type and pass the Decimal converted string data type to ptyProtectStrScalaWrapper() UDF with the decimal tokenizer.

Warning: Protegrity will not be responsible for any type of data conversion error that might occur during conversion.

Signature:

ptyUnprotectDecimalScalaWrapper(Decimal colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains the data in the Decimal format to unprotect.
  • dataElement: Specifies the data element to unprotect the Decimal format data.

Warning: Ensure that you use the No Encryption data element only. Using any other data element might cause corruption of data.

Caution: Before the ptyProtectDecimalScalaWrapper() UDF is called, Spark SQL rounds off the decimal value in the table to 18 digits in scale, irrespective of the length of the data.

Caution: If an unauthorized user, with no privileges to unprotect data in the security policy, and the output value set to NULL, attempts to unprotect the protected data of Numeric type data containing Short, Int, Float, Long, Double, and Decimal format values using the respective Spark SQL UDFs, then the output is 0.

Result:

  • The UDF returns the unprotected data in the Decimal format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyUnprotectDecimalScalaWrapper", "com.protegrity.spark.wrapper.ptyUnprotectDecimal", DecimalType(precision=10, scale=4))
spark.sql("select ptyUnprotectDecimalScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyReprotectStrScalaWrapper()

The UDF reprotects the string format protected data that was earlier protected using the ptyProtectStrScalaWrapper UDF, with a different data element.

Signature:

ptyReprotectStrScalaWrapper(String colName, String oldDataElement, String newDataElement)

Parameters:

  • colName: Specifies the column that contains the data in the string format to be reprotected.
  • oldDataElement: Specifies the data element that was used to protect the data earlier.
  • newDataElement: Specifies the new data element that will be used to reprotect the data.

Result:

  • The UDF returns the protected string format data.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyReprotectStrScalaWrapper", "com.protegrity.spark.wrapper.ptyReprotectStr", StringType())
spark.sql("select ptyReprotectStrScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyReprotectUnicodeScalaWrapper()

The UDF reprotects the string format protected data that was earlier protected using the ptyProtectUnicodeScalaWrapper UDF, with a different data element.

Warning: This UDF should be used only if you want to tokenize the Unicode data in PySpark, and migrate the tokenized data from Pyspark to a Teradata database and detokenize the data using the Protegrity Database Protector. Ensure that you use this UDF with a Unicode tokenization data element only.

Signature:

ptyReprotectUnicodeScalaWrapper(String colName, String oldDataElement, String newDataElement)

Parameters:

  • colName: Specifies the column that contains the data in the string format to be reprotected.
  • oldDataElement: Specifies the data element that was used to protect the data earlier.
  • newDataElement: Specifies the new data element that will be used to reprotect the data.

Result:

  • The UDF returns the protected string format data.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyReprotectUnicodeScalaWrapper", "com.protegrity.spark.wrapper.ptyReprotectUnicode", StringType())
spark.sql("select ptyReprotectUnicodeScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyReprotectIntScalaWrapper()

The UDF reprotects the integer format protected data that was earlier protected with a different data element.

Signature:

ptyReprotectIntScalaWrapper(Int colName, String oldDataElement, String newDataElement)

Parameters:

  • colName: Specifies the column that contains the data in the integer format to be reprotected.
  • oldDataElement: Specifies the data element that was used to protect the data earlier.
  • newDataElement: Specifies the new data element that will be used to reprotect the data.

Result:

  • The UDF returns the protected integer format data.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyReprotectIntScalaWrapper", "com.protegrity.spark.wrapper.ptyReprotectInt", IntegerType())
spark.sql("select ptyReprotectIntScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyReprotectShortScalaWrapper()

The UDF reprotects the short format protected data that was earlier protected with a different data element.

Signature:

ptyReprotectShortScalaWrapper(Short colName, String oldDataElement, String newDataElement)

Parameters:

  • colName: Specifies the column that contains the data in the short format to be reprotected.
  • oldDataElement: Specifies the data element that was used to protect the data earlier.
  • newDataElement: Specifies the new data element that will be used to reprotect the data.

Result:

  • The UDF returns the protected short format data.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyReprotectShortScalaWrapper", "com.protegrity.spark.wrapper.ptyReprotectShort", ShortType())
spark.sql("select ptyReprotectShortScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyReprotectLongScalaWrapper()

The UDF reprotects the long format protected data that was earlier protected with a different data element.

Signature:

ptyReprotectLongScalaWrapper(Long colName, String oldDataElement, String newDataElement)

Parameters:

  • colName: Specifies the column that contains the data in the long format to be reprotected.
  • oldDataElement: Specifies the data element that was used to protect the data earlier.
  • newDataElement: Specifies the new data element that will be used to reprotect the data.

Result:

  • The UDF returns the protected long format data.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyReprotectLongScalaWrapper", "com.protegrity.spark.wrapper.ptyReprotectLong", LongType())
spark.sql("select ptyReprotectLongScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyReprotectDateScalaWrapper()

The UDF reprotects the date format protected data that was earlier protected with a different data element.

Signature:

ptyReprotectDateScalaWrapper(Date colName, String oldDataElement, String newDataElement)

Parameters:

  • colName: Specifies the column that contains the data in the date format to be reprotected.
  • oldDataElement: Specifies the data element that was used to protect the data earlier.
  • newDataElement: Specifies the new data element that will be used to reprotect the data.

Result:

  • The UDF returns the protected date format data.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyReprotectDateScalaWrapper", "com.protegrity.spark.wrapper.ptyReprotectDate", DateType())
spark.sql("select ptyReprotectDateScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyReprotectDateTimeScalaWrapper()

The UDF reprotects the timestamp format protected data that was earlier protected with a different data element.

Signature:

ptyReprotectDateTimeScalaWrapper(Timestamp colName, String oldDataElement, String newDataElement)

Parameters:

  • colName: Specifies the column that contains the data in the timestamp format to be reprotected.
  • oldDataElement: Specifies the data element that was used to protect the data earlier.
  • newDataElement: Specifies the new data element that will be used to reprotect the data.

Result:

  • The UDF returns the protected timestamp format data.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyReprotectDateTimeScalaWrapper", "com.protegrity.spark.wrapper.ptyReprotectDateTime", TimestampType())
spark.sql("select ptyReprotectDateTimeScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyReprotectFloatScalaWrapper()

The UDF reprotects the float format data, which is provided as an input.

Caution: The Float, Double, and Decimal UDFs will be deprecated in a future version of the Big Data Protector and should not be used.
It is recommended not to use the Float or Double or Decimal data type directly in the Float or Double or Decimal UDFs of Protegrity.
If you want to protect the Float data type, then convert the Float data to String data type and pass the Float converted String data type to the ptyProtectStrScalaWrapper() UDF with the Float tokenizer. Ensure that the right precision and scale of input data are maintained during conversion.
If there is a Float datatype UDF with the Float input, then convert the Float to string data type and pass the Float converted string data type to ptyProtectStrScalaWrapper() UDF with the Float tokenizer.

Warning: Protegrity will not be responsible for any type of data conversion error that might occur during conversion.

Signature:

ptyReprotectFloatScalaWrapper(Float colName, String oldDataElement, String newDataElement)

Parameters:

  • colName: Specifies the column that contains the data in the float format to be reprotected.
  • oldDataElement: Specifies the data element that was used to protect the data earlier.
  • newDataElement: Specifies the new data element that will be used to reprotect the data.

Warning: Ensure that you use the No Encryption data element only. Using any other data element might cause corruption of data.

Result:

  • The UDF returns the protected data in the float format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyReprotectFloatScalaWrapper", "com.protegrity.spark.wrapper.ptyReprotectFloat", FloatType())
spark.sql("select ptyReprotectFloatScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyReprotectDoubleScalaWrapper()

The UDF reprotects the double format data, which is provided as an input.

Caution: The Float, Double, and Decimal UDFs will be deprecated in a future version of the Big Data Protector and should not be used.
It is recommended not to use the Float or Double or Decimal data type directly in the Float or Double or Decimal UDFs of Protegrity.
If you want to protect the Double data type, then convert the Double data to String data type and pass the Double converted String data type to the ptyProtectStr() UDF with the Double tokenizer. Ensure that the right precision and scale of input data are maintained during conversion.
If there is a Double datatype UDF with the Double input, then convert the Double to string data type and pass the Double converted string data type to ptyProtectStr() UDF with the Double tokenizer.

Warning: Protegrity will not be responsible for any type of data conversion error that might occur during conversion.

Signature:

ptyReprotectDoubleScalaWrapper(Double colName, String oldDataElement, String newDataElement)

Parameters:

  • colName: Specifies the column that contains the data in the double format to be reprotected.
  • oldDataElement: Specifies the data element that was used to protect the data earlier.
  • newDataElement: Specifies the new data element that will be used to reprotect the data.

Warning: Ensure that you use the No Encryption data element only. Using any other data element might cause corruption of data.

Result:

  • The UDF returns the protected data in the double format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyReprotectDoubleScalaWrapper", "com.protegrity.spark.wrapper.ptyReprotectDouble", DoubleType())
spark.sql("select ptyReprotectDoubleScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyReprotectDecimalScalaWrapper()

The UDF reprotects the decimal format data, which is provided as an input.

Caution: The Float, Double, and Decimal UDFs will be deprecated in a future version of the Big Data Protector and should not be used.
It is recommended not to use the Float or Double or Decimal data type directly in the Float or Double or Decimal UDFs of Protegrity.
If you want to protect the Decimal data type, then convert the Decimal data to String data type and pass the Decimal converted String data type to the ptyProtectStrScalaWrapper() UDF with the Decimal tokenizer. Ensure that the right precision and scale of input data are maintained during conversion.
If there is a Decimal datatype UDF with the Decimal input, then convert the Decimal to string data type and pass the Decimal converted string data type to ptyProtectStrScalaWrapper() UDF with the decimal tokenizer.

Warning: Protegrity will not be responsible for any type of data conversion error that might occur during conversion.

Signature:

ptyReprotectDecimalScalaWrapper(Decimal colName, String oldDataElement, String newDataElement)

Parameters:

  • colName: Specifies the column that contains the data in the Decimal format to be reprotected.
  • oldDataElement: Specifies the data element that was used to protect the data earlier.
  • newDataElement: Specifies the new data element that will be used to reprotect the data.

Warning: Ensure that you use the No Encryption data element only. Using any other data element might cause corruption of data.

Caution: Before the ptyReprotectDecimal() UDF is called, Spark SQL rounds off the decimal value in the table to 18 digits in scale, irrespective of the length of the data.

Result:

  • The UDF returns the protected data in the Decimal format.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyReprotectDecimalScalaWrapper", "com.protegrity.spark.wrapper.ptyReprotectDecimal", DecimalType(precision=10, scale=4))
spark.sql("select ptyReprotectDecimalScalaWrapper(column1, 'Data_Element') from table1;").show(truncate = False)

ptyStringEncScalaWrapper()

The UDF encrypts the string value, provided as an input, to get binary data.

Signature:

ptyStringEncScalaWrapper(String colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains data in String format to be encrypted.
  • dataElement: The data element in the String format that will be used to encrypt the data.

Result:

  • The UDF returns the encrypted binary format data.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyStringEncScalaWrapper", "com.protegrity.spark.wrapper.ptyStringEnc", BinaryType())
spark.sql("select ptyStringEncScalaWrapper (column1, 'Data_Element') from table1;").show(truncate = False)

ptyStringDecScalaWrapper()

The UDF decrypts the binary value, provided as an input, to get string data.

Signature:

ptyStringDecScalaWrapper(Binary colName, String dataElement)

Parameters:

  • colName: Specifies the column that contains data in binray format to be decrypted.
  • dataElement: The data element in the String format that will be used to decrypt the data.

Result:

  • The UDF returns the decrypted string format data.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyStringDecScalaWrapper", "com.protegrity.spark.wrapper.ptyStringDec", StringType())
spark.sql("select ptyStringDecScalaWrapper (column1, 'Data_Element') from table1;").show(truncate = False)

ptyStringReEncScalaWrapper()

The UDF re-encrypts the binary value, provided as an input, to get another binary data.

Signature:

ptyStringReEncScalaWrapper (Binary colName, String oldDataElement, String newDataElement)

Parameters:

  • colName: Specifies the column that contains data in the Binary format to be re-encrypted.
  • oldDataElement: Specifies the data element name in the String format that was previously used to encrypt the data.
  • newDataElement: Specifies the name of the new data element in the String format to re-encrypt the data.

Result:

  • The UDF returns the re-encrypted binary format data.

Example:

from pyspark.sql.types import *
spark.udf.registerJavaFunction("ptyStringReEncScalaWrapper", "com.protegrity.spark.wrapper.ptyStringReEnc", BinaryType())
spark.sql("select ptyStringReEncScalaWrapper (column1, 'Old_Data_Element', 'New_Data_Element' ) from table1;").show(truncate = False)

Last modified : January 20, 2026