Spark Java APIs

All the Spark Java APIs that are available for protection and unprotection in Big Data Protector to build secure Big Data applications are listed here.

Spark is an execution engine that carries out batch processing of jobs in-memory and handles a wider range of computational workloads. In addition to processing a batch of stored data, Spark is capable of manipulating data in real time.

Spark leverages the physical memory of the Hadoop system. It utilizes the Resilient Distributed Datasets (RDDs) to store the data in-memory and lowers latency, if the data fits in the memory size. The data is saved on the hard drive only if required. RDDs being the basic units of abstraction and computation in Spark, you can use the Spark protection and unprotection APIs to perform transformation operations on an RDD.

If you want to use the Spark Protector API in a Spark Java job, then you must implement the function interface as per the Spark Java programming specifications. Subsequently, you can use it in the required transformation of an RDD to tokenize the data.

Overview of the Spark Protector

The Protegrity Spark protector extends the functionality of the Spark engine and provides APIs that protect or unprotect the data as it is stored or retrieved.

Spark Protector Usage

The Protegrity Spark protector provides APIs for protecting and reprotecting the data using encryption or tokenization, and unprotecting data by using decryption or detokenization. Note: Ensure that you configure the Spark protector after installing the Big Data Protector.

Spark Scala

The Protegrity Spark protector (Java) can be used with Scala to protect the data by using encryption or tokenization. You can also use it with Scala to unprotect the data using decryption or detokenization.

Sample Code Usage for Spark (Scala)

The Spark protector sample program, described in this section, is an example on how to use the Protegrity Spark protector APIs with Scala.

The sample program utilizes the following three Scala classes for protecting and unprotecting data:

ProtectData.scala – This main class creates the Spark context object and calls the DataLoader class for reading cleartext data.
UnProtectData.scala - This main class creates the Spark Context object and calls the DataLoader class for reading protected data.
DataLoader.scala - This loader class fetches the input from the input path, calls the ProtectFunction to protect the data, and stores the protected data as output in the output path. In addition, it fetches the input from the protected path, calls the UnProtectFunction to unprotect the data, and stores the cleartext content as output.

The following functions perform protection for every new line in the input or unprotection for every new line in the output.

ProtectFunction - This class calls the Spark protector for every new line specified in the input to protect data.
UnProtectFunction - This class calls the Spark protector for every new line specified in the input to unprotect data.

Main Job Class for Protect Operation – ProtectData.scala

ProtectData.scala

package com.protegrity.samples.spark.scala
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
object ProtectData {
def main(args: Array[String]) {
// create a SparkContext object, which tells Spark how to access a cluster.
val sparkContext = new SparkContext(new SparkConf())
// create the new object for class DataLoader
val protector = new DataLoader(sparkContext)
// Call writeProtectedData method which read clear data from input Path i.e (args[0]) and
write data in output path after protect operation
protector.writeProtectedData(args(0), args(1), ",")
}
}

Main Job Class for Unprotect Operation – UnProtectData.scala

UnProtectData.scala

package com.protegrity.samples.spark.scala
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
object UnProtectData {
def main(args: Array[String]) {
val sparkContext = new SparkContext(new SparkConf())
val protector = new DataLoader(sparkContext)
protector.unprotectData(args(0), args(1), ",")
}
}

Utility to call Protect or Unprotect Function – DataLoader.scala

DataLoader.scala

package com.protegrity.samples.spark.scala
import org.apache.log4j.Logger
import org.apache.spark.SparkContext
object DataLoader {
private val logger = Logger.getLogger(classOf[DataLoader])
}
/**
* A Data loader utility for reading & writing protected and un-protected data
*/
class DataLoader(private var sparkContext: SparkContext) {
private var data_element_names: Array[String] = Array("TOK_NAME", "TOK_PHONE",
"TOK_CREDIT_CARD", "TOK_AMOUNT")
private var appid: String = sparkContext.getConf.getAppId
/**
* Writes protected data to the output path delimited by the input delimiter
*
* @param inputPath - path of the input employee info file
* @param outputPath - path where the output should be saved
* @param delim - denotes the delimiter between the fields in the file
*/
def writeProtectedData(inputPath: String, outputPath: String, delim: String) {
// read lines from the input path & create RDD
val rdd = sparkContext.textFile(inputPath)
//import ProtectFunction
import com.protegrity.samples.spark.scala.ProtectFunction._
//call ProtectFunction on rdd
rdd.ProtectFunction(delim, appid, data_element_names, outputPath)
}
/**
* Reads protected data from the input path delimited by the input delimiter
*
* @param protectedInputPath - path of the protected employee data
* @param unprotectedOutputPath - output path where unprotected data should be stored.
* @param delim
*/
def unprotectData(protectedInputPath: String, unprotectedOutputPath: String, delim: String)
{
// read lines from the protectedInputPath & create RDD
val protectedRdd = sparkContext.textFile(protectedInputPath)
//import UnProtectFunction
import com.protegrity.samples.spark.scala.UnProtectFunction._
//call UnprotectFunction on rdd
protectedRdd.UnprotectFunction(delim, appid, data_element_names, unprotectedOutputPath)
}
}

ProtectFunction.scala

package com.protegrity.samples.spark.scala
import java.util.ArrayList
import org.apache.spark.rdd.RDD
import com.protegrity.spark.Protector
import com.protegrity.spark.PtySparkProtector
object ProtectFunction {
/*Defining this class as implicit,so that we can add new functionality to an RDD on the fly.
implicits are lexically bounded i.e If we import this class, then only we can use it's
functions otherwise not*/
implicit class Protect(rdd: RDD[String]) {
def ProtectFunction(delim: String, appid: String, dataElement: Array[String],
protectoutputpath: String) =
{
val protectedRDD = rdd.map { line =>
// splits the input seperated by delimiter in the line
val splits = line.split(delim)
// store first split in protectedString as we are not going to protect first split.
var protectedString = splits(0)
// Initialize input size
val input = Array.ofDim[String](splits.length)
// Initialize output size
val output = Array.ofDim[String](splits.length)
// Initialize errorList
val errorList = new ArrayList[Integer]()
// create the new object for class ptySparkProtector
var protector: Protector = new PtySparkProtector(appid)
// Iterate through the splits and call protect operation
for (i <- 1 until splits.length) {
input(i) = splits(i)
// To protect data, call protect method with parameter dataElement, errorList,
input array and output array.output will be stored in output[]
protector.protect(dataElement(i - 1), errorList, input, output)
//Apppend output with protectedString
protectedString += delim + output(i)
}
protectedString
}
// Save protectedRDD into output path
protectedRDD.saveAsTextFile(protectoutputpath)
}
}
}

UnprotectFunction.scala

package com.protegrity.samples.spark.scala

import java.util.ArrayList
import org.apache.spark.rdd.RDD
import com.protegrity.spark.Protector
import com.protegrity.spark.PtySparkProtector


object UnProtectFunction {
  /*Defining this class as implicit,so that we can add new functionality to an RDD on the fly.
  implicits are lexically bounded i.e If we import this class, then only we can use it's functions otherwise not*/
  implicit class Unprotect(protectedRDD: RDD[String]) {
    def UnprotectFunction(delim: String, appid: String, dataElement: Array[String], unprotectoutputpath: String) =
      {
        val unprotectedRDD = protectedRDD.map { line =>
          // splits the input seperated by delimiter in the line
          val splits = line.split(delim)
          // store first split in unprotectedString
          var unprotectedString = splits(0)
          // Initialize input size
          val input = Array.ofDim[String](splits.length)
          // Initialize output size
          val output = Array.ofDim[String](splits.length)
          // Initialize errorList
          val errorList = new ArrayList[Integer]()
          // create the object for class ptySparkProtector
          var protector: Protector = new PtySparkProtector(appid)
          // Iterate through the splits and call unprotect operation
          for (i <- 1 until splits.length) {
            input(i) = splits(i)
            // To unprotect data, call unprotect method with parameter dataElement, errorList, input array and output array.output will be stored in output[]
            protector.unprotect(dataElement(i - 1), errorList, input, output)
            //Apppend output with protectedString
            unprotectedString += delim + output(i)
          }
          unprotectedString
        }

        // Save unprotectedRDD into output path
        unprotectedRDD.saveAsTextFile(unprotectoutputpath)
      }
  }
}

Spark APIs and supported protection methods

The following table lists the Spark APIs, the input and output data types, and the supported Protection Methods:

Operation	Input	Output	Protection Method Supported
Protect	Byte	Byte	Tokenization, Encryption, No Encyption, CUSP
Protect	Short	Short	Tokenization, No Encyption
Protect	Short	Byte	Encryption, CUSP
Protect	Int	Int	Tokenization, No Encyption
Protect	Int	Byte	Encryption, CUSP
Protect	Long	Long	Tokenization, No Encyption
Protect	Long	Byte	Encryption, CUSP
Protect	Float	Float	Tokenization, No Encyption
Protect	Float	Byte	Encryption, CUSP
Protect	Double	Double	Tokenization, No Encyption
Protect	Double	Byte	Encryption, CUSP
Protect	String	String	Tokenization, No Encyption
Protect	String	Byte	Encryption, CUSP
Unprotect	Byte	Byte	Tokenization, Encryption, No Encyption, CUSP
Unprotect	Short	Short	Tokenization, NoEncyption
Unprotect	Byte	Short	Encryption, CUSP
Unprotect	Int	Int	Tokenization, No Encyption
Unprotect	Byte	Int	Encryption, CUSP
Unprotect	Long	Long	Tokenization, No Encyption
Unprotect	Byte	Long	Encryption, CUSP
Unprotect	Float	Float	Tokenization, No Encyption
Unprotect	Byte	Float	Encryption, CUSP
Unprotect	Double	Double	Tokenization, No Encyption
Unprotect	Byte	Double	Encryption, CUSP
Unprotect	String	String	Tokenization, No Encyption
Unprotect	Byte	String	Encryption, CUSP
Reprotect	Byte	Byte	Tokenization, Encryption, CUSP
Reprotect	Short	Short	Tokenization
Reprotect	Int	Int	Tokenization
Reprotect	Long	Long	Tokenization
Reprotect	Float	Float	Tokenization
Reprotect	Double	Double	Tokenization
Reprotect	String	String	Tokenization

Note: If a protected value is generated using Byte as both Input and Output, then only Encryption/CUSP is supported.

Loading the Cleartext Data from a File to HDFS

You must first create a sample csv file that contains the cleartext data in comma separated value format. For example, create the basic_sample_data.csv file with the contents listed below.

ID	Name	Phone	Credit Card	Amount
928724	Hultgren Caylor	9823750987	376235139103947	6959123
928725	Bourne Jose	9823350487	6226600538383292	42964354
928726	Sorce Hatti	9824757883	6226540862865375	7257656
928727	Lorie Garvey	9913730982	5464987835837424	85447788
928728	Belva Beeson	9948752198	5539455602750205	59040774
928729	Hultgren Caylor	9823750987	376235139103947	3245234
928730	Bourne Jose	9823350487	6226600538383292	2300567
928731	Lorie Garvey	9913730982	5464987835837424	85447788
928732	Bourne Jose	9823350487	6226600538383292	3096233
928733	Hultgren Caylor	9823750987	376235139103947	5167763
928734	Lorie Garvey	9913730982	5464987835837424	85447788

To load the cleartext data from the basic_sample_data.csv file to HDFS, run the following command:

hadoop fs -put <Local_Filesystem_Path>/basic_sample_data.csv <Path_of_Cleartext_data_file>

where,

basic_sample_data.csv: Specifies the name of the file containing cleartext data.
<Local_Filesystem_Path>: Specifies the directory path on the local machine where the basic_sample_data.csv file is saved.
<Path_of_Cleartext_data_file>: Specifies the HDFS directory path for the file with the cleartext data.
Note: Ensure that the user who is running the command has read and write access to this location.

Protecting the Existing Data

To protect cleartext data, you must specify the name of the file, which contains the cleartext data and the name of the location that contains the file which would store the protected data. The following command reads the cleartext data from the basic_sample_data.csv file and stores it in the basic_sample_protected directory in protected form using the Spark APIs.

./spark-submit --master yarn --class com.protegrity.spark.ProtectData <PROTEGRITY_DIR>/samples/spark/lib/spark_protector_demo.jar
<Path_of_Cleartext_data_file>/basic_sample_data.csv
<Path_of_Protected_data_file>/basic_sample_protected

Note: Ensure that the user performing the task has the permissions to protect the data, as required, in the data security policy.

com.protegrity.spark.ProtectData: Specifies the Spark protector class for protecting the data.
spark_protector_demo.jar: Specifies the sample .jar file utilizing the Spark protector API to protect the data in the .csv file. You must create this sample .jar file by compiling the scala class files.
<Path_of_Cleartext_data_file>: Specifies the HDFS directory path for the file with cleartext data.
<Path_of_Protected_data_file>: Specifies the HDFS directory path for the file with protected data.
basic_sample_data: Specifies the name of the file to read cleartext data.

Unprotecting the Protected Data

To unprotect the protected data, you must specify the name of the location that contains the file, which stores the protected data and the name of the location that contains the file to store the unprotected data. To retrieve the protected data from the basic_sample_protected directory and save it in the basic_sample_unprotected directory in unprotected form, use the following command.

./spark-submit --master yarn --class com.protegrity.spark.UnProtectData <PROTEGRITY_DIR>/samples/spark/lib/spark_protector_demo.jar
<Path_of_Protected_data_file>/basic_sample_protected_data <Path_of_Unprotected_data_file>/basic_sample_unprotected_data

Note: Ensure that the user performing the task has the permissions to unprotect the data, as required, in the data security policy.

where,

com.protegrity.spark.UnProtectData: Specifies the Spark protector class for unprotecting the data.
spark_protector_demo.jar: Specifies the sample .jar file utilizing the Spark protector API to unprotect the data in the .csv file. You must create the sample .jar file by compiling the scala class files.
<Path_of_Protected_data_file>/basic_sample_protected_data: Specifies the HDFS directory path for the file with protected data.
<Path_of_Protected_data_file>: Specifies the HDFS directory path for the file with protected data.
<Path_of_Unprotected_data_file>/basic_sample_unprotected_data: Specifies the HDFS directory path for the file to store the unprotected data.

Retrieving the Unprotected Data from a File

To retrieve data from a file containing protected data, you must have access to the file. To view the unprotected data contained in the file, use the following command.

hadoop fs -cat <Path_of_Unprotected_data_file> /basic_sample_unprotected_data/part*

where,

<Path_of_Unprotected_data_file>/basic_sample_unprotected_data: Specifies the HDFS directory path for the file that contains the unprotected data.

getVersion()

The function returns the current version of the protector.

Signature:

public String getVersion()

Parameters:

None

Result:

The function returns the current version of the protector.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector(applicationId);
String version = protector.getVersion();

Exception:

The function throws the PtySparkProtectorException if it is unable to return the current version of the Spark protector.

getVersionExtended()

The function returns the extended version information of the protector.

Signature:

public String getVersionExtended()

Parameters:

None

Result:

The function returns a String in the following format:
```
"BDP: <1>; JcoreLite: <2>; CORE: <3>;"
```
where,
- 1. Is the current version of the Protector
- 1. Is the Jcorelite library version
- 1. Is the Core library version

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector(applicationId);
String version = protector.getVersionExtended();

Exception:

The function throws the PtySparkProtectorException if it is unable to return the current version of the Spark protector.

checkAccess()

The function checks the access permissions of the user for the specified data element(s).

Signature:

public boolean checkAccess(String dataElement, Permission permission, String... newDataElement)

Parameters:

dataElement : Specifies the name of the data element. (old data element when checking for reprotect access).
Permission : Specifies the type of the access of the user for the data element(s).
newDataElement: Specifies the name of the new data element when checking for reprotect access.

Result:

The function returns the following values:
- true : If the user has access to the data element(s).
- false : If the user does not have access to the data element(s).

Example:

import com.protegrity.bdp.protector.BDPProtector.Permission;
String dataElement = "dataelement";

Protector protector = new PtySparkProtector("protectAppId");
 
boolean accessProtectType = protector.checkAccess(dataElement, Permission.PROTECT);
boolean accessReprotectType = protector.checkAccess(dataElement, Permission.REPROTECT, dataElement);
boolean accessUnprotectType = protector.checkAccess(dataElement, Permission.UNPROTECT);

Exception:

The function throws the PtySparkProtectorException if it is unable to verify the access of the user for the data element(s).

hmac()

Warning: The function is marked for deprecation and will be removed from the future releases.

Warning: It is recommended to use the HMAC data element with the protect() Byte API for hashing byte array data, instead of using the hmac() API.

The function performs hashing of the data using the HMAC operation on a single data item with a data element, which is associated with HMAC. It returns the hmac value of the data with the data element.

Signature:

public byte[] hmac(String dataElement, byte[] input)

Parameters:

dataElement : Specifies the name of the data element for HMAC.
data : Specifies the byte array of data for HMAC.

Result:

The function returns the Byte array of HMAC data.

Example:

String applicationId = sparkContext.getConf().getAppId()
Protector protector = new PtySparkProtector(applicationId);
byte[] output = protector.hmac("HMAC-SHA1", "test1".getBytes());

Exception:

The function throws the PtySparkProtectorException if it is unable to protect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring	HMAC
hmac()	No	No	No	Yes	No	Yes	Yes

protect() - Byte array data

The function protects the data provided as an array of a byte array. The type of protection applied is defined by the data element.

Note: For Date and Datetime type of data elements, the protect API returns an invalid input data error if the input value falls between the non-existent date range from 05-OCT-1582 to 14-OCT-1582 of the Gregorian Calendar.
For more information about the tokenization and de-tokenization of the cutover dates of the Proleptic Gregorian Calendar, refer Date and Datetime tokenization.

Signature:

public void protect(String dataElement, List<Integer> errorIndex, byte[][] input, byte[][] output, String... charset)

Parameters:

dataElement: Specifies the name of the data element used for protection.
errorIndex: Specifies the list of the Error Index.
input: Specifies an array of the byte array type that contains the data to protect.
output: Specifies an array of the byte array type that contains the protected data.
charset: Specifies the charset of the input data. The applicable charsets are UTF-8 (default), UTF-16LE, and UTF-16BE.

Note: The Protegrity Spark protector only supports bytes converted from the string data type. If any other data type is directly converted to bytes and passed as input to the API that supports byte as input and provides byte as output, then data corruption might occur.

Warning: If you are using the Protect API, which accepts byte as input and provides byte as output, then ensure that when unprotecting the data, the Unprotect API, with byte as input and byte as output is utilized. In addition, ensure that the byte data being provided as input to the Protect API has been converted from a string data type only.

Result:

The output variable in the method signature contains the protected data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement=”Binary”;
byte[][] input = new byte[][]{“test1”.getbytes(),”test2”.getbytes()};
byte[][] output = new byte[input.length][];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.protect(dataElement, errorIndexList, input, output, "UTF-8");

Exception:

The function throws the PtySparkProtectorException if it is unable to protect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring	HMAC
protect() - Byte array data	Numeric (0-9) Credit Card Alpha Upper Case Alpha Alpha Numeric Upper Alpha Numeric Lower ASCII Printable Datetime (YYYY-MM-DD HH:MM:SS) Date (YYYY-MM-DD, DD/MM/YYYY, MM.DD.YYYY) Decimal Email Binary Unicode (Legacy) Unicode (Base64) Unicode (Gen2)	AES-128 AES-256 3DES CUSP	FPE (All)	Yes	Yes	Yes	Yes

protect() - Short array data

The function protects the short format data provided as a short array. The type of protection applied is defined by dataElement.

Signature:

public void protect(String dataElement, List<Integer> errorIndex, short[] input, short[] output)

Parameters:

dataElement: Specifies the name of the data element used for protection.
errorIndex: List of the Error Index
input: Specifies the short array type that contains the data to protect.
output: Specifies the short array type that contains the protected data.

Result:

The output variable in the method signature contains the protected data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement=”short”;
short[] input = new short[] {1234, 4545};
short[] output = new short[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.protect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it is unable to protect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
protect() - Short array data	Integer (2 Bytes)	No	No	Yes	No	Yes

protect() - Short array data for encryption

The function encrypts the short format data provided as a short array. The type of encryption applied is defined by dataElement.

Signature:

public void protect(String dataElement, List<Integer> errorIndex, short[] input, byte[][] output)

Parameters:

dataElement: Specifies the name of the data element used for encryption.
errorIndex: List of the Error Index.
input: Specifies a short array type that contains the data to be encrypted.
output: Specifies an encrypted array of byte array that contains the encrypted data.

Result:

The output variable in the method signature contains the encrypted data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement= "AES-256";
short[] input = new short[] {1234, 4545};
byte[][] output = new byte[input.length][];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.protect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it is unable to encrypt the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
protect() - Short array data for encryption	No	AES-128 AES-256 3DES CUSP	No	Yes	No	Yes

protect() - Int array

The function protects the data provided as int array. The type of protection applied is defined by the dataElement.

Signature:

public void protect(String dataElement, List<Integer> errorIndex, int[] input, int[] output)

Parameters:

dataElement: Specifies the name of the data element to protect the data.
errorIndex: Is the list of the Error Index.
input: Is an int array of data to be protected.
output: Is an int array containing the protected data.

Result:

The output variable in the method signature contains the protected int data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "int";
int[] input = new int[]{1234, 4545};
int[] output = new int[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.protect(dataElement, errorIndexList, input, output);

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
protect() - Int array	Integer (4 Bytes)	No	No	Yes	No	Yes

protect() - Int array data for encryption

The function encrypts the data provided as int array. The type of encryption applied is defined by the dataElement.

Signature:

public void protect(String dataElement, List<Integer> errorIndex, int[] input, byte[][] output)

Parameters:

dataElement: Specifies the name of the data element to encrypt the data.
errorIndex: Is the list of the Error Index.
input: Is an int array of data to be encrypted.
output: Is an array of byte array containing the encrypted data.

Result:

The output variable in the method signature contains the encrypted data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "AES-256";
int[] input = new int[]{1234, 4545};
byte[][] output = new byte[input.length][];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.protect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it is unable to encrypt the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
protect() - Int array data for encryption	No	AES-128 AES-256 3DES CUSP	No	Yes	No	Yes

protect() - Long array data

The function protects the data provided as long byte array. The type of protection applied is defined by the dataElement.

Signature:

public void protect(String dataElement, List<Integer> errorIndex, long[] input, long[] output)

Parameters:

dataElement: Specifies the name of the data element to protect the data.
errorIndex: Is the list of the error index.
input: Is the long array of data to be protected.
output: Is the long array containing the protected data.

Result:

The output variable in the method signature contains the protected data

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "long";
long[] input = new long[] {1234, 4545};
long[] output = new long[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.protect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it is unable to protect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
protect() - Long array data	Integer (8 Bytes)	No	No	Yes	No	Yes

protect() - Long array data for encryption

The function encrypts the data provided as long byte array. The type of protection applied is defined by the dataElement.

Signature:

public void protect(String dataElement, List<Integer> errorIndex, long[] input, byte[][] output)

Parameters:

dataElement: Specifies the name of the data element to encrypt the data.
errorIndex: Is the list of the error index.
input: Is the long array of data to be encrypted.
output: Is an array of a byte array containing the encrypted data.

Result:

The output variable in the method signature contains the encrypted data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "long";
long[] input = new long[] {1234, 4545};
long[] output = new long[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.protect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it is unable to protect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
protect() - Long array data for encryption	No	AES-128 AES-256 3DES CUSP	No	Yes	No	Yes

protect() - Float array data

The function protects the data provided as a float array. The type of protection applied is defined by the dataElement.

Signature:

public void protect(String dataElement, List<Integer> errorIndex, float[] input, float[] output)

Parameters:

dataElement: Specifies the name of the data element to protect the data.
errorIndex: Is the list of the Error Index.
input: Specifies the float array of data to be protected.
output: Specifies the float array containing the protected data.

Result:

The output variable in the method signature contains the protected float data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "float";
float[] input = new float[] {123.4f, 454.5f};
float[] output = new float[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.protect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it fails to protect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
protect() - Float array data	No	No	No	Yes	No	Yes

protect() - Float array data for encryption

The function encrypts the data provided as a float array. The type of protection applied is defined by the dataElement.

Signature:

public void protect(String dataElement, List<Integer> errorIndex, float[] input, byte[][] output)

Parameters:

dataElement: Specifies the name of the data element to encrypt the data.
errorIndex: Is the list of the Error Index.
input: Specifies the float array of data to be encrypted.
output: Specifies the array of byte array containing the encrypted data.

Result:

The output variable in the method signature contains the encrypted data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "AES-256";
float[] input = new float[] {123.4f, 454.5f};
byte[][] output = new byte[input.length][];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.protect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it fails to encrypt the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
protect() - Float array data for encryption	No	AES-128 AES-256 3DES CUSP	No	Yes	No	Yes

protect() - Double array data

The function protects the data provided as a double array. The type of protection applied is defined by the dataElement.

Signature:

public void protect(String dataElement, List<Integer> errorIndex, double[] input, double[] output)

Parameters:

dataElement: Specifies the name of the data element to protect the data.
errorIndex: Is the list of the error index.
input: Is the double array of data to be protected.
output: Is the double array containing the protected data.

Warning: Ensure that you use the data element with the No Encryption method only. Using any other data element might cause corruption of data.

Result:

The output variable in the method signature contains the protected double data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "double";
double[] input = new double[] {123.4, 454.5};
double[] output = new double[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.protect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it fails to protect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
protect() - Double array data	No	No	No	Yes	No	Yes

protect() - Double array data for encryption

The function encrypts the data provided as a double array. The type of protection applied is defined by the dataElement.

Signature:

public void protect(String dataElement, List<Integer> errorIndex, double[] input, byte[][] output)

Parameters:

dataElement: Specifies the name of the data element to encrypt the data.
errorIndex: Is the list of the Error Index.
input: Specifies the double array of data to be encrypted.
output: Specifies an array of byte array containing the encrypted data.

Result:

The output variable in the method signature contains the encrypted data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "AES-256";
double[] input = new double[] {123.4, 454.5};
byte[][] output = new byte[input.length][];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.protect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it fails to encrypt the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
protect() - Double array data for encryption	No	AES-128 AES-256 3DES CUSP	No	Yes	No	Yes

protect() - String array data

The function protects the data provided as a string array. The type of protection applied is defined by the dataElement.

Signature:

public void protect(String dataElement, List<Integer> errorIndex, String[] input, String[] output)

Parameters:

dataElement: Specifies the name of the data element to protect the data.
errorIndex: Is the list of the error index.
input: Is the String array of data to be protected.
output: Is the String array containing the protected data.

Result:

The output variable in the method signature contains the protected String data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "AlphaNum";
String[] input = new String[] {"test1", "test2"};
String[] output = new String[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.protect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it fails to protect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring	HMAC
protect() - String array data	Numeric (0-9) Credit Card Alpha Upper Case Alpha Alpha Numeric Upper Alpha Numeric Lower ASCII Printable Datetime (YYYY-MM-DD HH:MM:SS) Date (YYYY-MM-DD, DD/MM/YYYY, MM.DD.YYYY) Decimal Email Binary Unicode (Legacy) Unicode (Base64) Unicode (Gen2)	No	FPE (All)	Yes	Yes	Yes	Yes

protect() - String array data for encryption

The function encrypts the data provided as a String array. The type of protection applied is defined by the dataElement.

Signature:

public void protect(String dataElement, List<Integer> errorIndex, String[] input, byte[][] output)

Parameters:

dataElement: Specifies the name of the data element to encrypt the data.
errorIndex: Is the list of the Error Index.
input: Specifies the String array of data to be encrypted.
output: Specifies the array of byte array containing the encrypted data.

Result:

The output variable in the method signature contains the encrypted data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "AES-256";
String[] input = new String[] {"test1", "test2"};
byte[][] output = new byte[input.length][];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.protect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it fails to encrypt the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
protect() - String array data for encryption	No	AES-128 AES-256 3DES CUSP	No	Yes	No	Yes

unprotect() - Byte array data

The function unprotects the data provided as an array of a byte array. The type of unprotection applied is defined by the dataElement.

Signature:

public void unprotect(String dataElement, List<Integer> errorIndex, byte[][] inputDataItems, byte[][] output, String... charset)

Parameters:

dataElement: Specifies the name of the data element to unprotect the data.
errorIndex: Specifies the list of the Error Index.
input: Specifies an array of the byte array type that contains the data to unprotect.
output: Specifies an array of the byte array type that contains the unprotected data.
charset: Specifies the charset of the input data. The applicable charsets are UTF-8 (default), UTF-16LE, and UTF-16BE.

Warning: The Protegrity Spark protector only supports bytes converted from the string data type. If any other data type is directly converted to bytes and passed as input to the API that supports byte as input and provides byte as output, then data corruption might occur.

Result:

The output variable in the method signature contains the unprotected data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "Binary";
byte[][] input = new byte[][] {“test1”.getbytes(), ”test2”.getbytes()};
byte[][] output = new byte[input.length][];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.unprotect(dataElement, errorIndexList, input, output, "UTF-8");

Exception:

The function throws the PtySparkProtectorException if it is unable to unprotect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
unprotect() - Byte array data	Numeric (0-9) Credit Card Alpha Upper Case Alpha Alpha Numeric Upper Alpha Numeric Lower ASCII Printable Datetime (YYYY-MM-DD HH:MM:SS) Date (YYYY-MM-DD, DD/MM/YYYY, MM.DD.YYYY) Decimal Email Binary Unicode (Legacy) Unicode (Base64) Unicode (Gen2)	AES-128 AES-256 3DES CUSP	FPE (All)	Yes	Yes	Yes

unprotect() - Short array data

The function unprotects the short format data provided as a short array. The type of protection applied is defined by the dataElement.

Signature:

public void unprotect(String dataElement, List<Integer> errorIndex, short[] input, short[] output)

Parameters:

dataElement: Specifies the name of the data element used to unprotect the data.
errorIndex: List of the Error Index
input: Specifies the short array type that contains the data to unprotect.
output: Specifies the short array type that contains the unprotected data.

Result:

The output variable in the method signature contains the unprotected data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "short";
short[] input = new short[]{1234, 4545};
short[] output = new short[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.unprotect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it is unable to unprotect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
unprotect() - Short array data	Integer (2 Bytes)	No	No	Yes	No	Yes

unprotect() - Short array data for decryption

The function decrypts the array of byte array to get short array. The type of encryption applied is defined by the dataElement.

Signature:

public void unprotect(String dataElement, List<Integer> errorIndex, byte[][] input, short[] output)

Parameters:

dataElement: Specifies the name of the data element used to decrypt the data.
errorIndex: Is the list of the Error Index.
input: Specifies an array of the byte array type that contains the data to be decrypted.
output: Specifies the short array that contains the decrypted data.

Result:

The output variable in the method signature contains the decrypted data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "AES-256";
// here input is encrypted short array created using our below API
// public void protect(String dataElement, List<Integer> errorIndex, short[] input,
byte[][] output) throws PtySparkProtectorException;
byte[][] input = { <encrypted short array> }
short[] output = new short[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.unprotect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it is unable to decrypt the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
unprotect() - Short array data for decryption	No	AES-128 AES-256 3DES CUSP	No	Yes	No	Yes

unprotect() - Int array data

The function unprotects the data provided as int array. The type of unprotection applied is defined by the dataElement.

Signature:

public void protect(String dataElement, List<Integer> errorIndex, int[] input, int[] output)

Parameters:

dataElement: Specifies the name of the data element to unprotect the data.
errorIndex: Is the list of the Error Index.
input: Is an int array of data to be unprotected.
output: Is an int array containing the unprotected data.

Result:

The output variable in the method signature contains the unprotected int data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "int";
int[] input = new int[]{1234, 4545};
int[] output = new int[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.unprotect(dataElement, errorIndexList, input, output);

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
unprotect() - Int array	Integer (4 Bytes)	No	No	Yes	No	Yes

unprotect() - Int array data for decryption

The function decrypts an array of byte array to get an int array. The type of decryption applied is defined by the dataElement.

Signature:

public void unprotect(String dataElement, List<Integer> errorIndex, byte[][] input, int[] output)

Parameters:

dataElement: Specifies the name of the data element to decrypt the data.
errorIndex: Is the list of the Error Index
input: Is an array of a byte array containing the encrypted data.
output: Is an int array containing the decrypted data.

Result:

The output variable in the method signature contains the decrypted data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "AES-256";
// here input is encrypted int array created using our below API
// public void protect(String dataElement, List<Integer> errorIndex, int[] input, byte[]
[] output) throws PtySparkProtectorException;
byte[][] input = {<encrypted int array>};
int[] output = new int[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.unprotect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it is unable to decrypt the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
unprotect() - Int array data for decryption	No	AES-128 AES-256 3DES CUSP	No	Yes	No	Yes

unprotect() - Long array data

The function unprotects the data provided as long array. The type of unprotection applied is defined by the dataElement.

Signature:

public void unprotect(String dataElement, List<Integer> errorIndex, long[] input, long[] output)

Parameters:

dataElement: Specifies the name of the data element to unprotect the data.
errorIndex: Is the list of the error index.
input: Is the long array of data to be unprotected.
output: Is the long array containing the unprotected data.

Result:

The output variable in the method signature contains the unprotected data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "long";
long[] input = new long[] {1234, 4545};
long[] output = new long[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.unprotect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it is unable to unprotect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
unprotect() - Long array data	Integer (8 Bytes)	No	No	Yes	No	Yes

unprotect() - Long array data for decryption

The function decrypts an array of byte array to get a long array. The type of decryption applied is defined by the dataElement.

Signature:

public void unprotect(String dataElement, List<Integer> errorIndex, byte[][] input, long[] output)

Parameters:

dataElement: Specifies the name of the data element to decrypt the data.
errorIndex: Is the list of the error index.
input: Is an array of byte array of data to be decrypted.
output: Is a long array containing the decrypted data.

Result:

The output variable in the method signature contains the decrypted data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "AES-256";
// here input is encrypted long array created using our below API
// public void protect(String dataElement, List<Integer> errorIndex, long[] input,
byte[][] output) throws PtySparkProtectorException;
byte[][] input = { <encrypted long array> };
long[] output = new long[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.unprotect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it is unable to decrypt the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
unprotect() - Long array data for decryption	No	AES-128 AES-256 3DES CUSP	No	Yes	No	Yes

unprotect() - Float array data

The function unprotects the data provided as a float array. The type of unprotection applied is defined by the dataElement.

Signature:

public void unprotect(String dataElement, List<Integer> errorIndex, float[] input, float[] output)

Parameters:

dataElement: Specifies the name of the data element to unprotect the data.
errorIndex: Is the list of the Error Index.
input: Specifies the float array of data to be unprotected.
output: Specifies the float array containing the unprotected data.

Result:

The output variable in the method signature contains the unprotected float data.

Warning: Ensure that you use the data element with the No Encryption method only. Using any other data element might cause data corruption.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "float";
float[] input = new float[] {123.4f, 454.5f};
float[] output = new float[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.unprotect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it fails to unprotect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
unprotect() - Float array data	No	No	No	Yes	No	Yes

unprotect() - Float array data for decryption

The function decrypts an array of byte array to get a float array. The type of decryption applied is defined by the dataElement.

Signature:

public void unprotect(String dataElement, List<Integer> errorIndex, byte[][] input, float[] output)

Parameters:

dataElement: Specifies the name of the data element to decrypt the data.
errorIndex: Is the list of the Error Index.
input: Is an array of a byte array containing the encrypted data.
output: Specifies the float array containing the decrypted data.

Warning: Ensure that you use the data element with either the No Encryption method or Encryption data element only. Using any other data element might cause data corruption.

Result:

The output variable in the method signature contains the decrypted data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "AES-256";
// here input is encrypted float array created using our below API
// public void protect(String dataElement, List<Integer> errorIndex, float[] input,
byte[][] output) throws PtySparkProtectorException;
byte[][] input = { <encrypted float array> };
float[] output = new float[input.length][];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.unprotect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it fails to decrypt the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
unprotect() - Float array data for decryption	No	AES-128 AES-256 3DES CUSP	No	Yes	No	Yes

unprotect() - Double array data

The function unprotects the data provided as a double array. The type of unprotection applied is defined by the dataElement.

Signature:

public void unprotect(String dataElement, List<Integer> errorIndex, double[] input, double[] output)

Parameters:

dataElement: Specifies the name of the data element to unprotect the data.
errorIndex: Is the list of the error index.
input: Is the double array of data to be unprotected.
output: Is the double array containing the unprotected data.

Warning: Ensure that you use the data element with the No Encryption method only. Using any other data element might cause corruption of data.

Result:

The output variable in the method signature contains the unprotected double data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "double";
double[] input = new double[] {123.4, 454.5};
double[] output = new double[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.unprotect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it fails to unprotect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
unprotect() - Double array data	No	No	No	Yes	No	Yes

unprotect() - Double array data for decryption

The function decrypts an array of byte array to get a double array. The type of decryption applied is defined by the dataElement.

Signature:

public void protect(String dataElement, List<Integer> errorIndex, byte[][] input, double[] output)

Parameters:

dataElement: Specifies the name of the data element to decrypt the data.
errorIndex: Is the list of the Error Index.
input: Specifies an array of a byte array containing the encrypted data.
output: Specifies the double array containing the decrypted data.

Warning: Ensure that you use the data element with either the No Encryption method or Encryption data element only. Using any other data element might cause data corruption.

Result:

The output variable in the method signature contains the decrypted data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "AES-256";
// here input is encrypted double array created using our below API
// public void protect(String dataElement, List<Integer> errorIndex, double[] input,
byte[][] output) throws PtySparkProtectorException;
byte[][] input = { <encrypted double array> };
double[] output = new double[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.unprotect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it fails to decrypt the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
unprotect() - Double array data for decryption	No	AES-128 AES-256 3DES CUSP	No	Yes	No	Yes

unprotect() - String array data

The function unprotects the data provided as a String array. The type of protection applied is defined by the dataElement.

Signature:

public void unprotect(String dataElement, List<Integer> errorIndex, String[] input, String[] output)

Parameters:

dataElement: Specifies the name of the data element to unprotect the data.
errorIndex: Is the list of the error index.
input: Is the String array of data to be unprotected.
output: Is the String array containing the unprotected data.

Result:

The output variable in the method signature contains the unprotected data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "AlphaNum";
String[] input = new String[] {"test1", "test2"};
String[] output = new String[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.unprotect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it fails to unprotect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
unprotect() - String array data	Numeric (0-9) Credit Card Alpha Upper Case Alpha Alpha Numeric Upper Alpha Numeric Lower ASCII Printable Datetime (YYYY-MM-DD HH:MM:SS) Date (YYYY-MM-DD, DD/MM/YYYY, MM.DD.YYYY) Decimal Email Binary Unicode (Legacy) Unicode (Base64) Unicode (Gen2)	No	FPE (All)	Yes	Yes	Yes

unprotect() - String array data for decryption

The function decrypts an array of byte array to get a String array. The type of protection applied is defined by the dataElement.

Signature:

public void unprotect(String dataElement, List<Integer> errorIndex, byte[][] input, String[] output)

Parameters:

dataElement: Specifies the name of the data element to decrypt the data.
errorIndex: Is the list of the Error Index.
input: Specifies the array of byte array containing the encrypted data.
output: Specifies the String array containing the decrypted data.

Result:

The output variable in the method signature contains the decrypted data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String dataElement = "AES-256";
// here input is encrypted String array created using our below API
// public void protect(String dataElement, List<Integer> errorIndex, String[] input,
byte[][] output) throws PtySparkProtectorException;
byte[][] input = { <encrypted string array> };
String[] output = new String[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.unprotect(dataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it fails to encrypt the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
unprotect() - String array data for decryption	No	AES-128 AES-256 3DES CUSP	No	Yes	No	Yes

reprotect() - Byte array data

The function reprotects the array of byte array data, protected earlier, with a different data element.

Signature:

public void reprotect(String oldDataElement, String newDataElement, List<Integer> errorIndex, byte[][] input, byte[][] output, String... charset)

Parameters:

oldDataElement: Specifies the name of the data element with which data was protected earlier.
newDataElement: Specifies the name of the new data element to reprotect the data.
errorIndex: Specifies the list of the Error Index
input: Is an array of a byte array that contains the data to be encrypted.
output: Is an array of a byte array containing the reprotected data.
charset: Specifies the charset of the input data. The applicable charsets are UTF-8 (default), UTF-16LE, and UTF-16BE.

Result:

The output variable in the method signature contains the reprotected data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String oldDataElement = "Binary";
String newDataElement = "Binary_1";
byte[][] input = new byte[][] {"test1".getBytes(), "test2".getBytes()};
byte[][] output = new byte[input.length][];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.reprotect(oldDataElement, newDataElement, errorIndexList, input, output, "UTF-8");

Exception:

The function throws the PtySparkProtectorException if it fails to reprotect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
reprotect() - Byte array data	Numeric (0-9) Credit Card Alpha Upper Case Alpha Alpha Numeric Upper Alpha Numeric Lower ASCII Printable Datetime (YYYY-MM-DD HH:MM:SS) Date (YYYY-MM-DD, DD/MM/YYYY, MM.DD.YYYY) Decimal Email Binary Unicode (Legacy) Unicode (Base64) Unicode (Gen2)	AES-128 AES-256 3DES CUSP	FPE (All)	Yes	Yes	Yes

reprotect() - Short array data

The function reprotects the short array data that was protected earlier with a different data element.

Signature:

public void reprotect(String oldDataElement, String newDataElement, List<Integer> errorIndex, short[] input, short[] output)

Parameters:

oldDataElement: Specifies the name of the data element with which data was protected earlier.
newDataElement: Specifies the name of the new data element to reprotect the data.
errorIndex: Specifies the list of the Error Index
input: Specifies the short array of data to be reprotected.
output: Specifies the short array containing the reprotected data.

Result:

The output variable in the method signature contains the reprotected data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String oldDataElement = "short";
String newDataElement = "short_1";
short[] input = new short[] {135, 136};
short[] output = new short[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.reprotect(oldDataElement, newDataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it is unable to reprotect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
reprotect() - Short array data	Integer (2 Bytes)	No	No	Yes	No	Yes

reprotect() - Int array data

The function reprotects the int array data that was protected earlier with a different data element.

Signature:

public void reprotect(String oldDataElement, String newDataElement, List<Integer> errorIndex, int[] input, int[] output)

Parameters:

oldDataElement: Specifies the name of the data element with which data was protected earlier.
newDataElement: Specifies the name of the new data element to reprotect the data.
errorIndex: Specifies the list of the Error Index
input: Specifies the int array of data to be reprotected.
output: Specifies the int array containing the reprotected data.

Result:

The output variable in the method signature contains the reprotected data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String oldDataElement = "int";
String newDataElement = "int_1";
int[] input = new int[] {234,351};
int[] output = new int[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.reprotect(oldDataElement, newDataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it is unable to reprotect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
reprotect() - Int array data	Integer (4 Bytes)	No	No	Yes	No	Yes

reprotect() - Long array data

The function reprotects the long array data that was protected earlier with a different data element.

Signature:

public void reprotect(String oldDataElement, String newDataElement, List<Integer> errorIndex, long[] input, long[] output)

Parameters:

oldDataElement: Specifies the name of the data element with which data was protected earlier.
newDataElement: Specifies the name of the new data element to reprotect the data.
errorIndex: Specifies the list of the Error Index
input: Specifies the long array of data to be reprotected.
output: Specifies the long array containing the reprotected data.

Result:

The output variable in the method signature contains the reprotected data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String oldDataElement = "long";
String newDataElement = "long_1";
long[] input = new long[] {1234, 135};
long[] output = new long[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.reprotect(oldDataElement, newDataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it is unable to reprotect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
reprotect() - Long array data	Integer (8 Bytes)	No	No	Yes	No	Yes

reprotect() - Float array data

The function reprotects the float array data that was protected earlier with a different data element.

Signature:

public void reprotect(String oldDataElement, String newDataElement, List<Integer> errorIndex, float[] input, float[] output)

Parameters:

oldDataElement: Specifies the name of the data element with which data was protected earlier.
newDataElement: Specifies the name of the new data element to reprotect the data.
errorIndex: Specifies the list of the Error Index
input: Specifies the float array of data to be reprotected.
output: Specifies the float array containing the reprotected data.

Warning: Ensure that you use the data element with the No Encryption method only. Using any other data element might cause data corruption.

Result:

The output variable in the method signature contains the reprotected data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String oldDataElement = "NoEnc";
String newDataElement = "NoEnc_1";
float[] input = new float[] {23.56f, 26.43f}};
float[] output = new float[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.reprotect(oldDataElement, newDataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it is unable to reprotect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
reprotect() - Float array data	No	No	No	Yes	No	Yes

reprotect() - Double array data

The function reprotects the double array data that was protected earlier with a different data element.

Signature:

public void reprotect(String oldDataElement, String newDataElement, List<Integer> errorIndex, double[] input, double[] output)

Parameters:

oldDataElement: Specifies the name of the data element with which data was protected earlier.
newDataElement: Specifies the name of the new data element to reprotect the data.
errorIndex: Specifies the list of the Error Index
input: Specifies the double array of data to be reprotected.
output: Specifies the double array containing the reprotected data.

Warning: Ensure that you use the data element with the No Encryption method only. Using any other data element might cause data corruption.

Result:

The output variable in the method signature contains the reprotected data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String oldDataElement = "NoEnc";
String newDataElement = "NoEnc_1";
double[] input = new double[] {235.5, 1235.66};
double[] output = new double[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.reprotect(oldDataElement, newDataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it is unable to reprotect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
reprotect() - Double array data	No	No	No	Yes	No	Yes

reprotect() - String array data

The function reprotects the String array data that was protected earlier with a different data element.

Signature:

public void reprotect(String oldDataElement, String newDataElement, List<Integer> errorIndex, String[] input, String[] output)

Parameters:

oldDataElement: Specifies the name of the data element with which data was protected earlier.
newDataElement: Specifies the name of the new data element to reprotect the data.
errorIndex: Specifies the list of the Error Index
input: Specifies the String array of data to be reprotected.
output: Specifies the String array containing the reprotected data.

Result:

The output variable in the method signature contains the reprotected data.

Example:

String applicationId = sparkContext.getConf().getAppId();
Protector protector = new PtySparkProtector (applicationId);
String oldDataElement = "AlphaNum";
String newDataElement = "AlphaNum_1";
String[] input = new String[] {"test1", "test2"};
String[] output = new String[input.length];
List<Integer> errorIndexList = new ArrayList<Integer>();
protector.reprotect(oldDataElement, newDataElement, errorIndexList, input, output);

Exception:

The function throws the PtySparkProtectorException if it is unable to reprotect the data.

Supported Protection Methods:

Function Name	Tokenization	Encryption	FPE	No Encryption	Masking	Monitoring
reprotect() - String array data	Numeric (0-9) Credit Card Alpha Upper Case Alpha Alpha Numeric Upper Alpha Numeric Lower ASCII Printable Datetime (YYYY-MM-DD HH:MM:SS) Date (YYYY-MM-DD, DD/MM/YYYY, MM.DD.YYYY) Decimal Email Binary Unicode (Legacy) Unicode (Base64) Unicode (Gen2)	No	FPE (All)	Yes	Yes	Yes

Feedback

Was this page helpful?

Last modified : December 18, 2025