Understanding Protegrity Anonymization Python SDK Requests

The following APIs are available with Protegrity Anonymization. You can import Protegrity Anonymization into your Python SDK environment, pass the required parameter and data to the Protegrity Anonymization Python SDK requests, and retrieve work with the anonymized output.

Before running the anonymization jobs mentioned in the Protegrity Anonymization SDK section below, the following pre-requisites must be completed:

  • Ensure that Anonymization machine is set up and is configured as “https://anon.protegrity.com/".
    For more information about setting up and configuring an Anonymization machine for AWS and Azure, refer to AWS and Azure.
  • Ensure that the disk is not full and enough free space is available to save the destination file.
  • Verify the destination file is not in use. Set the required permissions for creating and modifying the destination file.
  • Verify that the anonymization job exists.
  • Verify the import of the Pythonic SDK. For example, import anonsdk as asdk.

You can use different sample requests to build and run the Protegrity Anonymization APIs. For more information about the sample requests for Python SDK, refer to Sample Requests for Protegrity Anonymization.

Understanding the AnonElement object

The AnonElement is an essential part of the Protegrity Anonymization SDK. It holds all information that is required for processing the anonymization request. The AnonElement is a part of the anonsdk package.

Protegrity Anonymization SDK processes a Pandas dataframe to anonymize data using the Protegrity Anonymization REST API. It is the AnonElement that accepts the parameters and passes the information to the REST API. The AnonElement accepts the connection to the REST API, the pandas dataframe with the data that must be processed, and the optionally the source location for processing the request.

Protegrity Anonymization Functions

The Protegrity Anonymization Functions APIs are used to run the anonymization job.

Anonymize

The Anonymize API is used to start an anonymize operation.

For more information about the anonymize API, refer to Submit a new anonymization job.

Note: When you run the job, an empty destination file is created. This file is created during processing for verifying the necessary destination permissions. Avoid using this file till the anonymization job is complete.

Ensure that the anonymized data file and the logs generated are moved to a different system before deleting your environment.

If the source file is larger than the maximum limit that is allowed on the Cloud environment, then run the anonymization request with “additional_properties”: { “single_file”: “no” }.

Apply Anonymize

The Apply Anonymize API is used as a template to anonymize additional entries. Using this API you can use the existing configuration to process additional data. This is especially useful in machine learning for training the system to anonymize new data points.

Note: In this API, privacy model parameters are ignored while performing the anonymization for the new entry.

For more information about the apply anonymize API, refer to Apply anonymization config to a given dataset.

Apply Anonymize API Parameters

Use this API to start an anonymize operation.

Apply Anonymize Job InformationDescription
Functionanonymize(anon_object, target_datastore, force, mode)
Parametersanon_object: The object with the configuration for performing the anonymization request.

target_datastore: The location to store the anonymized result.

force: The boolean value to force the operation.
Acceptable values: True and False.
Set this flag to true to resubmit the same anonymized job without any modification.

mode: The value to enable auto anonymization.
Acceptable value: auto.
Do not include this parameter to skip auto anonymization.
Return TypeA job object with which task monitoring and task statistics can be obtained.
Sample RequestWithout auto anonymization: job = asdk.anonymize(anon_object,target_datastore ,force=True)

With auto anonymization: job = asdk.anonymize(anon_object,target_datastore ,force=True,mode=“auto”)

Note: When you run the job, an empty destination file is created. This file is created during processing for verifying the necessary destination permissions. Avoid using this file till the anonymization job is complete.

For more information about using the Auto Anonymization, refer to Using the Auto Anonymizer.

Ensure that the anonymized data file and the logs generated are moved to a different system before deleting your environment.

If the source file is larger than the maximum limit that is allowed on the Cloud environment, then run the anonymization request with “additional_properties”: { “single_file”: “no” }.

If you want to bypass the Anon-Storage, then you can disable the pods by setting the pyt_storage flag to False.
For example, use the following code to run the anonymization request without using the storage pods

job=asdk.anonymize(anon_object, pty_storage=False)

Measure

The Measure API is used to measure or obtain anonymization result statics for different configurations before the actual anonymization job.

For more information about the anonymize measure job API, refer to Submit a new anonymization Measure job.

Using Infer to Anonymize API Parameters

Use the Infer API to start auto-detecting the data-domain, classification type, hierarchies, and anonymization configuration in Protegrity Anonymization. Any user-defined configuration, such as, QI attribute assignments, hierarchy, and K value, are retained and considered while performing the auto anonymization.

Using Infer to Anonymize InformationDefinition
Functioninfer(targetVariable)
ParameterstargetVariable: The field specified here is used as a focus point for performing the anonymization.
Return TypeReturns an anon element with all the detected classifications and hierarchies generated.
Sample Requeste.infer(targetVariable=‘income’)

Note: You can use e.measure() to modify the request and view different outcomes of the result set.

For more information about the anonymize measure job API, refer to Using Infer to Anonymize.

Task Monitoring APIs

The Task Monitoring APIs are used to monitor the anonymization job. Use these APIs to obtain the job status, retrieve a job, and abort a job.

Get Job IDs

The Get Job ID API is used to get the job IDs of the last 20 anonymization operations that are running, in queue, or completed. You can then use the required job ID with the other APIs to work with the anonymization job.

For more information about the job ID API, refer to Obtain job ids.

Get Job Status

The Get Job Status API is used to get the status of an anonymize operation that is running, in queue, or complete. It shows the percentage of job completion. Use the information provided here to monitor if a job is running or stalled.

For more information about the job status API, refer to Obtain job status.

Get Job Status API Parameters

Use this API to get the status of an anonymize operation that is running. It shows the percentage of job completion. Use the information provided here to monitor if a job is running or stalled.

Monitor Job InformationDescription
Functionstatus()
ParametersNone
Return TypeA string with the status information in the JSON format.

completed: This is information about the job, such as, data, statistics, summary, and time spent.

id: This is the job ID.

info: This is information about the job being processed, such as, the source and attributes for the job.

running: This is the completion status of the jobs being processed. It shows the percentage of the job completed.

status: This is the status of the job, such as, running or completed.

Note: This API displays all the status of the job. To obtain the ID of a job, use job.id().
Sample Requestjob.status()

Get Metadata

The Get Metadata API is used to retrieve the metadata for the existing job. This API is useful when you need to view the configuration available for a job. It displays the fields, configuration, and the data that is used to run the anonymization job.

For more information about the metadata API, refer to Obtain job metadata.

Retrieve Anonymized Data API Parameters

Use this API to retrieve the results of an anonymized job.

Retrieve Job InformationDescription
Functionresult()
ParametersNone
Return TypeReturns the AnonResult element, which provides the DataFrame for the anon data.

Note: The result.df will be None if you have overridden the resultstore as part of anonymize method.
Sample Requestjob.result()

Note: This is a blocking API and will stall processing till the job is complete.

Abort

The Abort API is used to abort a running anonymization job. You can abort jobs if you need to modify the parameters or if the job is stalled or taking too much time or resources to process.

For more information about the abort API, refer to Abort a running anonymization job.

Note: After aborting the task, it might take time before all the running processes are stopped.

Abort API Parameters

Use this API to abort a running anonymize operation. You can abort jobs if you need to modify the parameters or if the job is stalled or taking too much time or resources to process.

Abort Job InformationDescription
Functionabort()
ParametersNone
Return TypeA string with the status of the abort request.
Sample Requestjob.abort()

Delete

The Delete API is used to delete an existing job that is no longer required.

For more information about the delete API, refer to Delete a job.

Statistics APIs

Statistics APIs are used to obtain information about the anonymization data. Use these APIs to obtain the risk and utility information about the anonymization. The user needs to access these APIs to measure the utility benefits and risk of publishing the anonymized data. If these configurations are not satisfactory, then the user can re-submit the anonymization job after modifying some parameters based on these results.

Get Exploratory Statistics

The Get Exploratory Statistics API is used to obtain data distribution statistics about a completed anonymization job. The information includes information about both the source and the target distribution.

For more information about the exploratory statistic API, refer to Obtain the exploratory statistics.

Get Risk Metric

The Get Risk Metric API is used to ascertain the risk of the anonymized data. It shows the risk of the data against attacks such as journalist, marketer, and prosecutor.

For more information about the risk metric API, refer to Obtain the risk statistics.

Get Utility Statistics

The Get Utility Statistics API is used to check the usability of the anonymized data.

For more information about the utility statistics API, refer to Obtain the anonymization data utility statistics.


Last modified : March 24, 2026