This codec lets you define custom extraction logic and pass arguments to the next rule. The language that is currently supported for extraction is Python.
From DSG 3.0.0.0, the Python version is upgraded to python 3. The UDFs written in Python v2.7 will not be compatible with Python v3.10. To migrate the UDFs from python 2 to python 3, refer to the section Migrating the UDFs to Python 3.
The following figure illustrates the User Defined Extraction payload fields.
The properties for the User Defined Extraction payload are explained in the following table.
Properties | Description |
---|---|
Programming Language | Programming language used for data extraction is selected. The language that is currently supported for extraction is Python. |
Source Code | Source code for the selected programming language. CAUTION: Ensure that the class name UserDefinedExtraction is not changed while creating the UDF. Note: For more information about the supported libraries apart from the default Python modules, refer to the section Supported Libraries. |
Initialization Arguments | The list of arguments passed to the constructor of the user defined extraction code is specified in this field. |
Rule Advanced Settings | Provide a specific blocked module that must be overruled. The module will be overruled only for that extract rule. The parameter must be set to the name of the module that must be overruled in the following format.{"override_blocked_modules": ["<name of module>", "<name of module>"]} Note: Currently, methods cannot be overruled using Advanced settings. For more information about the allowed methods and modules, refer to the section Allowed Modules and Methods in UDF. Using the Rule Advanced Settings option, any module that is blocked, can be overruled to be unblocked. For example, the following are the modules that are allowed in the gateway.json file. "globalUDFSettings" : { "allowed_modules":["bs4", "common.logger", "re", "gzip", "fromstring", "cStringIO","struct", "traceback"] } The os module is not listed as part of the allowed_modules parameter in the gateway.json file, so it is blocked. To allow the use of the os module in the Source Code of UDF rules, you can set the {“override_blocked_modules”: [“os”]} in the Advanced Settings of the extract rule. Note: By overriding blocked modules, you risk introducing security risks to the DSG system. |
Note: The DSG supports the usage of the PyJwt python library in custom UDF creations. PyJWT is a python library that is used to implement Open Authentication (OAuth) using JSON Web Tokens (JWT). JSON Web Tokens (JWT) is an open standard that defines how to transmit information between a sender and a receiver as a JSON object. To authenticate JWT for OAuth, you must write a custom UDF. The PyJwt library version supported by the DSG is 1.7.1.
For more information about writing a custom UDF on the DSG, refer to the section User Defined Functions (UDF).
Note: The DSG supports the usage of the Kafka python library in custom UDF creations. Kafka is a python library that is used for storing, processing, and forwarding for applications in a distributed environment. For example, the DSG uses the Kafka library to forward Transaction Metrics logs to external applications. The Kafka library version supported by the DSG is 2.0.2.
For more information about writing a custom UDF on the DSG, refer to the section User Defined Functions (UDF).
Note: The DSG supports the usage of the Openpyxl Python library in custom UDF creations. Openpyxl is a Python library that is used to parse Excel xlsx, xlsm, xltx, xltm files. This library enables column-based transformation for Microsoft Office Excel. The Openpyxl library version supported by the DSG is 2.6.4.
Last modified January 30, 2025Note: The DSG uses the in-built tarfile python module for custom UDF creation. This module is used in the DSG to parse .tar and .tgz packages. Using the tarfile module, you can extract and decompress .tar and .tgz packages.