The CoP is a key paradigm used in the Protegrity Gateway Technology. The CoP technology enables a CoP administrator to create a set of rules that instructs the gateway on how to process data that traverses it.
The CoP technology is also a key technology from a user experience. The structure of the rules is equally important as the rules. The set of rules, their structure, and an easy-to-use interface results in a powerful concept called the CoP.
The DSG is fundamentally architected on the CoP principle. CoP suggests that configuration should be the preferred way of extending or customizing a system as opposed to programming. Users configure rules in the UI to define step-by-step processing of incoming messages. This allows DSG users to manage any kind of input message so long as they have corresponding rules configured in the DSG. The rules are generally categorized as extraction and transformation.
The DSG product evolution started with Static CoP, where the request processing rules are configured ahead of time. However, DSG now also offers a concept called Dynamic CoP. This allows JSON structure rule definitions to be dynamically injected in the request messages and executed on the fly.
DSG users configure the CoP Rulesets to construct a REST API that is suitable to their environment. DSG’s RESTful interface is high-level. Its API users are not exposed to low-level underlying crypto API message sequences such as open and close session. Further, low-level parameters such as data element name or session handle are not exposed either. User identity can be obtained as follows:
The following figure shows high-level functionality of the DSG RESTful interface.
For simplicity, the DSG example above shows a plain text string that is tokenized word-by-word. The tokens are returned in the 200 OK response. DSG comes with a whole battery of “codecs”. Codecs are message parsers that allow DSG to parse and process complex payload bodies. DSG’s codecs include the following payloads:
Further, DSG allows custom extraction and transformation rules to be written in Python and plugged-in within DSG CoP Rulesets.
The following sections describe Ruleset, the Ruleset Structure and the Ruleset engine followed by an example.
The DSG includes built-in standard protocol codecs. These allow configuration-driven payload parsing and processing for most data security use cases.
The Ruleset describes a set of instructions that the gateway uses to transform data as it traverses the gateway in any direction. The various kinds of Rule objects currently available in the gateway are illustrated in the following figure.
A typical Ruleset is constructed from the Extract and Transform rules.
The core rules available today are as follows:
Extract: Extraction rules are responsible for extracting smaller pieces of data from larger bodies of data. By way of engaging existing codecs, they are also capable of interpreting data per predefined encoding schemes. While the Extraction rules function as data filters, they do not actually manipulate data. Therefore, they are branch nodes in Ruleset tree and have child rules below them.
Transform: Transformation rules are responsible for manipulating data passed into them. Typical data security use cases will employ Transformation rules for the following:
Transformations do not warrant out-of-the-box security actions for the customers. They can build their own actions with Transformation User Defined Functions (UDFs). Customers can extend the out-of-the-box transformations with UDFs.
Log: The Log rule object allows to add log entries to the DSG log. User can define the level of logging that needs to be reflected in the log. The decision of where to save the log can also be made in this rule.
Exit: The Exit option acts as a terminating action and the rules are not processed further.
Set User identity: The Set User Identity rule object comes in effect if username details are part of the payload. The Protegrity Data Protection transformation leverages the value set in this rule such that the subsequent transformation actions calls are performed by the set user.
Profile Reference: An external profile can be referenced using the Profile Reference action. This rule transfers the control to a separate batch of rules grouped in a profile.
Error: Use this action to add custom response message for any invalid content.
Dynamic Injection: Use Dynamic CoP to send rules for extraction and transformation as part of a request header along with the data for protection in request message body.
Set Context Variable: Use this action type to a variable to any value that can then be used as an input to other rules. The value set by to this rule will be kept throughout the rule lifecycle.
Rulesets are organized in a hierarchical structure where Extract rules are branch nodes and other rules such as Transform rules are leaf nodes. In other words, extract specific data from the payload and then perform a Transform action on the data extracted.
Rules are compartmentalized into Profile containers. Profile containers can be enabled or disabled and they can also be referenced by a Profile Reference rule.
A typical Ruleset are recursed and processed in a sequence. With this mechanism, sibling rules that belong a given parent and all the child rules that belong to a sibling rule are recursed and executed sequentially. This occurs from top to bottom with no provision for conditional branching.
However, this disallows decision-based, mutually exclusive execution of individual child rules on various parts of extracted data within the same extraction context. Examples include a row in a CSV file, groups within a regular expression, or multiple XPaths within an XML document. This leads to extraction or parsing of the same data multiple times. Various parts of extracted data within the same extraction context may require to be processed differently.
The RuleSet ToT feature is an enhancement to the RuleSet algorithm that addresses this drawback. With the RuleSet ToT feature, an extraction parent rule can have multiple child rules. Those can be executed mutually exclusive to each other based on some condition applied in the parent rule. The feature allows various parts of extracted data to be processed downstream using different profile references. Since the profile references are sub-trees in and of themselves, this feature adds a Tree-of-Trees structural notation to the CoP rulesets.
The following compares the layout and execution paths of traditional rulesets with the ToT rulesets:
In the above example, a CSV payload needs to be processed as per the following requirements:
The traditional RuleSet strategy involved extracting or parsing the same CSV payload three times. Once for each column that needs protection using different data elements, as shown on the left side. In contrast, a ToT-enabled RuleSet requires extracting the CSV payload only once where values extracted from different columns can be sent down different child rules that provide different protection data elements. Consequently, the overall CSV payload processing time reduces substantially.
In this release, the Ruleset ToT feature supports the payloads:
Rulesets are executed with the Ruleset engine that is built into the gateway. The Ruleset engine is responsible for cascaded execution of the Ruleset. The behaviors of Rules objects range from data processing - extract and transform. It moves to controlling the execution flow of rule tree - exit. Some supplementary activities are also performed and logged.
The Ruleset engine will recursively traverse the Ruleset node by node. For example, Extract nodes will extract data that will be transformed with a Transform rule node. Following this, the recursion stack is rolled up and the reverse process happens. Here, data is encoded and packaged back to its original format and sent to the intended recipient.
The following example of the Ruleset, the Ruleset structure, and the Ruleset execution is illustrated. This example is started with an HTTP POST with an XML payload of a person’s information. The Ruleset is a hierarchy of 3 Extract nodes with the Transform rule as the end leaf node.
Extract Rule: The Extract Rule extracts the XML document from the message body.
Extract Rule: A second Extract Rule will take the XML document and parse the data that is to be transformed – the person’s name. This is done by using XPath.
Extract Rule: A third Extract Rule will split out the name into individual words – in this example, the first and the last name. This is done by using REGEX.
Transform Rule: The Transform Rule will take each word and apply an action. In this example the first name is protected and the last name is protected.
The next set of rules will perform operations in the reverse and prepare the contents to go back to the sender. The same Extraction rules would perform reverse processing as the recursion unwinds.
Extract Rule: On the return trip, an Extract Rule is used to combine the protected first and last name into a single string – Name.
Extract Rule: This rule will place the Name back into the XML document.
Extract Rule: The final Extract rule will place the XML document back into the message body to be sent back to the sender with the name protected.