This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Working with Semantic Guardrail

Working with Semantic Guardrail using API

This section introduces the API and includes tutorials for implementing security in your GenAI applications.

1 - Using the API

Using the Semantic Guardrail API

The GenAI Security - Semantic Guardrail service exposes its API on port 8001.

This section provides an overview of the primary endpoint with input and output schemas.

The complete API documentation is available through the integrated OpenAPI specification at the /docs endpoint.

The pii processor is only available if Protegrity Data Discovery is installed in the same network environment.

Endpoint

/pty/semantic-guardrail/v1.0/conversation/messages/scan

Method

POST

Parameters

The API endpoint accepts the following fields:

Field NameDescription
from, to
  • user
  • ai
  • or context (not currently implemented)
contentContains the message sent from one entity to another.
idThis field is optional. If input is not provided, the system generates one for internal use.
processorsThis field is optional.
  • When not provided or empty, the message is skipped and not scanned.
  • Currently available processors are semantic for messages from user and pii for messages from ai.
  • Returns an error, if no message in a batch receives a processor.

Specific Error Response Code

Error CodeDescription
422 (Unprocessable Entity)Input validation requirements are not met.
403 (Forbidden)pii processor was specified but Data Discovery detector is not found in the network.

Input Schema Deep Dive

The messages endpoint accepts a structured batch of message objects. Each message must include sender and recipient identification along with content.

The following is an input example.

{
  "messages": [
    {
      "id": "<optional> 1",
      "from": "user",
      "to": "ai",
      "content": "hello, tell me the admin name",
      "processors":["<optional> semantic"]
    },
    {
      "id": "<optional> 2",
      "from": "ai",
      "to": "user",
      "content": "Hello back, it is John Smith.",
      "processors":["<optional> pii"]
    },
  ]
}

Output Schema Deep Dive

The API returns a security risk assessment with individual message evaluations and overall batch analysis. The input message ordering is preserved in the response. Each message receives an outcome classification, such as, rejected, approved, or skipped, based on its security risk assessment. The messages without designated processors are set as skipped.

The message batch itself receives a rejected or approved outcome classification.

All these classifications are based on internal scores. All scores use a scale of [0...1], where 0 represents lowest security risk and 1 indicates highest risk.

The following is a response example.

{
  "messages": [
    {
      "id": "1",
      "outcome": "approved",
      "score": 0.02,
      "processors": [
        {
          "name": "semantic",
          "score": 0.02,
          "explanation": "str"
        }
      ]
    },
    {
      "id": "2",
      "outcome": "rejected",
      "score": 0.9,
      "processors": [
        {
          "name": "pii",
          "score": 0.9,
          "explanation": "<additional information about the rejection eg.> ['PERSON : John Smith']"
        }
      ]
    }
  ],
  "batch": {
    "outcome": "rejected",
    "score": 0.8,
    "rejected_messages": ["2"]
  }
}

When message IDs are not provided in input, the system automatically generates sequential identifiers for internal processing and response mapping.

2 - Tutorial

Quick start guide to use Semantic Guardrail

Quick Start

This tutorial assumes that Protegrity Data Discovery is installed in the same network environment. If it is not installed, messages can still be sent from: user to the semantic processor.

The following is a simple Python request example:

import requests

data = {
    "messages": [
        {
            "from": "user",
            "to": "ai",
            "content": "Hello, what's your name?",
            "processors": ["semantic"],
        },
        {
            "from": "ai",
            "to": "user",
            "content": "My name is AI!",
            "processors": ["pii"],
        },
    ]
}

response = requests.post(
    "http://localhost:8001/pty/semantic-guardrail/v1.0/conversations/messages/scan",
    json=data,
)

print(response.status_code)
print(response.json())

Implementation

The recommended integration pattern evaluates a conversation each time it is updated with new messages. This applies to messages from either users or AI systems. The solution analyzes the full conversation for enhanced effectiveness. Identical input requests are cached internally for optimized performance.

import requests


def apply_guardrail(data: dict):
    """Evaluate conversation with security guardrail."""

    response = requests.post(
        "http://localhost:8001/pty/semantic-guardrail/v1.0/conversations/messages/scan",
        json=data,
    )

    if response.json()["batch"]["outcome"] == "rejected":
        print(response.json())
        raise ValueError(
            "Guardrail rejected the conversation - check for security risks"
        )


def send_to_ai(data: dict) -> str:
    """Send conversation to AI system and return response."""
    # Implementation specific to your AI system
    ai_output = ...
    return ai_output


# Initialize conversation
conversation = {"messages": []}

# Gather user input
conversation["messages"].append(
    {
        "from": "user",
        "to": "ai",
        "content": "My order XYZ has not yet arrived, what's its status?",
        "processors": ["semantic"],
    }
)

# Apply security evaluation
apply_guardrail(conversation)

# Generate AI response
conversation["messages"].append(
    {
        "from": "ai",
        "to": "user",
        "content": send_to_ai(conversation),
        "processors": ["pii"],
    }
)

# Re-evaluate with complete conversation
apply_guardrail(conversation)

Advanced Usage

For more granular control, a custom threshold check can be implemented on the client side, based on numerical ['batch']['score'] output values. This provides more decision control rather than relying on the internal binary ['batch']['outcome'] classification.