Data Discovery is currently in Private Preview and is not available for General Availability (GA). It should not be used in production environments, as features and functionality may change before the final GA release.

Classify Tabular API

Classify structured Tabular data.

Method

POST

URL

http://{Host Address}/pty/data-discovery/v2/classify/tabular

Query Parameters

score_threshold

  • Type: float
  • Description: Optional. Exclude results with a score lower than this threshold.
  • Values: Minimum 0, Maximum 1.0
  • Default: 0.7

has_headers

  • Type: boolean
  • Description: Optional. Indicates whether the first row represents the column header.
  • Values: true/false
  • Default: true

column_delimiter

  • Type: char
  • Description: Optional. Delimiter to separate the columns.
  • Default: ,

quote_char

  • Type: char
  • Description: Optional. Character to quote fields containing special characters, such as, the column_delimiter or new-line characters.
  • Default: "

Body

  • Content type should be text/csv and in UTF-8 format.

  • Body size is limited to 10K Bytes

Sample Request

curl -X POST "http://<Host_address>/pty/data-discovery/v2/classify/tabular?score_threshold=0.85" \
     --header 'Content-Type: text/csv' \
     --data-raw 'Social Security Number,Credit Card Number,IBAN,Phone Number
     589-25-1068,349384370543801,FR43 9255 4858 47BG 3EBG U4OK O18,(483) 9440301
     636-36-3077,4041594844904,AL50 8947 4215 KAEY GAPM NLYC FNZG,(113) 5143119
     748-82-2375,3558175715821800,AT34 4082 9269 0841 5702,(763) 5136237
     516-62-9861,560221027976015000,FR22 0068 7181 11FB UG8H ECEM 306,(726) 6031636
     121-49-9409,374283320982549,DK37 5687 8459 8060 79,(624) 9205200
     838-73-3299,5558216060144900,CR54 8952 8144 6403 4765 0,(356) 9479541
     439-11-5310,5048376143641900,RS76 6213 4824 0184 8983 74,(544) 5623326
     564-06-8466,3543299511845640,EE51 6882 3443 7863 4703,(702) 6093849
     518-54-5443,3543019452249540,IT65 D000 3874 2801 Z15I LNLL OOX,(584) 8618371'
import requests
    
    url = "http://<Host_address>/pty/data-discovery/v2/classify/tabular"
    params = {"score_threshold": 0.85}
    headers = {"Content-Type": "text/csv"}
    data = """Social Security Number,Credit Card Number,IBAN,Phone Number
    589-25-1068,349384370543801,FR43 9255 4858 47BG 3EBG U4OK O18,(483) 9440301
    636-36-3077,4041594844904,AL50 8947 4215 KAEY GAPM NLYC FNZG,(113) 5143119
    748-82-2375,3558175715821800,AT34 4082 9269 0841 5702,(763) 5136237
    516-62-9861,560221027976015000,FR22 0068 7181 11FB UG8H ECEM 306,(726) 6031636
    121-49-9409,374283320982549,DK37 5687 8459 8060 79,(624) 9205200
    838-73-3299,5558216060144900,CR54 8952 8144 6403 4765 0,(356) 9479541
    439-11-5310,5048376143641900,RS76 6213 4824 0184 8983 74,(544) 5623326
    564-06-8466,3543299511845640,EE51 6882 3443 7863 4703,(702) 6093849
    518-54-5443,3543019452249540,IT65 D000 3874 2801 Z15I LNLL OOX,(584) 8618371
    """
    
    response = requests.post(url, params=params, headers=headers, data=data, verify=False)
    
    print("Status code:", response.status_code)
    try:
        print("Response JSON:", response.json())
    except ValueError:
        print("Response Text:", response.text)
    
URL: POST `http://<Host_address>/pty/data-discovery/v2/classify/tabular`
      Query Parameters:
      -score_threshold (optional), float between 0.0 and 1.0, default: 0.
      -has_headers (optional), Indicates whether the first row represents the column header.
      -column_delimiter (optional), Delimiter to separate the columns.
      -quote_char (optional), Character to quote fields containing special characters, such as, the column_delimiter or new-line characters.
      Headers:
      -Content-Type: text/csv
      Body:
      -Social Security Number,Credit Card Number,IBAN,Phone Number
     589-25-1068,349384370543801,FR43 9255 4858 47BG 3EBG U4OK O18,(483) 9440301
     636-36-3077,4041594844904,AL50 8947 4215 KAEY GAPM NLYC FNZG,(113) 5143119
     748-82-2375,3558175715821800,AT34 4082 9269 0841 5702,(763) 5136237
     516-62-9861,560221027976015000,FR22 0068 7181 11FB UG8H ECEM 306,(726) 6031636
     121-49-9409,374283320982549,DK37 5687 8459 8060 79,(624) 9205200
     838-73-3299,5558216060144900,CR54 8952 8144 6403 4765 0,(356) 9479541
     439-11-5310,5048376143641900,RS76 6213 4824 0184 8983 74,(544) 5623326
     564-06-8466,3543299511845640,EE51 6882 3443 7863 4703,(702) 6093849
     518-54-5443,3543019452249540,IT65 D000 3874 2801 Z15I LNLL OOX,(584) 8618371
   

Sample Response

{
    "providers": [
        {
            "name": "Pattern Classification Provider",
            "version": "...",
            "status": 200,
            "elapsed_time": 0.31273603439331055,
            "config_provider": {
                "name": "Pattern",
                "address": "http://pattern_provider_service:8051",
                "supported_content_types": []
            }
        },
        {
            "name": "Context Classification Provider",
            "version": "...",
            "status": 200,
            "elapsed_time": 1.1383004188537598,
            "config_provider": {
                "name": "Context",
                "address": "http://context_provider_service:8052",
                "supported_content_types": []
            }
        }
    ],
    "classifications": {
        "SOCIAL_SECURITY_ID": [
            {
                "score": 0.9994888835483127,
                "rows_processed": 9,
                "location": {
                    "column_name": "Social Security Number",
                    "column_index": 0
                },
                "classifiers": [
                    {
                        "provider_index": 1,
                        "name": "context",
                        "rows_with_classification": 9,
                        "total_classifications": 9,
                        "score": 0.9994888835483127,
                        "details": {}
                    }
                ]
            }
        ],
        "CREDIT_CARD": [
            {
                "score": 0.9986333317226834,
                "rows_processed": 9,
                "location": {
                    "column_name": "Credit Card Number",
                    "column_index": 1
                },
                "classifiers": [
                    {
                        "provider_index": 1,
                        "name": "context",
                        "rows_with_classification": 9,
                        "total_classifications": 9,
                        "score": 0.9986333317226834,
                        "details": {}
                    }
                ]
            }
        ],
        "BANK_ACCOUNT": [
            {
                "score": 0.7901234567901234,
                "rows_processed": 9,
                "location": {
                    "column_name": "IBAN",
                    "column_index": 2
                },
                "classifiers": [
                    {
                        "provider_index": 0,
                        "name": "IbanRecognizer",
                        "rows_with_classification": 8,
                        "total_classifications": 8,
                        "score": 0.8888888888888888,
                        "details": {}
                    }
                ]
            }
        ],
        "PHONE_NUMBER": [
            {
                "score": 0.9961333341068692,
                "rows_processed": 9,
                "location": {
                    "column_name": "Phone Number",
                    "column_index": 3
                },
                "classifiers": [
                    {
                        "provider_index": 1,
                        "name": "context",
                        "rows_with_classification": 9,
                        "total_classifications": 9,
                        "score": 0.9961333341068692,
                        "details": {}
                    }
                ]
            }
        ]
    }
}

Response Fields Description

Providers Section

NameExample ResponseDescription
providersArrayArray of provider objects that participated in the request, including their respective success or failure codes.
providers[n].namePattern Classification ProviderProduct name of the provider.
providers[n].version2.0.0Version of the provider.
providers[n].status200HTTP response code returned by the provider.
providers[n].elapsed_time0.028Time, in seconds, taken by the provider to process the request.
providers[n].config_providerObjectObject containing configuration details for each provider.
providers[n].config_provider.namePatternInternal name of the provider.
providers[n].config_provider.addresshttp://pattern_provider_service:8051Network address or endpoint of the provider.
providers[n].config_provider.supported_content_types[]Array of supported content types. An empty array indicates support for all content types.

Classifications Section

NameExample ResponseDescription
classificationsDictionaryA dictionary mapping entity types (e.g., “SOCIAL_SECURITY_ID”, “CREDIT_CARD”) to arrays of occurrence objects. Each key is an entity type, and its value is a list of detected occurrences, each containing location, classifier, and row details.
classifications[’entity’][n].score0.9995The confidence score for the detected entity, aggregated and calculated from all contributing classifiers and their
reported scores.
classifications[’entity’][n].rows_processed9The number of rows passed to and processed by the classification request.
classifications[’entity’][n].locationObjectAn object specifying the location of the entity within the tabular data.
classifications[’entity’][n].location.column_nameSocial Security NumberThe name of the column in which the entity was detected.
classifications[’entity’][n].location.column_index0The index of the column in which the entity was detected.
classifications[’entity’][n].classifiersArrayAn array of classifier objects that contributed to the entity detection.
classifications[’entity’][n].classifiers[m].provider_index1The index of the provider in the top-level providers array.
classifications[’entity’][n].classifiers[m].namecontextThe name of the classifier. A provider may have multiple classifiers.
classifications[’entity’][n].classifiers[m].score0.9995The score assigned by the classifier for the entity detection.
classifications[’entity’][n].classifiers[m].rows_with_classification9The number of rows in which the entity was classified by this classifier.
classifications[’entity’][n].classifiers[m].total_classifications9The total number of classifications made by this classifier in this location. it is possible to find multiple entities within a single column, e.g., date and time, complex address, etc'.
classifications[’entity’][n].classifiers[m].detailsObjectOptional. Additional key-value details provided by the classifier.

Response Codes

Response CodeDescription
200Successful Response.
206Partial Content. Only some providers classifed data successfully.
400Bad Request. Invalid input parameters or content.
413Payload too large.
415Unsupported media type.
422Untrusted input. For more information, refer to Input Validation
502Bad Gateway. All upstream providers failed; no successful data aggregation possible.
598Unexpected internal server error. Check server logs.
599Internal server error. Check server logs.
Last modified : March 11, 2026