Method
POST
URL
http://{Host Address}/pty/data-discovery/v2/classify/tabular
Query Parameters
score_threshold
- Type:
float - Description: Optional. Exclude results with a score lower than this threshold.
- Values: Minimum 0, Maximum 1.0
- Default:
0.7
has_headers
- Type:
boolean - Description: Optional. Indicates whether the first row represents the column header.
- Values:
true/false - Default:
true
column_delimiter
- Type:
char - Description: Optional. Delimiter to separate the columns.
- Default:
,
quote_char
- Type:
char - Description: Optional. Character to quote fields containing special characters, such as, the column_delimiter or new-line characters.
- Default:
"
Body
Content type should be
text/csvand in UTF-8 format.Body size is limited to 10K Bytes
Sample Request
curl -X POST "http://<Host_address>/pty/data-discovery/v2/classify/tabular?score_threshold=0.85" \
--header 'Content-Type: text/csv' \
--data-raw 'Social Security Number,Credit Card Number,IBAN,Phone Number
589-25-1068,349384370543801,FR43 9255 4858 47BG 3EBG U4OK O18,(483) 9440301
636-36-3077,4041594844904,AL50 8947 4215 KAEY GAPM NLYC FNZG,(113) 5143119
748-82-2375,3558175715821800,AT34 4082 9269 0841 5702,(763) 5136237
516-62-9861,560221027976015000,FR22 0068 7181 11FB UG8H ECEM 306,(726) 6031636
121-49-9409,374283320982549,DK37 5687 8459 8060 79,(624) 9205200
838-73-3299,5558216060144900,CR54 8952 8144 6403 4765 0,(356) 9479541
439-11-5310,5048376143641900,RS76 6213 4824 0184 8983 74,(544) 5623326
564-06-8466,3543299511845640,EE51 6882 3443 7863 4703,(702) 6093849
518-54-5443,3543019452249540,IT65 D000 3874 2801 Z15I LNLL OOX,(584) 8618371'import requests
url = "http://<Host_address>/pty/data-discovery/v2/classify/tabular"
params = {"score_threshold": 0.85}
headers = {"Content-Type": "text/csv"}
data = """Social Security Number,Credit Card Number,IBAN,Phone Number
589-25-1068,349384370543801,FR43 9255 4858 47BG 3EBG U4OK O18,(483) 9440301
636-36-3077,4041594844904,AL50 8947 4215 KAEY GAPM NLYC FNZG,(113) 5143119
748-82-2375,3558175715821800,AT34 4082 9269 0841 5702,(763) 5136237
516-62-9861,560221027976015000,FR22 0068 7181 11FB UG8H ECEM 306,(726) 6031636
121-49-9409,374283320982549,DK37 5687 8459 8060 79,(624) 9205200
838-73-3299,5558216060144900,CR54 8952 8144 6403 4765 0,(356) 9479541
439-11-5310,5048376143641900,RS76 6213 4824 0184 8983 74,(544) 5623326
564-06-8466,3543299511845640,EE51 6882 3443 7863 4703,(702) 6093849
518-54-5443,3543019452249540,IT65 D000 3874 2801 Z15I LNLL OOX,(584) 8618371
"""
response = requests.post(url, params=params, headers=headers, data=data, verify=False)
print("Status code:", response.status_code)
try:
print("Response JSON:", response.json())
except ValueError:
print("Response Text:", response.text)
URL: POST `http://<Host_address>/pty/data-discovery/v2/classify/tabular`
Query Parameters:
-score_threshold (optional), float between 0.0 and 1.0, default: 0.
-has_headers (optional), Indicates whether the first row represents the column header.
-column_delimiter (optional), Delimiter to separate the columns.
-quote_char (optional), Character to quote fields containing special characters, such as, the column_delimiter or new-line characters.
Headers:
-Content-Type: text/csv
Body:
-Social Security Number,Credit Card Number,IBAN,Phone Number
589-25-1068,349384370543801,FR43 9255 4858 47BG 3EBG U4OK O18,(483) 9440301
636-36-3077,4041594844904,AL50 8947 4215 KAEY GAPM NLYC FNZG,(113) 5143119
748-82-2375,3558175715821800,AT34 4082 9269 0841 5702,(763) 5136237
516-62-9861,560221027976015000,FR22 0068 7181 11FB UG8H ECEM 306,(726) 6031636
121-49-9409,374283320982549,DK37 5687 8459 8060 79,(624) 9205200
838-73-3299,5558216060144900,CR54 8952 8144 6403 4765 0,(356) 9479541
439-11-5310,5048376143641900,RS76 6213 4824 0184 8983 74,(544) 5623326
564-06-8466,3543299511845640,EE51 6882 3443 7863 4703,(702) 6093849
518-54-5443,3543019452249540,IT65 D000 3874 2801 Z15I LNLL OOX,(584) 8618371
Sample Response
{
"providers": [
{
"name": "Pattern Classification Provider",
"version": "...",
"status": 200,
"elapsed_time": 0.31273603439331055,
"config_provider": {
"name": "Pattern",
"address": "http://pattern_provider_service:8051",
"supported_content_types": []
}
},
{
"name": "Context Classification Provider",
"version": "...",
"status": 200,
"elapsed_time": 1.1383004188537598,
"config_provider": {
"name": "Context",
"address": "http://context_provider_service:8052",
"supported_content_types": []
}
}
],
"classifications": {
"SOCIAL_SECURITY_ID": [
{
"score": 0.9994888835483127,
"rows_processed": 9,
"location": {
"column_name": "Social Security Number",
"column_index": 0
},
"classifiers": [
{
"provider_index": 1,
"name": "context",
"rows_with_classification": 9,
"total_classifications": 9,
"score": 0.9994888835483127,
"details": {}
}
]
}
],
"CREDIT_CARD": [
{
"score": 0.9986333317226834,
"rows_processed": 9,
"location": {
"column_name": "Credit Card Number",
"column_index": 1
},
"classifiers": [
{
"provider_index": 1,
"name": "context",
"rows_with_classification": 9,
"total_classifications": 9,
"score": 0.9986333317226834,
"details": {}
}
]
}
],
"BANK_ACCOUNT": [
{
"score": 0.7901234567901234,
"rows_processed": 9,
"location": {
"column_name": "IBAN",
"column_index": 2
},
"classifiers": [
{
"provider_index": 0,
"name": "IbanRecognizer",
"rows_with_classification": 8,
"total_classifications": 8,
"score": 0.8888888888888888,
"details": {}
}
]
}
],
"PHONE_NUMBER": [
{
"score": 0.9961333341068692,
"rows_processed": 9,
"location": {
"column_name": "Phone Number",
"column_index": 3
},
"classifiers": [
{
"provider_index": 1,
"name": "context",
"rows_with_classification": 9,
"total_classifications": 9,
"score": 0.9961333341068692,
"details": {}
}
]
}
]
}
}Response Fields Description
Providers Section
| Name | Example Response | Description |
|---|---|---|
| providers | Array | Array of provider objects that participated in the request, including their respective success or failure codes. |
| providers[n].name | Pattern Classification Provider | Product name of the provider. |
| providers[n].version | 2.0.0 | Version of the provider. |
| providers[n].status | 200 | HTTP response code returned by the provider. |
| providers[n].elapsed_time | 0.028 | Time, in seconds, taken by the provider to process the request. |
| providers[n].config_provider | Object | Object containing configuration details for each provider. |
| providers[n].config_provider.name | Pattern | Internal name of the provider. |
| providers[n].config_provider.address | http://pattern_provider_service:8051 | Network address or endpoint of the provider. |
| providers[n].config_provider.supported_content_types | [] | Array of supported content types. An empty array indicates support for all content types. |
Classifications Section
| Name | Example Response | Description |
|---|---|---|
| classifications | Dictionary | A dictionary mapping entity types (e.g., “SOCIAL_SECURITY_ID”, “CREDIT_CARD”) to arrays of occurrence objects. Each key is an entity type, and its value is a list of detected occurrences, each containing location, classifier, and row details. |
| classifications[’entity’][n].score | 0.9995 | The confidence score for the detected entity, aggregated and calculated from all contributing classifiers and their |
| reported scores. | ||
| classifications[’entity’][n].rows_processed | 9 | The number of rows passed to and processed by the classification request. |
| classifications[’entity’][n].location | Object | An object specifying the location of the entity within the tabular data. |
| classifications[’entity’][n].location.column_name | Social Security Number | The name of the column in which the entity was detected. |
| classifications[’entity’][n].location.column_index | 0 | The index of the column in which the entity was detected. |
| classifications[’entity’][n].classifiers | Array | An array of classifier objects that contributed to the entity detection. |
| classifications[’entity’][n].classifiers[m].provider_index | 1 | The index of the provider in the top-level providers array. |
| classifications[’entity’][n].classifiers[m].name | context | The name of the classifier. A provider may have multiple classifiers. |
| classifications[’entity’][n].classifiers[m].score | 0.9995 | The score assigned by the classifier for the entity detection. |
| classifications[’entity’][n].classifiers[m].rows_with_classification | 9 | The number of rows in which the entity was classified by this classifier. |
| classifications[’entity’][n].classifiers[m].total_classifications | 9 | The total number of classifications made by this classifier in this location. it is possible to find multiple entities within a single column, e.g., date and time, complex address, etc'. |
| classifications[’entity’][n].classifiers[m].details | Object | Optional. Additional key-value details provided by the classifier. |
Response Codes
| Response Code | Description |
|---|---|
| 200 | Successful Response. |
| 206 | Partial Content. Only some providers classifed data successfully. |
| 400 | Bad Request. Invalid input parameters or content. |
| 413 | Payload too large. |
| 415 | Unsupported media type. |
| 422 | Untrusted input. For more information, refer to Input Validation |
| 502 | Bad Gateway. All upstream providers failed; no successful data aggregation possible. |
| 598 | Unexpected internal server error. Check server logs. |
| 599 | Internal server error. Check server logs. |