<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Data Discovery on</title><link>https://docs.protegrity.com/data-discovery/2.0.0/docs/</link><description>Recent content in Data Discovery on</description><generator>Hugo</generator><language>en</language><atom:link href="https://docs.protegrity.com/data-discovery/2.0.0/docs/index.xml" rel="self" type="application/rss+xml"/><item><title>Introduction</title><link>https://docs.protegrity.com/data-discovery/2.0.0/docs/introduction/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://docs.protegrity.com/data-discovery/2.0.0/docs/introduction/</guid><description>&lt;p>In an era where data privacy is paramount, safeguarding sensitive information in unstructured data has become critical—especially for organizations leveraging AI and machine learning technologies. Data Discovery is a powerful, developer-friendly product designed specifically to address this challenge.&lt;/p>
&lt;p>Data Discovery specializes in the detection of Personally Identifiable Information (PII), Protected Health Information (PHI), Payment Card Information (PCI) within free-text (unstructured) and table-based (structured, CSV) inputs. Unlike traditional data tools, it excels in dynamic, unstructured environments such as chatbot conversations, call transcripts, and Generative AI (Gen AI) outputs.&lt;/p></description></item><item><title>What's New</title><link>https://docs.protegrity.com/data-discovery/2.0.0/docs/whatsnew/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://docs.protegrity.com/data-discovery/2.0.0/docs/whatsnew/</guid><description>&lt;h2 id="data-discovery-20">Data Discovery 2.0&lt;/h2>
&lt;h3 id="major-changes">Major changes&lt;/h3>
&lt;p>&lt;strong>Standardized API Endpoints&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Updated Classify and Transform APIs:&lt;/p>
&lt;ul>
&lt;li>&lt;code>http://{Host Address}/pty/data-discovery/v2/classify/text&lt;/code> - &lt;a href="https://docs.protegrity.com/data-discovery/2.0.0/docs/restapi/classify/text/">Classify Text API&lt;/a>&lt;/li>
&lt;li>&lt;code>http://{Host Address}/pty/data-discovery/v2/classify/tabular&lt;/code> - &lt;a href="https://docs.protegrity.com/data-discovery/2.0.0/docs/restapi/classify/tabular/">Classify Tabular Data API&lt;/a>&lt;/li>
&lt;li>&lt;code>http://{Host Address}/pty/data-discovery/v2/transform/label&lt;/code> - &lt;a href="https://docs.protegrity.com/data-discovery/2.0.0/docs/restapi/transform/text/">Transform Text API&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;p>Added new Endpoints:&lt;/p>
&lt;ul>
&lt;li>&lt;code>http://{Host Address}/pty/data-discovery/doc&lt;/code> – Provides the API documentation for the Data Discovery. For more information see &lt;a href="https://docs.protegrity.com/data-discovery/2.0.0/docs/restapi/common-apis/doc/">API Specification&lt;/a>.&lt;/li>
&lt;li>&lt;code>http://{Host Address}/pty/data-discovery/log&lt;/code> – Gets/Sets the log level for the Data Discovery. For more information see &lt;a href="https://docs.protegrity.com/data-discovery/2.0.0/docs/restapi/common-apis/log/">Log level API&lt;/a>.&lt;/li>
&lt;li>&lt;code>http://{Host Address}/pty/data-discovery/version&lt;/code> – Retrieves the current version of the Data Discovery. For more information see &lt;a href="https://docs.protegrity.com/data-discovery/2.0.0/docs/restapi/common-apis/version/">Version API&lt;/a>.&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="enhancements">Enhancements&lt;/h3>
&lt;ul>
&lt;li>Updated Context Provider AI model for improved contextual accuracy.&lt;/li>
&lt;li>Updated Pattern Provider model for better pattern recognition.&lt;/li>
&lt;li>Updated the default score threshold for the Classify API from 0.0 to 0.7, aligning it with the Transform API which already defaults to 0.7. Low-confidence classifications below the threshold are filtered out. The legacy v1.1 classification endpoint retains a threshold of 0.0 for backward compatibility.&lt;/li>
&lt;li>Added usage metrics logging to the Classification Service for improved analytics and visibility, see &lt;a href="https://docs.protegrity.com/data-discovery/2.0.0/docs/docs/restapi/usage_metrics/">Usage Metrics&lt;/a> for more details.&lt;/li>
&lt;li>Added per-language accuracy metrics to improve visibility into multilingual performance, see &lt;a href="https://docs.protegrity.com/data-discovery/2.0.0/docs/docs/perf_accuracy/">Language Metrics&lt;/a> for more details.&lt;/li>
&lt;li>Added PII detection in multiple Markdown dialects.&lt;/li>
&lt;li>Bug Fixes.&lt;/li>
&lt;/ul></description></item><item><title>General Architecture</title><link>https://docs.protegrity.com/data-discovery/2.0.0/docs/arch/</link><pubDate>Tue, 20 Feb 2024 00:00:00 +0000</pubDate><guid>https://docs.protegrity.com/data-discovery/2.0.0/docs/arch/</guid><description>&lt;p>The main components of the Protegrity Data Discovery product are as follows:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Classification service&lt;/strong>: The Classification Service serves as the primary access point for all classification-related interactions. It orchestrates various back-end components known as Providers, which are responsible for executing the actual classification tasks.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Pattern and Context classification providers&lt;/strong>: The Providers function as specialized modules in identifying and classifying Personally Identifiable Information (PII). They analyze input data to detect, classify, and locate sensitive information.&lt;/p></description></item><item><title>Performance and Accuracy</title><link>https://docs.protegrity.com/data-discovery/2.0.0/docs/perf_accuracy/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://docs.protegrity.com/data-discovery/2.0.0/docs/perf_accuracy/</guid><description>&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>Performance and accuracy are critical metrics for data discovery tools. These ensure that large datasets can be processed swiftly and sensitive information is correctly identified. High performance minimizes latency and maximizes productivity, while accuracy reduces the risk of data breaches and ensures compliance with regulatory standards like GDPR and CCPA.&lt;/p>
&lt;p>Together, these qualities are essential for maintaining data integrity and security in environments where unstructured data flows through various systems..&lt;/p></description></item><item><title>Usage Metrics</title><link>https://docs.protegrity.com/data-discovery/2.0.0/docs/usage_metrics/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://docs.protegrity.com/data-discovery/2.0.0/docs/usage_metrics/</guid><description>&lt;p>This section outlines the usage metrics generated by Data Discovery for classification requests. These metrics provide visibility into service usage and support scenarios such as internal chargeback across departments, the logs are designed to support monitoring, auditing, and capacity planning.&lt;/p>
&lt;h2 id="overview">Overview&lt;/h2>
&lt;p>When you submit a classification request to Data Discovery, the service generates a usage log entry after the request is processed. A log entry is created for every request, regardless of whether the request succeeds or fails.&lt;/p></description></item></channel></rss>