<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Introduction on</title><link>https://docs.protegrity.com/synthetic-data/1.0.1/docs/introduction/</link><description>Recent content in Introduction on</description><generator>Hugo</generator><language>en</language><atom:link href="https://docs.protegrity.com/synthetic-data/1.0.1/docs/introduction/index.xml" rel="self" type="application/rss+xml"/><item><title>Privacy-Preserving Characteristics</title><link>https://docs.protegrity.com/synthetic-data/1.0.1/docs/introduction/hide_intro_privacy_preserve/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://docs.protegrity.com/synthetic-data/1.0.1/docs/introduction/hide_intro_privacy_preserve/</guid><description>&lt;h2 id="no-direct-link-to-real-individuals">No Direct Link to Real Individuals&lt;/h2>
&lt;p>Protegrity Synthetic Data is generated from learned patterns in real datasets but does not contain any actual personal records. This ensures:&lt;/p>
&lt;ul>
&lt;li>No 1:1 mapping between synthetic and real data.&lt;/li>
&lt;li>No re-identification risk, even when used in sensitive domains, such as healthcare or finance.&lt;/li>
&lt;/ul>
&lt;h2 id="compliance-with-privacy-regulations">Compliance with Privacy Regulations&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>General Data Protection Regulation (GDPR)&lt;/strong>: Synthetic Data is considered anonymous under GDPR. It lacks identifiable links to real individuals.&lt;/li>
&lt;li>&lt;strong>Health Insurance Portability and Accountability Act (HIPAA)&lt;/strong>: It qualifies under Safe Harbor and Expert Determination methods. This makes it suitable for healthcare data use, without being classified as Protected Health Information (PHI).&lt;/li>
&lt;/ul>
&lt;h2 id="built-in-privacy-safeguards">Built-In Privacy Safeguards&lt;/h2>
&lt;p>Protegrity’s Synthetic Data solution includes multiple privacy-enhancing features:&lt;/p></description></item><item><title>Comparison with Other Privacy-Enhancing Technologies</title><link>https://docs.protegrity.com/synthetic-data/1.0.1/docs/introduction/hide_intro_compare/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://docs.protegrity.com/synthetic-data/1.0.1/docs/introduction/hide_intro_compare/</guid><description>&lt;p>The following section provides details about Protegrity Synthetic Data and other data protection methods.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Pseudonymization&lt;/strong> replaces real data with tokens for certain attributes, such as Personally Identifiable Information (PII). However, this method still uses real data, and the analytical value is perfect unless other attributes are tokenized.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Anonymization&lt;/strong> reduces the risk of reidentification by transforming quasi-identifiers. However, careful balancing of utility and privacy is needed to minimize the impact on downstream usage.&lt;/p></description></item><item><title>Protegrity Synthetic Data Overview</title><link>https://docs.protegrity.com/synthetic-data/1.0.1/docs/introduction/hide_intro_synth_data_overview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://docs.protegrity.com/synthetic-data/1.0.1/docs/introduction/hide_intro_synth_data_overview/</guid><description>&lt;p>Protegrity Synthetic Data is a privacy-enhancing technology that uses real datasets to create artificial data. It does not represent real individuals and has no connection to real people. However, it still provides strong analytical utility and preserves relationships between variables.&lt;/p>
&lt;h2 id="key-characteristics-of-protegrity-synthetic-data">Key Characteristics of Protegrity Synthetic Data&lt;/h2>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Feature&lt;/th>
 &lt;th>Synthetic Data&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>Represents real people&lt;/td>
 &lt;td>False.&lt;br>It has no direct link to real individuals.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Closeness to real individuals&lt;/td>
 &lt;td>Low.&lt;br>It preserves relationships between variables and real data.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Analytics and advanced analytics&lt;/td>
 &lt;td>High utility.&lt;br>It is suitable for ML, forecasting, and testing.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Maintain data types&lt;/td>
 &lt;td>Guaranteed.&lt;br>It preserves the schema compatibility.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Internal and external sharing&lt;/td>
 &lt;td>Possible.&lt;br>It is compliant with privacy regulations like GDPR and HIPAA.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Simulating rare scenarios&lt;/td>
 &lt;td>Possible.&lt;br>It simulates rare scenarios, fraud patterns, or edge cases not present in production.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Risk of re-identification&lt;/td>
 &lt;td>Low.&lt;br>It minimizes the risk of re-identification compared to Anonymization or Pseudonymization.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Data progression&lt;/td>
 &lt;td>Possible.&lt;br>It can be used to create data trends that might change over time.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Cost&lt;/td>
 &lt;td>Moderate.&lt;br>It incurs varying costs depending on the complexity of the data and the synthesis methods used.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Scalability&lt;/td>
 &lt;td>High.&lt;br>It can be generated in large volumes as needed.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Maintenance&lt;/td>
 &lt;td>Moderate.&lt;br>It requires periodic updates to reflect changes in real data.&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>Protegrity Synthetic Data is a powerful tool for privacy compliance. It:&lt;/p></description></item><item><title>How Protegrity Synthetic Data is Generated</title><link>https://docs.protegrity.com/synthetic-data/1.0.1/docs/introduction/hide_intro_synth_data_generation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://docs.protegrity.com/synthetic-data/1.0.1/docs/introduction/hide_intro_synth_data_generation/</guid><description>&lt;p>Protegrity Synthetic Data is a privacy-enhancing technology that creates artificial datasets. It works by learning from the structure and statistical properties of real data. It is designed to preserve analytical utility while protecting individual privacy. The process involves three key stages:&lt;/p>
&lt;h2 id="stage-1-extract-characteristics-from-original-data">Stage 1: Extract Characteristics from Original Data&lt;/h2>
&lt;p>The system analyzes the original dataset to understand its structure and relationships:&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Characteristics&lt;/th>
 &lt;th>Examples&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>Column types&lt;/td>
 &lt;td>string, integer, categorical&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Value distributions&lt;/td>
 &lt;td>age ranges, frequency of pet types&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Relationships between variables&lt;/td>
 &lt;td>age and pet ownership patterns&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;h2 id="stage-2-generate-fictional-records">Stage 2: Generate Fictional Records&lt;/h2>
&lt;p>Based on the extracted characteristics, synthetic records are created using advanced modeling techniques:&lt;/p></description></item></channel></rss>