<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>CDP AWS DataHub on</title><link>https://docs.protegrity.com/protectors/10.0/docs/bdp/cdp_aws_datahub/</link><description>Recent content in CDP AWS DataHub on</description><generator>Hugo</generator><language>en</language><atom:link href="https://docs.protegrity.com/protectors/10.0/docs/bdp/cdp_aws_datahub/index.xml" rel="self" type="application/rss+xml"/><item><title>Understanding the architecture</title><link>https://docs.protegrity.com/protectors/10.0/docs/bdp/cdp_aws_datahub/bdp_cdp_aws_dh_understnd_arch/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://docs.protegrity.com/protectors/10.0/docs/bdp/cdp_aws_datahub/bdp_cdp_aws_dh_understnd_arch/</guid><description>&lt;p>The architecture for the CDP-AWS-DataHub distribution of the Big Data Protector is depicted in the image below.
&lt;img src="https://docs.protegrity.com/protectors/10.0/docs/images/bdp/bdp_cdp_pvc_base_architecture.png" alt="" title="CDP-AWS-DataHub Architecture">&lt;/p>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Component&lt;/th>
 &lt;th>Description&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>RPAgent&lt;/td>
 &lt;td>Is a daemon running on each node that downloads the package from ESA over a TLS channel using the installed Certificates.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Log Forwarder&lt;/td>
 &lt;td>Is a daemon running on each node that routes the audit logs and application logs to ESA/Audit Store.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>config.ini&lt;/td>
 &lt;td>Is a file on each node containing the set of configuration parameters to modify the protector behavior.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>BDP Layer&lt;/td>
 &lt;td>Contains the Big Data Protector UDFs and APIs executing in CDP service processes.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>JcoreLite&lt;/td>
 &lt;td>Is the JNI library that provides a Java API layer to the Core libraries.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Core&lt;/td>
 &lt;td>Is the set of various libraries that provide the Protegrity Core functionality.&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table></description></item><item><title>System Requirements</title><link>https://docs.protegrity.com/protectors/10.0/docs/bdp/cdp_aws_datahub/bdp_cdp_aws_dh_sys_req/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://docs.protegrity.com/protectors/10.0/docs/bdp/cdp_aws_datahub/bdp_cdp_aws_dh_sys_req/</guid><description>&lt;p>Ensure that the following prerequisites are met, before installing the Big Data Protector from the Cloudera Manager:&lt;/p>
&lt;ul>
&lt;li>The Hadoop cluster is installed, configured, and running CDP-AWS-Datahub (Cloudera Runtime 7.3.1).&lt;/li>
&lt;li>The ESA appliance, version 10.0.x or v10.1.x, is installed, configured, and running.&lt;/li>
&lt;li>The ports that are configured on ESA and the nodes in the cluster, which will run the Big Data Protector, are listed in the following table:&lt;/li>
&lt;/ul>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Destination Port&lt;/th>
 &lt;th>Protocol&lt;/th>
 &lt;th>Source&lt;/th>
 &lt;th>Destination&lt;/th>
 &lt;th>Description&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>8443&lt;/td>
 &lt;td>TLS&lt;/td>
 &lt;td>RPAgent on the Big Data Protector cluster node&lt;/td>
 &lt;td>ESA&lt;/td>
 &lt;td>The RPAgent communicates with ESA through port&lt;br>8443 to download a policy.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>9200&lt;/td>
 &lt;td>TLS&lt;/td>
 &lt;td>Log Forwarder on the Big Data Protector Cluster node&lt;/td>
 &lt;td>Protegrity Audit&lt;br>Store appliance&lt;/td>
 &lt;td>The Log Forwarder sends all the logs to&lt;br>the Protegrity Audit Appliance through port 9200.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>15780&lt;/td>
 &lt;td>TCP&lt;/td>
 &lt;td>Protector on the Big Data Protector&lt;br>cluster node&lt;/td>
 &lt;td>Log Forwarder&lt;br>on the Big Data&lt;br>Protector cluster&lt;br>node&lt;/td>
 &lt;td>The Big Data Protector writes Audit Logs to&lt;br>localhost through port 15780. The Application&lt;br>Logs are also written to localhost through&lt;br>port 15780. The Log Forwarder reads the logs from that&lt;br>socket.&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;ul>
&lt;li>The user, installing the Big Data Protector, has the requisite permissions to perform the following tasks:
&lt;ul>
&lt;li>Copy the Big Data Protector parcels and CSDs to the Cloudera Manager repository directories&lt;/li>
&lt;li>Restart the Cloudera SCM Server&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>If you are installing the Big Data Protector on a cluster, then ensure that it is installed on all the nodes in the cluster.&lt;/li>
&lt;li>The group &lt;code>ptyitusr&lt;/code> and the user &lt;code>ptyitusr&lt;/code>, responsible to manage the Big Data Protector-related services are managed by Cloudera Manager. The user and group are unavailable on the cluster nodes.&lt;/li>
&lt;/ul>
&lt;p>This build supports both Spark 2 and Spark 3 on the cluster using a single pepspark jar. &lt;br> For more information about installing Spark3 on CDP AWS DataHub cluster, refer &lt;a href="https://docs.cloudera.com/cdp-private-cloud-base/7.1.8/cds-3/topics/spark-install-spark-3-parcel.html">https://docs.cloudera.com/cdp-private-cloud-base/7.1.8/cds-3/topics/spark-install-spark-3-parcel.html&lt;/a>.&lt;/p></description></item></channel></rss>