System Requirements
Ensure that the following prerequisites are met, before installing the Big Data Protector from the Cloudera Manager:
- The Hadoop cluster is installed, configured, and running CDP-PVC-Base (Cloudera Runtime 7.1 and above and ClouderaManager (any compatible version) ).
- The ESA appliance, version 10.0.x or v10.1.x, is installed, configured, and running.
- The ports that are configured on ESA and the nodes in the cluster, which will run the Big Data Protector, are listed in the following table:
| Destination Port | Protocol | Source | Destination | Description |
|---|---|---|---|---|
| 8443 | TLS | RP Agent on the Big Data Protector cluster node | ESA | The RP Agent communicates with ESA through port 8443 to download a policy. |
| 9200 | TLS | Log Forwarder on the Big Data Protector Cluster node | Protegrity Audit Store appliance | The Log Forwarder sends all the logs to the Protegrity Audit Appliance through port 9200. |
| 15780 | TCP | Protector on the Big Data Protector cluster node | Log Forwarder on the Big Data Protector cluster node | The Big Data Protector writes Audit Logs to localhost through port 15780. The Application Logs are also written to localhost through port 15780. The Log Forwarder reads the logs from that socket. |
- The user, installing the Big Data Protector, has the requisite permissions to perform the following tasks:
- Copy the Big Data Protector parcels and CSDs to the Cloudera Manager repository directories
- Restart the Cloudera SCM Server
- If you are installing the Big Data Protector on a cluster, then ensure that it is installed on all the nodes in the cluster.
- The group ptyitusr and the user ptyitusr, responsible to manage the Big Data Protector-related services are managed by Cloudera Manager. The user and group are unavailable on the cluster nodes.
Note: This build supports both Spark 2 and Spark 3 on the cluster using a single pepspark jar.
For more information about installing Spark3 on CDP PVC Base cluster, refer https://docs.cloudera.com/cdp-private-cloud-base/7.1.8/cds-3/topics/spark-install-spark-3-parcel.html
The following table lists the minimum hardware configuration for the Big Data Protector on CDP-PVC-Base.
| Hardware Components | Configuration |
|---|---|
| CPU | Depends on the application. |
| Disk Space | 130 MB on every node for the LogForwarder and RP Agent |
| RAM | In v10.0.0, the RP Agent loads the policy package into the shared memory. Every individual service process on a node that initializes the protector will load a copy of the policy package into the process heap memory. Therefore, the memory requirement on each node depends on the policy size and the number of protector instances (number of processes). In addition, the JVM heap size configuration of each service, such as, the YARN container heap size, must be configured appropriately to prevent out of memory errors. |
Feedback
Was this page helpful?