System Requirements

Ensure that the following prerequisites are met, before installing the Big Data Protector from the Cloudera Manager:

  • The Hadoop cluster is installed, configured, and running CDP-PVC-Base (Cloudera Runtime 7.1 and above and ClouderaManager (any compatible version) ).
  • The ESA appliance, version 10.0.x or v10.1.x, is installed, configured, and running.
  • The ports that are configured on ESA and the nodes in the cluster, which will run the Big Data Protector, are listed in the following table:
Destination PortProtocolSourceDestinationDescription
8443TLSRP Agent on the Big Data Protector cluster nodeESAThe RP Agent communicates with ESA through port
8443 to download a policy.
9200TLSLog Forwarder on the Big Data Protector Cluster nodeProtegrity Audit
Store appliance
The Log Forwarder sends all the logs to
the Protegrity Audit Appliance through port 9200.
15780TCPProtector on the Big Data Protector
cluster node
Log Forwarder
on the Big Data
Protector cluster
node
The Big Data Protector writes Audit Logs to
localhost through port 15780. The Application
Logs are also written to localhost through
port 15780. The Log Forwarder reads the logs from that
socket.
  • The user, installing the Big Data Protector, has the requisite permissions to perform the following tasks:
    • Copy the Big Data Protector parcels and CSDs to the Cloudera Manager repository directories
    • Restart the Cloudera SCM Server
  • If you are installing the Big Data Protector on a cluster, then ensure that it is installed on all the nodes in the cluster.
  • The group ptyitusr and the user ptyitusr, responsible to manage the Big Data Protector-related services are managed by Cloudera Manager. The user and group are unavailable on the cluster nodes.

Note: This build supports both Spark 2 and Spark 3 on the cluster using a single pepspark jar.
For more information about installing Spark3 on CDP PVC Base cluster, refer https://docs.cloudera.com/cdp-private-cloud-base/7.1.8/cds-3/topics/spark-install-spark-3-parcel.html

The following table lists the minimum hardware configuration for the Big Data Protector on CDP-PVC-Base.

Hardware ComponentsConfiguration
CPUDepends on the application.
Disk Space130 MB on every node for the LogForwarder and RP Agent
RAMIn v10.0.0, the RP Agent loads the policy package into the shared memory. Every individual service process on a node that initializes the protector will load a copy of the policy package into the process heap memory. Therefore, the memory requirement on each node depends on the policy size and the number of protector instances (number of processes). In addition, the JVM heap size configuration of each service, such as, the YARN container heap size, must be configured appropriately to prevent out of memory errors.

Last modified : February 20, 2026