Creating a Cluster
The procedures mentioned in this section are applicable only for the Bootstrap approach to install the Big Data Protector.
Perform the following steps to create an EMR cluster on AWS and install Big Data Protector on all the nodes in the EMR cluster.
To install Big Data Protector on a New EMR Cluster:
On the AWS services screen, click EMR under the Analytics section.
The Amazon EMR screen appears.
Click Create cluster.
The Create Cluster - Quick Options screen appears.
Type the name of the cluster in the Cluster name box.
Depending on the requirements, enter the sum of the master and core nodes in the Number of instances box.
Click Create cluster.
The Software and Steps tab on the Create Cluster - Advanced Options screen appears.
Depending on the requirements, select the components under the Software Configuration section.
Click Next.
The Hardware tab on the Create Cluster - Advanced Options screen appears.
On the Hardware tab, if required, you can add or reduce the number of instances of the Master, Core, and Task nodes.
Click Next.
The General Cluster Settings tab on the Create Cluster - Advanced Options screen appears.
Type the name of the cluster in the Cluster name box.
Under the Bootstrap Actions area, in the Add bootstrap action drop-down list, click Custom action.
The Add Bootstrap Action dialog box appears.
Enter the name of the bootstrap action in the Name box.
To select the location of the bootstrap script, click the icon besides the Script location box.
The Select S3 File dialog box appears.
Enter the path of the S3 bucket in the URL box.
The contents of the S3 bucket appear.
Select the
bdp_bootstrap_installer.shfile from the S3 bucket.Click Select.
The Big Data Protector bootstrap script file is selected and the Add Bootstrap Action dialog box appears.
To specify the directory in which the Big Data Protector needs to be installed on the nodes in the cluster, then provide the directory path in the Optional arguments box.
If an installation directory for the Big Data Protector is not specified, then
/opt/protegrity/is considered as the default directory.Click Add.
The General Cluster Settings tab on the Create Cluster - Advanced Options screen appears and the Bootstrap actions are updated.
Click Next.
The Security tab on the Create Cluster - Advanced Options screen appears.
Select the required EC2 key pair for the EMR cluster from the EC2 key pair drop-down list.
Click Create Cluster.
The EMR cluster is created, Big Data Protector is installed on all the nodes in the cluster, and the required Big Data Protector parameters are configured.
You can also install create a new EMR cluster and install Big Data Protector on the nodes in the cluster using the CLI using the following command:
aws emr create-cluster --auto-scaling-role EMR_AutoScaling_DefaultRole --termination-protected --applications Name=Hadoop Name=Hive Name=Pig Name=Hue Name=Spark Name=Tez Name=HBase --bootstrap-actions '[{"Path":"<S3_Path_For_BootstrapInstaller>","Name":"<Script_Name>"}]' --ec2-attributes '{"KeyName":"<KEY_NAME>","InstanceProfile":"EMR_EC2_DefaultRole","EmrManagedSlaveSecurityGroup":"sg-c8ef00de","EmrManagedMasterSecurityGroup":"sg-2deb043b"}' --service-role EMR_DefaultRole --enable-debugging --release-label emr-<EMR_Version> --log-uri 's3n://aws-logs-406396743807-us-east-1/elasticmapreduce/' --name '<Cluster_Name>' --instance-groups '[{"InstanceCount":2,"InstanceGroupType":"CORE","InstanceType":"m3.xlarge","Name":"Core - 2"},{"InstanceCount":1,"InstanceGroupType":"MASTER","InstanceType":"m3.xlarge","Name":"Master - 1"}]' – scale-down-behavior TERMINATE_AT_INSTANCE_HOUR --region us-east-1where:
S3_Path_For_BootstrapInstaller: Specifies the S3 bucket path containing the Big Data Protector bootstrap installer script.Script_Name: Specifies the name of the Big Data Protector installation script.KEY_NAME: Specifies the Private Key file on the Master node in the EMR cluster, which is used to communicate with the other nodes in the cluster.Cluster_Name: Specifies the name of the new EMR cluster.
Feedback
Was this page helpful?