This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Installation

Instructions for installing Cloud Storage Protector Service.

1 - Prerequisites

Requirements before installing the protector.

    AWS Services

    The following table describes the AWS services that may be a part of your Protegrity installation.

    ServiceDescription
    LambdaProvides serverless compute for S3 Protector.
    S3Input and Output data to be processed with S3 Protector.
    CloudWatchApplication and audit logs, performance monitoring, and alerts.

    Prerequisites

    RequirementDetail
    S3 Protector distribution and installation scriptsThese artifacts are provided by Protegrity
    Protegrity Cloud Protect APIThis product is required.
    AWS AccountRecommend using the same AWS account as the Protegrity Cloud API deployment.

    Required Skills and Abilities

    Role / SkillsetDescription
    AWS Account AdministratorTo run CloudFormation (or perform steps manually), create/configure S3, VPC and IAM permissions.
    Protegrity AdministratorThe ESA credentials required to read the policy configuration.

    What’s Next

    2 - Pre-Configuration

    Configuration steps before installing the protector.

      Provide AWS sub-account

      Identify or create an AWS account where the Protegrity solution will be installed. The installation instructions assume the same AWS account and region are used for Cloud Protect API deployment.

      AWS Account ID: ___________________

      AWS Region: ___________________

      Create S3 bucket for Installing Artifacts

      This S3 bucket will be used for the artifacts required by the CloudFormation installation steps. This S3 bucket must be created in the region that is defined in Provide AWS sub-account.

      To create S3 bucket for installing artifacts:

      1. Sign in to the AWS Management Console and open the Amazon S3 console.

      2. Change region to the one determined in Provide AWS sub-account

      3. Click Create Bucket.

      4. Enter a unique bucket name:

        For example, protegrity-install.us-west-2.example.com.

      5. Upload the installation artifacts to this bucket. Protegrity will provide the following artifacts.

        • protegrity-s3-protector-<version>.zip

        Artifact S3 Bucket Name: ___________________

      Cloud Protect API function

      Protegrity Cloud Protect API on AWS is required for the S3 Protector installation. See the Cloud Protect API on AWS documentation to create a new installation if one is not already available in your account/region. With Cloud Protect API on AWS installed, follow the below instructions to obtain the ARN of the protector lambda function.

      Follow these steps to obtain Cloud API Lambda ARN.

      1. Access the AWS Management Console.

      2. Navigate to the Cloud Protect API function in the AWS Lambda service.

      3. Open the Cloud Protect API function.

      4. From the Lambda view, choose Aliases, then click on Production alias.

      5. At the top right, copy the Lambda function ARN and record it. The Cloud API Production Alias ARN will be used later in this installation guide when creating IAM policy and deploying S3 Protector with Cloud Formation template.

      Cloud Protect API function ARN: ____________________

      S3 Buckets For Input And Output Data

      Two S3 buckets are required. One bucket is used for incoming files. The second bucket is used for files processed by the S3 Protector. The buckets must be different. The S3 buckets should be created in the region that is defined in Provide AWS sub-account.

      Identify existing bucket names or follow the steps below to create new buckets.

      1. Sign in to the AWS Management Console and open the Amazon S3 console.

      2. Change region to the one determined in Provide AWS sub-account

      3. Select Create Bucket.

      4. Enter a globally unique bucket name. For example: in.us-west-2.example.com or out.us-west-2.example.com.

      5. Scroll down and configure S3 bucket security features. It is strongly recommend to keep Block all public access on. It is also recommend to enable server-side encryption.

      6. Record bucket names. They will be required later in this installation guide.

      Input S3 Bucket Name: ____________________

      Output S3 Bucket Name: ____________________

      What’s Next

      3 - S3 Protector Service Installation

      Install the S3 protector service.

        Preparation

        Ensure that all the steps in Pre-Configuration are performed.

        1. Login to the AWS sub-account console where Protegrity will be installed.

        2. Ensure that the required CloudFormation templates provided by Protegrity are available on your local computer.

        Create S3 Protector Lambda IAM Execution Policy

        The below steps create an IAM policy for use by the Protegrity Lambda function. The policy grants permissions to:

        • Write logs to CloudWatch
        • Read from input S3 bucket
        • Write to output S3 bucket
        • Invoke Cloud Protect API function

        Steps

        1. From the AWS IAM console, select PoliciesCreate Policy.

        2. Select the JSON tab and copy the following sample policy:

        {
          "Version": "2012-10-17",
          "Statement": [
            {
              "Sid": "CloudWatchWriteLogs",
              "Effect": "Allow",
              "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
              ],
              "Resource": "*"
            },
            {
              "Sid": "ReadS3In",
              "Effect": "Allow",
              "Action": [
                "s3:GetObject",
                "s3:GetObjectVersion",
                "s3:GetObjectAcl",
                "s3:ListBucket",
                "s3:DeleteObject"
              ],
              "Resource": [
                "arn:aws:s3:::PLACEHOLDER_S3_IN_BUCKET_NAME",
                "arn:aws:s3:::PLACEHOLDER_S3_IN_BUCKET_NAME/*"
              ]
            },
            {
              "Sid": "WriteS3Out",
              "Effect": "Allow",
              "Action": [
                "s3:PutObject",
                "s3:ListBucket",
                "s3:PutObjectAcl",
                "s3:DeleteObject"
              ],
              "Resource": [
                "arn:aws:s3:::PLACEHOLDER_S3_OUT_BUCKET_NAME",
                "arn:aws:s3:::PLACEHOLDER_S3_OUT_BUCKET_NAME/*"
              ]
            },
            {
              "Sid": "InvokeCloudProtectApi",
              "Effect": "Allow",
              "Action": [
                "lambda:InvokeFunction"
              ],
              "Resource": [
                "PLACEHOLDER_CLOUD_PROTECT_API_ARN"
              ]
            }
          ]
        }
        
        1. Replace the PLACEHOLDER values with the values recorded in earlier steps:

          • Cloud Protect API prerequisites
          • S3 Data Buckets prerequisites
        2. Select Review policy, type in a policy name (e.g., ProtegrityS3ProtectorLambdaPolicy) and Confirm.

        3. Record the policy name.

          S3 Protector Function Policy Name: __________________


        Create S3 Protector Lambda IAM Role

        The following steps create the role to utilize the policy defined in the previous section.

        Steps

        1. From the AWS IAM console, select RoleCreate Role.

        2. Select AWS ServiceLambdaPermissions.

        3. In the list, search and select the policy created in the previous step.

        4. Proceed to Tags.

        5. Proceed to final step of the wizard.

        6. Type the role name (e.g., ProtegrityS3ProtectorLambdaRole) and click Confirm.

        7. Record the role ARN.

          Protegrity S3 Protector Lambda Role ARN: ___________________


        Install through CloudFormation

        The following steps describe deployment of the S3 Protector Lambda Function using CloudFormation.

        1. Access CloudFormation and select the target AWS Region in the console.

        2. Click Create Stack and choose With new resources.

        3. Specify the template.

        4. Select Upload a template file.

        5. Upload the Protegrity-provided CloudFormation template called pty_s3_protector_cf.json and click Next.

        6. Specify the stack details. Enter a stack name.

        7. Enter the required parameters. All the values were generated in the pre-configuration steps.

        CloudFormation Parameters

        ParameterDescriptionDefault Value
        ArtifactS3BucketThe name of the S3 bucket containing deployment package for S3 Protector. Use Artifact S3 Bucket Name recorded in prerequisites.
        Allowed pattern: [a-zA-Z0-9.\-_]+
        CloudApiProtectorLambdaArnThe ARN of the Cloud Protect API Lambda which will be invoked by S3 Protector Lambda. Use Cloud Protect API function ARN recorded in prerequisites.
        Allowed pattern: arn:(aws[a-zA-Z-]*)?:lambda:[a-z]{2}(-gov)?-[a-z]+-\d{1}:\d{12}:function:[a-zA-Z0-9-_\.]+(:(\$LATEST|[a-zA-Z0-9-_]+))?
        DeleteInputFilesDelete the input files after they have been successfully processed.
        Allowed values: [true, false]
        true
        IncludeHeaderAdd header to output data.
        Allowed values: [true, false]
        true
        LambdaExecutionRoleArnS3 Protector Lambda IAM execution role ARN allowing access to CloudWatch logs and S3 bucket. Use Protegrity S3 Protector Lambda Role ARN recorded previously.
        Allowed pattern: arn:(aws[a-zA-Z-]*)?:iam::\d{12}:role/?[a-zA-Z_0-9+=,.@\-_/]+
        MaxBatchSizeThe maximum number of rows to process in single Cloud API invocation.
        Allowed pattern: [0-9]+
        25000
        MinLogLevelMinimum log level for S3 protector function.
        Allowed values: [off, severe, warning, info, config, all]
        severe
        OutputFileCompressionTypeCompression type to apply to processed files in the output s3 bucket.
        Allowed values: [gzip, none]
        gzip
        OutputFileNamePostfixPostfix to append to processed file names in the output s3 bucket.
        Allowed values: [none, timestamp]
        timestamp
        OutputFileFormatFormat of the processed file saved in the output s3 bucket.
        Allowed values: [csv, json, parquet, preserve_input_format, use_mapping_spec, xlsx]
        preserve_input_format
        OutputS3BucketNameThe name of the output S3 bucket where protected files will be saved. Use Output S3 Bucket Name recorded in prerequisites.
        Allowed pattern: [a-zA-Z0-9.\-_]+
        PolicyUserThe name of the authorized user in the Protegrity security policy. This is the user which will be applied to every protect operation.
        LambdaFunctionProductionVersionS3 Protector Lambda version handling service requests.
        Allowed pattern: ([0-9]+|\$LATEST)
        $LATEST
        1. Click Next with defaults to complete CloudFormation.

        2. After CloudFormation is completed, select the Outputs tab in the stack.

        3. Record the S3 Protector Lambda Name and Arn.

          S3 Protector Lambda Name: __________________________

          S3 Protector Lambda Arn: ________________________________


        Test S3 Protector Function Configuration

        Perform the following steps to verify that S3 Protector Function can read files from input S3 bucket, call Cloud API protector and write data to output S3 bucket.

        Before you begin:

        1. Update S3 Protector Cloud Formation stack with temporary settings used for testing:
          • In AWS Cloud Formation console, go to Stacks
          • Select Cloud Formation stack deployed in the installation step
          • In the stack details pane, choose Update
          • Select Use existing template and then choose Next
          • Change the following parameters:
        ParameterValueNote
        DeleteInputFilesfalseFor testing purposes input file will not be deleted after it’s processed.
        MinLogLevelconfigConfig level prints verbose log messages.
        OutputFileCompressionTypenoneFor testing purposes compression is disabled for quicker visual verification of the output file.
        • Select Next and then Submit. Wait until the changes are deployed.
        1. Upload sample data file to S3 input bucket.

        data.csv:

        first name,last name,email
        tusqB,FrjKe,ebMgF.VoiDd@bqclblD.wOt
        JXVVW,acg,BikPa.ufb@UmPxcTD.bLh
        mDNJ,IZWCYkbnrAs,NWXD.GdrzMJwmwJG@fMZsuSE.Qlp
        jIqColWOss,XKfz,NVabzoUSgx.XRHM@BQleCST.Mnb
        muUxYvz,FLZxCHlca,eiNjzCm.UMRNYANwn@isvxpAV.PJk
        
        1. Upload mapping.json to the input S3 bucket next to the input data file. Replace placeholders with data element names configured in your security policy. If your Cloud Protect API Function uses sample policy you can replace “protect” with “unprotect” for operation and use “alpha” as data element.
        {
         "columns":{
            "first name":{
              "operation":"protect",
              "data_element":"<data_element_1_name>"
            },
            "last name":{
              "operation":"protect",
              "data_element":"<data_element_2_name>"
            },
            "email":{
              "operation":"protect",
              "data_element":"<data_element_3_name>"
            }
         }
        }
        

        Execute S3 Protector Function in AWS console:

        With the input data file and mapping file uploaded, follow the steps below to trigger the S3 Protect Function.

        1. Sign in to the AWS Management Console and go to Lambda console.

        2. Select Lambda Function recorded in S3 Protector Lambda Name in Install through CloudFormation section.

        3. On the S3 Protector Function page, choose Test tab.

        4. Copy the json test event into the Event JSON pane - replace bucket name placeholder with your input bucket name.

          {
            "Records": [
              {
                "s3": {
                  "bucket": {
                    "name": "<PLACEHOLDER_S3_IN_BUCKET_NAME>"
                  },
                  "object": {
                    "key": "data.csv"
                  }
                }
              }
            ]
          }
          
        5. Select Test to execute the test event.

        Verify execution results:

        1. Execution is successful if the output of test contains the following:
        {
          "statusCode": 200,
          "body": {
            "target": "s3://<PLACEHOLDER_S3_OUT_BUCKET_NAME>/data.<timestamp>.csv"
          }
        }
        

        If the expected output is not present, please consult the Troubleshooting section for common errors and solutions.

        1. Download the output file mentioned in the response body in the “target” field. Verify that it was processed according to your mapping.json. If sample policy was used with “unprotect” and “alpha” data element, the output file should contain values below:
        first name,last name,email
        Lorem,Ipsum,lorem.ipsum@example.com
        Dolor,Sit,dolor.sit@example.com
        Amet,Consectetur,amet.consectetur@example.com
        Adipiscing,Elit,adipiscing.elit@example.com
        Vivamus,Elementum,vivamus.elementum@example.com
        

        Restore production configuration:

        After S3 Protector Function configuration has been verified, make sure that the following configuration was restored for production environment:

        • Cloud Formation configuration - restore values changed in pre-configuration steps at the beginning of this section.
        • IAM permissions - remove any additional S3 read/write IAM permissions used to manually upload test datasets to S3.

        Configure S3 Lambda Triggers

        Follow the steps below to configure Amazon S3 event notification on the input bucket. This will allow Amazon S3 to send an event to S3 Protector Lambda function when an object is created or updated.

        Steps to Add S3 Lambda trigger:

        1. Sign in to the AWS Management Console and open the Amazon Lambda console.

        2. Select Lambda Function recorded in S3 Protector Lambda Name in the installation section.

        3. On the S3 Protector Function page, choose Aliases, then click on Production alias.

        4. In the Function overview pane, choose Add trigger.

        5. Select S3.

        6. Under Bucket, select the bucket recorded in Input S3 Bucket Name in prerequisites section.

        7. Under Event types, select All object create events.

        8. Optionally enter a file prefix.

        9. Enter a file suffix, e.g.: .csv. You can find the full list of supported file formats in the Features section.

        10. Under Recursive invocation, select the check box to acknowledge that using the same Amazon S3 bucket for input and output is not recommended.

        11. Choose Add.

        12. Repeat these steps for additional file suffixes supported by S3 Protector.


        Example Usage

        This section describes typical usage of S3 Protector.

        Prepare data for testing:

        Sample datasets and mapping.json files are provided in appendix sections:

        • CSV with no header
        • CSV with pipe delimiter

        Create a new folder in the input S3 bucket:

        A new folder must be created in the S3 input bucket for each distinct file schema. Each folder can have a mapping.json file corresponding to the dataset type expected. It is recommended that input folders use S3 encryption:

        • From the AWS S3 console, search and select the S3 input bucket created earlier for input files
        • Click the Create folder button
        • Provide a descriptive name for the type of dataset, e.g. sales orders
        • In Server-side encryption, select Enable
        • Use the default key type, Amazon S3 key (SSE-S3)
        • Click Create folder

        Upload the mapping.json and dataset to the folder:

        The appropriate mapping.json file must be uploaded to the folder prior to uploading the dataset.

        • Choose one of the sample dataset and mapping.json pairs from the appendix. Replace the data elements in mapping.json with similar data elements from your security policy
        • From the AWS console, navigate to Amazon S3, search and select the S3 input bucket created earlier for incoming files
        • Navigate to the desired folder
        • Click the Upload button
        • Click Add files
        • Upload the mapping.json file
        • Click the Upload button
        • Now repeat the above step for the sample dataset

        Verify output:

        Verify the output file was created:

        • From the AWS console, navigate to Amazon S3, search and select the S3 output or target bucket created earlier for writing processed files
        • Navigate to the corresponding folder
        • There should be a non-zero byte file with protected values
        • Select the file
        • From the menu select Actions | Query with S3 Select
        • Click the Run SQL query
        • Click the Formatted tab of the resultset
        • Verify the data is protected

        Troubleshooting / Logs:

        Logs are written to CloudWatch. This could provide helpful information if the results are not as expected:

        • From the AWS console, navigate to the Lambda service | Functions
        • Select and open the Lambda we created for protecting S3 files
        • At the top of function’s workspace, click the Monitoring tab
        • Click the button View logs in CloudWatch
        • Click the latest log stream
        • Scroll to the bottom of the stream for the latest log entries

        Troubleshooting

        By default S3 Protector is set to log minimal information. It is beneficial to increase S3 Protector log level to either ‘config’ or ‘all’ while troubleshooting any error conditions. Use the CloudFormation installation steps to increase ‘MinLogLevel’ function configuration.

        S3 Protector Error States

        Error StateDescriptionAction
        400 ErrorA configuration error has occurred. The standard log should provide a descriptive error message. File processing has not started. Nothing was written to target bucket.Review the log for descriptive error message. Most likely some configuration parameters will need to be updated before S3 Protector can be re-started for failed file. Use the CloudFormation installation steps to update function configuration.
        500 PermissionErrorS3 Protector does not have enough permissions to access AWS resources.Review S3 Protector IAM Policy
        500 ExceptionAn error has occurred. The log may provide additional details. File processing may have started and a partial file may have been written to the target S3 bucket. While S3 Protector does not write unprotected data to partially processed files, S3 Protector automatically removes these files on error.Review error log for additional information.
        Status: timeoutS3 Protector ran out of time while processing large files.Review S3 Protector Timeout Section
        AWS Lambda crashAny AWS Lambda function may crash due to intermittent failures. If this occurs a partial file may have been written to the target S3 bucket. Due to the crash, S3 will assume this file to be an incomplete multi-part upload. Incomplete uploads do not appear as a standard S3 files, they are not shown in AWS S3 console. You are still charged for incomplete uploads.1. Discover and abort incomplete multi-part uploads for target bucket (e.g. using AWS CLI)
        2. Restart S3 Protector for failed file

        Restarting S3 Protector

        If S3 Protector fails, it is possible to start S3 Protector for existing source file without re-uploading the file again by using AWS Lambda console. With the input data file and mapping file uploaded, follow the steps below to trigger the S3 Protect Function.

        Steps

        1. Sign in to the AWS Management Console and go to Lambda console.

        2. Select Lambda Function recorded in S3 Protector Lambda Name in the CloudFormation installation section.

        3. On the S3 Protector Function page, choose Test tab.

        4. Copy the json test event into the Event JSON pane - replace bucket name placeholder with your input bucket name:

        {
          "Records": [
            {
              "s3": {
                "bucket": {
                  "name": "<PLACEHOLDER_S3_IN_BUCKET_NAME>"
                },
                "object": {
                  "key": "data.csv"
                }
              }
            }
          ]
        }
        
        1. Select Test to execute the test event.

        4 -

        Prerequisites

        RequirementDetail
        S3 Protector distribution and installation scriptsThese artifacts are provided by Protegrity
        Protegrity Cloud Protect APIThis product is required.
        AWS AccountRecommend using the same AWS account as the Protegrity Cloud API deployment.

        5 -

        AWS Services

        The following table describes the AWS services that may be a part of your Protegrity installation.

        ServiceDescription
        LambdaProvides serverless compute for S3 Protector.
        S3Input and Output data to be processed with S3 Protector.
        CloudWatchApplication and audit logs, performance monitoring, and alerts.

        6 -

        Required Skills and Abilities

        Role / SkillsetDescription
        AWS Account AdministratorTo run CloudFormation (or perform steps manually), create/configure S3, VPC and IAM permissions.
        Protegrity AdministratorThe ESA credentials required to read the policy configuration.