S3 Protector Service Installation

Install the S3 protector service.

    Preparation

    Ensure that all the steps in Pre-Configuration are performed.

    1. Login to the AWS sub-account console where Protegrity will be installed.

    2. Ensure that the required CloudFormation templates provided by Protegrity are available on your local computer.

    Create S3 Protector Lambda IAM Execution Policy

    The below steps create an IAM policy for use by the Protegrity Lambda function. The policy grants permissions to:

    • Write logs to CloudWatch
    • Read from input S3 bucket
    • Write to output S3 bucket
    • Invoke Cloud Protect API function

    Steps

    1. From the AWS IAM console, select PoliciesCreate Policy.

    2. Select the JSON tab and copy the following sample policy:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "CloudWatchWriteLogs",
          "Effect": "Allow",
          "Action": [
            "logs:CreateLogGroup",
            "logs:CreateLogStream",
            "logs:PutLogEvents"
          ],
          "Resource": "*"
        },
        {
          "Sid": "ReadS3In",
          "Effect": "Allow",
          "Action": [
            "s3:GetObject",
            "s3:GetObjectVersion",
            "s3:GetObjectAcl",
            "s3:ListBucket",
            "s3:DeleteObject"
          ],
          "Resource": [
            "arn:aws:s3:::PLACEHOLDER_S3_IN_BUCKET_NAME",
            "arn:aws:s3:::PLACEHOLDER_S3_IN_BUCKET_NAME/*"
          ]
        },
        {
          "Sid": "WriteS3Out",
          "Effect": "Allow",
          "Action": [
            "s3:PutObject",
            "s3:ListBucket",
            "s3:PutObjectAcl",
            "s3:DeleteObject"
          ],
          "Resource": [
            "arn:aws:s3:::PLACEHOLDER_S3_OUT_BUCKET_NAME",
            "arn:aws:s3:::PLACEHOLDER_S3_OUT_BUCKET_NAME/*"
          ]
        },
        {
          "Sid": "InvokeCloudProtectApi",
          "Effect": "Allow",
          "Action": [
            "lambda:InvokeFunction"
          ],
          "Resource": [
            "PLACEHOLDER_CLOUD_PROTECT_API_ARN"
          ]
        }
      ]
    }
    
    1. Replace the PLACEHOLDER values with the values recorded in earlier steps:

      • Cloud Protect API prerequisites
      • S3 Data Buckets prerequisites
    2. Select Review policy, type in a policy name (e.g., ProtegrityS3ProtectorLambdaPolicy) and Confirm.

    3. Record the policy name.

      S3 Protector Function Policy Name: __________________


    Create S3 Protector Lambda IAM Role

    The following steps create the role to utilize the policy defined in the previous section.

    Steps

    1. From the AWS IAM console, select RoleCreate Role.

    2. Select AWS ServiceLambdaPermissions.

    3. In the list, search and select the policy created in the previous step.

    4. Proceed to Tags.

    5. Proceed to final step of the wizard.

    6. Type the role name (e.g., ProtegrityS3ProtectorLambdaRole) and click Confirm.

    7. Record the role ARN.

      Protegrity S3 Protector Lambda Role ARN: ___________________


    Install through CloudFormation

    The following steps describe deployment of the S3 Protector Lambda Function using CloudFormation.

    1. Access CloudFormation and select the target AWS Region in the console.

    2. Click Create Stack and choose With new resources.

    3. Specify the template.

    4. Select Upload a template file.

    5. Upload the Protegrity-provided CloudFormation template called pty_s3_protector_cf.json and click Next.

    6. Specify the stack details. Enter a stack name.

    7. Enter the required parameters. All the values were generated in the pre-configuration steps.

    CloudFormation Parameters

    ParameterDescriptionDefault Value
    ArtifactS3BucketThe name of the S3 bucket containing deployment package for S3 Protector. Use Artifact S3 Bucket Name recorded in prerequisites.
    Allowed pattern: [a-zA-Z0-9.\-_]+
    CloudApiProtectorLambdaArnThe ARN of the Cloud Protect API Lambda which will be invoked by S3 Protector Lambda. Use Cloud Protect API function ARN recorded in prerequisites.
    Allowed pattern: arn:(aws[a-zA-Z-]*)?:lambda:[a-z]{2}(-gov)?-[a-z]+-\d{1}:\d{12}:function:[a-zA-Z0-9-_\.]+(:(\$LATEST|[a-zA-Z0-9-_]+))?
    DeleteInputFilesDelete the input files after they have been successfully processed.
    Allowed values: [true, false]
    true
    IncludeHeaderAdd header to output data.
    Allowed values: [true, false]
    true
    LambdaExecutionRoleArnS3 Protector Lambda IAM execution role ARN allowing access to CloudWatch logs and S3 bucket. Use Protegrity S3 Protector Lambda Role ARN recorded previously.
    Allowed pattern: arn:(aws[a-zA-Z-]*)?:iam::\d{12}:role/?[a-zA-Z_0-9+=,.@\-_/]+
    MaxBatchSizeThe maximum number of rows to process in single Cloud API invocation.
    Allowed pattern: [0-9]+
    25000
    MinLogLevelMinimum log level for S3 protector function.
    Allowed values: [off, severe, warning, info, config, all]
    severe
    OutputFileCompressionTypeCompression type to apply to processed files in the output s3 bucket.
    Allowed values: [gzip, none]
    gzip
    OutputFileNamePostfixPostfix to append to processed file names in the output s3 bucket.
    Allowed values: [none, timestamp]
    timestamp
    OutputFileFormatFormat of the processed file saved in the output s3 bucket.
    Allowed values: [csv, json, parquet, preserve_input_format, use_mapping_spec, xlsx]
    preserve_input_format
    OutputS3BucketNameThe name of the output S3 bucket where protected files will be saved. Use Output S3 Bucket Name recorded in prerequisites.
    Allowed pattern: [a-zA-Z0-9.\-_]+
    PolicyUserThe name of the authorized user in the Protegrity security policy. This is the user which will be applied to every protect operation.
    LambdaFunctionProductionVersionS3 Protector Lambda version handling service requests.
    Allowed pattern: ([0-9]+|\$LATEST)
    $LATEST
    1. Click Next with defaults to complete CloudFormation.

    2. After CloudFormation is completed, select the Outputs tab in the stack.

    3. Record the S3 Protector Lambda Name and Arn.

      S3 Protector Lambda Name: __________________________

      S3 Protector Lambda Arn: ________________________________


    Test S3 Protector Function Configuration

    Perform the following steps to verify that S3 Protector Function can read files from input S3 bucket, call Cloud API protector and write data to output S3 bucket.

    Before you begin:

    1. Update S3 Protector Cloud Formation stack with temporary settings used for testing:
      • In AWS Cloud Formation console, go to Stacks
      • Select Cloud Formation stack deployed in the installation step
      • In the stack details pane, choose Update
      • Select Use existing template and then choose Next
      • Change the following parameters:
    ParameterValueNote
    DeleteInputFilesfalseFor testing purposes input file will not be deleted after it’s processed.
    MinLogLevelconfigConfig level prints verbose log messages.
    OutputFileCompressionTypenoneFor testing purposes compression is disabled for quicker visual verification of the output file.
    • Select Next and then Submit. Wait until the changes are deployed.
    1. Upload sample data file to S3 input bucket.

    data.csv:

    first name,last name,email
    tusqB,FrjKe,ebMgF.VoiDd@bqclblD.wOt
    JXVVW,acg,BikPa.ufb@UmPxcTD.bLh
    mDNJ,IZWCYkbnrAs,NWXD.GdrzMJwmwJG@fMZsuSE.Qlp
    jIqColWOss,XKfz,NVabzoUSgx.XRHM@BQleCST.Mnb
    muUxYvz,FLZxCHlca,eiNjzCm.UMRNYANwn@isvxpAV.PJk
    
    1. Upload mapping.json to the input S3 bucket next to the input data file. Replace placeholders with data element names configured in your security policy. If your Cloud Protect API Function uses sample policy you can replace “protect” with “unprotect” for operation and use “alpha” as data element.
    {
     "columns":{
        "first name":{
          "operation":"protect",
          "data_element":"<data_element_1_name>"
        },
        "last name":{
          "operation":"protect",
          "data_element":"<data_element_2_name>"
        },
        "email":{
          "operation":"protect",
          "data_element":"<data_element_3_name>"
        }
     }
    }
    

    Execute S3 Protector Function in AWS console:

    With the input data file and mapping file uploaded, follow the steps below to trigger the S3 Protect Function.

    1. Sign in to the AWS Management Console and go to Lambda console.

    2. Select Lambda Function recorded in S3 Protector Lambda Name in Install through CloudFormation section.

    3. On the S3 Protector Function page, choose Test tab.

    4. Copy the json test event into the Event JSON pane - replace bucket name placeholder with your input bucket name.

      {
        "Records": [
          {
            "s3": {
              "bucket": {
                "name": "<PLACEHOLDER_S3_IN_BUCKET_NAME>"
              },
              "object": {
                "key": "data.csv"
              }
            }
          }
        ]
      }
      
    5. Select Test to execute the test event.

    Verify execution results:

    1. Execution is successful if the output of test contains the following:
    {
      "statusCode": 200,
      "body": {
        "target": "s3://<PLACEHOLDER_S3_OUT_BUCKET_NAME>/data.<timestamp>.csv"
      }
    }
    

    If the expected output is not present, please consult the Troubleshooting section for common errors and solutions.

    1. Download the output file mentioned in the response body in the “target” field. Verify that it was processed according to your mapping.json. If sample policy was used with “unprotect” and “alpha” data element, the output file should contain values below:
    first name,last name,email
    Lorem,Ipsum,lorem.ipsum@example.com
    Dolor,Sit,dolor.sit@example.com
    Amet,Consectetur,amet.consectetur@example.com
    Adipiscing,Elit,adipiscing.elit@example.com
    Vivamus,Elementum,vivamus.elementum@example.com
    

    Restore production configuration:

    After S3 Protector Function configuration has been verified, make sure that the following configuration was restored for production environment:

    • Cloud Formation configuration - restore values changed in pre-configuration steps at the beginning of this section.
    • IAM permissions - remove any additional S3 read/write IAM permissions used to manually upload test datasets to S3.

    Configure S3 Lambda Triggers

    Follow the steps below to configure Amazon S3 event notification on the input bucket. This will allow Amazon S3 to send an event to S3 Protector Lambda function when an object is created or updated.

    Steps to Add S3 Lambda trigger:

    1. Sign in to the AWS Management Console and open the Amazon Lambda console.

    2. Select Lambda Function recorded in S3 Protector Lambda Name in the installation section.

    3. On the S3 Protector Function page, choose Aliases, then click on Production alias.

    4. In the Function overview pane, choose Add trigger.

    5. Select S3.

    6. Under Bucket, select the bucket recorded in Input S3 Bucket Name in prerequisites section.

    7. Under Event types, select All object create events.

    8. Optionally enter a file prefix.

    9. Enter a file suffix, e.g.: .csv. You can find the full list of supported file formats in the Features section.

    10. Under Recursive invocation, select the check box to acknowledge that using the same Amazon S3 bucket for input and output is not recommended.

    11. Choose Add.

    12. Repeat these steps for additional file suffixes supported by S3 Protector.


    Example Usage

    This section describes typical usage of S3 Protector.

    Prepare data for testing:

    Sample datasets and mapping.json files are provided in appendix sections:

    • CSV with no header
    • CSV with pipe delimiter

    Create a new folder in the input S3 bucket:

    A new folder must be created in the S3 input bucket for each distinct file schema. Each folder can have a mapping.json file corresponding to the dataset type expected. It is recommended that input folders use S3 encryption:

    • From the AWS S3 console, search and select the S3 input bucket created earlier for input files
    • Click the Create folder button
    • Provide a descriptive name for the type of dataset, e.g. sales orders
    • In Server-side encryption, select Enable
    • Use the default key type, Amazon S3 key (SSE-S3)
    • Click Create folder

    Upload the mapping.json and dataset to the folder:

    The appropriate mapping.json file must be uploaded to the folder prior to uploading the dataset.

    • Choose one of the sample dataset and mapping.json pairs from the appendix. Replace the data elements in mapping.json with similar data elements from your security policy
    • From the AWS console, navigate to Amazon S3, search and select the S3 input bucket created earlier for incoming files
    • Navigate to the desired folder
    • Click the Upload button
    • Click Add files
    • Upload the mapping.json file
    • Click the Upload button
    • Now repeat the above step for the sample dataset

    Verify output:

    Verify the output file was created:

    • From the AWS console, navigate to Amazon S3, search and select the S3 output or target bucket created earlier for writing processed files
    • Navigate to the corresponding folder
    • There should be a non-zero byte file with protected values
    • Select the file
    • From the menu select Actions | Query with S3 Select
    • Click the Run SQL query
    • Click the Formatted tab of the resultset
    • Verify the data is protected

    Troubleshooting / Logs:

    Logs are written to CloudWatch. This could provide helpful information if the results are not as expected:

    • From the AWS console, navigate to the Lambda service | Functions
    • Select and open the Lambda we created for protecting S3 files
    • At the top of function’s workspace, click the Monitoring tab
    • Click the button View logs in CloudWatch
    • Click the latest log stream
    • Scroll to the bottom of the stream for the latest log entries

    Troubleshooting

    By default S3 Protector is set to log minimal information. It is beneficial to increase S3 Protector log level to either ‘config’ or ‘all’ while troubleshooting any error conditions. Use the CloudFormation installation steps to increase ‘MinLogLevel’ function configuration.

    S3 Protector Error States

    Error StateDescriptionAction
    400 ErrorA configuration error has occurred. The standard log should provide a descriptive error message. File processing has not started. Nothing was written to target bucket.Review the log for descriptive error message. Most likely some configuration parameters will need to be updated before S3 Protector can be re-started for failed file. Use the CloudFormation installation steps to update function configuration.
    500 PermissionErrorS3 Protector does not have enough permissions to access AWS resources.Review S3 Protector IAM Policy
    500 ExceptionAn error has occurred. The log may provide additional details. File processing may have started and a partial file may have been written to the target S3 bucket. While S3 Protector does not write unprotected data to partially processed files, S3 Protector automatically removes these files on error.Review error log for additional information.
    Status: timeoutS3 Protector ran out of time while processing large files.Review S3 Protector Timeout Section
    AWS Lambda crashAny AWS Lambda function may crash due to intermittent failures. If this occurs a partial file may have been written to the target S3 bucket. Due to the crash, S3 will assume this file to be an incomplete multi-part upload. Incomplete uploads do not appear as a standard S3 files, they are not shown in AWS S3 console. You are still charged for incomplete uploads.1. Discover and abort incomplete multi-part uploads for target bucket (e.g. using AWS CLI)
    2. Restart S3 Protector for failed file

    Restarting S3 Protector

    If S3 Protector fails, it is possible to start S3 Protector for existing source file without re-uploading the file again by using AWS Lambda console. With the input data file and mapping file uploaded, follow the steps below to trigger the S3 Protect Function.

    Steps

    1. Sign in to the AWS Management Console and go to Lambda console.

    2. Select Lambda Function recorded in S3 Protector Lambda Name in the CloudFormation installation section.

    3. On the S3 Protector Function page, choose Test tab.

    4. Copy the json test event into the Event JSON pane - replace bucket name placeholder with your input bucket name:

    {
      "Records": [
        {
          "s3": {
            "bucket": {
              "name": "<PLACEHOLDER_S3_IN_BUCKET_NAME>"
            },
            "object": {
              "key": "data.csv"
            }
          }
        }
      ]
    }
    
    1. Select Test to execute the test event.

    Last modified : January 12, 2026