Performance
Performance Considerations
The following factors may affect S3 Protector performance:
Number of protected columns in a file: Translates to the number of parallel requests to Cloud API. For optimal performance Cloud API should be configured to allow for at least this many parallel requests. Review Performance section in Cloud API on AWS documentation on how to monitor and configure Cloud API for best performance. The more protected columns there are in a file, the longer it will take to process the file.
Maximum batch size: The maximum number of rows to process in a single Cloud API invocation. This value is applied to all columns. The higher the batch size, the higher the Cloud API throughput. This value controls the size of a single request to Cloud API. The request size is limited by AWS Lambda infrastructure to 6 MB. Review AWS Lambda quotas and limitations for latest information. Update maximum batch size using CloudFormation template.
Function timeout: Default S3 Protector execution time is set to 5 minutes. It may be increased to a maximum of 15 minutes, where maximum is imposed by AWS Lambda infrastructure. Execution time puts a restriction on the maximum file size that this product can process. Review Timeout section for more information.
Cloud API performance: S3 Protector uses Cloud API to apply protect operations to data in the file. Performance of Cloud API directly affects the performance of S3 Protector. Review Performance section in Cloud API on AWS documentation.
Benchmarks
The following table shows throughput and latency for three different files. Each file has 21 columns, 6 of which were protected by S3 Protector with tokenization data elements. The remaining 15 columns were configured to pass through without applying protection.
Two of the default S3 Protector settings were updated for this benchmark:
- Default function timeout was increased to its maximum of 15 minutes
- ‘MaxBatchSize’ was increased from default ‘25000’ to ‘50000’ (via CloudFormation template)
Performance Results
| Rows x Columns | Protected Columns | Number of Protect Operations | File Size (MB) | Total Duration (s) | Throughput (MB/s) | Throughput (Operations/s) |
|---|---|---|---|---|---|---|
| 100k x 21 | 6 | 600,000 | 22 | 5 | 4.34 | 118k/s |
| 1m x 21 | 6 | 6,000,000 | 220 | 50 | 4.36 | 119k/s |
| 10m x 21 | 6 | 60,000,000 | 2,200 | 510 | 4.31 | 118k/s |
Feedback
Was this page helpful?