Troubleshooting
Accessing the PPC CLI
- Permission denied (publickey): Ensure you’re using the correct private key that matches the authorized_keys in the pod
- Connection refused: Verify the load balancer IP and hosts file configuration
- Key format issues: Ensure your private key is in the correct format (OpenSSH format for Linux/macOS, .ppk for PuTTY)
Failure of init-resiliency script
Issue: When running the init_resiliency.sh script on a fresh RHEL 10.1 system as the root user, some required tools, such as, AWS CLI, kubectl, or Helm are not detected during setup. The following error apears:
[2026-03-26 06:57:15] No credentials file found at ~/.aws/credentials. Triggering aws configure...
configuring credentials...
/home/ec2-user/bootstrap-scripts/setup-devtools-linux_redhat.sh: line 297: aws: command not found
[2026-03-26 06:57:15] Step failed: Tool installation (redhat) — command exited with non-zero status
[2026-03-26 06:57:15] ERROR: Step failed: Tool installation (redhat)
Cause: On RHEL systems, the default environment configuration for the root user do not include certain standard installation directories such as /usr/local/bin in the system path. As a result, tools that are installed successfully might not be immediately available to the script during execution.
Resolution: Before running the bootstrap or resiliency scripts as the root user on RHEL, ensure that /usr/local/bin (and the AWS CLI binary path, if applicable) is included in the $PATH. Alternatively, run the script using a non-root user (such as ec2-user) where /usr/local/bin is already part of the default PATH.
Certificate Authority (CA) is not backed up leading to protector disruption
Issue: CA certificates are not backed up during cluster migration, causing SSL certificate errors for protectors trying to communicate with the new cluster.
Description: When the CA that Envoy uses is not migrated to the new cluster, protectors cannot establish secure connections. The connection fails with SSL certificate errors like “unable to get local issuer certificate”. This disrupts protector functionality and requires manual intervention to restore communication.
Workaround:
Workaround 1: Custom CA is preserved before restore. This preserved CA is replaced with the default CA in the new restored cluster.
For more information, refer to Replacing the default Certificate Authority (CA) with a Custom CA in PPC.
This ensures protectors continue to trust the cluster without any changes.
Workaround 2: Run the GetCertificates command on each protector after restore.
cd /opt/protegrity/rpagent/bin/
./GetCertificates -u <username> -p <password>
This command downloads new CA‑signed certificates which results in restoring secure communication with the cluster.
Important: This approach is functional but not user‑friendly and should be avoided in production by preserving the custom CA across restores.
make clean command destroys the wrong cluster
Issue: make clean command affects an unintended cluster if the active context is incorrect
Description: Cleanup operations such as make clean act on the currently active Kubernetes context. Verifying that the environment is aligned with the intended cluster helps ensure cleanup activities affect only the expected resources.
Resolution: Before running the make clean command, take the following precautions:
- Verify that the active kubectl context is set to the cluster intended to decommission.
To check the active kubectl context, run the following command:kubectl config current-context - When restoring or managing multiple clusters, use a separate jump box for each cluster to keep the environments isolated.
- When using the same jump box, run restore and cleanup operations from a separate working directory for each cluster.
- Always double‑check the active context and working directory before initiating any cleanup actions.
Feedback
Was this page helpful?