A Data Engineer needs to build a model using a dataset containing customer credit card informationHow can the Data Engineer ensure the data remains encrypted and the credit card information is secure?
Use a custom encryption algorithm to encrypt the data and store the data on an Amazon SageMaker instance in a VPC. Use the SageMaker DeepAR algorithm to randomize the credit card numbers.
Use an IAM policy to encrypt the data on the Amazon S3 bucket and Amazon Kinesis to automatically discard credit card numbers and insert fake credit card numbers.
Use an Amazon SageMaker launch configuration to encrypt the data once it is copied to the SageMaker instance in a VPC. Use the SageMaker principal component analysis (PCA) algorithm to reduce the length of the credit card numbers.
Use AWS KMS to encrypt the data on Amazon S3 and Amazon SageMaker, and redact the credit card numbers from the customer data with AWS Glue.
Explanations:
Using a custom encryption algorithm is not recommended due to potential security vulnerabilities. Additionally, randomizing credit card numbers does not secure the data; it may make it unusable for legitimate purposes.
An IAM policy does not directly encrypt data; it manages access permissions. Discarding credit card numbers without proper handling is risky and could lead to data integrity issues.
Amazon SageMaker launch configurations are not used for data encryption, and reducing the length of credit card numbers does not address security. It may also make the data invalid.
Using AWS KMS (Key Management Service) ensures strong encryption of data at rest on Amazon S3 and Amazon SageMaker. Redacting credit card numbers with AWS Glue ensures sensitive information is removed before processing.