Which solution will meet these requirements?

An online retail company has more than 50 million active customers and receives more than 25,000 orders each day.The company collects purchase data for customers and stores this data in Amazon S3.Additional customer data is stored in Amazon RDS.The company wants to make all the data available to various teams so that the teams can perform analytics.The solution must provide the ability to manage fine-grained permissions for the data and must minimize operational overhead.

Which solution will meet these requirements?

Migrate the purchase data to write directly to Amazon RDS. Use RDS access controls to limit access.

Schedule an AWS Lambda function to periodically copy data from Amazon RDS to Amazon S3. Create an AWS Glue crawler. Use Amazon Athena to query the data. Use S3 policies to limit access.

Create a data lake by using AWS Lake Formation. Create an AWS Glue JDBC connection to Amazon RDS. Register the S3 bucket in Lake Formation. Use Lake Formation access controls to limit access.

Create an Amazon Redshift cluster. Schedule an AWS Lambda function to periodically copy data from Amazon S3 and Amazon RDS to Amazon Redshift. Use Amazon Redshift access controls to limit access.

Explanations:

Migrating purchase data to Amazon RDS would not efficiently scale with over 50 million active customers and 25,000 orders daily. RDS also has limitations in managing large datasets compared to S3. Additionally, it would increase operational overhead without fine-grained permission management across different teams.

While this option allows querying data with Athena and has some access control, it still requires periodic copying of data, which adds operational overhead. It lacks the ability to manage fine-grained permissions effectively across diverse data sources.

Using AWS Lake Formation allows the creation of a data lake that can integrate both S3 and RDS data sources. It provides fine-grained access controls for various teams, minimizing operational overhead by centralizing permissions and making data available for analytics efficiently.

Creating an Amazon Redshift cluster introduces significant operational complexity and cost, especially for real-time analytics. While it can handle large datasets, it doesn’t offer the same ease of permission management as Lake Formation and may not be ideal for the diverse data access needs of multiple teams.

Learn & move to cloud

Which solution will meet these requirements?

Explanations:

Leave a Reply Cancel reply