What would you recommend?
(Choose two.)
Have the mobile app access Amazon DynamoDB directly Instead of JSON files stored on Amazon S3.
Write data directly into an Amazon Redshift cluster replacing both Amazon DynamoDB and Amazon S3.
Introduce an Amazon SQS queue to buffer writes to the Amazon DynamoDB table and reduce provisioned write throughput.
Introduce Amazon Elasticache to cache reads from the Amazon DynamoDB table and reduce provisioned read throughput.
Create a new Amazon DynamoDB table each day and drop the one for the previous day after its data is on Amazon S3.
Explanations:
Allowing mobile apps to directly access DynamoDB is generally a bad practice from a security perspective. It would require embedding AWS credentials in the mobile app, which is a significant security risk. It also adds complexity to the mobile app development.
Redshift is designed for analytical queries on large datasets, not for high-volume, low-latency writes like those coming from the mobile app. Writing each data point directly to Redshift would be very inefficient and expensive. Redshift is also not designed for the type of key-value lookups required for user authentication.
Introducing an SQS queue to buffer writes to DynamoDB is a good optimization. The write load is concentrated in the night hours. SQS can decouple the mobile app from DynamoDB, allowing you to smooth out the write traffic and reduce the required provisioned write capacity on the DynamoDB table. This lowers costs by reducing the need to over-provision for peak write throughput.
Caching reads from DynamoDB with ElastiCache is not relevant in this scenario. The daily aggregation process involves ascanof the entire DynamoDB table, not individual reads. Caching wouldn’t improve the performance or cost of the scan operation. The scan is also done once per day, so there would be very little benefit from caching.
Creating a new DynamoDB table each day and dropping the previous day’s table after the data is copied to S3 is a good cost optimization strategy. This allows you to avoid paying for storage of old data in DynamoDB. DynamoDB is more expensive for long-term storage than S3. Since the aggregated data is already stored in S3, keeping the raw data in DynamoDB is redundant and costly.