Which design would you choose to meet these requirements?
Use AWS data Pipeline to schedule a DynamoDB cross region copy once a day, create a ג€Lastupdatedג€ attribute in your DynamoDB table that would represent the timestamp of the last update and use it as a filter.
Use EMR and write a custom script to retrieve data from DynamoDB in the current region using a SCAN operation and push it to DynamoDB in the second region.
Use AWS data Pipeline to schedule an export of the DynamoDB table to S3 in the current region once a day then schedule another task immediately after it that will import data from S3 to DynamoDB in the other region.
Send also each Ante into an SQS queue in me second region; use an auto-scaling group behind the SQS queue to replay the write in the second region.
Explanations:
Using AWS Data Pipeline to schedule a cross-region copy of DynamoDB once a day with a “LastUpdated” attribute allows efficient synchronization of only modified data. This approach satisfies the Recovery Point Objective (RPO) of 24 hours, minimizes changes to the web application, and ensures that only updated records are transferred.
Using EMR with a custom script and a SCAN operation is not efficient for large DynamoDB tables as SCAN is resource-intensive and can lead to high throughput costs. Additionally, this approach does not efficiently handle incremental changes or provide the RPO of 24 hours.
Exporting DynamoDB to S3 and then importing it into another region does not allow for incremental synchronization of only modified data. It also fails to meet the RPO requirement since the export would be a full snapshot and may involve more than 24 hours of data loss, depending on the timing of the export.
Sending each write to an SQS queue in the second region is complex and may not efficiently synchronize only modified data. While it could achieve cross-region replication, it introduces additional operational overhead and is not designed to handle the RPO or the need for incremental synchronization in an efficient manner.