How can you reduce the load on your on-premises database resources in the most cost-effective way?
Use an Amazon Elastic Map Reduce (EMR) S3DistCp as a synchronization mechanism between the on-premises database and a Hadoop cluster on AWS.
Modify the application to write to an Amazon SQS queue and develop a worker process to flush the queue to the on-premises database.
Modify the application to use DynamoDB to feed an EMR cluster which uses a map function to write to the on-premises database.
Provision an RDS read-replica database on AWS to handle the writes and synchronize the two databases using Data Pipeline.
Explanations:
Using Amazon EMR and S3DistCp for synchronization may be complex and is more suited for large-scale data processing rather than reducing load on a database directly. It does not address the immediate issue of write volume on the on-premises database.
Modifying the application to write to an Amazon SQS queue allows for asynchronous processing. This decouples the write operations from the main application, reducing immediate load on the database. A worker process can then manage the writes to the on-premises database at a controlled rate, optimizing resource usage.
While using DynamoDB could help manage writes better, it introduces additional complexity. Feeding an EMR cluster from DynamoDB to write back to the on-premises database could still overwhelm the mainframe database, and it’s not the most cost-effective or straightforward solution for load reduction.
Provisioning an RDS read-replica may help with read operations, but it does not inherently solve the write load issue. Additionally, synchronizing between RDS and the on-premises database using Data Pipeline adds unnecessary complexity and potential cost without addressing the main problem of write volume.