Which solution will meet these requirements MOST cost-effectively?

A company ingests and processes streaming market data.The data rate is constant.A nightly process that calculates aggregate statistics takes 4 hours to complete.The statistical analysis is not critical to the business, and data points are processed during the next iteration if a particular run fails.The current architecture uses a pool of Amazon EC2 Reserved Instances with 1-year reservations.These EC2 instances run full time to ingest and store the streaming data in attached Amazon Elastic Block Store (Amazon EBS) volumes.A scheduled script launches EC2 On-Demand Instances each night to perform the nightly processing.The instances access the stored data from NFS shares on the ingestion servers.The script terminates the instances when the processing is complete.The Reserved Instance reservations are expiring.The company needs to determine whether to purchase new reservations or implement a new design.

Which solution will meet these requirements MOST cost-effectively?

Update the ingestion process to use Amazon Kinesis Data Firehose to save data to Amazon S3. Use a scheduled script to launch a fleet of EC2 On-Demand Instances each night to perform the batch processing of the S3 data. Configure the script to terminate the instances when the processing is complete.

Update the ingestion process to use Amazon Kinesis Data Firehose to save data to Amazon S3. Use AWS Batch with Spot Instances to perform nightly processing with a maximum Spot price that is 50% of the On-Demand price.

Update the ingestion process to use a fleet of EC2 Reserved Instances with 3-year reservations behind a Network LoadBalancer. Use AWS Batch with Spot Instances to perform nightly processing with a maximum Spot price that is 50% of the On-Demand price.

Update the ingestion process to use Amazon Kinesis Data Firehose to save data to Amazon Redshift. Use Amazon EventBridge to schedule an AWS Lambda function to run nightly to query Amazon Redshift to generate the daily statistics.

Explanations:

While using Amazon Kinesis Data Firehose to save data to Amazon S3 reduces infrastructure management, it still relies on On-Demand Instances for nightly processing, which can be costlier than other options, especially if the data volume scales. It does not leverage cost-saving features such as Spot Instances.

This option uses Kinesis Data Firehose to stream data to S3, which is cost-effective for storage. By utilizing AWS Batch with Spot Instances for nightly processing, the company can significantly reduce costs, as Spot Instances can be much cheaper than On-Demand Instances. The setup is efficient and aligns well with the non-critical nature of the nightly processing.

This option suggests using Reserved Instances for ingestion, which may not be cost-effective given the need to re-evaluate after current reservations expire. Also, it still relies on Spot Instances for batch processing, but the use of Reserved Instances does not provide significant savings compared to the flexible use of On-Demand and Spot Instances in the previous option.

While this option introduces Kinesis Data Firehose and Redshift, it is not the most cost-effective approach due to potentially higher costs associated with using Redshift and Lambda for querying, especially for batch processing tasks. It also doesn’t use Spot Instances, which are essential for cost savings in processing tasks.

Learn & move to cloud

Which solution will meet these requirements MOST cost-effectively?

Explanations:

Leave a Reply Cancel reply