Which solution will meet these requirements MOST cost-effectively?
Update the ingestion process to use Amazon Kinesis Data Firehose to save data to Amazon S3. Use a scheduled script to launch a fleet of EC2 On-Demand Instances each night to perform the batch processing of the S3 data. Configure the script to terminate the instances when the processing is complete.
Update the ingestion process to use Amazon Kinesis Data Firehose to save data to Amazon S3. Use AWS Batch with Spot Instances to perform nightly processing with a maximum Spot price that is 50% of the On-Demand price.
Update the ingestion process to use a fleet of EC2 Reserved Instances with 3-year reservations behind a Network LoadBalancer. Use AWS Batch with Spot Instances to perform nightly processing with a maximum Spot price that is 50% of the On-Demand price.
Update the ingestion process to use Amazon Kinesis Data Firehose to save data to Amazon Redshift. Use Amazon EventBridge to schedule an AWS Lambda function to run nightly to query Amazon Redshift to generate the daily statistics.
Explanations:
While using Amazon Kinesis Data Firehose to save data to Amazon S3 reduces infrastructure management, it still relies on On-Demand Instances for nightly processing, which can be costlier than other options, especially if the data volume scales. It does not leverage cost-saving features such as Spot Instances.
This option uses Kinesis Data Firehose to stream data to S3, which is cost-effective for storage. By utilizing AWS Batch with Spot Instances for nightly processing, the company can significantly reduce costs, as Spot Instances can be much cheaper than On-Demand Instances. The setup is efficient and aligns well with the non-critical nature of the nightly processing.
This option suggests using Reserved Instances for ingestion, which may not be cost-effective given the need to re-evaluate after current reservations expire. Also, it still relies on Spot Instances for batch processing, but the use of Reserved Instances does not provide significant savings compared to the flexible use of On-Demand and Spot Instances in the previous option.
While this option introduces Kinesis Data Firehose and Redshift, it is not the most cost-effective approach due to potentially higher costs associated with using Redshift and Lambda for querying, especially for batch processing tasks. It also doesn’t use Spot Instances, which are essential for cost savings in processing tasks.