What should the solutions architect do to meet this requirement in the MOST cost-efficient way?
Send the image metadata from the application directly to a second ALB for the worker nodes that use an Auto Scaling group of EC2 Spot Instances as the target group.
Process the image metadata by sending it directly to EC2 Reserved Instances in an Auto Scaling group. With a dynamic scaling policy, use an Amazon CloudWatch metric for average CPU utilization of the Auto Scaling group as soon as the front-end application obtains the images.
Write messages to Amazon Simple Queue Service (Amazon SQS) when the front-end application obtains an image. Process the images with EC2 On- Demand instances in an Auto Scaling group with instance scale-in protection and a fixed number of instances with periodic health checks.
Write messages to Amazon Simple Queue Service (Amazon SQS) when the application obtains an image. Process the images with EC2 Spot Instances in an Auto Scaling group with instance scale-in protection and a dynamic scaling policy using a custom Amazon CloudWatch metric for the current number of messages in the queue.
Explanations:
Sending metadata directly to a second ALB for worker nodes does not ensure all images are processed, as it lacks a reliable queuing mechanism. Additionally, Spot Instances may not always be available, potentially leading to unprocessed images.
Using EC2 Reserved Instances for processing is not cost-efficient compared to Spot Instances. Moreover, processing directly with Reserved Instances doesn’t guarantee that every image is processed, especially under variable load without a queuing mechanism.
While using Amazon SQS for message queuing is a good practice, processing with EC2 On-Demand instances does not provide the cost-efficiency of Spot Instances. Scale-in protection and fixed instances may lead to underutilization and higher costs.
Writing messages to SQS ensures all images are queued for processing. Using EC2 Spot Instances is cost-efficient, and dynamic scaling based on the number of messages in the queue ensures that worker nodes can scale up or down according to demand, maximizing resource utilization while minimizing costs.