Which solution meets these requirements with the MOST operational efficiency?

A DevOps engineer manages a large commercial website that runs on Amazon EC2.The website uses Amazon Kinesis Data Streams to collect and process web logs.The DevOps engineer manages the Kinesis consumer application, which also runs on Amazon EC2.Sudden increases of data cause the Kinesis consumer application to fall behind, and the Kinesis data streams drop records before the records can be processed.The DevOps engineer must implement a solution to improve stream handling.

Which solution meets these requirements with the MOST operational efficiency?

Modify the Kinesis consumer application to store the logs durably in Amazon S3. Use Amazon EMR to process the data directly on Amazon S3 to derive customer insights. Store the results in Amazon S3.

Horizontally scale the Kinesis consumer application by adding more EC2 instances based on the Amazon CloudWatch GetRecords.IteratorAgeMilliseconds metric. Increase the retention period of the Kinesis data streams.

Convert the Kinesis consumer application to run as an AWS Lambda function. Configure the Kinesis data streams as the event source for the Lambda function to process the data streams.

Increase the number of shards in the Kinesis data streams to increase the overall throughput so that the consumer application processes the data faster.

Explanations:

While storing logs in Amazon S3 and using EMR for processing can handle large data, it offloads data processing from Kinesis and introduces additional storage and compute layers. This increases complexity and doesn’t address real-time processing needs.

Scaling the Kinesis consumer application with more EC2 instances based onIteratorAgeMillisecondsand increasing the Kinesis retention period directly addresses processing delays, enabling the system to handle data spikes effectively and maintain operational efficiency.

Converting the application to Lambda could simplify scaling, but Lambda has limits on batch sizes and processing times, which may not handle heavy, continuous data streams as efficiently as horizontally scaling EC2 consumers.

Increasing shard count can improve stream throughput but does not resolve the processing bottleneck if the consumer application is falling behind. It’s only a partial solution and may still lead to dropped records.

Learn & move to cloud

Which solution meets these requirements with the MOST operational efficiency?

Explanations:

Leave a Reply Cancel reply