Which solution will meet these requirements with the LEAST amount of customization to transform and store the ingested data?
Use AWS Lambda to read and aggregate the data hourly. Transform the data and store it in Amazon S3 by using Amazon Kinesis Data Firehose.
Use Amazon Kinesis Data Firehose to read and aggregate the data hourly. Transform the data and store it in Amazon S3 by using a short-lived Amazon EMR cluster.
Use Amazon Kinesis Data Analytics to read and aggregate the data hourly. Transform the data and store it in Amazon S3 by using Amazon Kinesis Data Firehose.
Use Amazon Kinesis Data Firehose to read and aggregate the data hourly. Transform the data and store it in Amazon S3 by using AWS Lambda.
Explanations:
AWS Lambda is not an ideal solution for aggregating data on an hourly basis, as it is better suited for smaller, more immediate event-driven tasks. It would require custom logic to read from Kinesis Data Streams, aggregate data, and store it in S3, making it less efficient for handling large-scale hourly aggregation.
Amazon Kinesis Data Firehose cannot aggregate data on its own. It is a data delivery service that does not provide built-in aggregation capabilities. Additionally, using a short-lived Amazon EMR cluster is an overcomplicated solution, requiring more customization for hourly aggregation.
Amazon Kinesis Data Analytics is designed to perform real-time data processing and aggregation efficiently. It can be used to aggregate the records hourly and then store the output in Amazon S3 via Kinesis Data Firehose, making this the most efficient and least complex solution.
Kinesis Data Firehose does not perform data aggregation. AWS Lambda would be needed to perform the transformation, but combining Firehose for aggregation is not ideal. Lambda should be used for processing individual records, not for performing the hourly aggregation.