Which solution will meet these requirements with the LEAST amount of customization to transform and store the ingested data?

A network security vendor needs to ingest telemetry data from thousands of endpoints that run all over the world.The data is transmitted every 30 seconds in the form of records that contain 50 fields.Each record is up to 1 KB in size.The security vendor uses Amazon Kinesis Data Streams to ingest the data.The vendor requires hourly summaries of the records that Kinesis Data Streams ingests.The vendor will use Amazon Athena to query the records and to generate the summaries.The Athena queries will target 7 to 12 of the available data fields.

Which solution will meet these requirements with the LEAST amount of customization to transform and store the ingested data?

Use AWS Lambda to read and aggregate the data hourly. Transform the data and store it in Amazon S3 by using Amazon Kinesis Data Firehose.

Use Amazon Kinesis Data Firehose to read and aggregate the data hourly. Transform the data and store it in Amazon S3 by using a short-lived Amazon EMR cluster.

Use Amazon Kinesis Data Analytics to read and aggregate the data hourly. Transform the data and store it in Amazon S3 by using Amazon Kinesis Data Firehose.

Use Amazon Kinesis Data Firehose to read and aggregate the data hourly. Transform the data and store it in Amazon S3 by using AWS Lambda.

Explanations:

AWS Lambda is not an ideal solution for aggregating data on an hourly basis, as it is better suited for smaller, more immediate event-driven tasks. It would require custom logic to read from Kinesis Data Streams, aggregate data, and store it in S3, making it less efficient for handling large-scale hourly aggregation.

Amazon Kinesis Data Firehose cannot aggregate data on its own. It is a data delivery service that does not provide built-in aggregation capabilities. Additionally, using a short-lived Amazon EMR cluster is an overcomplicated solution, requiring more customization for hourly aggregation.

Amazon Kinesis Data Analytics is designed to perform real-time data processing and aggregation efficiently. It can be used to aggregate the records hourly and then store the output in Amazon S3 via Kinesis Data Firehose, making this the most efficient and least complex solution.

Kinesis Data Firehose does not perform data aggregation. AWS Lambda would be needed to perform the transformation, but combining Firehose for aggregation is not ideal. Lambda should be used for processing individual records, not for performing the hourly aggregation.

Learn & move to cloud

Which solution will meet these requirements with the LEAST amount of customization to transform and store the ingested data?

Explanations:

Leave a Reply Cancel reply