Which solution will meet these requirements with the LEAST operational overhead?
Send activity data to an Amazon Kinesis data stream. Configure the stream to deliver the data to an Amazon S3 bucket.
Send activity data to an Amazon Kinesis Data Firehose delivery stream. Configure the stream to deliver the data to an Amazon Redshift cluster.
Place activity data in an Amazon S3 bucket. Configure Amazon S3 to run an AWS Lambda function on the data as the data arrives in the S3 bucket.
Create an ingestion service on Amazon EC2 instances that are spread across multiple Availability Zones. Configure the service to forward data to an Amazon RDS Multi-AZ database.
Explanations:
Kinesis data streams are designed for real-time processing. However, it doesn’t offer SQL-based querying for large-scale data analytics, and would require custom processing for data analytics.
Kinesis Data Firehose simplifies the ingestion process and can deliver data directly to Amazon Redshift, which supports SQL-based analytics on large datasets. This solution requires minimal operational overhead.
While Amazon S3 and Lambda can work together, Lambda isn’t optimal for ingesting petabyte-scale data and performing on-demand analytics at this scale. It would require more management and scaling considerations.
Creating a custom ingestion service on EC2 instances introduces significant operational overhead due to instance management, scaling, and fault tolerance handling. It’s not ideal for a serverless or managed solution.