Which solution meets the company’s requirements?
Develop a Python script to capture the data from Amazon EC2 in real time and store the data in Amazon S3. Use a copy command to copy data from Amazon S3 to Amazon Redshift. Connect a business intelligence tool running on Amazon EC2 to Amazon Redshift and create the visualizations.
Use an Amazon Kinesis agent running on an EC2 instance in an Auto Scaling group to collect and send the data to an Amazon Kinesis Data Firehose delivery stream. The Kinesis Data Firehose delivery stream will deliver the data directly to Amazon ES. Use Kibana to visualize the data.
Use an in-memory caching application running on an Amazon EBS-optimized EC2 instance to capture the log data in near real-time. Install an Amazon ES cluster on the same EC2 instance to store the log files as they are delivered to Amazon EC2 in near real-time. Install a Kibana plugin to create the visualizations.
Use an Amazon Kinesis agent running on an EC2 instance to collect and send the data to an Amazon Kinesis Data Firehose delivery stream. The Kinesis Data Firehose delivery stream will deliver the data to Amazon S3. Use an AWS Lambda function to deliver the data from Amazon S3 to Amazon ES. Use Kibana to visualize the data.
Explanations:
This option relies on a Python script and a batch process to move data from S3 to Redshift, which does not address the need for near-real-time analysis and requires significant maintenance. Additionally, it involves more administrative overhead compared to streaming solutions.
This option utilizes Amazon Kinesis to stream data in real time to Amazon Elasticsearch Service (ES) for storage and visualization via Kibana. It provides scalability, minimizes administrative overhead, and allows for near-real-time analysis, which aligns well with the company’s requirements.
This option proposes an in-memory caching application and an ES cluster on the same EC2 instance, which could lead to resource contention and scalability issues. It does not provide the streaming capabilities required for near-real-time analysis and adds complexity in management and maintenance.
Although this option uses Kinesis Data Firehose to send data to S3 and involves AWS Lambda to deliver data to Amazon ES, it does not achieve the desired near-real-time analysis as effectively as option B. The use of S3 introduces latency compared to direct streaming to ES.