Which solution would allow the use of SQL to query the stream with the LEAST latency?
Amazon Kinesis Data Analytics with an AWS Lambda function to transform the data.
AWS Glue with a custom ETL script to transform the data.
An Amazon Kinesis Client Library to transform the data and save it to an Amazon ES cluster.
Amazon Kinesis Data Firehose to transform the data and put it into an Amazon S3 bucket.
Explanations:
Amazon Kinesis Data Analytics allows SQL queries on streaming data with low latency. When combined with an AWS Lambda function, it can transform the GZIP data on-the-fly, providing near real-time insights.
AWS Glue is designed for batch ETL processes and does not provide low-latency querying. It is more suited for preparing data for analysis rather than real-time streaming.
While using the Amazon Kinesis Client Library can facilitate data transformation and send data to an Amazon ES cluster, it does not offer SQL querying capabilities natively and can introduce higher latency compared to direct SQL streaming solutions.
Amazon Kinesis Data Firehose is primarily used for loading data into storage (e.g., S3) and not designed for real-time SQL querying. The latency from writing to S3 and then querying it makes this option less optimal for real-time insights.