Which solution will meet these requirements?
Store data in Amazon S3. Use Amazon Redshift Spectrum to query data.
Store data in Amazon S3. Use the AWS Glue Data Catalog and Amazon Athena to query data.
Store data in EMR File System (EMRFS). Use Presto in Amazon EMR to query data.
Store data in Amazon Redshift. Use Amazon Redshift to query data.
Explanations:
While Amazon Redshift Spectrum allows querying data in S3, it incurs additional costs for data scanned and is generally more expensive than other options for frequent queries.
Storing data in Amazon S3 and using AWS Glue Data Catalog with Amazon Athena provides a serverless architecture, reducing costs significantly as you only pay for the queries run, and it scales well for varying query loads.
While using EMR File System (EMRFS) and Presto allows querying data efficiently, maintaining a persistent EMR cluster incurs higher operational costs compared to serverless options like Athena.
Storing data in Amazon Redshift may not be cost-effective for this scenario due to the fixed cost of maintaining a Redshift cluster, especially if queries only occur during a limited timeframe.