Which combination of steps will meet these requirements MOST cost-effectively?

A research company uses on-premises devices to generate data for analysis.The company wants to use the AWS Cloud to analyze the data.The devices generate .csv files and support writing the data to an SMB file share.Company analysts must be able to use SQL commands to query the data.The analysts will run queries periodically throughout the day.

Which combination of steps will meet these requirements MOST cost-effectively?

(Choose three.)

Deploy an AWS Storage Gateway on premises in Amazon S3 File Gateway mode.

Deploy an AWS Storage Gateway on premises in Amazon FSx File Gateway made.

Set up an AWS Glue crawler to create a table based on the data that is in Amazon S3.

Set up an Amazon EMR cluster with EMR File System (EMRFS) to query the data that is in Amazon S3. Provide access to analysts.

Set up an Amazon Redshift cluster to query the data that is in Amazon S3. Provide access to analysts.

Setup Amazon Athena to query the data that is in Amazon S3. Provide access to analysts.

Explanations:

Deploying an AWS Storage Gateway in Amazon S3 File Gateway mode allows the on-premises devices to store .csv files directly into Amazon S3, making the data available for analysis. This is a cost-effective way to integrate on-premises data with AWS cloud storage.

Deploying an AWS Storage Gateway in Amazon FSx File Gateway mode is not suitable because FSx is optimized for Windows file systems and does not directly support the required .csv file processing or querying in the same way as S3 File Gateway. It also adds unnecessary complexity for this scenario.

Setting up an AWS Glue crawler to create a table based on the data in Amazon S3 allows the analysts to catalog and prepare the .csv data for querying. Glue is a cost-effective ETL service that helps organize the data in a way that is queryable using SQL.

Setting up an Amazon EMR cluster with EMRFS can be expensive for periodic queries since EMR incurs costs for running clusters continuously. While it can query S3 data, it is not the most cost-effective solution given the requirement for periodic queries.

Setting up an Amazon Redshift cluster is costly as Redshift is a fully managed data warehouse that may be overkill for this use case, especially for periodic SQL queries. It also requires provisioning resources which could lead to unnecessary expenses.

Setting up Amazon Athena to query the data in Amazon S3 is a highly cost-effective solution for ad-hoc queries. It allows analysts to run SQL queries directly on data stored in S3 without the need for a managed cluster, paying only for the data scanned during queries.

Learn & move to cloud

Which combination of steps will meet these requirements MOST cost-effectively?

Explanations:

Leave a Reply Cancel reply