Which storage solution should a solutions architect recommend to meet these requirements?
Run AWS DataSync as a scheduled cron job to migrate the data to an Amazon S3 bucket on an ongoing basis.
Deploy an AWS Storage Gateway file gateway with an Amazon S3 bucket as the target storage. Migrate the data to the Storage Gateway appliance.
Deploy an AWS Storage Gateway volume gateway with cached volumes with an Amazon S3 bucket as the target storage. Migrate the data to the Storage Gateway appliance.
Configure an AWS Site-to-Site VPN connection from the on-premises environment to AWS. Migrate data to an Amazon Elastic File System (Amazon EFS) file system.
Explanations:
AWS DataSync is used for transferring data between on-premises storage and AWS services. While it can help migrate data to S3, it does not provide immediate access to a subset of data or low-latency access to frequently accessed data, which are key requirements for the researchers.
The AWS Storage Gateway file gateway provides access to S3 but does not allow for low-latency access to frequently used data, as it primarily serves as a means to access S3 objects via NFS/SMB. Researchers may face delays when trying to access large datasets.
The AWS Storage Gateway volume gateway with cached volumes provides low-latency access to frequently accessed data by keeping a subset in local cache while storing the bulk of the data in Amazon S3. This solution meets the requirement for immediate availability of data with minimal lag while also reducing on-premises storage needs and ongoing capital expenses.
Configuring a Site-to-Site VPN and migrating data to Amazon EFS would not address the need for immediate access to a subset of data efficiently. EFS is designed for file storage rather than as a solution for accessing large datasets while minimizing capital expenses, and the use of a VPN may introduce latency issues.