Which solution meets these requirements?
Use regularly scheduled AWS Snowball Edge devices to transfer the sequencing data into AWS. When AWS receives the Snowball Edge device and the data is loaded into Amazon S3, use S3 events to trigger an AWS Lambda function to process the data.
Use AWS Data Pipeline to transfer the sequencing data to Amazon S3. Use S3 events to trigger an Amazon EC2 Auto Scaling group to launch custom-AMI EC2 instances running the Docker containers to process the data.
Use AWS DataSync to transfer the sequencing data to Amazon S3. Use S3 events to trigger an AWS Lambda function that starts an AWS Step Functions workflow. Store the Docker images in Amazon Elastic Container Registry (Amazon ECR) and trigger AWS Batch to run the container and process the sequencing data.
Use an AWS Storage Gateway file gateway to transfer the sequencing data to Amazon S3. Use S3 events to trigger an AWS Batch job that executes on Amazon EC2 instances running the Docker containers to process the data.
Explanations:
AWS Snowball Edge is designed for offline data transfer, which doesn’t align with the need for daily transfers at high speed. The turnaround time would be increased due to the physical shipping process.
AWS Data Pipeline is not optimal for regular large data transfers from on-premises environments. Additionally, it is not well-suited for orchestrating Docker-based processing workflows, making it less efficient for this use case.
AWS DataSync enables fast and reliable data transfer to Amazon S3. AWS Step Functions can efficiently manage the workflow, triggering AWS Batch to process the genomics data using Docker containers in a scalable way. This meets the need for on-demand processing and reducing turnaround time.
AWS Storage Gateway can transfer data to S3 but is typically used for hybrid cloud storage solutions rather than frequent, large-scale data transfers. This option lacks the orchestration capabilities of AWS Step Functions to manage complex processing workflows.