How should the company accomplish this with the least amount of administrative overhead?
Run an Amazon EMR cluster that uses a MapReduce job to examine the CloudTrail trails.
Use the events history feature of the CloudTrail console to query the CloudTrail trails.
Write an AWS Lambda function to query the CloudTrail trails. Configure the Lambda function to be executed whenever a new file is created in the CloudTrail S3 bucket.
Create an Amazon Athena table that looks at the S3 bucket the CloudTrail trails are being written to. Use Athena to run queries against the trails.
Explanations:
Running an Amazon EMR cluster with MapReduce jobs requires significant setup and ongoing maintenance, introducing high administrative overhead compared to other solutions. It is also not well-suited for simple, ad hoc querying of CloudTrail logs.
The events history feature in the CloudTrail console is limited to recent events and does not support querying historical CloudTrail logs across multiple accounts over a span of 3 years. This option is not feasible for the scenario described.
While using AWS Lambda to process CloudTrail logs can automate actions, it does not facilitate ad hoc querying of logs. Lambda is more suited for event-driven processes and not for querying large datasets over long periods. Additionally, it introduces unnecessary complexity for this use case.
Amazon Athena is a serverless query service that can directly query CloudTrail logs stored in Amazon S3. By creating an Athena table to query the S3 bucket, the company can easily run ad hoc SQL queries on the logs, minimizing administrative overhead and handling logs dating back several years. This is the most efficient and scalable solution.