Querying the AWS CloudTrail using Amazon Athena - Part II

In this blog, we will look at the continuous compliance solution by querying the AWS CloudTrail with Amazon Athena.

AWS CloudTrail

AWS CloudTrail is a service provided by AWS that allows you to enable governance, compliance, operational, and risk auditing of your AWS account. Events in CloudTrail are actions taken by a user, role, or AWS service. Events include AWS Management Console, AWS Command Line Interface, and AWS SDKs and APIs actions.

Amazon Athena

Amazon Athena is an interactive query service that allows you to use standard SQL to analyze data directly in Amazon Simple Storage Service (Amazon S3). Athena is serverless, so there is no infrastructure to set up or manage, and you can start analyzing data immediately. Amazon Athena is ideal for quick, ad-hoc querying. It can also handle complex analysis, including large joins, window functions, and arrays.

How to Query CloudTrail Logs using Amzon Athena

  1. The CloudTrail service is enabled, and the logs are stored in an S3 bucket.
  2. Initially, an Athena Table is created for querying the CloudTrail Logs from S3.
  3. The below external table creates a query with partition projection enabled, so if any additional data is added in the mentioned S3 path, the data will be automatically pulled by the query.

The package manager for python, pip is required for installing AWS CLI and boto3. Then for installation use the following commands:

CREATE EXTERNAL TABLE cloudtrail_logs (

    • eventversion STRING,
    • useridentity STRUCT < type: STRING,
    • principalid: STRING,
    • arn: STRING,
    • accountid: STRING,
    • invokedby: STRING,
    • accesskeyid: STRING,
    • userName: STRING,
    • sessioncontext: STRUCT < attributes: STRUCT < mfaauthenticated: STRING,
    • creationdate: STRING >,
    • sessionissuer: STRUCT < type: STRING,
    • principalId: STRING,
    • arn: STRING,
    • accountId: STRING,
    • userName: STRING >>>,
    • eventtime STRING,
    • eventsource STRING,
    • eventname STRING,
    • awsregion STRING,
    • sourceipaddress STRING,
    • useragent STRING,
    • errorcode STRING,
    • errormessage STRING,
    • requestparameters STRING,
    • responseelements STRING,
    • additionaleventdata STRING,
    • requestid STRING,
    • eventid STRING,
    • resources ARRAY < STRUCT < ARN: STRING,
    • accountId: STRING,
    • type: STRING >>,
    • eventtype STRING,
    • apiversion STRING,
    • readonly STRING,
    • recipientaccountid STRING,
    • serviceeventdetails STRING,
    • sharedeventid STRING
    • ) PARTITIONED BY (accountid string, day string, region string) ROW FORMAT SERDE
    • 'com.amazon.emr.hive.serde.CloudTrailSerde' STORED AS INPUTFORMAT
    • 'com.amazon.emr.cloudtrail.CloudTrailInputFormat' OUTPUTFORMAT
    • 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 's3://cloudtrails-
    • logsevents/AWSLogs/' TBLPROPERTIES (
    • 'projection.enabled' = 'true',
    • 'projection.accountid.range' = '0,9999999999999',
    • 'projection.accountid.type' = 'integer',
    • 'projection.day.format' = 'yyyy/MM/dd',
    • 'projection.day.interval' = '1',
    • 'projection.day.interval.unit' = 'DAYS',
    • 'projection.day.range' = '2015/01/01,NOW',
    • 'projection.day.type' = 'date',
    • 'projection.region.type' = 'enum',
    • 'projection.region.values' = 'us-east-1, us-east-2',
    • 'storage.location.template' =
    • 's3:///AWSLogs/${accountid}/CloudTrail/${region}/${day}/'
    • )
  1. Creation of Athena view from the external table to parse the data from the column user-identity
    • create or replace view cloudtrail_logs_vw as
    • select
    • eventversion,
    • useridentity.type as useridentity_type,
    • eventname,
    • useridentity.principalid as useridentity_principalid,
    • useridentity.arn as useridentity_arn,
    • useridentity.accountid as useridentity_accountid,
    • useridentity.invokedby as useridentity_invokedby,
    • useridentity.username as useridentity_username,
    • useridentity.sessioncontext as useridentity_sessioncontext,
    • eventtime,
    • eventsource,
    • awsregion,
    • sourceipaddress,
    • useragent,
    • errorcode,
    • errormessage,
    • requestparameters,
    • responseelements,
    • additionaleventdata,
    • requestid,
    • eventid,
    • resources,
    • eventtype,
    • apiversion,
    • readonly,
    • recipientaccountid,
    • serviceeventdetails,
    • sharedeventid,
    • accountid,
    • day,
    • region
    • from cloudtrail_logs

2. For the querying particular Event Source and Event Name, the Below Query is run and the desired result will be shown as mentioned in the below image

      • select *
      • from cloudtrail_logs_vw
      • where accountid in ('')
      • and region = 'us-east-1'
      • and day = '2022/05/04'
      • and eventsource = 's3.amazonaws.com'
      • and eventname = 'PutBucketPublicAccessBlock'

Output

Picture1
Picture2

If you have any questions or suggestions, please reach out to us at contactus@1cloudhub.com

Written by:  Sriram Narayanan  &   Umashankar N

Sharing is caring!

In Blog
Subscribe to our Newsletter1CloudHub