Do you want to know how much outgoing traffic there is from your S3 bucket? Maybe your account cost has increased and you need to know what is generating all the traffic. Or maybe you just out of curiosity want to know if your API on S3 is any popular. Fortunately, it is quite easy to get the exact amount of requests and outgoing traffic in four simple steps.
1. Create a new S3 bucket for your access logs. I will call mine my-log-bucket.
2. Enable logging for the S3 buckets you want to analyze.
This can easily be done in the AWS console. Right-click the bucket and expand logging. Check the enable checkbox, choose the bucket that you created in step one, and use the current bucket name as prefix.
The access logs will now be copied from the S3 servers to my-log-bucket on hourly basis.
3. Collect the log files.
s3cmd sync s3://my-log-bucket/ logs/
4. Analyze the files with a tool.
To analyze the logs I used a Ruby gem – Request-log-analyzer.
If you don’t have Ruby, then you first need to install it. The gem is then installed as always:
gem install request-log-analyzer
Now you can create a request report with the following command:
request-log-analyzer -f amazon_s3 --output html --file report.html logs/my-bucket*
The report document will include request distribution per hour, most popular items, request duration, amount of traffic per item, and distribution of HTTP status codes, etc. Basically all you would need to know to figure out if something is wrong or if your public files are just remarkably popular.
So what are you waiting for? Enable logging now and you can look forward to nice statistics tomorrow. :)