More S3 analysis

Sat 23 May 2015 by Patrick Pierson

I wanted to do some semi deep inspection of S3 logs. I wrote the following python script to do so.
S3 log parsing is a github repo I am going to start updating with scripts to do S3 log analysis with python/bash/etc. I'd like to get back into Elastic Map Reduce and Pig scripts on AWS. To use the s3parse.py script do the following:

  1. Download s3 logs to a "logs" folder
./s3parse.py 

You will see output that looks like this:

Run 'parse s3 logs'
Task:

Type "parse s3 logs"

Task: parse s3 logs

Next it will ask for your logs location, the default is ./logs/*. Press enter to keep it.

Logs location: "./logs/*"

This will create a test.csv that you can then run basic analytics on. Run s3parse.py again.

./s3parse.py 

You will see the following:

Available Tasks
----------------
count bytes sent #prints total bytes in KB
show 404 count #prints 404s
show ips count #prints ip counts
show useragent count #prints useragent counts
quit #quits