I wanted to do some semi deep inspection of S3 logs. I wrote the following python script to do so.
S3 log parsing is a github repo I am going to start updating with scripts to do S3 log analysis with python/bash/etc. I'd like to get back into Elastic Map Reduce and Pig scripts on AWS. To use the s3parse.py script do the following:
- Download s3 logs to a "logs" folder
You will see output that looks like this:
Run 'parse s3 logs' Task:
Type "parse s3 logs"
Task: parse s3 logs
Next it will ask for your logs location, the default is ./logs/*. Press enter to keep it.
Logs location: "./logs/*"
This will create a test.csv that you can then run basic analytics on. Run s3parse.py again.
You will see the following:
Available Tasks ---------------- count bytes sent #prints total bytes in KB show 404 count #prints 404s show ips count #prints ip counts show useragent count #prints useragent counts quit #quits