January 13th, 2010 — geek
The log format for Amazon S3 is slightly annoying. Not overwhelmingly so, but the date field has the field separator (a space) in the middle of it and it isn’t encapsulated by quote characters. Here’s some code to split the fields up, assuming you’ve downloaded the log file already (it’s easy enough to list all logs and retrieve them with boto):
[cce lang="Python"]
import csv
r = csv.reader(open(‘logfilename’),
delimiter=’ ‘,quotechar=’”‘)
log_entries = []
for i in r:
i[2] = i[2] + ” ” + i[3] # repair date field
del i[3]
log_entries.append(i)
[/cce]
December 9th, 2009 — geek
Here’s a little Python snippet I just made up, but is immensely useful because I couldn’t find an obvious method like filter that applied to dictionaries instead of lists. This code pulls out specific key-value pairs from a dictionary and puts them in a new dictionary.
[cce lang="Python"]
>>> x={ ‘test1′:1, ‘test2′:2, ‘test3′:3 }
>>> my_keys = (‘test1′,’test2′)
>>> y=dict(filter(lambda t: t[0] in my_keys, x.items()))
{‘test’: 1, ‘test2′: 2}
[/cce]
Obviously the performance characteristics of this won’t scale, but if you just want a few keys out of a dictionary then you should be fine.