Upload a file to S3 with Boto

Amazon S3 is a distributed storage service which I’ve recently been working with. Boto is Python library for working with Amazon Web Services, which S3 is one facet of. This post will demonstrate how to upload a file using boto (a future post will demonstrate who to create the parameters for POST multi-part request that another client can use to upload to S3 without knowing your AWS key id or secret access key).

I will assume you’ve signed up for S3 already, and successfully downloaded and installed boto.

import boto

# Fill these in - you get them when you sign up for S3
AWS_ACCESS_KEY_ID = ''
AWS_SECRET_ACCESS_KEY = ''

bucket_name = AWS_ACCESS_KEY_ID.lower() + '-mah-bucket'
conn = boto.connect_s3(AWS_ACCESS_KEY_ID,
            AWS_SECRET_ACCESS_KEY)

If you’ve previously learnt about S3 you’ll know that bucket names need to be unique, so that’s why we’ve used the AWS Key ID as a prefix (as this is your unique id it’s unlikely someone else will be using it as a prefix). We’ve also converted the bucket to lowercase, as DNS is case-insensitive and it’s nice to use vanity domains of the form http://[bucket_name].s3.amazonaws.com/.

We create the connection object with boto.connect_s3() and this object let’s us interact with S3. We now create ourselves a bucket and upload a file:

import boto.s3
bucket = conn.create_bucket(bucket_name,
        location=boto.s3.connection.Location.DEFAULT)

testfile = "replace this with an actual filename"
print 'Uploading %s to Amazon S3 bucket %s' % \
       (testfile, bucket_name)

import sys
def percent_cb(complete, total):
    sys.stdout.write('.')
    sys.stdout.flush()

from boto.s3.key import Key
k = Key(bucket)
k.key = 'my test file'
k.set_contents_from_filename(testfile,
        cb=percent_cb, num_cb=10)

There you have it! The set_contents_from_filename is a particularly nifty method which simplifies all the streaming of data to S3. Amazon’s prototype python S3 library required that the file be loaded into memory. This doesn’t work too well if you are working with large media files.

Oh, and the percent_cb function is a call back that gets called as the upload progresses.



11 comments ↓

#1   Matt on 01.07.10 at 5:46 am

Is it possible to write a few bytes to the S3 outgoing socket every so often (rather than giving it a file to transfer?)

#2   Joel on 01.07.10 at 7:05 am

It’s possible to send just a string with:

boto.s3.key.Key.set_contents_from_string

But if you want to use the same Http request, you’ll have to look at how

boto.s3.key.Key.send_file

works.

Note that if you keep the socket open it means that there is no guarantee that the s3 key object will be updated until after the request completes. And if you wait too long between writes then S3 might close the connection…

#3   Stephen Reese on 01.30.11 at 8:00 am

Is there a way to make the percentage ticks appear after the printed statement instead of on the next line?

print ‘Uploading %s to Amazon S3 bucket %s’ % \
(testfile, bucket_name)

#4   Joel on 01.31.11 at 7:59 pm

@Stephen It depends on your console. If it’s ANSI compatible (most are), you can use ANSI escape codes to move the cursor.

http://en.wikipedia.org/wiki/ANSI_escape_code

(search for “cursor position”)

Should be easy enough to implement…

#5   Adii on 09.05.12 at 12:41 am

is it possible to get Uploaded file url?

#6   twistedlog on 10.05.12 at 8:22 am

Can you tell me how do i work with large video files?

#7   Saqib Ali on 11.11.12 at 3:55 pm

There is an error in this example. Here is the line:

bucket = conn.create_bucket(bucket_name, location=s3.connection.Location.DEFAULT)

The variable s3 is undefined. Please post working, tested code.

#8   Joel on 11.12.12 at 10:59 am

@Saqib Thanks for the heads up. I’ve updated the example code… forgot to import boto.s3

#9   Uploading images to Amazon S3 with Tornado using Boto | & such & such on 01.02.13 at 1:39 pm

[...] S3 upload and download using Python/Django Upload a file to S3 with Boto How to upload an image with python-tornado from an HTML form? Uploading image to S3 (boto + GAE) [...]

#10   Jeff on 01.29.13 at 4:19 am

@twistedlog You need to use S3 Multipart Upload if the file is bigger than 5GB. See notes here:

http://aws.amazon.com/about-aws/whats-new/2010/11/10/Amazon-S3-Introducing-Multipart-Upload/

The boto Python library has classes that support this.

#11   Jeff on 01.29.13 at 4:23 am

@Adii The uploaded file will have a URL like:

s3://bucket_name/folder_name/file_name

Leave a Comment