Wednesday, 11 June 2014

Using Python boto to store the content to AWS S3



Recently I have a requirement to store some plain text files to AWS S3. As my primary program is Python, I used boto to fulfil the task. Here are the basic steps

Install boto package into python environment.

If the environment has the internet access ability, boto can be easily setup by ‘easy_install’. Otherwise you have to download the source file from github and manually install it.


easy_install boto
git clone https://github.com/boto/boto

Setup key configuration

When accessing S3, you need to use authentication (security keys, X509 certificates etc), in python/boto, security keys is the easiest way (maybe not the most applicable way), security keys is a pair of key_id and access_key generated by AWS to identify the user. See more about IAM.

Usually when you generated the IAM user, AWS allows you to download the key csv files to your local machine. There are two ways using keys in Python/boto. One is code it into your program as
aws_access_key = “aaaaaaaaa”
aws_secret_key=”sssssssssssssssss”
from boto.s3.connection import S3Connection
conn = connect_s3 (aws_access_key, aws_secret_key)
or you put your security key into /etc/boto and use connect_s3 without any parameters.
from boto.s3.connection import S3Connection
conn = connect_s3 ()
I used the second way as it is more flexible (easily to change the user) and more secure  (you can specify the file permission)
[Credentials]

aws_access_key_id = *******************
aws_secret_access_key = ********************

Program flow:

import boto
bucketname = ‘rafaxubucket’                    #here is the bucket
filename = ‘recports/myfile’                       #here is the filename including path
conn = boto.connect_s3()
b = conn.get_bucket(bucketname)
key = b.new_key(filename)
key.set_contents_from_string(reportContent)
key.set_canned_acl('public-read')

Explaination:                

To store the file, we need to get the bucket, then the filename is called key in S3, the key can contain ‘/’ which means the path actually. Then you can dump your content to the key (write the content to the file as compared in the file IO) then set the acl. After that you can view the content via browser as the content is already on S3 as a web service.

No comments:

Post a Comment