Today we will be talking about automating EC2 backups.  Backups are boring and no one ever cares about them.  Until you need them.  It’s at that point that you wish you had setup some sort of automated backup system, because life goes by too quickly to do anything manually that isn’t absolutely required.

First, let’s talk about Amazon EC2 backups or in AWS lingo – snapshots.  Snapshots, as they are implemented with AWS, are pretty cool.  If you happen to be a VMware guy and think you’re familiar with AWS snapshots, you aren’t.  Because of the whole cloud deal, Amazon is able to implement them quite differently.  This difference allows for snapshots to be created with only a slight performance hit as they are being created, and then no performance hit after that.  With the traditional VMware installation, having too many delta snapshots can cause host and virtual machine performance issues.  Sometimes you just pray that it will work when you want to delete a bunch of them.

When creating a snapshot with Amazon, any block level differences since the last snapshot are recorded and then dumped off to S3.  Simple.  The original disk is never touched and there is no performance hit of ‘writing through’ a bunch of delta disks.  Additionally, since the changes are dumped into S3, you get the crazy redundancy (99.999999999% durability) associated with it.  Now the good, or bad, associated with this process is that only changed data is written off.  So if you have a web server with mostly static data on it (and you should), then backups are super cheap!  However, if the server has extremely dynamic data, then some cost/benefit calculations might want to be made to determine the optimal RPO.

Now, let’s dive in!

1. First you will need to ensure that the AWS CLI tools are installed on your instance.  If you’re running Amazon Linux, they are already there.  If they aren’t, the tools can probably be installed for your distro with the following command:

sudo pip install awscli

2. Download the most excellent backup script from the amazing AWS Missing Tools repo and mark it executable.  Since I primarily use Amazon Linux, I usually just put this stuff in a scripts folder under the ec2-user account.  That said, here are the commands:

mkdir /home/ec2-user/scripts
cd /home/ec2-user/scripts
chmod +x

3. Add a cron job to snapshot the volume(s). The following command should be added to the ec2-user crontab.  It runs the backup every day at 02:30 local time and purges any backups over 30 days old.  For a full feature list, see this site.

30 2 * * * /home/ec2-user/scripts/ -v "vol-xxxxx vol-yyyyy" -n -k 30 -p


Optional/troubleshooting:  If you happen to run into any problems where the instance is unable to create snapshots on its own, it could be related to an IAM permissions error.  The easiest way to fix this is to create an IAM Role for the server that allows for full access to both EC2 and S3.  I might do an upcoming post on IAM, in the meantime, check the docs.

This post originally appeared on Need help with your virtualization project? Let us help. Contact [email protected] for more information.