AWS Tips, Tricks, and Techniques

Background

AWS is one of the most popular cloud computing platforms. It provides everything from object storage (S3), elastically provisioned servers (EC2), databases as a service (RDS), payment processing (DevPay), virtualized networking (VPC and AWS Direct Connect), content delivery networks (CDN), monitoring (CloudWatch), queueing (SQS), and a whole lot more.

In this post I'll be going over some tips, tricks, and general advice for getting started with Amazon Web Services (AWS). The majority of these are lessons we've learned in deploying and running our cloud SaaS product, JackDB, which runs entirely on AWS.

Billing

AWS billing is invoiced at the end of the month and AWS services are generally provided on a "per use" basis. For example EC2 servers are quoted in $/hour. If you spin up a server for 6 hours then turn it off you'll only be billed for those 6 hours.

Unfortunately, AWS does not provide a way to cap your monthly expenses. If you accidentally spin up too many servers and forget to turn them off, then you could get a bit shock at the end of they month. Similarly, AWS charges you for total outbound bandwidth used. If you have a spike in activity to a site hosted on AWS or just excessive usage of S3, you could end up with a sizable bill.

AWS does allow you to set up billing alerts. Amazon CloudWatch allows you to use your projected monthly bill as a metric for alerts. You can have a notification sent to you when it exceeds a preset dollar amount.

Do this immediately!

Seriously just go add this immediately. Even better, add a couple of these at a variety of dollar figures. A good starting set is: $1/mo, $10/mo, $50/mo, $100/mo, $250/mo, $500/mo, and $1,000/mo

If you ever end up with a run away server or accidentally provision 10 servers instead of 1 (which is surprisingly easy when scripting automated deployments...), then you'll be happy you set up these billing alerts.

Our company, JackDB, is entirely hosted on AWS and our bills are fairly consistent month to month. Once that monthly bill stabilized we setup billing alerts, as described above, at the expected value as well as 1/3 and 2/3 of it. This means that on approximately the 10th and 20th of each month we get a notification that our bill is at 1/3 or 2/3 of our expected monthly spend.

If either of those alerts is triggered sooner than that, it'd be a sign that monthly is on the rise.

Security

Security is a big deal, especially so in the cloud. Countless articles have been written about it and the advice here is just a couple quick points. At a later date I'd like to write a more detailed write up.

Multi-factor Authentication

Mutli-factor authentication, also known as two-factor authentication or 2FA, is an enhanced approach to authentication that requires combining multiple, separate, factors of authentication. Most commonly these are something you know (ex: a password) and something you have (ex: a hardware token or virtual token generator on your phone).

If one of these gets compromised (ex: someone steals your password), they would still need one of the other factors to login. A single factor alone isn't enough.

AWS supports multi-factor authentication using standard TOTP pin codes. It supports both free software pins (ex: Google Authenticator on your smart phone) and hardware tokens ($12.99 as of Jan, 2014).

Do this immediately!

There is no reason not to have this enabled and I recommend immediately enabling it. In fact, you should enable 2FA on every service you use that supports it. If you're using Google Apps or even just regular Gmail, you should enable it there as well.

SSH

It's a good idea to use unique SSH keys for unrelated projects. It makes it much easier to deprovision them later on. You should generate a new pair of SSH keys (with a long passphrase) to use with your new servers. If you have multiple people sharing access to the same servers, each person should have their own unique SSH keys.

Rather than specifying your key files on the command line, you should add them to your ~/.ssh/config file so they'll be automatically used. q

Host ec2-10-20-30-40.compute-1.amazonaws.com
  User ubuntu
  IdentityFile "~/.ssh/my-ec2-private-key"

By default SSH will send all your available SSH keys in your keyring or listed in your config file to a remote server. If you have a lot of SSH keys then you may get an error from a remote server when you try to SSH to it:

Too many authentication failures for username

From the remote server's perspective each is considered a connection attempt. If you have too many SSH keys for it to try then it may not get to the correct one. To force it to only send the SSH key specific to the server you are connecting to as listed in your config file, add the following to the top of your ~/.ssh/config:

Host *
  IdentitiesOnly yes

This will force you to explicitly list the SSH key to use for all remote servers. If you want to restrict this to just a subset of them, you can replace the "*" in the Host section with a wildcard matching the DNS name of the servers. For example *.example.com.

VPC

Amazon Virtual Private Coud (VPC) is a networking feature of EC2 that allows you to define a private network for a group of servers. Using it greatly simplifies fencing off components of your infrastructure and minimizing the externally facing pieces.

The basic idea is to separate your infrastructure into two halves, a public half and a private half. The external endpoints for whatever you are creating goes in the public half. For a web application this would be your web server or load balancer.

Services that are only consumed internally such as databases or caching servers belong in the private half. Components in the private half are not directly accessible from the public internet.

This is a form of the principle of least privilege and it's a good idea to implement it. If your server infrastructure involves more than one server, then you probably should be using a VPC.

Bastion Host

To access internal components in the private half your VPC you'll need a bastion host. This is a dedicated server that will act as an SSH proxy to connect to your other internal components. It sits in the public half of your VPC.

Using a bastion host with a VPC greatly simplifies network security when working on AWS by significantly minimizing the number of external firewall rules you need to manage. Here's how to set one up:

Spin up a new server in the public half of your VPC
Create a security group Bastion SG and assign it to the new server
Edit the security groups for your private half servers to allow inbound access on port 22 (SSH) from Bastion SG
Add your whitelisted IPs to the inbound ACL for Bastion SG (see the next section)

An SSH proxy server doesn't use that much CPU and practically zero disk. A m1.micro instance (the cheapest one that AWS offers) is more than enough for this. As of Jan 2014, at on-demand rates this comes out to ~$15/mo. With reserved instances you can bring this down to about $7/mo.

The added cost is nothing compared to the simplicity and security it adds to your overall environment.

Firewall Whitelists

Any server with port 22 (SSH) open to the public internet will get a lot of hacking attempts. Within 24 hours of turning on a such a server you should see a lot of entries in your SSH logs of bots trying to brute force log in. If you only allow SSH key based authentication this will be a pointless exercise but it's still annoying to see all the entries in the log (ie. it's extra noise).

One way to avoid this is to whitelist the IP addresses that can connect to your server. If you have a static IP address at your office or if it's "mostly static" (ex: most dynamic IPs for cable modems and DSL don't change very often), then you can set up the firewall rules for your servers to only allow inbound SSH access from those IPs. Other IP addresses will not even be able to tell there is an SSH server running. Port scanning for an SSH server will fail as the initial TCP socket will never get established.

Normally this would be a pain to manage on multiple servers but by using a bastion host this only needs to be done in one place. Later on if your IP address changes or you need to connect to your server from a new location (ex: on the road at a hotel), then just add your current IP address to the whitelist. When you're done, simply remove it from the list.

Email

Amazon Simple Email Service (SES) is Amazon's send-only email service for AWS. You can use it to send out email from your application or command line scripts. It includes both an AWS specific programmatic API as well as a standard STMP interface. Both cost the same (it's pay per use) but for portability purposes I recommend using the SMTP interface. This allows you to change your email provider down the road.

If you're only sending a handful emails then SES can be used free of charge. The first 2,000 emails per day are free when you send them from an EC2 server. For a lot of people this should be more than enough. After the first 2,000 per day it's $.10 per 1,000 emails plus outbound bandwidth costs.

Verification

When you first set it up you'll need to verify ownership for the email addresses that you'll be sending from. For example if you want to send email from hello@example.com then you'll need to prove you actually control example.com. This is done by either verifying that you can receive email at that address. You can also verify an entire domain by setting up TXT records to verify domain ownership.

You cannot send email until this verification is complete and it can take a little while for it to propagate through the system. Additionally, to prevent spammers from using SES for nefarious purposes, Amazon restricts your initial usage of SES. Your sending limit is gradually increased. If you plan on sending email from your application you should set up SES immediately so that it's available when you need it.

DKIM and SPF

To ensure that your emails are actually received by your recipients and not rejected by their spam filters you should set up both DKIM and SPF for your domain. Each is a way for recipients to verify that Amazon SES is a legitimate sender of email for your domain.

You can read more about setting up SPF on SES here.

You can read more about setting up DKIM on SES here.

Also, if you haven't already, you should set up DKIM and SPF for your domain's email servers as well. If you're using Google Apps for email hosting more details are available here.

Testing via Port 25

Once you have it set up, a simple way to test out both DKIM and SPF is using the email verification service provided by Port 25. Simply send them an email and a short while later they'll respond back with a report saying whether SPF and DKIM are properly configured. They'll also indicate whether their spam filters would flag your message as junk mail.

Note that you can also use Port 25 to verify your personal email address as well, not just Amazon SES. Just manually send an email to Port 25 and wait for the response.

EC2 (on the cheap)

Amazon EC2 allows you to spin up servers on demand and you only pay for what you usage, billed hourly. This means it's particularly catered towards usage patterns that involve scaling up to a large number of servers for a short period of time.

The flip side of the fine grained billing of EC2 is that the on-demand price of the servers is more expensive than other cloud providers. Here a couple tips to lower your costs.

Reserved Instances

If you are running a server that is always online you should look into reserved instances. With reserved instances you pay an upfront fee to in exchange for greatly reduced hourly rates. Amazon offers three levels of reserved instances, Light (the cheapest), Medium, and Heavy (most expensive), and each is offered in 1-year (cheaper) or 3-year blocks (more expensive).

Each costs progressively more, but also progressively reduces the hourly cost for running a server. For example a standard m1.small instance costs $.06/hour on-demand. If you buy a 3-year light reserved instance for $96 the hourly price drops to $.027/hour. Factoring in the up front cost and assuming the server runs 24-hours a day, it would take about 4 months to break even with the reserved instance vs paying on-demand rates.

Billing for Heavy instances is a bit different than Light or Medium as you'll be billed for the underlying instances regardless of whether it's actually running. This means that if you buy a 3-year heavy reserved instance you're agreeing to pay for running that instance 24-hours a day for the next three years.

The savings for reserved instances is anywhere from 25% to 65% over on-demand pricing. The break even point is anywhere from 3-months to a year. Once your AWS usage has stabilized it's well worth the time to investigate the cost savings of buying reserved instances.

If you're unsure of whether you'll be running the instance types a year or two from now I suggest sticking to Light or Medium instances. The savings differential is only another 10-15% but they allow you more cheaply change your infrastructure down the road.

Reserved Instance Marketplace

There's also a reserved instance marketplace where you can buy reserved instances from third parties or sell those that you no longer need. Since reserved instances only impact billing there's no difference in functionality in buying them via the third party marketplace. In fact, it's all part of the same UI in the admin console.

The only real difference with third party reserved instances is the duration of the reservation can be just about anything. This can be very useful if you anticipate a usage term besides 1-year or 3-years. Just make sure to check the actual price as there usually a couple listed that have wildly inflated prices compared to the Amazon offered ones.

Spot Instances

Spot instances allow you to bid on excess EC2 capacity. As long as your bid price is above the current spot price, your instance will continue to run and you'll pay the lower of the two per hour. If the spot price increases beyond your bid, your instance may be terminated. This termination could happen at anytime.

As they can be terminated at any time, spot instances work best when application state and results are stored outside the instance itself. Idempotent operations and reproducible or parallelizable work is usually a great candidate for spot instances. One of the most popular use cases for it is for running continuous integration (CI) servers. If the CI server crashes and restarts an hour later, it's not that big of a deal. The only result we really care about is if the final application build/test was successful.

More complicated setups are possible with spot instances as well. If you fully automate their provisioning (ie. automate security groups, user data, startup scripts), you can even have them automatically register themselves via Route53 DNS and then join an existing load balancer. In a later article I'll be writing about how to build such a setup that cheaply scales out a stateless web service via a phalanx of spot instances, all the while being fault tolerant to their random termination.

S3

S3 is Amazon's object store. It allows you to store arbitrary objects (up to 5TB in size) and access them over HTTP or HTTPS. It is said to be designed to provide 99.999999999% durability and 99.99% availability. For what it offers, S3 is really cheap. Amazon also periodically drops the prices for it as well. Most recently a week ago.

The S3 API allows you to create signed URLs that provide fine grained access to S3 resources with custom expirations. For example you can have an object stored in S3 that is not publicly readable to be temporarily accessible via a signed URL. This is a great way to provide user's access to objects stored in private S3 buckets from webapps.

GPG

If you're using S3 to store sensitive data (ex: database backups) then you should encrypt the data before you upload it to S3. That way only encrypted data is stored on Amazon's servers.

The easiest way to do this is using GPG. Generate a GPG key pair for your backup server on your local machine and add the public key of the key pair to the server. Then use that public key to encrypt any data before uploading to S3.

# Set the bucket/path on S3 where we will put the backup:
$ S3_PATH="s3://my-s3-bucket/path/to/backup/backup-$(date +%Y%M%d)"

# Temp file for encryption:
$ GPG_TEMP_FILE=$(mktemp)

# Encrypt it with GPG:
$ gpg --recipient aws.backup@example.com --output "$GPG_TEMP_FILE" --encrypt mydata.foo

# Upload it via s3cmd:
$ s3cmd put "$GPG_TEMP_FILE" "$S3_PATH"

# Clean up temp file:
$ rm "$GPG_TEMP_FILE"

The only downside to encrypting data prior to storage on S3 is that you will need to decrypt it to read it. You can't provide the S3 URL to someone else to download the data (ex: a direct link to a user's content that you're storing on their behalf) as they will will not be able to decrypt it. For backups this is the right approach as only you (or your company, your team, ...) should be able to read your data.

The S3 CLI tool s3cmd includes an option to specify a GPG key. If set, then it will automatically encrypt at objects you PUT to S3. It can be a convenient option as you only need to set it one place (the s3cmd config file). I prefer explicitly adding the GPG steps to my backup scripts though.

Encryption At Rest

S3 also supports what is referred to as Server Side Encryption . With this option enabled, Amazon will encrypt your data before persisting it to disk, and will transparently decrypt it prior to serving it back to a valid S3 request. The documentation for it describes how they keep the decryption keys separate from the S3 API servers and request them on demand.

Since Amazon can still read your data I don't consider this to be that useful of a feature. If you want to ensure that nobody else can read your data then you need to do the encryption. If you trust a third party to do it, then by definition that third party is able to read your unencrypted data.

Still though, it doesn't hurt to enable it either. Apparently there is no real performance penalty for enabling it.

Object Expiration

Once you start using S3 for backups you'll notice that your S3 bill will grow fairly linearly (or faster!). By default, S3 persists objects forever so you'll need to take some extra steps to clean things up.

The obvious solution is to delete objects as they get old. If you're using S3 for automated backups then you can have your backup script delete older entries. With additional logic you can have your backup scripts keep various ages of backups (ex: monthly for a year, weekly for a month, daily for a week).

A simpler, albeit coarser, approach is to use S3 object expirations. It allows you to define a maximum age for S3 objects. Once that age is reached, the object will automatically be deleted with no user interaction. You could set a 6-month expiration on your S3 backup bucket and it will automatically delete older entries than that.

Warning: If you use S3 object expiration make sure that your backups are actually working. It will delete your old objects regardless of whether your latest backups are actually valid. Make sure to test your backups regularly!

Object expiration is also a simple way to delete all the objects in an S3 bucket. From the AWS S3 console simply set the expiration for the entire bucket to be 1-day. The objects will then be automatically deleted after a day. S3 expiration is asynchronous so you may still see the objects in S3 for a short while after 24-hours but you will not be billed for them.

Glacier

Glacier is Amazon's long term, persistent storage service. It's about an order of magnitude cheaper than S3 but with much slower access times (on the order of multiple hours).

Rather than deleting old S3 objects you can configure S3 to automatically expire them to Glacier. The storage cost will be about 10x less and if you really need the data (ex: you accidentally destroyed all your S3 backups) you can slowly restore them from Glacier.

Final Thoughts

This post ended up longer than originally planned, but it's still a very incomplete list. There are plenty of other tips, tricks, and techniques to use with AWS, this is just a (hopefully helpful) start.

Do you have something to add to this list, a better way of solving some of these problems, or just want to be notified when I have a new post? Let me know!

Launch by Lunch

Databases, DevOps, and Development