Don't gift your money to Jeff - rein(deer) in your AWS bill this Christmas (and all year round)
Wed, Dec 16, 2015Christmas comes but once a year
Now it’s here, now it’s here
Bringing lots of joy and cheer
Tra la la la la
You and me and he and she
And we are glad because
because because because
There is a Santa Claus
Christmas comes but once a year
Now it’s here, now it’s here
Bringing lots of joy and cheer
Tra la la la la
Unlike Christmas, the AWS bill comes every month and does not generally bring joy and cheer. In this post I want to share details on two tools that we use at Fairfax Media to reduce our AWS costs all year round, and a simple action you can take to reduce your bill even more over the holiday period.
Like so many other companies, we have been through the AWS boom times - ease of use leading to rapid expansion of cloud resources and the resultant billing - and the bust - when those bills start becoming so significant they attract attention from the finance department. To help reduce our costs we have developed a couple of tools and both have been open sourced on our GitHub page.
The tools were both written by the very smart David Baggerman and are called ‘Cloudcycler’ and ‘Flywheel’.
Cloudcycler allows you to shut down and start up AWS resources on a scheduled basis. There are quite a lot of tools around that do this but none met our needs. In particular Cloudcycler allows us to:
- shut down EC2 instances
- shut down RDS instances
- shut down entire CloudFormation stacks (and all their resources)
- snapshot RDS volumes on shutdown
- start up resources, including from snapshots (e.g. restoring RDS data)
- start up resources from CloudFormation templates
We use Cloudcycler to turn off the majority of our non-production environments outside of office hours. The default configuration is to only run these environments from 08:00 to 17:00 weekdays - with some exceptions for environments that are used by offshore development teams. This produces significant savings - in a 30 day month there are 720 hours in total. Of those 30 days, around 22 are working days so we now run these environments for 22 (days) * 9 (hours) = 198 hours - a saving of 522 hours - or a cost of just 28% of the full monthly running cost.
Flywheel takes this a step further. Turning on non-production resources on a scheduled basis is better than having them running full time, but even better would be to not have them running at all until they are needed. Flywheel does this by being an EC2 ‘proxy’. Instances controlled by Flywheel are by default shut down. Flywheel provides a simple web site with one-click startup for those instances, so the end user, such as a developer or tester, can easily start up the instance on demand. After a (configurable) time period the instance is automatically shut down again. A particular use case is for test instances which are only required to be on for an hour or so during the day.
Using both of these tools has significantly reduced our AWS running hours and therefore our costs. This graph, taken from CloudCheckr (our cost & security tool of choice) shows the impact that Cloudcycler has on our running instance hours over time.
Yes, it is an overall upward trend as we add more projects to AWS but you can clearly see the effect of turning off resources (particularly clear are the dips at weekends).
Onto the special December/January cost saving tip.
Fairfax Media encourages staff to take leave around Christmas. There are a few reasons for this - it is partly a relic of the old print days when presses were run with a skeleton staff over Christmas, partly a concern for staff welfare, and no doubt partly to reduce the financial liability from some staff having accrued high amounts of leave. Whatever the reason, we can use this to our advantage in saving AWS costs. We also, like many large companies, have a change freeze in place over the holidays.
Adding all these together, we spoke to our development teams to let them know that there would be a new Cloudcycler schedule in place from Monday 21st December to Friday 8th January 2016 inclusive. During that time, environments will be turned off completely, all day. This is an opt-out system - by default all systems are in scope and exceptions can be added on demand, for the small number of projects which will be active over this time.
This has the potential to save 15 (days) * 9 (hours) = 135 hours of running costs over the Christmas period - a good saving gained just by sending a few emails and changing a configuration file.
Why not contact your development teams and see if you can do the same?
P.S. excluding Christmas there are seven weekday public holidays in New South Wales, Australia in 2016 - I would welcome a pull request to the Cloudcycler repo with changes to make it read those dates (from a static file and/or a web service) and extend the scheduler so those days are treated as weekend days.