We here at PipelineDeals also abandoned EBS-backed instances after their 2nd outage.
Instead we rely on instances that use an instance-store root device. During the EBS outage, our instance store servers did not have any issues, while our EBS-backed servers really struggled throughout the day, with crazy high loads.
We still use EBS for database backups via ec2-consistent-snapshot. The slave DB that performs this is not production-facing.
We have 2 separate chef recipes for our DBs:
One is for a full-time slave. This recipe will set up the db to use the EBS volume.
The other is for a slave that will be promoted to a master. In this case, we do a little extra legwork to do a bit-by-bit copy of a recent EBS snapshot, onto the ephemeral disk.
Great post! It's fascinating to see that other people have come up with the exact some solution -- we also run a "skeleton crew" server setup in us-west.
Instead we rely on instances that use an instance-store root device. During the EBS outage, our instance store servers did not have any issues, while our EBS-backed servers really struggled throughout the day, with crazy high loads.
http://devblog.pipelinedeals.com/pipelinedeals-dev-blog/2012...