News a couple of days ago that Amazon Web Services is adding AWS Glacier to it’s product catalog. AWS is designed for one thing only – long term archival of data that isn’t routinely accessed. I’s not speed optimized but what it is, is price optimized. From AWS’s blurb;
Amazon Glacier is an extremely low-cost storage service that provides secure and durable storage for data archiving and backup. In order to keep costs low, Amazon Glacier is optimized for data that is infrequently accessed and for which retrieval times of several hours are suitable. With Amazon Glacier, customers can reliably store large or small amounts of data for as little as $0.01 per gigabyte per month, a significant savings compared to on-premises solutions.
Currently much archiving occurs on expensive kit, primarily because the price of cloud storage has been too high for this type of data. By slashing the price considerably, AWS makes it viable to move archiving to the cloud.
There are two ways of looking at this, from a customer perspective and from the perspective of AWS itself. For customers this is a compelling offering since it gets the cost of the archival storage down to ridiculously low levels. While it is true that customers still need to think about the practicalities of getting all that data into Glacier (sending disks by FedEx anyone?), once the data is there the pricing makes it viable to archive the ever increasing amounts of data an organization holds cheaply and easily, and to not have to think about the infrastructure behind that ever again. There goes the decades old approaches of archiving on tape. As Lydia Leong opined;
@samj I think Glacier’s price-point is going to make a lot of people re-think their archiving strategy.
— Lydia Leong (@cloudpundit) August 22, 2012
The more interesting angle however is that of AWS. Being able to move significant volumes of data onto their own service helps to increase yet further the economies of scale that Amazon enjoys, allowing it to yet again compete on a cost basis with other vendors with its other services. It already enjoys some of the largest scale-benefits of a cloud vendor – this drives those unit costs further south.
Of course Glacier is, as yet, unproven. AWS needs to show that it is resilient, and enterprises need to understand that Glacier is no greyhound, data stored on Glacier takes between three and five hours to access. Add to that the fact that retrieval is only free for 5% of total storage per month, after which users are charged a retrieval fee, and it becomes obvious that TCO calculations for switching to Glacier are going to be fairly complex indeed.
More interesting also is the indication from AWS that over time it will allow the seamless movement of data between S3 and Glacier storage based on data lifecycle policies. That could prove a nicely compelling little workflow aid to organizations.