Rainstor, formerly from UK with the name Clearspace Software Ltd., is announcing the release of new version of their product called Rainstor 4. With this release, the currently San Francisco based company is positioning themselves as a big player in the data retention and retrieval market for Big Data. With their expertise in online information preservation, Rainstor is now helping companies take advantage of the low cost of the cloud based storage systems for their data retention needs. Most of the companies in the Big Data space is focused more on the analytics side. Rainstor is hoping to establish themselves as a leader in the data retention, retrieval, compliance space. Their focus lies entirely on solving this problem.
Organizations across various industries are generating extremely large amounts of data. In this era of big data, not all data are used actively by these organizations. In the case of transactional data, only a part of the data is actively used and the rest are static without being accessed. Over time, 90% of these data becomes static with only 10% being actively used by the organizations. If we consider machine generated data like call logs, etc., almost all of the data is static from the moment it is created. Using traditional databases to archive these data are not only difficult to manage but it also results in complete waste of resources costing huge amounts of money to the organizations. Clearly, there is a need for better and cost effective technologies.
A better approach will be to use cloud based storage for the archival of the static data. However, there are some issues that needs to be addressed before using the cloud storage for data retention and retrieval purposes. They are
- Storing huge volumes of data in the cloud could result in big storage costs negating any cost advantage of cloud based storage
- Moving huge amounts of data is not only difficult but also expensive due to bandwidth costs. There should be a proper compression technology coupled with smart de-duplication of repetitive strings
- Not every organization is content with putting their data on the cloud. Some are just unnecessarily worried about the security in the cloud and for others, there could be regulatory needs requiring strict compliance. This calls for smart security options
- TCO should be really low
Rainstor is clearly taking this approach to storage of historical data in this Big Data era. Designed to run either inside an organization’s firewall or on the cloud, they have developed a technology that achieves extreme compression along with smart data de-duplication based on patterns to achieve anything from 40:1 reduction (in the case of transactional data) to 100:1 (in the case of machine generated data). Once the data is reduced, it can be encrypted and stored in private cloud like EMC Atmos or public cloud like AWS. The encryption key is stored inside the firewall and hence the data is secure even inside the public clouds. This size reduction is only a part of the story. The most interesting thing about Rainstor’s technology is the ability for organizations to query these historical data stored in the private or public clouds using SQL or BI tools with any need to re-inflate the compressed data. It is also possible to set rules on the stored data for compliance needs including the ability to auto delete the data of the requisite retention period is over.
Two of the biggest advantage of Rainstor technologies are
- Ease with which large volumes of data are moved to private or public clouds. There is no need to send media by Fedex to Amazon Web Services
- Extremely high cost savings because cloud based storage costs less, the extreme compression greatly reduces the storage costs and there is no need for admin to manage the data, index maintenance, etc..
The new release, Rainstor 4, comes with superior data management including record level expiry, legal hold by record, automatic deletion of records based on business rules and the ability to group and tag records for efficient future discovery. This version contains upto 50% increase in performance of data ingestion and query performance. They have also added support for Windows with other platforms supported in the near future.