On the subject of old and new technologies in data storage

16.12.2016

Just a few days ago, I came across an article Storage Market: Out With The Old, In With The New. It’s about new business models and products that would oust the prominent industry players of today. I’d rather not give you any spoilers – the article is moderately sized and makes nice reading. However, I’d like to add my five cents to the subject.

Long story short, I believe most of the features the article refers to as future technology are, in fact, part of today’s development cycle for many data storage companies out there.

Some experts predicted that SSD disks would reach the price level of HDDs by 2017. How does this impact the industry landscape? All in all, using SSD technology allows for greater savings. It’s not just the hardware costs we are talking, infrastructure expenses like storage space, electricity and cooling should be factored in as well. Following the ergonomic trend, a new Samsung storage device with 512 GB capacity weighs a minuscule one gram. That’s a nice way to prune the power and room space budgets! By the same token, SDDs are orders of magnitude more cost-effective than HDDs in terms of $/IOps.

If we are considering the 1 GB cost metric, there is a bunch of available scenarios based on performed operations. Disks optimized by various access patterns reveal a great dispersion of 1 GB cost figures. This is due to a number of variables such as spare block size, buffers and controller performance. Besides, disk reliability does not necessarily account for disk durability. Disks that beats regular HDDs by reliability parameters (eg., RBER or MTBF) can be rendered out of service within a week, if improperly used.

It’s crucial to select the right hardware for specific tasks and take care of data management software. TCO is influenced by a number of factors: quality of service (QoS), RAID operation in degraded mode, write optimization techniques, support for compression and deduplication on the fly, tackling the write amplification issue – multiple data transfer actions on writing (a setback for SSDs that have limited rewritability), etc.

Another vital metric of Flash and SSD efficiency is performance with sustainably low latency levels. Many vendors play cunning by sharing with the public the benchmark operation results. The benchmark uses write-zero on activated hardware compression, and the tests are performed in ideal conditions, with the buffer doing all the heavy lifting. These test sets have little to do with reality.

When building enterprise solutions, particularly in the cloud environment where we face the IO Blender effect, it’s key to evaluate average performance levels with limited latency. HGST disks, for one, do a great job in these conditions. Committing to QoS on device level, they greatly facilitate the construction of All Flash and AllSSD arrays.

The RAIDIX solutions cater to aforementioned issues in a number of ways. For instance, we provide fast access to deduplication index and its cache for accelerated data reconstruction. In case of SSD devices, we can optimize the use of hash function and resolve collisions (emergence of identical hash values) performing byte-by-byte comparison of detected duplicates. Whereas hard drives require more time and effort to process redundant requests, SDDs reveal no tangible performance decline on extra read operations.

Software-defined storage is old news, too. However, it makes sense to rehash the approaches to solution management. The emerging StarRAIN technology by RAIDIX will address the daunting enterprise challenges of storage, performance and serviceability. Speaking of new program interfaces, the key trend here is elimination of old write levels that generate unwanted latency. The day is not far away when software will communicate directly with data storage through an API, avoiding the file system and block device drivers. The RAIDIX engineers are on it!