Software-defined Data Storage Systems: Stepping Stone to the Storage of Tomorrow

24.11.2016

By Rufat Ibragimov

This blog posting is about software-defined storage (SDS) and its development prospects. I don’t claim monopoly on the truth, however, I’d like to suggest a sneak peek into the future of data storage, short to mid-term prospective. I’ll start off with a brief overview of the current situation and SDS growth forecasts.

Many IT specialists I talked with select data storage systems with a 2–3 years’ service horizon. According to IDC, the volume of global data will double every two consecutive years and will reach 44 zettabytes or 44 trillion gigabytes by 2020. This boils down to 5 TB for each earthling including babies (!).

It’s pretty easy to see how the figures add up. Just 10 years ago, an amateur digital camera (say, Canon EOS 350) had a matrix of 8MPx and photo size of 7.5 MB. Nowadays, the matrix of Canon EOS 750D is 24MPx, and the picture size is 30 MB. Simple math reveals that photo size has quadrupled in the course of 10 years, and doubled every 2.5 years.

For storing ever-increasing data volumes, a huge number of data storage systems is needed. The systems should meet the classic requirements that have been out there for some 15—20 years:

  • Performance
  • Storage cost
  • Storage management.

When it comes to performance, the trend is rather straightforward. Year after year applications have become more resource-hungry in terms of IOps and throughput. Any new IT project, for instance, a virtual desktop infrastructure, requires a high performance data storage system which implies increased storage budgets.

Everything we do to ensure high performance adds on top of the system cost. In fact, it’s not all about cutting edge technology (like All-Flash/SSD storage). First off, the customer has to run a cycle of hardware and software updates, and perform data migration.

Once implementation is through, the customers find themselves in a tricky situation when there is a bunch of siloed storage systems from various vendors that are majorly incompatible with one another. There is no common management tool for these storage systems. As a result, the customer is faced with the challenges of maintenance planning, upgrades and data migration across versatile environments.

Up to recent past, many vendors could come up with just one viable solution – deploy a new “box” replacing a pool of old ones, as if it were a universal remedy. In the last couple years, the data storage market leaders (EMC, NetApp, HP, IBM, Dell) – and second tier vendors – announced their software-defined data storage solutions (SDS).

What is Software Defined Storage and why is there so much fuss about it? As of today, there is no fixed definition of SDS, just a set of features it should deliver. All in all, SDS platforms provide IТ companies with:

  • Increased flexibility and vendor independence when purchasing data storage resources. With no peril of vendor lock-in, IT enterprises are free to choose a suitable cost-efficient hardware or virtual platform for deploying SDS.
  • Beneficial unification of data storage infrastructures made possible with affordable servers and industry-standard JBOD systems
  • Orchestration of hardware platforms and workloads pertaining to multiple system generations, uncoupling of hardware and software cycles that enables increased ROI for data storage infrastructures.

Functionally, software-defined data storage systems hold their ground with classic storage and encompass all key features like deduplication, replication, thin provisioning, snapshots, back-up, tiering, etc. The main differentiators are flexibility and management automation (the data-storage-as-a-service concept), lower TCO and CapEx.

Why SDS? As a rule of the thumb, data storage software has to deliver on a few important factors:

  • Allow for compatibility with third-party hardware. This will help avoid vendor lock-in and decrease CapEx.
  • Split software and hardware processes. In other words, we don’t have to update the software in case of a hardware upgrade, and vice versa.
  • Use the entire storage pool on various hardware platforms and manage it from a centralized location. When brand-new technologies and standards emerge, there has to be a program method to integrate them into a current data storage system.
  • Facilitate hardware scaling, make data migration easy and seamless.

Apparently, the benefits of SDS and the ability to disengage software from hardware were too tempting to neglect. The new approach revolutionized the world of data storage, allowing solid systems to run on standard x86 servers. Although the reports on the death of hardware-sensitive storage are greatly exaggerated, all classic storage vendors have jumped on the bandwagon and unveiled their SDS offerings in the last couple years. According to IDC forecasts, the software-defined storage market will grow from $1.4B in 2014 to $6.2B in 2019.

Generally speaking, SDS solutions may be split into these categories:

  • Storage Hypervisor. Software that operates on a server, virtual machine, inside a hypervisor or data storage network. This segment will grow from the current ~$600M to $1.8B in 2019.
  • Storage Virtual Software. Open-source scalable software that prevents vendor lock-in and ensures unrestricted, safe and scalable data management at a minimum cost. Experts expect this niche to grow from $215M to $1.9B by 2019.
  • Control Planes. Software for processing storage policies and reinforcing these policies within lower-level resources and services. The control planes market share will grow from $453M to $1.44B by 2019.
  • Data Services. Software that adds extra features to a data storage system. The forecasted growth here is from $650M to $1B through 2019.

What drives the expansion of SDS? At these point, the main factors are:

  • Big Data. Data volumes reveal exponential growth every year, encouraging demand for scalable and reliable data storage systems. Not surprisingly, SDS gains traction in this field, fulfilling the needs of large companies as well as SMBs looking to cut their infrastructure costs.
  • Expense optimization. Needless to say, most enterprises are concerned with optimizing their hardware costs. SDS enables the use of standard х86 servers and off-the-shelf components for building high performance data storage systems.
  • Complexity of data storage networks. For a long time, the growing complexity of SAN networks was a challenge to data center operators as well as SMB integrators. SDS allows the customer to facilitate the SAN network operation and alleviate expenses for smaller business

Software-defined storage is no longer a new concept these days. As technology evolves, the RAIDIX engineers come up with new approaches to better performance and QoS. Does direct communication between software and data storage through an API sound like a fantasy? Doesn’t seem surreal to us.

Stay tuned for more blog postings and insights into storage technology, and leave your feedback!