RAIDIX Data Storage for a VMware Virtualization Cluster

21.02.2018

This article unveils VMWare infrastructure requirements, RAIDIX tech characteristics and recommended RAIDIX configurations for virtualization needs.

 

Introduction

As of today, server virtualization is one of the most efficient methods of deploying private and public clouds, development and testing environments, as well as corporate applications. Server virtualization enables decreased TCO due to power and space economy, eliminates hardware vendor lock-in and boosts operational up-time.

The RAIDIX solution for data storage in a virtual environment allows the administrator to facilitate and automate system management, and accomplish greater ROI and IT resources for every dollar invested into the corporate infrastructure.

The key storage parameters include:

Connectivity tools

VMware ESXi and the data storage system may be connected via versatile protocols — iSCSI, NFS, etc. Virtual machines (VM) may employ corresponding files (configuration and vDISKs) on all available datastores. At that, the administrator can make use of all VMWare’s storage functions (VMotion, VMware DRS, VMware HA and VMware Storage VMotion).

Performance

Performance depends on a server utilized for data storage (RAID-controller and drive functions). Due to the RAIDIX unique algorithms and efficient parallelization of calculations, RAIDIX unlocks maximum performance expected from specific hardware. Moreover, RAIDIX-based systems enable elastic scalability with no degradation in case of increasing number of VMs or heavy workloads.

Compatibility

RAIDIX supports virtualization platforms VMware ESX 5.0/5.1/5.5/6.0; KVM (Kernel-based Virtual Machine); RHEV (Red Hat Enterprise Virtualization), Microsoft Hyper-V Server, XenServer.

 

RAIDIX Architecture for Virtualization

Image 1. VMWare and RAIDIX Data Storage

 

Configuring a Virtual Infrastructure on RAIDIX Storage

1. RAIDIX Installation

For virtualization tasks, it’s recommended to employ the RAID 6 array — a level of interleaving blocks with double parity distribution, based on RAIDIX’s proprietary mathematical algorithms. RAID 6 provides improved performance due to optimized paralleling of calculations (I/O requests are processed separately by each drive). RAID 6 ensures solid fault-tolerance sustaining complete failure of two drives in the same group.

For increased performance in the enterprise environment, it’s advisable to initialize RAID 6 arrays (RAID 6i where “i” stands for initialization). During the initialization process, the disks are primarily written with zeros, allowing for accelerated processing of transactional operations.

To ensure fast failover, high velocity and reliability, RAIDIX recommends to activate the Advanced Reconstruction feature. This mechanism optimizes performance of read operations by eliminating the drives with the lowest read rates.

Advanced reconstruction can be used in two modes:

  • Permanently: RAIDIX eliminates drives with the highest response time and resolves a system of equations in the background to restore the data. The system stops sending read requests to those drives for a second and then eliminates other slow drives.
  • On-Demand: RAIDIX detects slow drives in a group and stops sending them read requests. The data is quickly restored by solving a system of equations. The drives with the lowest response time are assigned “Slow” status in the UI so the administrator can replace them quickly and ensure normal system performance.

2. Setting up ESXi

The first step is to configure a target (e.g., Fibre Channel). The target connects to VMware ESXi. The next step is configuration of datastores in the dual controller mode (Active-Active). In this mode, both nodes are active, they work in parallel and provide access to the same pool of drives. The nodes are hardware-agnostic components of the data storage system that include processors, cache memory, motherboards and may be combined into a cluster.

RAIDIX guarantees continuity of data access and provides fault-tolerance due to:

  • Duplication of nodes
  • Duplication of drive connection channels (both nodes are connected to a single set of drives).

Nodes interact through InfiniBand, iSCSI (over Ethernet), SAS, and Fibre Channel interfaces that allow for data and cache status synchronization.

Due to bi-directional standby cache synchronization, the remote node always contains an up-to-date copy of data in the local node cache. Therefore, should one node fail, the other node will transparently take over the entire workload, allowing the administrator to fix errors on the fly.

Duplication of hardware components and interfaces provides protection against the following incidents:

  • Failure of a single hardware component (CPU, motherboard, power supply unit, controller, system drive)
  • Connection failure of a drive shelve interface (SAS cable, I/O module)
  • Power failure for one of the nodes

Software errors detected on one of the nodes.

 

Technical Characteristics

Supported RAID levels

RAID 0/5/6/7.3/10/N+M

Max. number of drives in a RAID

64

Max. number of drives in the system

600

Scalability unit

12 drives

Hot spare

Dedicated reserve disks and shared access disks

Max. LUN size

Unlimited

Max. number of LUNs

487

iSCSI

MPIO, ACLs, CHAP-authorization, LUN masking, СRC Digest

Supported number of sessions

1024

Max. number of hosts in case of direct connection

32

Supported operating systems

Mас OS Х 10.6.8 and higher, Microsoft® Windows® Server 2008/ 2008 R2/ 2012, Microsoft ® Windows® XP/Vista/7/8; Red Hat Linux, SuSE, ALT Linux, Cent OS Linux, Ubuntu Linux; Solaris 10

Supported virtualization platforms

VMware ESX 3.5/4.0/4.1/5.0/5.1/5.5/6.0; KVM (Kernel-based Virtual Machine); RHEV (Red Hat Enterprise Virtualization), Microsoft Hyper-V Server, XenServer

Supported high-performance interface

Fibre Channel 8Gb, 16Gb; InfiniBand (FDR, QDR, DDR, EDR); iSCSI; 12G SAS

Supported NAS protocols

SMB, NFS, FTP, AFP

Integration with MS Active Directory

Yes

WORM (Write Once – Read Many)

Yes

Number of nodes

2 in the Active/Active mode

Data caching

Two-tier: RAM and Flash, WriteBack and ReadAhead for multiple streams

QoS support

On the host/application level

 

Sample Project in an IT company

The hardware infrastructure included 10 Supermicro servers with Broadcom HBA cards and Mellanox InfiniBand adapters. For synchronization purposes, ISCSI over InfiniBand was selected as the fastest method in this hardware configuration. ISCSI over Ethernet was used for failover.

RAIDIX employed three RAID 6i partitions and an average of three LUNs per partition on each server. Every server hosted VMWare ESXi 5.1 and VCenter 5.1 with virtual machines (VMs). VMs performed versatile IT functions of MS SQL Database storage, backup server, fileservers for corporate users, virtual server farms for software engineers, office services, etc.

The chosen configuration allowed for effective processing of random data, low system footprint and high reliability.

Infrastructure scheme

Storage

Disks

RAID

LUN

VM

1

24

4 RAIDs 6i

3 LUN per RAID

67

2

16+36 JBOD

6 RAIDs 6i

1 LUN per RAID

73

3

24

3 RAIDs 6i

3 LUN per RAID

70

4

24

4 RAIDs 6i

3 LUN per RAID

69

5

12

2 RAIDs 6i

3 LUN on one RAID,
2 — on the other RAID

33

6

24

3 RAIDs 6i

3 LUN on one RAID,
2 on the other two

71

7

24

3 RAIDs 6i

3 LUN per RAID

66

8

24

4 RAIDs 6i

3 LUN per RAID

75

9

24

3 RAIDs

3 LUN per RAID

72

 

Business Impact

  • Reliable fault-tolerant storage of critical data
  • Flexible virtualization of existing infrastructures
  • High performance of transactional operations
  • Data availability at ‘five nines’ (99,999)
  • Optimized IT expenses.