RAIDIX Data Storage for a VMware Virtualization Cluster
This article unveils VMWare infrastructure requirements, RAIDIX tech characteristics and recommended RAIDIX configurations for virtualization needs.
Introduction
As of today, server virtualization is one of the most efficient methods of deploying private and public clouds, development and testing environments, as well as corporate applications. Server virtualization enables decreased TCO due to power and space economy, eliminates hardware vendor lock-in and boosts operational up-time.
The RAIDIX solution for data storage in a virtual environment allows the administrator to facilitate and automate system management, and accomplish greater ROI and IT resources for every dollar invested into the corporate infrastructure.
The key storage parameters include:
Connectivity tools
VMware ESXi and the data storage system may be connected via versatile protocols — iSCSI, NFS, etc. Virtual machines (VM) may employ corresponding files (configuration and vDISKs) on all available datastores. At that, the administrator can make use of all VMWare’s storage functions (VMotion, VMware DRS, VMware HA and VMware Storage VMotion).
Performance
Performance depends on a server utilized for data storage (RAID-controller and drive functions). Due to the RAIDIX unique algorithms and efficient parallelization of calculations, RAIDIX unlocks maximum performance expected from specific hardware. Moreover, RAIDIX-based systems enable elastic scalability with no degradation in case of increasing number of VMs or heavy workloads.
Compatibility
RAIDIX supports virtualization platforms VMware ESX 5.0/5.1/5.5/6.0; KVM (Kernel-based Virtual Machine); RHEV (Red Hat Enterprise Virtualization), Microsoft Hyper-V Server, XenServer.
RAIDIX Architecture for Virtualization
Image 1. VMWare and RAIDIX Data Storage
Configuring a Virtual Infrastructure on RAIDIX Storage
1. RAIDIX Installation
For virtualization tasks, it’s recommended to employ the RAID 6 array — a level of interleaving blocks with double parity distribution, based on RAIDIX’s proprietary mathematical algorithms. RAID 6 provides improved performance due to optimized paralleling of calculations (I/O requests are processed separately by each drive). RAID 6 ensures solid fault-tolerance sustaining complete failure of two drives in the same group.
For increased performance in the enterprise environment, it’s advisable to initialize RAID 6 arrays (RAID 6i where “i” stands for initialization). During the initialization process, the disks are primarily written with zeros, allowing for accelerated processing of transactional operations.
To ensure fast failover, high velocity and reliability, RAIDIX recommends to activate the Advanced Reconstruction feature. This mechanism optimizes performance of read operations by eliminating the drives with the lowest read rates.
Advanced reconstruction can be used in two modes:
- Permanently: RAIDIX eliminates drives with the highest response time and resolves a system of equations in the background to restore the data. The system stops sending read requests to those drives for a second and then eliminates other slow drives.
- On-Demand: RAIDIX detects slow drives in a group and stops sending them read requests. The data is quickly restored by solving a system of equations. The drives with the lowest response time are assigned “Slow” status in the UI so the administrator can replace them quickly and ensure normal system performance.
2. Setting up ESXi
The first step is to configure a target (e.g., Fibre Channel). The target connects to VMware ESXi. The next step is configuration of datastores in the dual controller mode (Active-Active). In this mode, both nodes are active, they work in parallel and provide access to the same pool of drives. The nodes are hardware-agnostic components of the data storage system that include processors, cache memory, motherboards and may be combined into a cluster.
RAIDIX guarantees continuity of data access and provides fault-tolerance due to:
- Duplication of nodes
- Duplication of drive connection channels (both nodes are connected to a single set of drives).
Nodes interact through InfiniBand, iSCSI (over Ethernet), SAS, and Fibre Channel interfaces that allow for data and cache status synchronization.
Due to bi-directional standby cache synchronization, the remote node always contains an up-to-date copy of data in the local node cache. Therefore, should one node fail, the other node will transparently take over the entire workload, allowing the administrator to fix errors on the fly.
Duplication of hardware components and interfaces provides protection against the following incidents:
- Failure of a single hardware component (CPU, motherboard, power supply unit, controller, system drive)
- Connection failure of a drive shelve interface (SAS cable, I/O module)
- Power failure for one of the nodes
Software errors detected on one of the nodes.
Technical Characteristics
Supported RAID levels |
RAID 0/5/6/7.3/10/N+M |
Max. number of drives in a RAID |
64 |
Max. number of drives in the system |
600 |
Scalability unit |
12 drives |
Hot spare |
Dedicated reserve disks and shared access disks |
Max. LUN size |
Unlimited |
Max. number of LUNs |
487 |
iSCSI |
MPIO, ACLs, CHAP-authorization, LUN masking, СRC Digest |
Supported number of sessions |
1024 |
Max. number of hosts in case of direct connection |
32 |
Supported operating systems |
Mас OS Х 10.6.8 and higher, Microsoft® Windows® Server 2008/ 2008 R2/ 2012, Microsoft ® Windows® XP/Vista/7/8; Red Hat Linux, SuSE, ALT Linux, Cent OS Linux, Ubuntu Linux; Solaris 10 |
Supported virtualization platforms |
VMware ESX 3.5/4.0/4.1/5.0/5.1/5.5/6.0; KVM (Kernel-based Virtual Machine); RHEV (Red Hat Enterprise Virtualization), Microsoft Hyper-V Server, XenServer |
Supported high-performance interface |
Fibre Channel 8Gb, 16Gb; InfiniBand (FDR, QDR, DDR, EDR); iSCSI; 12G SAS |
Supported NAS protocols |
SMB, NFS, FTP, AFP |
Integration with MS Active Directory |
Yes |
WORM (Write Once – Read Many) |
Yes |
Number of nodes |
2 in the Active/Active mode |
Data caching |
Two-tier: RAM and Flash, WriteBack and ReadAhead for multiple streams |
QoS support |
On the host/application level |
Sample Project in an IT company
The hardware infrastructure included 10 Supermicro servers with Broadcom HBA cards and Mellanox InfiniBand adapters. For synchronization purposes, ISCSI over InfiniBand was selected as the fastest method in this hardware configuration. ISCSI over Ethernet was used for failover.
RAIDIX employed three RAID 6i partitions and an average of three LUNs per partition on each server. Every server hosted VMWare ESXi 5.1 and VCenter 5.1 with virtual machines (VMs). VMs performed versatile IT functions of MS SQL Database storage, backup server, fileservers for corporate users, virtual server farms for software engineers, office services, etc.
The chosen configuration allowed for effective processing of random data, low system footprint and high reliability.
Infrastructure scheme
Storage |
Disks |
RAID |
LUN |
VM |
1 |
24 |
4 RAIDs 6i |
3 LUN per RAID |
67 |
2 |
16+36 JBOD |
6 RAIDs 6i |
1 LUN per RAID |
73 |
3 |
24 |
3 RAIDs 6i |
3 LUN per RAID |
70 |
4 |
24 |
4 RAIDs 6i |
3 LUN per RAID |
69 |
5 |
12 |
2 RAIDs 6i |
3 LUN on one RAID, 2 — on the other RAID |
33 |
6 |
24 |
3 RAIDs 6i |
3 LUN on one RAID, 2 on the other two |
71 |
7 |
24 |
3 RAIDs 6i |
3 LUN per RAID |
66 |
8 |
24 |
4 RAIDs 6i |
3 LUN per RAID |
75 |
9 |
24 |
3 RAIDs |
3 LUN per RAID |
72 |
Business Impact
- Reliable fault-tolerant storage of critical data
- Flexible virtualization of existing infrastructures
- High performance of transactional operations
- Data availability at ‘five nines’ (99,999)
- Optimized IT expenses.