Highly Scalable Data Storage System for Postproduction Studios and Broadcasting Corporations

02.02.2018

Data Storage Solution for Media & Entertainment

Storing and accessing large files in data-rich industries, such as medical diagnostics or resource-intensive research, may be a challenging task for any IT infrastructure. As for Media & Entertainment, this segment works with files that are truly gargantuan. In case the resolution of digital content is 2K or 4K — over four times the standard resolution – an average feature film can take up to 2TB size in Full HD and up to 15TB in 8K.

Postproduction professionals operate with terabytes of data and perform video editing and grading with the use of resource-intensive applications like Blackmagic Design’s DaVinci Resolve. One of the key objectives of multimedia studios is to support high throughput required for uninterrupted video processing with no frames dropped — from multiple workstations.

At that, given the growing data volumes and limited storage capacity, postproduction companies and TV channels require high-scalability solutions. As a rule, petabyte volumes and clustered data architectures imply Scale-Out data storage. When building a storage cluster from multiple nodes, a system administrator stumbles upon the limitations of traditional file systems. These restrictions include metadata and data stored in the same volumes; low scalability in terms of capacity, performance, files quantity, directory depth; lack of cross-platform compatibility, etc. In this scenario, a distributed clustered file system comes up as an optimal solution.

In this article, we’ll share details of typical M&E tasks, performance metrics and Scale-Out solutions based on RAIDIX and the HyperFS file system for major IT infrastructures.

 

Video Production and Broadcasting Requirements

Key video production and broadcasting requirements include:

  • High throughput and sustainable performance even in case of drive failure
  • Hot spare capability with no downtime or performance degradation
  • Workload prioritization on the application level
  • Flexible support for Fibre Channel, iSCSI, NFS, SMB, FTP, AFP and other access protocols
  • High Scale-Up and Scale-Out potential.

When operating with high workloads, it’s crucial to balance storage performance, density and TCO. System performance depends on the number of drives and individual performance of each drive. As a rule, high-speed software-defined technology utilizing HDDs fully complies with sequential workload requirements. In combination with high-density JBOD enclosures, storage professionals create effective configurations that cater to a large number of parallel threads or high-definition streams. Greater scalability requires implementation of cluster solutions, such as the system based on the RAIDIX management software and HyperFS.

Aside from high performance and low latencies, the list of key features expected from a fully functional Scale-Out solution includes:

  • Single namespace for multiple storage clusters
  • Concurrent access via versatile protocols
  • File and block access to the same data.

 

Challenges and the Solution

Video data editing implies real-time reading/writing and processing of uncompressed video streams. Data storage latencies may lead to dropped frames, in which case the process will have to start over. A key task for film companies is video production within minimum timeframes. In technical terms, this requirement calls for high throughput and fault-tolerance throughout the production lifecycle.

Media holdings running tens of parallel projects need new data storage systems or regular updates to existing configurations in view of ever-growing content volumes. An eligible data storage system should enable high performance and reliable storage of multi-petabyte data volumes with minimal investment. Other critical factors involve maximum data processing speeds and reliable failover all the way from ingestion to broadcasting.

In this paper, we’ll focus on a high-capacity storage system (from 2U/48TB up to 4U/108TB), easily expandable with JBOD enclosures, high-performing and capable of handling numerous 2К/4К video streams. Data integrity with no frames dropped is ensured by proprietary RAIDIX algorithms including RAID 6 with double-parity and RAID 7.3 with triple parity. Thus, RAID 7.3 commits to uninterrupted system performance even if up to three drives fail.

As data storage management software, RAIDIX operates with commodity х86–64 components (casing, drives, interface controllers, memory, processors, etc.) and allows the end customer to customize RAID arrays for specific M&E tasks and decrease overall implementation and maintenance costs.

RAIDIX supports professional equipment from Аррlе, AJA, Blackmagic Design, as well as Хsan, metaSAN, StorNext, FalconStor environments and video editing software (Adobe Premiere, Final Cut Pro, Avid, Smoke, DaVinci Resolve, SGO Mistika, etc.). Moreover, RAIDIX allows the administrator to install professional editing and grading software right on the storage node, thus minimizing hardware overheads.

What are the RAIDIX capabilities in terms of building multi-node storage clusters? As we mentioned before, this task goes beyond the functionality of traditional file systems (FS). Classic FS impose the following limitations:

  • Metadata and data are stored on the same partitions
  • Files are ‘scattered’ across the partition, causing access latencies
  • No protection against defragmentation
  • Low scalability by capacity, performance, file number, directory depth, etc.
  • Lack of ‘native’ cross-platform support.

These issues can be resolved with the aid of cluster file systems. The HyperFS system from Scale Logic, for instance, ensures high scalability with full process transparency for the customer and shared access to data from various OS’s (in particular, through a dedicated NAS-gateway). Integrated with HyperFS, the RAIDIX software allows for a single namespace for SAN and NAS.

Technical benefits of the comprehensive solution include:

  • Up to 4B files in a single directory
  • Up to 4096 partitions that can be consolidated within a single FS
  • No single point of failure (SPOF)
  • Dynamic FS scalability by volume and performance
  • Support for the latest versions of popular OS — Mac/Windows/Linux.

The HyperFS SAN file system ensures required redundancy, high data availability, mirroring of paths and data. HyperFS for SAN helps to transform multiple FS or iSCSI drive arrays into a unified storage cluster. This cluster enables parallel editing and playback from several client machines, as well as high performance and shared data access within a single namespace. The system encompasses an optional metadata controller (MDC) with redundancy structure as well as a full-redundancy SAN-structure with metadata mirroring. HyperFS SAN also supports multi-path configurations in the Fibre Channel and iSCSI environments. The system reveals no SPOF and ensures high storage reliability.

Image 1. HyperFS SAN Infrastructure

The capabilities of Scale-Out NAS systems employed in major M&E infrastructures involve consolidation of up to 64 nodes in a single cluster, concurrent access via versatile protocols (SMB v2/v3, NFS v3/v4, FTP/FTPS, HTTP/HTTPS/WebDAV), workload balancing across the nodes (Round-Robin, Connection Count, Load node), and Active Directory support.

Enhanced RAIDIX and HyperFS features include:

  • System optimization for large and small files
  • Support for user and folder quotes
  • SNMP monitoring by SNMP for SONG and MDC
  • LDAP/Active Directory — an opportunity to use a local user database or integrate with Active Directory
  • ACL support — an opportunity to utilize ACL on all supported operating systems.

Bottom line: the system based on RAIDIX and HyperFS provides film companies, postproduction studios and TV channels with a high-performance solution featuring a single namespace, concurrent access via various protocols, low latencies, high scalability, file and block access to the same data.

 

Solution Architecture

The data storage architecture (see pic. 2) based on RAIDIX and the HyperFS cluster system is comprised of three key components:

  • Storage nodes (SharedDisk/data storage system). HDD-based systems aimed at fault-tolerant data storage
  • Directory services (MDS). Intended for storing data references, resource arbitration and access management. Directory services per se do not store data or metadata.
  • Clients. RAIDIX clients may be represented by servers or computers with pre-installed client software for shared access.

In a scenario that involves connection of a large number of clients with no specific client software installed, the solution architecture allows for setting up NAS gateways that enable data operations.

Image 2. Data storage architecture based on the RAIDIX software and cluster file system

 

Technical Characteristics

System capacity

64 ZB (theoretical limit)

Max. number of files/objects/folders

Up to 4,000,000,000 when using 4TB metadata volume

File size

64 ZB (theoretical limit)

File name length

Windows: 255 ASCII characters; Linux/Mac: 255 ASCII characters

Directory depth

Windows: 244 characters; Linux: 4096 Bytes

Max. number of LUNs

4093

Exported Paths

512

No. of metadata controllers (MDC)

Up to 2, can be configured in HA mode

Number of concurrent file systems

16

Full redundancy configuration

Supported: No single point of failure

Dynamic file system expansion

Yes, LUNs can be added with no downtime

Supported SAN client OSs

Windows 7 32/x86_64/Win 8, Win 10
Windows 2008/2008_R2/2012/2012_R2/Server 32/x86_64/2016_x86_64
SUSE 11 SP1-3
OS X 10.7-10.12

SSD Support

Yes

 

Business Impact

RAIDIX allows the administrator to employ multiple storage nodes, distributing data dynamically and balancing workloads across the nodes. The solution architecture enables adding of new system nodes by request — without the need for data migration or system re-configuration.

The main benefit of RAIDIX is the ability of concurrent high-performing data processing on the block level from a single or multiple data storage systems and numerous workstations, which is not possible within a classic SAN architecture.

The RAIDIX technology in combination with the HyperFS file system complies with the high performance and fault-tolerance requirements, and ensures shared access to video content from multiple workstations. The use of the RAIDIX technology allows the user to minimize hardware overheads when building a storage cluster — by providing effective scale-out of the existing infrastructures with no downtime or performance slump.