(This article was originally published on the DCIG web site on August 28, 2017)
Next-generation all-flash arrays will provide dramatic improvements in performance and density over the prior generation of all-flash arrays. These new levels of performance and density will bring the benefits of real-time analysis to a whole new set of problems and organizations, creating tremendous value. They will also enable organizations to achieve significant budget savings through a fresh wave of data center consolidations. But unlocking the ability of any next-generation array to deliver these savings depends on a key set of features that enable workload consolidation and simplified management.
Next-generation All-flash Arrays Provide Massive Performance in Micro Form Factors
The prior generation of all-flash arrays provided up to several hundred thousand IOPS at latencies around one millisecond. As revealed at the recent Flash Memory Summit, the next generation of all-flash arrays, enabled by NVMe, will provide millions of IOPS at latencies of under 200 microseconds. For example, E8 Storage showed its E8-D24, a 2RU appliance with claims of 10 million IOPS, 40 GB/second of bandwidth with read latency of 120 micro-seconds.
New SSD Form Factors Will Contribute to a 3x to 5x Increase in Storage Density
The highest raw storage density achieved by any product in the DCIG 2017-18 All-Flash Array Buyer’s Guidewas 192 TB/RU. This will double once the recently announced 32 TB SSDs are qualified for existing all-flash arrays. But new SSD form factors will also contribute to increased storage density. At the recent Flash Memory Summit, Samsung showed its new 16TB NGSFF SSD and described an NGSFF-based reference system that can utilize 36 NGSFF modules to provide 576TB of raw flash capacity in a one rack unit (1RU) appliance.
Intel showed its new “ruler” form factor SSD and a 1RU 32-slot design that can achieve 1 PB per rack unit based on its 32TB “ruler” SSDs.
Features that Enable Workload Consolidation and Simplification are Key to Creating Business Value
The greatest value of all-flash storage is that it enables organizations to move faster. And moving faster than one’s competitors creates wins. As Eric Pearson, the CIO of InterContinental Hotels Group has said, “It’s no longer the big beating the small. It’s the fast beating the slow.” 1
Consolidating many workloads onto an all-flash array accelerates all those workloads, and helps create competitive wins. It also enables significant reductions in overall data center costs. This flash-enabled consolidation extends beyond storage consolidation to include server and even data center consolidation.
As noted above, the next generation of all-flash arrays clearly have the storage capacity, density and low-latency performance to handle many workloads concurrently. Therefore, features that enable workload consolidation are key to unlocking business value, and are a reasonable focus for evaluation.
Features that Enable Consolidation
- Concurrent multi-protocol support (unified SAN and NAS) accelerates both block and file-based workloads
- High-speed Ethernet and/or Fibre Channel (FC) connectivity to application servers for maximum front-end bandwidth
- Non-disruptive Upgrades (NDU) and redundancy features that maximize up-time availability
- Quality of Service (QoS) features, especially QoS based on predefined service levels, enabling an administrator to quickly and easily assign each application or volume to a priority classification
- Multi-tenancy that enables distributed administration and secure sharing of the array’s physical resources
- Certified support for enterprise applications
- REST API to enable integration into automation frameworks that are the foundation for public cloud-like self-service capabilities
Features that enable simplification can significantly improve IT agility and enable the entire organization to move faster. Therefore, these features also deserve careful consideration.
Features that Enable Simplification
- Automated intelligent caching and/or storage tiering that keeps the hottest data on the lowest latency media without manual tuning
- Automated, policy-based provisioning that eliminates much routine storage administration
- Integration into hypervisor management consoles that empowers application and server administrators to quickly allocate and assign storage to new virtual machines
- Proactive remediation based on fault data that prevents component failures from becoming system failures
- Proactive intervention based on storage analytics that optimizes performance and avoids service interruptions
Cautions and Best Practice Recommendations for Next-generation All-flash Arrays
Mind the failure domain. Consolidation can yield dramatic savings; but it is prudent to consider the failure domain, and how much of an organization’s infrastructure should depend on any one component–including an all-flash array.
Focus on accelerating apps. Eliminating storage bottlenecks may reveal other bottlenecks in the application path. Getting the maximum performance benefit from an AFA may require more or faster network connections to application servers and/or the storage system, more server DRAM, adjusting cache sizes and adjusting other server and network configuration details. Some AFAs include utilities that will help identify the bottlenecks wherever they occur along the data path.
Revisit assumptions. Optimal configuration changes may not be obvious. For example, one all-flash proof of concept revealed that a database application performed much better when local DRAM caching was reduced to less than 1/4th of existing best practice guidelines. This discovery resulted in both higher performance and greater server consolidation savings.
Leverage multi-tenancy features. Use multi-tenancy to enable secure sharing of the array while limiting the percentage of array resources any server administrator or software developer can allocate.
Pursue automation. Automation can dramatically reduce the amount of time spent on storage management and enable new levels of enterprise agility. This is another place where multi-tenancy and/or robust QoS capabilities add a layer of safety.
Conduct a proof of concept implementation. This can validate feature claims and uncover performance-limiting bottlenecks elsewhere in the infrastructure.
1. Pat Gelsinger on Stage at VMworld 2015, 15:50. YouTube. YouTube, 01 Sept. 2015. <https://www.youtube.com/watch?v=U6aFO0M0bZA&list=PLeFlCmVOq6yt484cUB6N4LhXZnOso5VC7&index=3>.