There is a strong movement in the industry towards “software defined everything” – compute, network, security, and storage, leading to a vision of software defined data center. In this blog post, I share my perspectives on the major storage trends and the opportunities/challenges that lie ahead in realizing the vision.
Server virtualization brought efficiencies to the compute infrastructure. However, as compute virtualization in the datacenter matured and provided flexible and programmable compute architecture, it exposed the rigidity in networking and storage functions. Software Defined Networking (SDN) rose to prominence in 2012, and promised to bring the same level of flexibility and programmability to network and security functions as Server Virtualization did to compute functions.
The next target was storage so the industry coined the term Software Defined Storage (SDS) borrowing similar concepts from SDN. Before we get into SDS discussion, it would be good to briefly look at SDN and draw parallel between SDS and SDN.
VMware recently announced NSX, a network virtualization(NV) platform which isn’t quite the same as SDN. Martin Casado, founder of Nicira and CTO of Networking at VMware explains the differences and acknowledges the lack of a clear definition of SDN causing confusion with customers. Cisco responded with ACI sparking a debate on ACI vs NSX which will be a topic for another blog post.
Just as server virtualization provides software constructs such as vCPU, vRAM, vNIC decoupled from hardware, network virtualization provides L2-L7 software constructs for switches, routers, load balancers and firewalls to allow programmatic provisioning of network along with compute.
Similar trends can be seen in the telecom industry where functions such as session border controller (SBC) are moving away from appliances into general purpose servers. The industry term for this trend is network function virtualization (NFV). What drove this trend is essentially Operators getting tired of dealing with lifecycle management issues related to multiple appliances from multiple vendors and seeing an opportunity to virtualize the functions on commodity hardware. Again, I’ll cover this topic later in a separate blog post.
Software Defined Storage Trends
The reason I bring the SDN, NV and NFV topics into the discussion here is because I see a broader IT trend here which is shaping the storage industry too.
- Agility: Enterprise IT organizations need agility in managing the infrastructure to meet the business needs, or risk losing the business to public cloud providers.
- Cost: Datacenter architecture is trending towards more and more general purpose and homogeneous hardware with virtualized infrastructure services layered on top to allow automation and flexible provisioning to serve application needs of the business. This trend is disruptive to traditional hardware appliance vendors.
So how is this broader IT trend shaping the storage industry? I would classify it in 4 major areas:
- Storage Evolution will follow Networking Evolution
Storage, in many ways, is similar to networking and you’ll see software defined storage causing the same terminology confusion as SDN did. If you look around you’ll find many definitions of Software Defined Storage, mostly tweaked by vendors to suit their own interests. Established vendors such as NetApp are saying they have had Software Defined Storage (SDS) features for years.. Just didn’t call it SDS.
EMC is taking a different approach and has recently introduced ViPR, a new software only offering in the form of a controller similar in approach to OpenFlow controller. And then there is a whole slew of startups each defining SDS in their own terms. For instance, Nexenta provides a storage solution based on OpenSolaris and ZFS running on commodity JBOD, adding yet another angle to SDS definition.
Regardless of the lack of a clear definition of SDS, the key objective that the industry is trying to accomplish, just like SDN, is to reduce complexity of managing storage in a virtualized world of compute and network. Server virtualization has created a new set of challenges for storage management where the granularity of storage management has moved from physical server to VMs hosted on it. Existing storage boxes had no notion of a VM, creating an opportunity for startups such as Tintri to build VM aware storage. Meanwhile, Vmware announced vVOLs to allow storage vendors to build VM awareness into their appliances.
- Converged storage and compute emerging as SAN/NAS killer for certain workloads
Motivated by the architectures of HyperScale companies such as Google and Facebook, another key trend is to bring compute and storage together in a scale out architecture that obviates the needs for traditional SAN/NAS devices. Nutanix is one of the leading players in that space, and Vmware recently announced VSAN that is built upon the same principles bringing SAN like benefits at much lower cost. Microsoft Windows Server 2012 has several new features such as Storage Spaces and SMB 3.0 Scale Out File Server (SoFS) that builds upon similar architectural principles.
Maxta is another startup that recently came out of stealth mode. They provide a storage pool created from local storage and exposed as an NFS appliance to each hypervisor.
- Object Storage as the fastest growth segment driving major architectural changes
Object storage is the fastest growing segment in cloud storage. Object storage, unlike file/block storage stores the entire object and its metadata together and provides an object identifier for access. Object storage implementations typically provide REST interfaces with simple put, get or delete operations.
While one can build object based storage arrays with software and commodity JBODs, Seagate is removing many layers of the hardware/software stack, taking the object interfaces directly to its drives, through a new platform called Seagate Kinetic
- Flash will deliver IOPS while HDD delivers capacity
Finally, flash is emerging as a mainstream storage technology in the enterprise space. One can find a variety of flash based products both as pure flash arrays or PCIe cards, as well as in hybrid disk arrays. Companies such as Skyera are trying to bring the cost of flash down to $1-$2 per GB without deduplication and compression (close to the price point of HDD) in their 1U 65TB to 250TB appliances offering up to 5 million IOPs. Price, performance and density economics of these devices will start changing the disk vs. flash array mix in datacenters in a big way.
Opportunities and Challenges
While the SDS space is still evolving, customers can start taking advantage of some of the offerings for certain type of workloads. However, so much has changed in the storage space over the last few years that it becomes difficult for customers to make appropriate decisions to address their business problems.
The key dilemma is – Storage decisions need to be made w.r.t many things such as location – cloud or on-prem, technology – flash, disk or hybrid, types of storage – file/block or object and so forth. Imagine making all of these decisions at scale across thousands of VMs and applications, and managing the dynamic storage needs of these workloads to meet the application SLAs while optimizing cost.
In an ideal world, a CIO would like to buy storage capacity in all of these categories from best of breed vendors, throw a storage controller on top of them to best manage his applications requirements (performance, availability etc) dynamically at optimal cost. This would be the true zero admin storage nirvana.
With the current trends, we seem to be heading in that direction but are a few years away from seeing mature solutions. Can EMC ViPR evolve to be that intelligent storage controller? Or would that be a new offering from another startup?
Another key challenge is how do you enforce application SLA in a complex and heterogenous data center environment, especially from a storage perspective? In a virtualized world where the path of an IO request from a VM is complex traversing many layers, it is hard to define and meet end to end policies. Just as in OpenFlow, one needs to define the notion of a “storage flow” and instrument all layers from OS, drivers, hypervisors, network, and storage end points to rate limit the flow and manage storage policies through a control plane. Microsoft Research has recently published an architecture that attempts to address the challenge.
To conclude, there is a lot of innovation happening in storage software that is going to disrupt the traditional storage industry once again. It’s an exciting time to be in the IT industry where everything is going to be “Software Defined”. As Marc Andreessen explained “Why Software is eating the world”, storage industry appears to be the next prey of software.