“No army can withstand the strength of an idea whose time has come” – Victor Hugo.
We seem to have reached that moment in time for storage like with compute back in Circa 2003.
Why 2003? The 2 socket 2U became the workhorse compute engine for range of workloads (the web cloud and enterprise virtualization).
This is a form factor that spawned a whole category of compute that was not done before. The same underlying silicon technology (processor and DRAM) enabled the acceleration of adoption of the architectural shift. I pick 2003 because between 2002 and 2004 number of events that conspired to rise of compute virtualization while simultaneously enabling distributed computing. Here are some notable events.
- Throughput (Multi-core/threading) over latency (MHz): AMD announcement of Opteron (x86-64) in 2003 with Intel following suite in Gallatin in 2003, followed by its own 64 bit in 2004. It was the beginning of the multi-core era for x86 (whereas it had dawned on others earlier (sparc & sparc ). DDR2 memory was introduced at the same time.
- Compute Virtualization: VMware acquisition by EMC but more importantly major update to ESX version with support for VMotion, Virtual SMP. VMware drove the definitions of Virtualization into x86 CPUs that showed up 2-3 years later. (VT-x etc)
- Distributed Computing: Google sharing ‘GFS’ the paper that was perhaps the motivation for MapReduce and eventually Hadoop
The common theme with these 3 events were, the availability of dual socket multi-core CPU based system (dell-2U) with more than adequate compute and memory at low cost that enabled both server consolidation (Virtualization) and emergence of distributed computing outside of Google (Hadoop for e.g.).
2017 = 2003 for storage. The emergence of 2U 24 Drive NVMe storage
- Storage throughput: Emergence of NVMe as the performant flash/storage interface with potential cost cross over SSD and more importantly delivering high throughput much like multi-core/multi-socket CPUs for compute. With each drive sustaining 2-3GB/s, a commodity storage platform can deliver 30-40 0GB/s. This is timely with the emergence of RoCE and 100Gb Ethernet (4x100Gb)
- Emergence of 16 TB flash drives (3D-NAND) and cost of NAND is at the cusp of dramatic cost decrease with capacity increase.
- Distributed System: The emergence of variety of robust distributed data stores (some call it software defined storage) solutions – mostly from emerging startups (see below).
Once again thanks to underlying silicon technology (3D NAND in this case and NVMe), throughput, capacity and cost reach a perfect storm that will enable a whole range of new categories for storage. The value shifts to the software as it did then with VMware. Time is ripe for ‘VMware’ of storage to emerge.
The emergence of Top Of the Rack Storage (TORS), much like TOR emerged with the transition to commodity 2 socket rack mount systems, will enable a whole new class of systems to be deployed. For e.g. its now possible to go ‘diskless’ in each server and with the advent of NVMoF coupled to a TB 2U box, its very likely that most cloud scale infrastructure could be built with compute that has just CPU and memory and all storage consolidated to this TORS within the rack.
Shirjeet Mukherjee of Cumulus networks makes a good analogy and similar observation for networking (see trident ). He asks..
- Which OS will unlock the networking innovations and thinking like Linux vendors like RedHat, SuSE, and TurboLinux did for compute applications? ….
A corollary question is who will emerge to be the ‘VMware’ for storage and what are the key attributes. The who is likely to be a company that was founded a few years back much like Google and VMware were founded a few years before the 2003 moment. The key attributes are truly distributed data management system that is drive, node, rack, DC failure tolerant, continuous availability, expose file, block, object and emerging data access methods (KV, tables, streams, queues etc).
Looking back – 1998 was an interesting year. That was the year VMware and Google were founded. Co-incidently In 1998, I led the team that enabled Sun to deliver the lowest cost ($1K) workstation and server running Unix that was faster than Intel CPUs while at the same time Sun announced the E10K at $1M apiece. Little did I know then 5 years later, the seeds of the shift was sown in 1998.