Matias Bjorling in the paper he co-wrote in 2012 calls for the necessary death of the block device interface in the linux kernel as we know it. Flash was just emerging as a storage tier in enteprise and infrastructure IO stack back then.
Going back to the era of the creation of block device abstration in Unix (late 1970s – early 1980s), POSIX (IEEE standardization efforts) also published file and directory access APIs that are OS independent. Around the same time 3.5″ HDD (Circa 1983) came into existence that enabled both the PC and the workstation form factors. The operating system level abstraction and the IEEE standardization process enabled storage to be segregated as a set of well defined APIs resulting in storage as an industry – which over the past 30 years is more than $50B in size.
30 years later, the flash entered the enterprise or infrastructure segment. Around the same time a number of KV stores have emerged that have tried to map application use cases (NoSQL, databases, messaging to name a few) to flash and used variety of KV abstraction APIs to enhance the integration of Flash in the platform. Around the same period, we have also seen object stores emerge and the cloud and S3 has emerged to be a default standard effectively as an object store, specifically to users of AWS.
With the emergence of the NVRAM (or 3D xpoint), the reasons outlined in the paper and the rationale are even more obvious. Until recently, I believed that a well defined and designed KV store is the new ‘block device’. While that remains true, without the standardization process, it will never have wide acceptance or become the new ‘block device’. Similar to late 1970s, there are three things that are forming the storm clouds to posit the new block device. They are
- Emergence of 3D Xpoint or SCM as an interim tier between memory and flash which has both memory semantics as well as storage (or persistence) semantics
- Emergence of S3 as a dominant API for application programmers to leverage cloud based storage and in general S3 as a dominant API for today’s programmer.
- The need for a POSIX like OS independent (today you will call it cloud independent) ‘KV store’ that addresses both the new stack (SCM + Flash) as well as handles latency and throughput attributes that these new media offer that would be otherwise limiting with the old block interface
Its obvious that the new storage API will be some variant of a KV store.
Its obvious that the new storage will be ‘memory centric’ in the sense that it has to comprehend the SCM and Flash Tier as the primary storage tiers and thus adhere to latency and throughput as well as failure mode requirements.
If the new interface is necssarily KV like, why not make ‘S3 compatible’ interface for the emerging new persistence tier (SCM and Flash). Standardization is key and why not co-opt the ever popular S3 API?
AWS has a unique opportunity to re-imagine the new memory stack (SCM, flash) and propose a ‘high performance’ S3 compatible API and offer it as the new ‘POSIX’ standard.