Pizza to Dojo…(Open *)

30+ years back (April 1989 to be exact) was the the launch of the Pizza box form factor for servers (and eventually became 1U/2U nomenclature), but it was the simple mechanical form factor that was the beginning of the server building block, but it was also an approach to design we learnt from Andy Bechtolsheim and impinged in memory. As the Wikipedia article notes Andy, “specified that the motherboard would be the size of a sheet of paper and the SBus expansion cards would be the size of index cards, resulting in an extremely compact footprint“.

But there was more to that simple statement than just formfactor. The system design i.e. the main silicon components had to be designed to fit. That was his view and that is what I got my chops of think silicon in the context of the system. It was the beginning of the journey of more silicon integration (floating point, then cache, followed by MMU, then memory controller and IO controller – all within 3 years), better IO (Sbus when compared with the ATI bus in PCs). The outside in-thinking was carried forward and Andy continued to lead the industry in designing 2-way and 4-way SMP in small form factors which forced silicon integration, re-think interfaces, innovate in packaging with that outside in-thinking. We were in the ‘green zone’ of moore’s law. It was a full stack solution as we had control over the silicon, platform hardware, operating system and perhaps even user experience.

Silicon and thus hardware was increasingly getting more integrated while software started its journey down that decade to disintegrate – starting with domains in big SMPs (E10K) to Virtualization and the massively distributed systems that we have to come to accept as the modern mainframe. Reflecting, the core form factor and components inside that has not changed the last 30 years. More integration, smaller coherence domains, DRAM and processor followed moore’s law and IO evolved from Sbus -> PCI -> PCIe. EDA, MCAD and HPC codes loved Pizza (boxes)i.e it was then the favourite of the emerging high performance computing workloads. But within 10 years, it was no longer the recipe for just those esoteric EDA engineers or HPC folks. That evolved to grow big in the form factor of E10K to run the world’s enterprise and eventually a separate stream taking those Pizza boxes (Akamai was first before Google, Amazon and Azure) and step+repeat to build the cloud as we know today.

The ‘system’ in retrospect was fairly simple and it started from the bottom up simple form factor to build more complex and scalable platforms.

30 years of systems innovation has been focused on integrating more into silicon and using software to both scale and disintegrate that silicon (virtualization, distributed system), fueled in large part by the congruent evolution of ethernet and networking.

At hotchips this year, folks at Tesla took a different approach than what has not been done since 1989. Like then – took a form factor view and define some key attributes that had to be solved (cost of communication, memory/compute/IO ratios, latency across the span and many more). The form factor was an aisle in a data center.

Source: Beyond Compute

Dojo engineers re-imagined all layers of this hierarchy instead of being beholden to the current source of silicon and platforms ( because they can) into this.

Source: Beyond Compute

With that they got back to this.

Source: Beyond Compute

Distracting note: At the other end of this spectrum Apple with its M1 is pursuing a different full stack optimization.

While there has been tremendous innovations in ‘distributed systems’ the last 20 years as manifested by the cloud at Google, Amazon and Microsoft, the basic hardware building block and the hierarchies and the metrics for platform design followed the same rules of thumb as moore exponential was still in vogue.

Looking back it has been incremental in large part as new workloads like ML training and inferencing that demanded heterogeneity in compute. With that heterogeneity need, drives the need to deal with disaggregated memory, deal with differening ratios of compute, memory or IO based on model/ap, which did not appear until recently. There is another reason beyond workloads to look at this from new perspective. The same Moore curve is not ahead of us. So flattening the hierarchy in hardware and software, re-imaging the network/communication pathways, disaggregating the system, but ‘integrate’ at the platform software level.

Ganesh Venkataramanan calls out the following software stack opportunities – use new APIs for ML (Unix System call was the API between SW and hardware back then), think massively parallel, more compilers, less kernel, renewed focus on distributed compilers, reduced OS roles, flexible ratios of memory-compute and IO (we had fixed ratios for 30 years), disaggregate memory.

source: Beyond Compute

This time, we have to do the reverse. Instead of starting from the Pizza box to build the data center, work backwards from the data center system design to derive the new Pizza box or perhaps Pizza stack.

An at-scale disaggregated system has to be ‘shrunk’ to meet the demands of deployment everywhere not just as a training supercomputer, just like the Pizza box of 1989 evolved to run all workloads and eventually everywhere.

While Dojo might look like a training supercomputer for the select few, taking an approach to shrink from the data center to the pizza stack as the new basic building block could well expand this to spawn itself from the ML training data center to every enterprise workload past, present and future. Perhaps this is akin to building the Roadster then model S and Model 3 and perhaps throw in a cybertruck along the way!

The only way to do that is open up the platform for others to innovate or build the same with open standards, open source and the modern version of Open Systems. Both will happen. The era of the Open Cloud is opening. We have reached that inflection point like 1989.

We cannot predict what is possible in 10 years…but we can imagine and realize it.

Tesla’s Ganesh Venkataramanan shared the rationale behind their approach.

Author: renuraman

Always connecting the dots....

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

wrong tool

You are finite. Zathras is finite. This is wrong tool.

----- Thinking Path -------

"knowledge speaks but wisdom listens" Jimi Hendrix.

The Daily Post

The Art and Craft of Blogging

WordPress.com News

The latest news on WordPress.com and the WordPress community.

%d bloggers like this: