As a millennial, everything in the media that ‘Gen Z’ does often gets lumped into the millennial category. Thankfully there’s another type of Gen-Z in the world: the cache coherent memory-semantic standard. Where standards like CXL are designed to work inside a node, CXL is meant to work between nodes, providing a switched fabric or a point-to-point connectivity for memory, storage, accelerators, and even other servers.
Earlier this year we saw the announcement of a Gen-Z switch which provides a fabric backbone to which hardware can be connected. The switch allows for fabric management, switching, routing, and security, and allows hardware configuration mixes of storage, compute, and accelerators. We found one such add-on at this year’s Supercomputing: the ZMM, or Gen-Z Memory Module.
What we had in front of us was actually a dummy unit for show purposes, but it is a 3-inch wide memory device that adds additional distributed memory to the network such that different nodes can take advantage of it when needed. Inside is 256 GB of DRAM, providing 30 GB/s bandwidth through the Gen-Z interface: that’s the equivalent of dual channel DDR4-1866. The total latency is listed as 400 ns which is an order of magnitude slower than main memory. Ultimately this is slower than traditional memory-controller supported DRAM, but aims to be faster than network attached storage.
We also saw marketing for a PCIe Gen 5.0 compatible Gen-Z Connector
With these modules, ultimately the goal of Gen-Z is to have a 4U unit in a rack that customers can install any number of memory modules, storage drives, accelerators, or other compute resources, without worrying about exactly where they are in the system or how the system can access them. The Gen-Z consortium is aiming for ‘rack-scale compatibility’, and wants to be able to make these rack-level adjustments seemless to existing ecosystems without OS changes.