Hot Chips 2020 Live Blog: Silicon Photonics for AI (6:00pm PT)
08:28PM EDT – Lightmatter developers silicon photonics based AI systems
08:29PM EDT – A new type of computer
08:29PM EDT – Lightmatter Mars
08:29PM EDT – multi-chip solution
08:31PM EDT – Workloads grow out to datacenter scales
08:31PM EDT – a new hw approach is needed
08:31PM EDT – Using standard CMOS
08:31PM EDT – Optical transport
08:32PM EDT – Perform computation in the optical domain, even in parallel
08:32PM EDT – 1000x faster than electronics at 10x speed with 1000x les pwoer for the same die area
08:32PM EDT – Single MAC at microwatt in photonics vs a milliwatt in electronics
08:32PM EDT – Takes about the same area
08:33PM EDT – 10s watts for data transport – free with optics
08:33PM EDT – Free from RC time constants
08:33PM EDT – 10s of watts to single digit microwatts
08:34PM EDT – MZI phase shift interference detection
08:34PM EDT – Mach Zehnder Interferometer
08:34PM EDT – Interference creates a multiplier
08:34PM EDT – No fundamental energy required
08:34PM EDT – near-zero
08:35PM EDT – independent of process, voltage
08:35PM EDT – Many ways to build phase shifters
08:35PM EDT – Mars uses Nano Optical Electro Mechanical System NOEMS
08:35PM EDT – Run at 100s of MHz vs 10s of kHz
08:36PM EDT – Mars uses a mechanical solution
08:36PM EDT – effect the refractive index
08:36PM EDT – suspend in air during manufacture and etch under it
08:36PM EDT – Cdyn is super low
08:37PM EDT – Optical Vector MAC
08:37PM EDT – Directional couplers
08:38PM EDT – 2×2 matrix multiplied by a 1×2 vector
08:38PM EDT – At speed of light, almost zero power
08:38PM EDT – Array of MZIs
08:38PM EDT – Build large matrix vector structures
08:38PM EDT – 1000×1000 or larger
08:39PM EDT – 1000s MACs per 100ps
08:39PM EDT – limitation is surrounding electronics
08:39PM EDT – High speed data photonics at the edge
08:39PM EDT – Performance scales with area
08:39PM EDT – power scales with sqrt(area)
08:40PM EDT – 64 DAC and 64 ADC = 4096 MACs
08:40PM EDT – Limit is pushing the weights into the array
08:40PM EDT – 3 orders of magnitudes of order faster than electronics
08:41PM EDT – Each element can take multiple data points – parallel processing
08:42PM EDT – optics in different colors etc
08:42PM EDT – like fiber optics
08:42PM EDT – 1 GHz vector rate – set my data conversions
08:42PM EDT – 50mW laser
08:42PM EDT – 90nm GloFo standard photonics process
08:42PM EDT – 150mm2
08:42PM EDT – yield very well
08:43PM EDT – Mars SoC 14nm custom ASIC
08:43PM EDT – mm2
08:43PM EDT – Analog interfaces to Photoics
08:43PM EDT – SRAM for weights and activations
08:44PM EDT – single fully synchronous pipeline scehduler
08:47PM EDT – 3W TDP…
08:48PM EDT – Most power is data movement
08:48PM EDT – 3D integration
08:49PM EDT – optical core and ASIC are stacked
08:49PM EDT – Laser power coming in from external to chip
08:49PM EDT – Support for ML Frameworks – Pytorch, TensorFlow, ONNX
08:51PM EDT – Q&A Time
08:51PM EDT – Q: 3W TDP? A: That’s for SiPh, Laser, SoC, everything
08:52PM EDT – Q: Perf on Resnet? A: Not publishing yet, but we have simulator results and demo chips in the lab
08:52PM EDT – Q: MLPerf? A: We’re working on it!
08:53PM EDT – Q: Models bigger than on-system memory? A: We know it’s an important problem to solve! We think we can solve it through photonics. Looking at scale out solutions for training and inference
08:54PM EDT – HBM roadmap is good, but it doesn’t scale with the BW that we need, so we need solutions that scale at factors of 10, that’s what we’re look at
08:56PM EDT – Q: How robust are the inteferometers? A: MEMS have a yield – you can enhance based on scales on feature sizes. Reliability – these things have to live in a datacenter for 10 years, so we’re looking at robust devices. High reliability MEMs devices technology has been around a while. These devices are really small, so these are tiny – we’re not pushing to a limit. These are tiny movements to affect the effective refractive index
08:56PM EDT – Q: DAC precision? A: Not as critical as you think. Generally 8-bit DAC. We can scale to 12-bit and still build high perf system. We’re building something that matches the rest of the industry. DAC/ADC are the rate limits of the design
08:57PM EDT – Q: 200ps, does that include digital? A: No, just the photonics and analog
08:58PM EDT – Q: Other neural networks? A: We’re looking at them, but our goal is a general purpose accelerator. We have space on the 14nm ASIC of course
08:58PM EDT – Q: Limitations on weight matrices? A: No, we can represent any matrix
08:59PM EDT – That’s the end of Hot Chips! I hope you’ve enjoyed the Live Blogs. There’s so much to piece through after the fact. The slack channel for attendees was going crazy. There’s a small wrap up talk for a few minutes
09:00PM EDT – 2294 total participants this year!
09:00PM EDT – 2302! this slide was made too early
09:00PM EDT – Last year was a record 1250
09:01PM EDT – ~3.5% press in previous years
09:02PM EDT – Public slide decks and videos will be available later this year
09:07PM EDT – Thanks to everyone for tuning in. If you loved our content, sign up next year to watch it live 🙂