AI Hardware and Real-World AI
AI is fast becoming a significant consumer of the world’s computational power, so it is crucial to use that power wisely and efficiently. Our approaches to doing so must span all levels of the research stack: from fundamental theoretical understanding of the loss surfaces and regularization properties of machine learning models, to efficient layout at the transistor level of floating-point multipliers and RAM. I will talk about projects, such as real-time computer vision on the Microsoft HoloLens HPU (about 3.5 GFLOPS ), which required extreme efficiency in both objective and gradient computations, and how this relates to the training of massive AI models on Graphcore’s IPU (about 350 TFLOPS). Key to this work is how we empower programmers to communicate effectively with such hardware, and how we design frameworks and languages to ensure we can put theory into practice.
So this talk contains aspects of: mathematical optimization, automatic differentiation, programming languages, and silicon design. Despite this range of topics, the plan is for it to be accessible and useful to anyone who loves computers.
Recording (RSECon, Nov 2022): https://www.youtube.com/watch?v=DGF1FtfbO6o