Making computer vision systems that work: Boujou, Kinect, HoloLens
I have been lucky enough to have been involved in the development of real-world computer vision systems for over twenty years. In 1999, prize-winning research from Oxford University was spun out to become the Emmy-award-winning camera tracker “boujou”, which has been used to insert computer graphics into live-action footage in pretty much every movie made since its release, from the “Harry Potter” series to “Bridget Jones’s Diary”. In 2007, I was part of the team that delivered human body tracking in Kinect for Xbox 360, and in 2015 I moved from Microsoft Research to work on Microsoft’s HoloLens, an AR headset brimming with cutting-edge computer vision technology, where I worked on the fully-articulated hand tracking for HoloLens 2. In all of these projects, the academic state of the art has had to be leapfrogged in accuracy and efficiency, sometimes by orders of magnitude. Sometimes that’s just raw engineering, sometimes it means completely new ways of looking at the research. If I had to nominate one key to success, it’s a focus on, well, everything: from low-level coding to algorithms to user interface design, and on always being willing to change one’s mind.
<html> <iframe width=“560” height=“315” src=“https://www.youtube.com/embed/xqd1L6ubP64” frameborder=“0” allow=“accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture” allowfullscreen></iframe> </html>