MIT Fast Code Seminar

Title: AutoML for Efficiently Designing Efficient Neural Network Architectures

Abstract: Efficient deep learning computing requires algorithm and hardware co-design to enable specialization and acceleration: we usually need to change the algorithm to reduce memory footprint and improve energy efficiency. However, this is a large design space: neural network models are notoriously hard to tune to balance accuracy and latency. Human engineers can hardly exhaust the design space by heuristics. It's labor consuming and sub-optimal. This talk will describe our recent AutoML techniques to systematically guide neural network architecture design. It will cover: automatically design small and fast models (ProxylessNAS), auto channel pruning (AMC), and auto mixed-precision quantization (HAQ). We demonstrate such learning-based, automated design achieves superior performance and efficiency than rule-based human design. Moreover, we shorten the design cycle by 200x than previous work, so that we can afford to design specialized neural network models for different hardware platforms.

Bio: Song Han is an assistant professor at MIT EECS. Dr. Han received the Ph.D. degree in Electrical Engineering from Stanford advised by Prof. Bill Dally. Dr. Han's research focuses on efficient deep learning computing. He proposed "Deep Compression" and "EIE Accelerator" that impacted the industry. His work received the best paper award in ICLR'16 and FPGA'17. He is the co-founder and chief scientist of DeePhi Tech (a leading efficient deep learning solution provider), which was acquired by Xilinx in 2018.