There is an increasing need to bring machine learning to a wide diversity of hardware devices from the datacenter to the edge. Current frameworks rely on vendor-specific operator libraries and optimize for a narrow range of server-class GPUs. Deploying workloads to new platforms such as mobile phones, embedded devices, and accelerators (e.g., FPGAs, ASICs) requires significant manual effort. In this talk, we will talk about Apache TVM — an end to end optimizing deep compiler stack that brings deep learning models on diverse hardware back-ends that are competitive with state-of-the-art hand-tuned libraries for low-power CPU, mobile GPU, and server-class GPUs.
TVM: An Automated End-to-End Optimizing Compiler for Deep Learning Tianqi Chen
September 13, 2019