DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion (PLDI 2021 - PLDI Research Papers)

Who

Wei Niu, Jiexiong Guan, Yanzhi Wang, Gagan Agrawal, Bin Ren

Track

PLDI 2021

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 23 Jun 2021 13:50 - 13:55 at PLDI-A - Talks 2A: Machine Learning
Thu 24 Jun 2021 01:50 - 01:55 at PLDI-A - Talks 2A: Machine Learning

Abstract

Deep Neural Networks (DNNs) have emerged as the core enabler of many major applications on mobile devices. To achieve high accuracy, DNN models have become increasingly deep with hundreds or even thousands of operator layers, leading to high memory and computational requirements for inference. Operator fusion (or kernel/layer fusion) is key optimization in many state-of-the-art DNN execution frameworks, such as TensorFlow, TVM, and MNN, that aim to improve the efficiency of the DNN inference. However, these frameworks usually adopt fusion approaches based on certain patterns that are too restrictive to cover the diversity of operators and layer connections, especially those seen in many extremely deep models. Polyhedral-based loop fusion techniques, on the other hand, work on a low-level view of the computation without operator-level information, and can also miss potential fusion opportunities. To address this challenge, this paper proposes a novel and extensive loop fusion framework called DNNFusion. The basic idea of this work is to work at an operator view of DNNs, but expand fusion opportunities by developing a classification of both individual operators and their combinations. In addition, DNNFusion includes 1) a novel mathematical-property-based graph rewriting framework to reduce evaluation costs and facilitate subsequent operator fusion, 2) an integrated fusion plan generation that leverages the high-level analysis and accurate light-weight profiling, and 3) additional optimizations during fusion code generation. DNNFusion is extensively evaluated on 15 DNN models with varied types of tasks, model sizes, and layer counts. The evaluation results demonstrate that DNNFusion finds up to $8.8 \times$ higher fusion opportunities, outperforms four state-of-the-art DNN execution frameworks with $9.3\times$ speedup. The memory requirement reduction and speedups can enable the execution of many of the target models on mobile devices and even make them part of a real-time application.

DOI

https://doi.org/10.1145/3453483.3454083

Wei Niu

College of William & Mary

United States

Jiexiong Guan

College of William & Mary

United States

Yanzhi Wang

Northeastern University

United States

Gagan Agrawal

Augusta University

United States

Bin Ren

College of William & Mary

United States

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 23 Jun
Displayed time zone: Eastern Time (US & Canada) change

13:30 - 14:05	Talks 2A: Machine LearningPLDI at PLDI-A +12h

13:30 5m Talk		Learning to Find Naming Issues with Big Code and Small Supervision PLDI Jingxuan He ETH Zurich, Cheng-Chun Lee EPFL, Veselin Raychev DeepCode, Martin Vechev ETH Zurich DOI
13:35 5m Talk		Fast and Precise Certification of Transformers PLDI Gregory Bonaert ETH Zurich, Dimitar I. Dimitrov ETH Zurich, Maximilian Baader ETH Zurich, Martin Vechev ETH Zurich DOI
13:40 5m Talk		Web Question Answering with Neurosymbolic Program Synthesis PLDI Jocelyn Qiaochu Chen University of Texas at Austin, USA, Aaron Lamoreaux University of Texas at Austin, Xinyu Wang University of Michigan, Greg Durrett University of Texas at Austin, USA, Osbert Bastani University of Pennsylvania, Işıl Dillig University of Texas at Austin DOI
13:45 5m Talk		Robustness Certification with Generative Models PLDI Matthew Mirman ETH Zurich, Alexander Hägele ETH Zurich, Timon Gehr ETH Zurich, Pavol Bielik ETH Zurich, Martin Vechev ETH Zurich Link to publication DOI
13:50 5m Talk		DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion PLDI Wei Niu College of William & Mary, Jiexiong Guan College of William & Mary, Yanzhi Wang Northeastern University, Gagan Agrawal Augusta University, Bin Ren College of William & Mary DOI
13:55 5m Talk		Vectorized Secure Evaluation of Decision Forests PLDI Raghav Malik Purdue University, Vidush Singhal Purdue University, Benjamin Gottfried Purdue University, Milind Kulkarni Purdue University DOI Pre-print
14:00 5m Talk		AKG: Automatic Kernel Generation for Neural Processing Units using Polyhedral Transformations PLDI Jie Zhao Hunan University, Changsha, Hunan, Bojie Li Huawei Technologies, Wang Nie Huawei Technologies, Zhen Geng Huawei Technologies, Renwei Zhang Huawei Technologies, Xiong Gao Huawei Technologies, Bin Cheng Huawei Technologies, Chen Wu Huawei, Yun Cheng Huawei Technologies, Zheng Li Huawei Technologies, Peng Di Huawei Technologies, Kun Zhang Huawei Technologies, Xuefeng Jin Huawei Technologies DOI

Thu 24 Jun
Displayed time zone: Eastern Time (US & Canada) change

01:30 - 02:05	Talks 2A: Machine LearningPLDI at PLDI-A

01:30 5m Talk		Learning to Find Naming Issues with Big Code and Small Supervision PLDI Jingxuan He ETH Zurich, Cheng-Chun Lee EPFL, Veselin Raychev DeepCode, Martin Vechev ETH Zurich DOI
01:35 5m Talk		Fast and Precise Certification of Transformers PLDI Gregory Bonaert ETH Zurich, Dimitar I. Dimitrov ETH Zurich, Maximilian Baader ETH Zurich, Martin Vechev ETH Zurich DOI
01:40 5m Talk		Web Question Answering with Neurosymbolic Program Synthesis PLDI Jocelyn Qiaochu Chen University of Texas at Austin, USA, Aaron Lamoreaux University of Texas at Austin, Xinyu Wang University of Michigan, Greg Durrett University of Texas at Austin, USA, Osbert Bastani University of Pennsylvania, Işıl Dillig University of Texas at Austin DOI
01:45 5m Talk		Robustness Certification with Generative Models PLDI Matthew Mirman ETH Zurich, Alexander Hägele ETH Zurich, Timon Gehr ETH Zurich, Pavol Bielik ETH Zurich, Martin Vechev ETH Zurich Link to publication DOI
01:50 5m Talk		DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion PLDI Wei Niu College of William & Mary, Jiexiong Guan College of William & Mary, Yanzhi Wang Northeastern University, Gagan Agrawal Augusta University, Bin Ren College of William & Mary DOI
01:55 5m Talk		Vectorized Secure Evaluation of Decision Forests PLDI Raghav Malik Purdue University, Vidush Singhal Purdue University, Benjamin Gottfried Purdue University, Milind Kulkarni Purdue University DOI Pre-print
02:00 5m Talk		AKG: Automatic Kernel Generation for Neural Processing Units using Polyhedral Transformations PLDI Jie Zhao Hunan University, Changsha, Hunan, Bojie Li Huawei Technologies, Wang Nie Huawei Technologies, Zhen Geng Huawei Technologies, Renwei Zhang Huawei Technologies, Xiong Gao Huawei Technologies, Bin Cheng Huawei Technologies, Chen Wu Huawei, Yun Cheng Huawei Technologies, Zheng Li Huawei Technologies, Peng Di Huawei Technologies, Kun Zhang Huawei Technologies, Xuefeng Jin Huawei Technologies DOI