AKG: Automatic Kernel Generation for Neural Processing Units using Polyhedral Transformations (PLDI 2021 - PLDI Research Papers)

Who

Jie Zhao, Bojie Li, Wang Nie, Zhen Geng, Renwei Zhang, Xiong Gao, Bin Cheng, Chen Wu, Yun Cheng, Zheng Li, Peng Di, Kun Zhang, Xuefeng Jin

Track

PLDI 2021

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 23 Jun 2021 14:00 - 14:05 at PLDI-A - Talks 2A: Machine Learning
Thu 24 Jun 2021 02:00 - 02:05 at PLDI-A - Talks 2A: Machine Learning

Abstract

Existing tensor compilers have proven their effectiveness in deploying deep neural networks on general-purpose hardware like CPU and GPU, but optimizing for neural processing units (NPUs) is still challenging due to the heterogeneous compute units and complicated memory hierarchy.

In this paper, we present AKG, a tensor compiler for NPUs. AKG first lowers the tensor expression language to a polyhedral representation, which is used to automate the memory management of NPUs. Unlike existing approaches that resort to manually written schedules, AKG leverages polyhedral schedulers to perform a much wider class of transformations, and extends the semantics of the polyhedral representation to combine complex tiling techniques and hierarchical fusion strategies. We also implement the domain-specific optimization of convolution in AKG. Moreover, to achieve the optimal performance, we introduce complementary optimizations in code generation, which is followed by an auto-tuner.

We conduct extensive experiments on benchmarks ranging from single operators to end-to-end networks. The experimental results show that AKG can obtain superior performance to both manual scheduling approaches and vendor provided libraries. We believe AKG will cast a light on the follow-up compiler works on NPUs.

DOI

https://doi.org/10.1145/3453483.3454106

Jie Zhao

Hunan University, Changsha, Hunan

China

Bojie Li

Huawei Technologies

China

Wang Nie

Huawei Technologies

China

Zhen Geng

Huawei Technologies

China

Renwei Zhang

Huawei Technologies

China

Xiong Gao

Huawei Technologies

China

Bin Cheng

Huawei Technologies

China

Chen Wu

Huawei

China

Yun Cheng

Huawei Technologies

China

Zheng Li

Huawei Technologies

China

Peng Di

Huawei Technologies

China

Kun Zhang

Huawei Technologies

China

Xuefeng Jin

Huawei Technologies

China

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 23 Jun
Displayed time zone: Eastern Time (US & Canada) change

13:30 - 14:05	Talks 2A: Machine LearningPLDI at PLDI-A +12h

13:30 5m Talk		Learning to Find Naming Issues with Big Code and Small Supervision PLDI Jingxuan He ETH Zurich, Cheng-Chun Lee EPFL, Veselin Raychev DeepCode, Martin Vechev ETH Zurich DOI
13:35 5m Talk		Fast and Precise Certification of Transformers PLDI Gregory Bonaert ETH Zurich, Dimitar I. Dimitrov ETH Zurich, Maximilian Baader ETH Zurich, Martin Vechev ETH Zurich DOI
13:40 5m Talk		Web Question Answering with Neurosymbolic Program Synthesis PLDI Jocelyn Qiaochu Chen University of Texas at Austin, USA, Aaron Lamoreaux University of Texas at Austin, Xinyu Wang University of Michigan, Greg Durrett University of Texas at Austin, USA, Osbert Bastani University of Pennsylvania, Işıl Dillig University of Texas at Austin DOI
13:45 5m Talk		Robustness Certification with Generative Models PLDI Matthew Mirman ETH Zurich, Alexander Hägele ETH Zurich, Timon Gehr ETH Zurich, Pavol Bielik ETH Zurich, Martin Vechev ETH Zurich Link to publication DOI
13:50 5m Talk		DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion PLDI Wei Niu College of William & Mary, Jiexiong Guan College of William & Mary, Yanzhi Wang Northeastern University, Gagan Agrawal Augusta University, Bin Ren College of William & Mary DOI
13:55 5m Talk		Vectorized Secure Evaluation of Decision Forests PLDI Raghav Malik Purdue University, Vidush Singhal Purdue University, Benjamin Gottfried Purdue University, Milind Kulkarni Purdue University DOI Pre-print
14:00 5m Talk		AKG: Automatic Kernel Generation for Neural Processing Units using Polyhedral Transformations PLDI Jie Zhao Hunan University, Changsha, Hunan, Bojie Li Huawei Technologies, Wang Nie Huawei Technologies, Zhen Geng Huawei Technologies, Renwei Zhang Huawei Technologies, Xiong Gao Huawei Technologies, Bin Cheng Huawei Technologies, Chen Wu Huawei, Yun Cheng Huawei Technologies, Zheng Li Huawei Technologies, Peng Di Huawei Technologies, Kun Zhang Huawei Technologies, Xuefeng Jin Huawei Technologies DOI

Thu 24 Jun
Displayed time zone: Eastern Time (US & Canada) change

01:30 - 02:05	Talks 2A: Machine LearningPLDI at PLDI-A

01:30 5m Talk		Learning to Find Naming Issues with Big Code and Small Supervision PLDI Jingxuan He ETH Zurich, Cheng-Chun Lee EPFL, Veselin Raychev DeepCode, Martin Vechev ETH Zurich DOI
01:35 5m Talk		Fast and Precise Certification of Transformers PLDI Gregory Bonaert ETH Zurich, Dimitar I. Dimitrov ETH Zurich, Maximilian Baader ETH Zurich, Martin Vechev ETH Zurich DOI
01:40 5m Talk		Web Question Answering with Neurosymbolic Program Synthesis PLDI Jocelyn Qiaochu Chen University of Texas at Austin, USA, Aaron Lamoreaux University of Texas at Austin, Xinyu Wang University of Michigan, Greg Durrett University of Texas at Austin, USA, Osbert Bastani University of Pennsylvania, Işıl Dillig University of Texas at Austin DOI
01:45 5m Talk		Robustness Certification with Generative Models PLDI Matthew Mirman ETH Zurich, Alexander Hägele ETH Zurich, Timon Gehr ETH Zurich, Pavol Bielik ETH Zurich, Martin Vechev ETH Zurich Link to publication DOI
01:50 5m Talk		DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion PLDI Wei Niu College of William & Mary, Jiexiong Guan College of William & Mary, Yanzhi Wang Northeastern University, Gagan Agrawal Augusta University, Bin Ren College of William & Mary DOI
01:55 5m Talk		Vectorized Secure Evaluation of Decision Forests PLDI Raghav Malik Purdue University, Vidush Singhal Purdue University, Benjamin Gottfried Purdue University, Milind Kulkarni Purdue University DOI Pre-print
02:00 5m Talk		AKG: Automatic Kernel Generation for Neural Processing Units using Polyhedral Transformations PLDI Jie Zhao Hunan University, Changsha, Hunan, Bojie Li Huawei Technologies, Wang Nie Huawei Technologies, Zhen Geng Huawei Technologies, Renwei Zhang Huawei Technologies, Xiong Gao Huawei Technologies, Bin Cheng Huawei Technologies, Chen Wu Huawei, Yun Cheng Huawei Technologies, Zheng Li Huawei Technologies, Peng Di Huawei Technologies, Kun Zhang Huawei Technologies, Xuefeng Jin Huawei Technologies DOI