Machine Learning for Autotuning Production Machine Learning Compilers (MAPS 2021)

Track

MAPS 2021

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 21 Jun 2021 16:45 - 17:45 at MAPS - Session C

Abstract

Search-based techniques have been demonstrated effective in solving complex optimization problems that arise in domain-specific compilers for machine learning (ML). Unfortunately, deploying such techniques in production compilers is impeded by two limitations. First, prior works require factorization of a computation graph into smaller subgraphs over which search is applied. This decomposition is not only non-trivial but also significantly limits the scope of optimization. Second, prior works require search to be applied in a single stage in the compilation flow, which does not fit with the multi-stage layered architecture of most production ML compilers.

I will present an autotuner for production ML compilers that can tune both graph-level and subgraph-level optimizations at multiple compilation stages. The autotuner applies a flexible search methodology that defines a search formulation for joint optimizations by accurately modeling the interactions between different compiler passes. The autotuner tunes tensor layouts, operator fusion decisions, tile sizes, and code generation parameters in XLA, a production ML compiler, using various search strategies. We demonstrate how to incorporate machine learning techniques such as a learned cost model and various learning-based search strategies to reduce autotuning time. In an evaluation across 150 ML training and inference models on Tensor Processing Units (TPUs), the autotuner offers up to 2.4x and an average 5% runtime speedup over the heavily-optimized XLA compiler.

Bio

Mangpo is a research scientist at Google Brain, where she leads Machine Learning for Machine Learning Compilers effort (one of Google Brain moonshots in 2020). Her research interests include compilers, machine learning for systems, program synthesis, and efficient computing. Mangpo completed her PhD in Computer Science at UC Berkeley. Her dissertation focuses on synthesis-aided compilation and programming models for emerging architectures, ranging from an ultra-low-power processor to a programmable network card.

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 21 Jun
Displayed time zone: Eastern Time (US & Canada) change

16:45 - 19:15	Session CMAPS at MAPS

16:45 60m Talk		Machine Learning for Autotuning Production Machine Learning Compilers MAPS Phitchaya Mangpo Phothilimthana Google
17:45 30m Talk		Pure, Low-Level Tensor Program Rewriting via Access Patterns (Representation Pearl) MAPS Gus Henry Smith University of Washington, Andrew Liu University of Washington, Steven Lyubomirsky University of Washington, USA, Scott Davidson University of Washington, Joseph McMahan University of Washington, Michael Bedford Taylor University of Washington, Luis Ceze University of Washington, Zachary Tatlock University of Washington, Seattle
18:15 30m Talk		ControlFlag: A Self-supervised Idiosyncratic PatternDetection System for Software Control Structures MAPS Niranjan Hasabnis Intel Labs, Justin Gottschlich Intel Labs / Penn
18:45 30m Talk		Predictive Data Locality Optimization for Higher-Order Tensor Computations MAPS Tharindu Patabandi University of Utah, Anand Venkat , Abhishek Kulkarni Intel, Pushkar Ratnalikar Intel Labs, Mary Hall University of Utah, Justin Gottschlich Intel Labs / Penn

Machine Learning for Autotuning Production Machine Learning Compilers

Mon 21 Jun
Displayed time zone: Eastern Time (US & Canada) change

Phitchaya Mangpo Phothilimthana

Google

Tracks

Co-hosted Conferences

Workshops

Machine Learning for Autotuning Production Machine Learning Compilers

Program Display Configuration

Program Display Configuration

Mon 21 JunDisplayed time zone: Eastern Time (US & Canada) change

Phitchaya Mangpo Phothilimthana

Google

Mon 21 Jun
Displayed time zone: Eastern Time (US & Canada) change