Write a Blog >>
PLDI 2021
Sun 20 - Sat 26 June 2021 PLDI
Mon 21 Jun 2021 18:00 - 18:25 at ARRAY - Session 4 (short talks) Chair(s): Jonathan Ragan-Kelley

The goal of this paper is to demonstrate performance enhancements of the high performance dense linear algebra matrix-matrix multiply DGEMM kernel, widely implemented by vendors in the basic linear algebra subroutine BLAS library. The mathematics of arrays (MoA) paradigm due to Mullin (1988) results in contiguous memory accesses in combination with Church-Rosser complete language constructs optimized for target processor architectures. Our performance studies demonstrate that the MoA implementation of DGEMM combined with optimal cache-blocking strategies results in at least a 25% performance gain on both Intel Xeon Skylake and IBM Power-9 processors over the vendor supplied Intel MKL and IBM ESSL basic linear algebra libraries.
Results are presented for the NREL Eagle and ORNL Summit supercomputers.

Extended abstract (ARRAY_2021_paper_4 (revised).pdf)547KiB

Mon 21 Jun

Displayed time zone: Eastern Time (US & Canada) change

18:00 - 21:00
Session 4 (short talks)ARRAY at ARRAY
Chair(s): Jonathan Ragan-Kelley MIT CSAIL
18:00
25m
Talk
Improving the Performance of DGEMM with MoA and Cache-Blocking
ARRAY
Stephen Thomas National Renewable Energy Laboratory, Lenore Mullin SUNY Albany, USA, Kasia Swirydowicz Pacific Northwest National Laboratory
File Attached
18:25
25m
Talk
Nested Object Support in a Structure-of-Arrays Dynamic Objector Allocator
ARRAY
Jizhe Chenxin Tokyo Institute of Technology, Hidehiko Masuhara Tokyo Institute of Technology
File Attached
18:50
25m
Talk
Data Layouts are Important (Extended Abstract)
ARRAY
Doru Thom Popovici Lawrence Berkeley National Lab, Andrew Canning Lawrence Berkeley National Laboratory, Zhengji Zhao Lawrence Berkeley National Laboratory, Lin-Wang Wang Lawrence Berkeley National Laboratory, John Shalf Lawrence Berkeley National Laboratory
File Attached