Date |
Paper |
July 30, 2025 |
John Ousterhout
"Always Measure One Level Deeper"
|
July 23, 2025 |
Zheng Liu, Meng Hao, Weizhe Zhang, Gangzhao Lu, Xueyang Tian, Siyu Yang, Mingdong Xie, Jie Dai, Chenyu Yuan, Desheng Wang & Hongwei Yang
"Optimizing depthwise separable convolution on DCU"
|
July 16, 2025 |
Robert Szafarczyk, Syed Waqar Nabi, Wim Vanderbauwhede
"Compiler Support for Speculation in Decoupled Access/Execute Architectures"
|
July 09, 2025 |
Gerasimos Gerogiannis and Josep Torrellas
"Micro-Armed Bandit: Lightweight & Reusable Reinforcement
Learning for Microarchitecture Decision-Making", MICRO 2023
|
July 02, 2025 |
Marcel Ullrich and Sebastian Hack
"Synthesis of Sorting Kernels", Code Generation and Optimization (CGO) 2025
|
June 25, 2025 |
Giorgis Georgakoudis, Konstantinos Parasyris, David Beckingsale
"Proteus: Portable Runtime Optimization of GPU Kernel Execution with Just-in-Time Compilation", Code Generation and Optimization (CGO) 2025
|
June 18, 2025 |
Chaitanya Mamatha Ananda, Rajiv Gupta, Sriraman Tallam, Han Shen, Xinliang David Li
"PreFix: Optimizing the Performance of Heap-Intensive Applications", Code Generation and Optimization (CGO) 2025
|
June 11, 2025 |
Martin Paul Lücke, Oleksandr Zinenko, William S. Moses, Michel Steuwer, Albert Cohen
"The MLIR Transform Dialect: Your Compiler Is More Powerful Than You Think", Code Generation and Optimization (CGO) 2025
|
June 04, 2025 |
Ningxin Zheng, Bin Lin, Quanlu Zhang, Lingxiao Ma, Yuqing Yang, Fan Yang, Yang Wang, Mao Yang, and Lidong Zhou
"SparTA: Deep-Learning Model Sparsity via Tensor-with-Sparsity-Attribute", USENIX Symposium on Operating
Systems Design and Implementation 2022
|
May 28, 2025 |
Maksim Panchenko, Rafael Auler, Bill Nell, Guilherme Ottoni
"BOLT: A Practical Binary Optimizer for Data Centers and Beyond", CGO 2019
|
May 21, 2025 |
Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache, Oleksandr Zinenko
"MLIR: Scaling Compiler Infrastructure for Domain Specific Computation", CGO 2021
|
May 14, 2025 |
Matthias Braun, Sebastian Buchwald, Sebastian Hack, Roland Leißa,
Christoph Mallon, and Andreas Zwinkau
"Simple and Efficient Construction of Static Single Assignment Form", CC 2013
|
March 25, 2025 |
Matteo Basso, Aleksandar Prokopec, Andrea Rosa, and Walter Binder
"Improving Native-Image Startup Performance", CGO 2025
|
February 04, 2025 |
Sebastian Kim and Alberto Ros
"Effective Context-Sensitive Memory Dependence Prediction", HPCA 2024
|
November 27, 2024 |
Akash Dutta, Ali Jannesari,
"MIREncoder: Multi-modal IR-based Pretrained Embeddings for Performance Optimizations," PACT 2024
|
October 30, 2024 |
Zhiying Xu, Jiafan Xu, Hongding Peng, Wei Wang, Xiaoliang Wang, Haoran Wan, Haipeng Dai, Yixu Xu, Hao Cheng, Kun Wang, and Guihai Chen,
"ALT: Breaking the Wall between Data Layout and Loop Optimizations for Deep Learning Compilation,"
EuroSys 2023
|
October 03, 2024 |
Quang Duong, Akanksha Jain, Calvin Lin,
"A New Formulation of Neural Data Prefetching,"
ISCA 2024
|
September 18, 2024 |
Roberto L. Castro, Andrei Ivanov, Diego Andrade, Tal Ben-Nun, Basilio B. Fraguela, Torsten Hoefler,
"VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores"
SuperComputing, 2023.
|
August 07, 2024 |
R. Bera, A. Ranganathan, J. Rakshit, S. Mahto, A. V. Nori, J. Gaur, A. Olgun, K. Kanellopoulos, M. Sadrosadati, S. Subramoney, O. Mutlu
"Constable: Improving Performance and Power Efficiency by Safely Eliminating Load Instruction Execution,"
ISCA, 2024.
|
July 31, 2024 |
T. THEODORIDIS, Z. SU,
"Refined Input, Degraded Output: The Counterintuitive World of Compiler Behavior,"
PLDI, 2024.
|
July 24, 2024 |
H. XU, F. KJOLSTAD,
"Copy-and-Patch Compilation,"
OOPSLA, 2021.
|
July 17, 2024 |
F. KJOLSTAD, S. KAMIL, S. CHOU, D. LUGATO, S. AMARASINGHE
"The Tensor Algebra Compiler,"
OOPSLA, 2017.
|
July 10, 2024 |
P. Patel1, E. Choukse, C. Zhang, A. Shah, Í. Goiri, S. Maleki, R. Bianchini,
"Splitwise: Efficient Generative LLM Inference Using Phase Splitting,"
International Symposium on Computer Architecture (ISCA), Buenos Aires, July, 2024.
|
July 03, 2024 |
A. Castelló, J. Bellavita, G. Dinh, Y. Ikarashi and H. Martínez,
"Tackling the Matrix Multiplication Micro-Kernel Generation with Exo,"
Code Generation and Optimization (CGO), Edinburgh, March, 2024.
|
June 25, 2024 |
Tommy McMichen, Nathan Greiner, Peter Zhong, Federico Sossai, Atmn Patel, Simone Campanoni,
"Representing Data Collections in an SSA Form,"
Code Generation and Optimization (CGO), Edinburgh, March, 2024.
|
June 19, 2024 |
Volker Seeker, Chris Cummins, Murray Cole, Bjorn Franke, Kim Hazelwood, Hugh Leather,
"Revealing Compiler Heuristics Through Automated Discovery and Optimization,"
Code Generation and Optimization (CGO), Edinburgh, March, 2024.
|
June 11, 2024 |
Alnis Murtovi, Giorgis Georgakoudis, Konstantinos Parasyris, Chunhua Liao, Ignacio Laguna, Bernhard Steffen,
"Enhancing Performance Through Control-Flow Unmerging and Loop Unrolling on GPUs,"
Code Generation and Optimization (CGO), Edinburgh, March, 2024.
|
June 05, 2024 |
Ben L. Titzer,
"Whose Baseline Compiler is it Anyway?,"
Code Generation and Optimization (CGO), Edinburgh, March, 2024.
|
May 29, 2024 |
Martin Maas, David G. Andersen, Michael Isard, Mohammad Mahdi Javanmard, Kathryn S. McKinley, and Colin Raffel
"Combining Machine Learning and Lifetime-Based Resource Management for Memory Allocation and Beyond"
Communications of the ACM, April, 2024.
|
May 22, 2024 |
Ettore Tiotto, Víctor Pérez, Whitney Tsang, Lukas Sommer, Julian Oppermann, Victor Lomüller, Mehdi Goli, James Brodman,
"Experiences Building an MLIR-Based SYCL Compiler,"
Code Generation and Optimization (CGO), Edinburgh, March, 2024.
|
May 15, 2024 |
Florian Drescher, Alexis Engelke
"Fast Template-Based Code Generation for MLIR,"
Compiler Construction, Edinburgh, March, 2024..
|
May 08, 2024 |
Amir Ayupov, Maksim Panchenko, Sergey Pupyrev
"Stale Profile Matching,"
Compiler Construction, Edinburgh, March, 2024.
|
May 02, 2024 |
Dorit Nuzman, Ayal Zaks, Ziv Ben-Zion
"If-Convert as Early as You Must,"
Compiler Construction, Edinburgh, March, 2024.
|
April 18, 2024 |
Cyrus Zhou, Zack Hassman, Dhirpal Shah, Vaughn Richard, Yanjing Li
"YFlows: Systematic Dataflow Exploration and Code Generation for Efficient Neural Network Inference using SIMD Architectures on CPUs,"
Compiler Construction, Edinburgh, March, 2024.
|
April 11, 2024 |
"oneDNN Graph Compiler: A Hybrid Approach for High-Performance Deep Learning Compilation,"
Code Generation and Optimization (CGO), Edinburgh, March, 2024.
|
April 04, 2024 |
Nassim Abderaouf Amalou , Elisa Fromont, Isabelle Puaut,
"Fast and Accurate Context-Aware Basic Block Timing Prediction using Transformers,"
Compiler Construction, Edinburgh, March, 2024.
|
March 14, 2024 |
James Reed, Zachary DeVito, Horace He, Ansley Ussery, and Jason
Ansel. 2022. Torch.fx: practical program capture and transformation
for deep learning in python. In Proceedings of Machine Learning and
Systems. D. Marculescu, Y. Chi, and C. Wu, (Eds.) Vol. 4, 638–651.
|
February 22, 2024 |
Jason Ansel et al., "PyTorch 2: Faster Machine Learning Through Dynamic
Python Bytecode Transformation and Graph," In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) ’24, April 27-May 1, 2024, La Jolla, CA, USA. |
February 15, 2024 |
Weihao Cui1, Zhenhua Han, Lingji Ouyang, Yichuan Wang, Ningxin Zheng, Lingxiao Ma, Yuqing Yang, Fan Yang, Jilong Xue, Lili Qiu, Lidong Zhou, Quan Chen, Haisheng Tan, Minyi Guo. "Optimizing Dynamic Neural Networks with Brainstorm," In
17th USENIX Symposium on Operating Systems Design and
Implementation (OSDI 23). USENIX
Association, Boston, CA, November 2023. |
January 25, 2024 |
Hongyu Zhu, Ruofan Wu,
Yijia Diao, Shanbin Ke, Haoyu Li, Chen Zhang, Jilong
Xue, Lingxiao Ma, Yuqing Xia, Wei Cui, Fan Yang, Mao
Yang, "ROLLER: Fast and Efficient Tensor Compilation for
Deep Learning," 16th USENIX Symposium on Operating
Systems Design and Implementation, Carlsbad, CA, USA,
July, 2022. |
January 11, 2024 |
Y. Xing, J. Weng, Y. Wang, L. Sui, Y. Shan and Y. Wang, "An in-depth
comparison of compilers for deep neural networks on hardware", 2019 IEEE International Conference on
Embedded Software and Systems (ICESS), pp. 1-8, 2019. |
November 30, 2023 |
X Yi, S Zhang, L Diao, C Wu, Z Zheng, S Fan, S Wang, J Yang, W Lin,
"Optimizing DNN Compilation for Distributed Training With Joint OP and Tensor Fusion"
IEEE Transactions on Parallel and Distributed Systems, 2022 |
November 23, 2023 |
M Bansal, O Hsu, K Olukotun, F Kjolstad, "Mosaic: An Interoperable
Compiler for Tensor Algebra"
Proceedings of the ACM on Programming Languages, 2023 |
November 7, 2023 |
Chen, T., Moreau, T., Jiang, Z., Zheng, L., Yan, E., Shen, H., ... &
Krishnamurthy, A. (2018). "TVM: An automated End-to-End optimizing compiler for deep learning" In 13th
USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). |
October 31, 2023 |
M Cowan, D Dangwal, A Alaghi, C Trippel, VT Lee, B Reagen. "Porcupine: A
synthesizing compiler for vectorized homomorphic encryption" In
Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and
Implementation |
July 20, 2023 |
I Korostelev, JP L. De Carvalho, J Moreira, JN Amaral. "YaConv:
Convolution with low cache footprint" ACM Transactions on Architecture and Code Optimization, 2023 |
July 13, 2023 |
W Praharenka, D Pankratz, JPL De Carvalho, E Amiri, JN Amaral.
"Vectorizing divergent control flow with active-lane consolidation on long-vector architectures" The Journal
of Supercomputing |
July 6, 2023 |
T Furtak, JN Amaral, R Niewiadomski.
"Using SIMD registers and instructions to enable instruction-level parallelism in sorting algorithms" In
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures |
June 29, 2023 |
I Ireland, JN Amaral, R Silvera, S Cui.
"SafeType: detecting type violations for type‐basedalias analysis of C" Software: Practice and
Experience |
June 23, 2023 |
JPL De Carvalho, B Kuzma, I Korostelev, JN Amaral, C Barton, J Moreira,
G
Araujo.
"KernelFaRer: replacing native-code idioms with high-performance library calls" ACM Transactions On
Architecture And Code Optimization (TACO) |
June 15, 2023 |
J Huynh, JN Amaral, P Berube, SAA Touati.
"Evaluating address register assignment and offset assignment algorithms" ACM Transactions on Embeddedm
Computing Systems (TECS) |
June 8, 2023 |
R Govindarajan, H Yang, JN Amaral, C Zhang, GR Gao.
"Minimum register instruction sequencing to reduce register spills in out-of-order issue superscalar
architectures" IEEE Transactions on Computers |
June 1, 2023 |
P Berube, JN Amaral.
"Combined profiling: A methodology to capture varied program behavior across multiple inputs" In 2012 IEEE
International Symposium on Performance Analysis of Systems & Software |
May 25, 2023 |
A Wang, M Gaudet, P Wu, M Ohmacht, JN Amaral, C Barton, R Silvera, MM
Michael.
"Software support and evaluation of hardware transactional memory on blue gene/q" IEEE Transactions on
Computers |
May 18, 2023 |
S Curial, P Zhao, JN Amaral, Y Gao, S Cui, R Silvera, R Archambault.
"Mpads: memory-pooling-assisted data splitting" In Proceedings of the 7th international symposium on Memory
management |
May 11, 2023 |
A Chikin, T Lloyd, JN Amaral, E Tiotto, M Usman.
"Memory-access-aware safety and profitability analysis for transformation of accelerator-bound OpenMP loops"
ACM Transactions on Architecture and Code Optimization (TACO) |
May 4, 2023 |
R Nabinger-Sanchez, JN Amaral, D Szafron, M Pirvu, and M Stoodley.
"Using machines to learn method-specific compilation strategies" In International Symposium on Code
Generation and Optimization (CGO 2011) |
April 20, 2023 |
U Hölzle, C Chambers, D Ungar.
"Optimizing dynamically-typed object-oriented languages with polymorphic inline caches" In ECOOP'91 European
Conference on Object-Oriented Programming: Geneva, Switzerland |
April 6, 2023 |
SS Mannarswamy, R Govindarajan, R Surendran.
"Region based structure layout optimization by selective data copying" In 2009 18th International Conference
on Parallel Architectures and Compilation Techniques |
March 9, 2023 |
TA Khan, M Ugur, K Nathella, D Sunwoo, H Litz, DA Jiménez, B Kasikci.
"Whisper: Profile-guided branch misprediction elimination for data center applications" In 2022 55th
IEEE/ACM International Symposium on Microarchitecture (MICRO) |
February 23, 2023 |
J Savage, TM Jones.
"Halo: Post-link heap-layout optimisation" In Proceedings of the 18th ACM/IEEE International Symposium on
Code Generation and Optimization |
February 8, 2023 |
L Ye, M Lis, A Fedorova.
"A unifying abstraction for data structure splicing" In Proceedings of the International Symposium on Memory
Systems |
January 12, 2023 |
R Bruno, V Jovanovic, C Wimmer, G Alonso.
"Compiler-assisted object inlining with value fields" In Proceedings of the 42nd ACM SIGPLAN International
Conference on Programming Language Design and Implementation |
November 30, 2022 |
B Liu, A Laird, WH Tsang, B Mahjour, MM Dehnavi.
"Combining Run-time Checks and Compile-time Analysis to Improve Control Flow Auto-Vectorization" In
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques |
November 9, 2022 |
T Kistler, M Franz.
"Continuous program optimization: A case study" ACM Transactions on Programming Languages and Systems
(TOPLAS) |
October 20, 2022 |
BC Schwedock, P Yoovidhya, J Seibert, N Beckmann.
"Täkō: A polymorphic cache hierarchy for general-purpose optimization of data movement" In Proceedings of
the 49th Annual International Symposium on Computer Architecture |
September 15, 2022 |
NR Tallent, JM Mellor-Crummey, MW Fagan.
"Binary Analysis for Measurement and Attribution of ProgramPerformance" ACM Sigplan Notices |