Session Type
Quick Talks
Date & Time
Wednesday, May 10, 2023, 11:30 AM - 12:30 PM
Name
Quick Talks
Talk Order
  1. Multiple-Entry, Multiple-Exit MLIR Regions -Jeff Niu
  2. ML-on-CPU: should vectorization happen in the LLVM backend or higher up the stack? -Elen Kalda
  3. Tensor Evolution - An ML Graph Optimization Technique -Javed Absar
  4. OpenMP as GPU Kernel Language  -Johannes Doerfert
  5. Improving Vectorization for Loops with Control Flow -Ashutosh Nema
  6. Iterative Compilation - Give the Compiler a Second Chance -Ziv Ben Zion
Abstract/s


Multiple-Entry, Multiple-Exit MLIR Regions -Jeff Niu

MLIR regions provide a natural representation of structured control-flow found in many applications, with implicit SSA value captures and automatic memory scopes, but they have been limited to single-entry, single-exit regions. In this talk, we present a new MLIR region-based control-flow representation for single-entry, multiple-exit regions and how this provides a faithful IR model of control-flow in source languages. We also integrated LLVM coroutine intrinsics in our compiler, and we will discuss how they interact with our control-flow representation and how the latter enables trivial implementations of coroutine frame optimizations.


ML-on-CPU: should vectorization happen in the LLVM backend or higher up the stack? -Elen Kalda

This talk is about how TVM, one of the most mature machine learning compilation stacks in ML space, interacts with LLVM. TVM is a domain specific compiler that consumes a machine learning model expressed in high level ML framework like TensorFlow or PyTorch and compiles it for a chosen target, such as Arm(R) architecture. For CPU targets, it does this by using LLVM as a backend, directly translating TVM's IR into LLVM IR.

 In TVM, just like in other Machine Learning stacks using LLVM as a backend for CPU code generation, one needs to make a decision about where optimizations like vectorization should happen: in the LLVM backend, or in the ML stack higher up. This is further complicated by the emergence of scalable vectors, like the Scalable Vector Extension (SVE). While generating code for fixed length vectors can mostly be left to LLVM, there is a case to be made for the presence of variable length vectors in TVM stack, to be able to more effectively use the capabilities of SVE. In this talk, we're going to present our experiences and insights on the trade-offs targeting SVE in the TVM+LLVM stack.
 

Tensor Evolution - An ML Graph Optimization Technique -Javed Absar

We present ‘Tensor Evolution (TEV)’, a new analysis for tensors such as those found in loops of ML Graphs. It is an extension of the well-known Scalar Evolution (SCEV) for tensors and tensor expressions. In an ML Graph, tensors can be added, multiplied, sliced, reshaped, and concatenated. We describe how each of these tensor-ops could be handled to generate TEV-expressions and rewrite rules. TEV is an analysis that enables optimizations such as loop-invariant code motion.


OpenMP as GPU Kernel Language -Johannes Doerfert

In this talk, we discuss the use of OpenMP as a kernel language (think CUDA or HIP). While OpenMP comes with offloading capabilities, the execution model was different and generally associated with overheads. Further, the user did not have the same level of control, at least not without target-specific builtins. With our new OpenMP extensions, we can match native CUDA (and HIP) codes while retaining the portability of OpenMP as well as interoperability with the existing capabilities.


Improving Vectorization for Loops with Control Flow -Ashutosh Nema

Auto-vectorization is an essential compiler optimization. In the presence of control flow, it gets challenging. We introduce the implementation of Branch-On-Super-Word-Conditional-Codes (BOSCC) way of vectorization in the presence of conditional statements. BOSCC introduces a branch instruction that can be conditionally taken based on the comparison result of two vector variables. BOSCC encloses the vector instructions guarded by vector predicate inside an if-statement.
 

Iterative Compilation - Give the Compiler a Second Chance -Ziv Ben Zion
Compiler heuristics play a crucial role in improving the performance of generated code. Revisiting some of the decisions taken by the compiler is possible using different compilation flags, and can sometimes overcome wrong compiler decisions. This talk introduces a different approach, where the compiler itself triggers new compiler runs with different heuristics. I will briefly outline how we implemented this new approach in our LLVM-based compiler.

Location Name
Imperial Suite