Session Details: 2025 LLVM Developers' Meeting

Date & Time

Wednesday, October 29, 2025, 11:00 AM - 12:00 PM

Name

Quick Talks

Session Type

Quick Talks

Talk Order

Optimizing generic code lowering to LLVM-IR through function equivalence coalescing - Alina Sbirlea
Project Widen Your Char-izons: Adding wchar support to LLVM-libc - Uzair Nawaz, Sriya Pratipati
Extending ThinLTO Support for AMDGPU - Shilei Tian
TangoLLVM: An LLVM Backend for the Go compiler - Tianxiao Gu
Optimizing IREE to Match llama.cpp: An Introduction to IREE optimization for newbies through a benchmark journey - Uiseop Eom

Abstract/s

Optimizing generic code lowering to LLVM-IR through function equivalence coalescing - Alina Sbirlea
This talk describes a solution for the problem of duplicate LLVM IR functions being emitted when lowering generic code such as C++ templates, and generics in Rust, Swift, or Carbon. The aim is to tackle the issue of code size and high compile-times, originally impacting C++ templates, for which a front end approach is expected to be more efficient than LLVM’s function merging pass. We present an algorithm for coalescing different front-end level functions into a single LLVM IR function when such functions are equivalent in LLVM IR. For this, we use LLVM types for building a canonical fingerprint for functions, even when such types are distinct in the language’s front end. We implement this proof of concept in Carbon’s front end during the lowering to LLVM-IR stage. The algorithm determines if two functions are equivalent by considering their SemIR (Carbon’s IR) representation and their lowered LLVM type information, handling recursion through strongly-connected components (SCCs) call graph analysis, and using two types of fingerprints in order to identify potential equivalences. We also discuss alternatives and future improvements.

Project Widen Your Char-izons: Adding wchar support to LLVM-libc - Uzair Nawaz, Sriya Pratipati
Project Widen Your Char-izons adds wide character functionality to LLVM-libc. This includes implementing parallels to standard string utilities (e.g., concatenation, length) and facilitating conversions between multibyte and wide characters, currently supporting UTF-8 and UTF-32 encodings. There were many interesting implementation details and design choices that had to be made when implementing these functions, such as “how do we handle someone partially converting a character” or “should we consider multiple wide character sizes” (we wanted the answer to be no). This talk aims to elaborate on the design decisions and challenges encountered during the implementation of these libc functions. It is primarily targeted at runtime developers interested in character encodings. The content is moderately technical, focusing on our design rationale and the trade-offs involved in decisions that were not pursued. The ultimate goal is to explain the established framework and demonstrate its potential for future expansion to UTF-16.

Extending ThinLTO Support for AMDGPU - Shilei Tian
In this talk, we'll briefly introduce the ongoing effort to support ThinLTO for AMDGPU. We'll start by discussing the motivation for enabling ThinLTO and the current limitations in the AMDGPU ABI that prevent us from using it out of the box. By default, ThinLTO compiles modules from each translation unit in parallel, effectively following a split scheme based on translation units. To work around some of the limitations, we've made targeted modifications to the existing ThinLTO infrastructure. However, not all limitations can be addressed with workarounds. To properly support ThinLTO, we'll introduce a new split scheme that divides the program based on a graph constructed from the module summary. The remaining ThinLTO infrastructure will then compile the resulting splits in parallel, instead of compiling modules per translation unit as ThinLTO does by default. We also expect this new scheme to benefit other GPU targets that don’t share the same ABI constraints as AMDGPU.

TangoLLVM: An LLVM Backend for the Go compiler - Tianxiao Gu
We add LLVM as an alternative backend for the Go compiler. The LLVM backend can be used to generate code for selected functions only. Different from TinyGo or GOLLVM, we do not aim at building everything using LLVM. Instead, we still use the Go compiler to parse and compile the source code to obtain an object file. Before lowering the Go SSA into platform dependent format, we translate the generic Go SSA into LLVM bitcode, compile the bitcode and generate the necessary auxiliary data (e.g., GC stack maps), and patch the code and auxiliary data into the object file generated by the Go compiler. In this way, we can first reuse many optimizations that have been applied before lowering (e.g., escape analysis, nil check elimination). Second, we have no need to deal with the Go ABI internal to lower every call instruction to LLVM IR. We do not need to generate type descriptors and other module data at LLVM. Third, we can reuse many optimizations released in LLVM. For example, we do additional inlining at LLVM side to further improve performance. We have implemented TangoLLVM on top of Go 1.19/1.24 and LLVM19. We have evaluated TangoLLVM on go1 benchmark suite. The geomean improvement is 10.41%.

Optimizing IREE to Match llama.cpp: An Introduction to IREE optimization for newbies through a benchmark journey - Uiseop Eom
Deploying efficient machine learning inference engines often involves rigorous benchmark comparisons to achieve optimal performance. In this talk, we present a benchmark analysis comparing the inference performance of IREE against llama.cpp, focusing specifically on executing open-source llama3 model.

Attendees will learn about the performance gaps initially observed between IREE and llama.cpp, and the targeted optimizations we implemented within the iree-compiler to bridge these gaps. The session will introduce common performance bottlenecks faced by new users of iree-compiler and iree-runtime, including typical profiling tips. We will demonstrate practical MLIR optimizations and how to implement them. This talk aims to be especially valuable for newcomers looking to understand and enhance performance when leveraging IREE for model inference tasks.

Speakers

Alina Sbirlea
Uzair Nawaz
Sriya Pratipati
Shilei Tian
Tianxiao Gu
Uiseop Eom

Location Name

California Ballroom