- Using LLMs to avoid costly bisections in the LLVM Buildbot - Carlos Seo
- Optimizing generic code lowering to LLVM-IR through function equivalence coalescing - Alina Sbirlea
- Project Widen Your Char-izons: Adding wchar support to LLVM-libc - Uzair Nawaz, Sriya Pratipati
- Extending ThinLTO Support for AMDGPU - Shilei Tian
- TangoLLVM: An LLVM Backend for the Go compiler - Tianxiao Gu
Using LLMs to avoid costly bisections in the LLVM Buildbot - Carlos Seo
In this talk, we will cover the use of Large Language Models (LLMs) to facilitate finding regressions in the LLVM Buildbot and avoid costly bisection processes. This presentation will showcase the design of the solution Linaro has been using in the past 10 months to monitor the Arm buildbots, as well as highlight the lessons learned while using LLMs in CI.
Optimizing generic code lowering to LLVM-IR through function equivalence coalescing - Alina Sbirlea
This talk describes a solution for the problem of duplicate LLVM IR functions being emitted when lowering generic code such as C++ templates, and generics in Rust, Swift, or Carbon. The aim is to tackle the issue of code size and high compile-times, originally impacting C++ templates, for which a front end approach is expected to be more efficient than LLVM’s function merging pass. We present an algorithm for coalescing different front-end level functions into a single LLVM IR function when such functions are equivalent in LLVM IR. For this, we use LLVM types for building a canonical fingerprint for functions, even when such types are distinct in the language’s front end. We implement this proof of concept in Carbon’s front end during the lowering to LLVM-IR stage. The algorithm determines if two functions are equivalent by considering their SemIR (Carbon’s IR) representation and their lowered LLVM type information, handling recursion through strongly-connected components (SCCs) call graph analysis, and using two types of fingerprints in order to identify potential equivalences. We also discuss alternatives and future improvements.
Project Widen Your Char-izons: Adding wchar support to LLVM-libc - Uzair Nawaz, Sriya Pratipati
Project Widen Your Char-izons adds wide character functionality to LLVM-libc. This includes implementing parallels to standard string utilities (e.g., concatenation, length) and facilitating conversions between multibyte and wide characters, currently supporting UTF-8 and UTF-32 encodings. There were many interesting implementation details and design choices that had to be made when implementing these functions, such as “how do we handle someone partially converting a character” or “should we consider multiple wide character sizes” (we wanted the answer to be no). This talk aims to elaborate on the design decisions and challenges encountered during the implementation of these libc functions. It is primarily targeted at runtime developers interested in character encodings. The content is moderately technical, focusing on our design rationale and the trade-offs involved in decisions that were not pursued. The ultimate goal is to explain the established framework and demonstrate its potential for future expansion to UTF-16.
Extending ThinLTO Support for AMDGPU - Shilei Tian
In this talk, we'll briefly introduce the ongoing effort to support ThinLTO for AMDGPU. We'll start by discussing the motivation for enabling ThinLTO and the current limitations in the AMDGPU ABI that prevent us from using it out of the box. By default, ThinLTO compiles modules from each translation unit in parallel, effectively following a split scheme based on translation units. To work around some of the limitations, we've made targeted modifications to the existing ThinLTO infrastructure. However, not all limitations can be addressed with workarounds. To properly support ThinLTO, we'll introduce a new split scheme that divides the program based on a graph constructed from the module summary. The remaining ThinLTO infrastructure will then compile the resulting splits in parallel, instead of compiling modules per translation unit as ThinLTO does by default. We also expect this new scheme to benefit other GPU targets that don’t share the same ABI constraints as AMDGPU.
TangoLLVM: An LLVM Backend for the Go compiler - Tianxiao Gu
We add LLVM as an alternative backend for the Go compiler. The LLVM backend can be used to generate code for selected functions only. Different from TinyGo or GOLLVM, we do not aim at building everything using LLVM. Instead, we still use the Go compiler to parse and compile the source code to obtain an object file. Before lowering the Go SSA into platform dependent format, we translate the generic Go SSA into LLVM bitcode, compile the bitcode and generate the necessary auxiliary data (e.g., GC stack maps), and patch the code and auxiliary data into the object file generated by the Go compiler. In this way, we can first reuse many optimizations that have been applied before lowering (e.g., escape analysis, nil check elimination). Second, we have no need to deal with the Go ABI internal to lower every call instruction to LLVM IR. We do not need to generate type descriptors and other module data at LLVM. Third, we can reuse many optimizations released in LLVM. For example, we do additional inlining at LLVM side to further improve performance. We have implemented TangoLLVM on top of Go 1.19/1.24 and LLVM19. We have evaluated TangoLLVM on go1 benchmark suite. The geomean improvement is 10.41%.