- Leveraging BOLT to improve data prefetching for AArch64 binaries - Shanzhi Chen, Wei Wei
- LLVM JIT � Upcoming Challenges and Opportunities - Lang Hames
- Attack of the Clones: Speeding Up Coroutine Compilation - Artem Pianykh
- State of Lifetime Safety in Clang - Utkarsh Saxena
- Challenges in binary rewriting: enabling BOLT to optimize CFI-hardened binaries - Gergely Balint
Leveraging BOLT to improve data prefetching for AArch64 binaries
Speaker(s): Shanzhi Chen, Wei Wei
The post-link optimizer BOLT has provided a bunch of binary-level optimizations which mostly focus on code layout and effectively reduce front-end stalls in the Top-Down performance analysis view. In addition, we found that BOLT could also be a handy tool to emit prefetching instructions in binaries and to alleviate back-end stalls resulting from cache misses. In this talk, we will cover how to leverage BOLT to improve data prefetching for AArch64 binaries. A new pass is added to BOLT to provide prefetching support for different variations of AArch64 load instructions. And the existing dataflow analysis framework in BOLT is also enabled for AArch64 to provide register liveness information for prefetching addresses. In addition, the ARM SPE-based profiling technique is employed to provide valuable insights into memory operations and to complete the overall profile-guided data prefetching optimization in BOLT.
LLVM JIT — Upcoming Challenges and Opportunities
Speaker(s): Lang Hames
LLVM's JIT can now run arbitrary real-world applications, as demonstrated by Xcode's Previews feature. Despite this success, enormous opportunities for improvement remain—especially in performance, memory consumption, tooling, and optimization. This talk will describe the most promising opportunities in these areas, sketch a roadmap for tackling them, and discuss how the community can collaborate to accelerate progress.
Attack of the Clones: Speeding Up Coroutine Compilation
Speaker(s): Artem Pianykh
Compiling coroutines with full debug information shouldn't be dramatically slower than with line tables — but we found CoroSplitPass running over 100x slower, adding minutes to compilation time. The cause traced back to LLVM's function cloning, where processing debug info metadata was O(Module) rather than O(Function). This talk covers the investigation, the upstream patches, and how the fix ended up benefiting all users of the function cloning API.
State of Lifetime Safety in Clang
Speaker(s): Utkarsh Saxena
This talk provides a status update on the evolution of Clang’s intra-procedural, flow-sensitive lifetime analysis, building upon the "Origins and Loans" model introduced at the 2025 US LLVM Dev Meet. We outline our strategies for scaling this analysis to Google’s codebase of 1 billion lines of C++, focusing on lifetime annotation adoption through automated inference and targeted suggestions. We share lessons learned from internal rollouts, balancing bug detection against compile-time regressions and false positives. We highlight our future roadmap for subprojects like iterator invalidation and conclude with a call for participation, inviting new contributors to join our efforts in advancing temporal memory safety.
Challenges in binary rewriting: enabling BOLT to optimize CFI-hardened binaries
Speaker(s): Gergely Balint
BOLT is increasingly adopted as it can provide additional performance uplift on top of LTO+PGO optimized binaries. At the same time, AArch64 binaries are commonly deployed with Control Flow Integrity features (PAC and BTI) enabled. This creates a practical challenge: until recently, BOLT couldn’t optimize such binaries. It would either crash, or worse: emit incorrect binaries, crashing at runtime. The talk introduces our work on enabling these features, and describes key engineering challenges, including how implementing such features differs from their compiler counterparts.