Date & Time
Thursday, November 18, 2021, 10:45 AM - 11:15 AM
Name
Byte types, or how to get rid of i8 abuse for chars in LLVM IR
Description

LLVM IR does not have a universal union type like C (unsigned char) or C++ (std::byte) have. Instead, integers (in particular, i8, i16, i32, i64) are used which makes it possible for them to carry pointers. Since LLVM’s alias analyses do not take integer operations into account, this can lead to wrong aliasing results. The abuse of integers as universal data containers can also be seen in memory transfer functions optimizations: calls to memcpy, memmove, etc. are sometimes lowered into integer loads/stores of the corresponding bit width and this type punning combined with other LLVM optimizations can lead to miscompilations. In this talk, we present a new byte type for the LLVM IR that can be used as a universal data type and that solves the load type punning problems. We discuss its semantics, engineering efforts to introduce it in LLVM, as well as the performance impact.

Session Type
Technical Talk