How to write an LLVM analysis and/or transformation pass using LLVM’s new pass infrastructure you may wonder? Well, since the documentation that can be found on the internet is sparse, we thought it would be useful to write a blog article on how to use LLVM’s new pass manager. Unlike passes under the old legacy pass manager where the pass interface is defined using inheritance, passes under the new pass manager rely on concept-based polymorphism—there is no explicit interface. Our team is happy to help you with any issues regarding this matter and others.
The LLVM compiler infrastructure (https://llvm.org/) provides all means to write custom compiler passes. LLVM distinguishes between analysis and transformation passes. Both types of passes are run on some given LLVM target bitcode, also referred to as LLVM intermediate representation or LLVM IR for short. Analysis passes only analyze the target code without modifying it. They generally compute static analysis information on the target code or try to prove some interesting properties. Transformation passes, on the other hand, transform code, i.e., they apply modifications to the target code. Code transformation passes usually make use of the information provided by one or more analysis passes to modify the code.
To show how custom LLVM passes can be used in practice, we present a simple transformation pass that we built on top of an analysis pass. Our transformation pass replaces calls to an „unsafe“ and „dangerous“ function: foo()
. The corresponding code can be found at https://github.com/GaZAR-UG/llvm-opt-pass. For presentation here we assume that the function foo()
is unsafe and thus, we would like to build a transformation pass that is able to replace any direct (opposed to indirect which would require additional points-to information that would unnecessarily complicate our example) calls to foo()
with the safer alternative bar(int)
. We further assume the following target code:
#include <cstdio> void foo() { printf("This function is dangerous and should be replaced!\n"); } void bar(int i) { printf("This function is safe! We found call site: '%d'.\n", i); } void baz() { foo(); } int main(int argc, char **argv) { foo(); for (int i = 0; i < argc; ++i) { printf("%s\n", argv[i]); } baz(); return 0; }
To solve this problem, we first implement an analysis pass that determines all direct call sites at which the “dangerous” foo()
function is being called. The analysis pass produces as a result a set of call sites that we wish to have fixed. We enable an automated fix by implementing a transformation pass that takes this set of call sites as input information and then transforms each of the call sites making bar(int)
the callee target. Finally, we set up the required pass infrastructure to be able to run our pass(es) on some target bitcode file. Our custom passes can, of course, also be integrated into LLVM’s existing infrastructure such that one can run those using LLVM’s opt tool.