Optimizing brainfuck compiler using LLVM
% Remco Bloemen
% 2010-02-27, last updated 2014-03-04
Optimizing brainfuck compiler using LLVM
LLVM is currently the hype in the world of
compilers. This C++ based compiler infrastruture is hard on its way to
replace the Gnu Compiler Collection. In contrast with GCC LLVM has a
nice C++ interface wich allows you to easily implement your own
language, or so I thought.
It turns out that LLVM is still somewhat of a moving target and examples
from previous versions no longer compile. It was quite a struggle to get
the simplest example to work since the API is very unintuitive and
complex at first.
In the end I managed to get it working.
Then I played with it some more. Trying out other ways to translate
brainfuck to LLVM IR.
In the end I found that I could get remarkable optimzation results. The
key was:
-
enabled some complex loop optimization passes
-
allocated the brainfuck memory on the stack
-
do not create a loop to initialize the memory, but explicittly set
every cell
If you have done all that, then the compiler is capable of compiling the
following brainfuck "Hello Wordl!"-program:
>+++++++++[<++++++++>-]<.>+++++++[<++++>-]<+.+++++++..+++.>>>
++++++++[<++++>-]<.>>>++++++++++[<+++++++++>-]<---.<<<<.+++
.------.--------.>>+.
into
define void @main() {
code:
call void @putchar(i8 72)
call void @putchar(i8 101)
call void @putchar(i8 108)
call void @putchar(i8 108)
call void @putchar(i8 111)
call void @putchar(i8 32)
call void @putchar(i8 87)
call void @putchar(i8 111)
call void @putchar(i8 114)
call void @putchar(i8 108)
call void @putchar(i8 100)
call void @putchar(i8 33)
ret void
}
which is optimal! All the loops and operations are elliminated! Compare
that with the esotope compiler!
Benchmark!
Okay, now that I have my super-optimizing llvm-based brainfuck compiler
it is time to benchmark it against the direct-threaded virtual machine I
developed earlier.
The direct threaded VM has two modes; non-optimizing where it executes
brainfuck instructions as-is and optimizing where it replaces certain
sequences of instructions with more efficient onces.
The llvm jit-compiler has a broad selection of optimization passes that
can be chained in uncountable ways. After a lot of trial and error I
found the chain presented in the source below. In the benchmark I also
included no optimization passes, only an instruction combining pass, the
default selection for amd64, the aggressive default selection for amd64
and my own extensive chain.
The first benchmark program, Bottles, prints the lyric to "99 bottles
of beer". Developed by Andrew Paczkowski.
origin.
The second, Hanoi, solves the three peg nine disk towers of Hanoi
problem and animates the solution with ascii art. Developed by Clifford
Wolf. origin
Last, Mandelbrot, calculates a 512×192 mandelbrot fractal and displays
it with ascii characters. Developed by Erik Bosman.
origin
Virtual machine Optimization Bottles Hanoi Mandelbrot
direct threaded none 0.008 29.647 1978.041
direct threaded simple 0.017 1.091 273.033
llvm jit none 0.129 11.731 111.611
llvm jit combining pass 0.121 7.245 52.894
llvm jit default 0.157 9.491 42.440
llvm jit aggressive 0.174 9.618 42.048
llvm jit extensive 0.375 50.141 37.745
The results are in seconds and include everything from starting the
compiler until the exit of the brainfuck program.
One result stands out: the extensively optimized hanoi takes a lot of
time! Well, actually, the optimization takes a lot of time, the
resulting program executes in less than a second. Further analysis
(using callgrind, callgraph shown below) showed that all the time was
spend in the "Loop Invariant Code Motion" pass. This pass is largely
responsible for the incredible optimization result shown earlier. I
suspect that LLVM was also able to reduce hanoi to a near minimal set of
output instructions, although the resulting code is too messy to confirm
this.

Source!
As always, here is the source:
#include <stack>
#include <fstream>
#include <llvm/Module.h>
#include <llvm/Function.h>
#include <llvm/PassManager.h>
#include <llvm/CallingConv.h>
#include <llvm/Analysis/Verifier.h>
#include <llvm/Support/IRBuilder.h>
#include <llvm/ExecutionEngine/ExecutionEngine.h>
#include <llvm/Target/TargetData.h>
#include <llvm/ModuleProvider.h>
#include <llvm/LinkAllPasses.h>
#include <llvm/ExecutionEngine/JIT.h>
#include <llvm/Target/TargetSelect.h>
using namespace std;
using namespace llvm;
struct MyLoopInfo
{
Value* beforeValue;
PHINode* startValue;
Value* endValue;
Value* afterValue;
BasicBlock* beforeBlock;
BasicBlock* startBlock;
BasicBlock* endBlock;
BasicBlock* afterBlock;
};
Function* makeFunc(Module* module, const char* source, int tapeSize = 300)
{
const Type* voidType = Type::getVoidTy(getGlobalContext());
const IntegerType* cellType = IntegerType::get(getGlobalContext(), 8);
const IntegerType* indexType = IntegerType::get(getGlobalContext(), 32);
const PointerType* tapeType = PointerType::get(cellType, 0);
Value* zero = ConstantInt::get(cellType, 0);
Value* one = ConstantInt::get(cellType, 1);
Value* minOne = ConstantInt::get(cellType, -1);
Function* getchar = cast<Function>(module->getOrInsertFunction("getchar", cellType, NULL));
getchar->setCallingConv(CallingConv::C);
Function* putchar = cast<Function>(module->
getOrInsertFunction("putchar", voidType, cellType, NULL));
putchar->setCallingConv(CallingConv::C);
Function* main = cast<Function>(module->getOrInsertFunction("main", voidType, NULL));
main->setCallingConv(CallingConv::C);
BasicBlock* block = BasicBlock::Create(getGlobalContext(), "code", main);
stack<MyLoopInfo> loops;
IRBuilder<> codeIR(block);
Value* head = codeIR.CreateAlloca(cellType, ConstantInt::get(indexType, tapeSize));
Value* it = head;
for(int i=0; i < tapeSize; i++)
{
codeIR.CreateStore(zero, it);
it = codeIR.CreateGEP(it, one);
}
while(*source)
{
IRBuilder<> builder(block);
switch(*source++)
{
case '>': head = builder.CreateGEP(head, one); break;
case '<': head = builder.CreateGEP(head, minOne); break;
case '+':
{
Value* headValue = builder.CreateLoad(head);
Value* result = builder.CreateAdd(headValue, one);
builder.CreateStore(result, head);
break;
}
case '-':
{
Value* headValue = builder.CreateLoad(head);
Value* result = builder.CreateSub(headValue, one);
builder.CreateStore(result, head);
break;
}
case '.':
{
Value* output = builder.CreateLoad(head);
builder.CreateCall(putchar, output);
break;
}
case ',':
{
Value* input = builder.CreateCall(getchar);
builder.CreateStore(input, head);
break;
}
case '[':
{
MyLoopInfo loop;
loop.beforeBlock = block;
loop.startBlock = BasicBlock::Create(getGlobalContext(), "", main);
loop.afterBlock = BasicBlock::Create(getGlobalContext(), "", main);
loop.beforeValue = head;
Value* headValue = builder.CreateLoad(head);
Value* condition = builder.CreateIsNotNull(headValue);
builder.CreateCondBr(condition, loop.startBlock, loop.afterBlock);
IRBuilder<> sbuilder(loop.startBlock);
loop.startValue = sbuilder.CreatePHI(tapeType);
loop.startValue->addIncoming(loop.beforeValue, loop.beforeBlock);
loops.push(loop);
block = loop.startBlock;
head = loop.startValue;
break;
}
case ']':
{
MyLoopInfo loop = loops.top(); loops.pop();
loop.endValue = head;
loop.endBlock = block;
Value* headValue = builder.CreateLoad(head);
Value* condition = builder.CreateIsNotNull(headValue);
builder.CreateCondBr(condition, loop.startBlock, loop.afterBlock);
loop.startValue->addIncoming(loop.endValue, loop.endBlock);
block = loop.afterBlock;
IRBuilder<> abuilder(block);
PHINode* headPhi = abuilder.CreatePHI(tapeType);
headPhi->addIncoming(loop.beforeValue, loop.beforeBlock);
headPhi->addIncoming(loop.endValue, loop.endBlock);
head = headPhi;
break;
}
default:
break;
}
}
{
IRBuilder<> builder(block);
builder.CreateRetVoid();
}
return main;
}
int main(int argc, char* argv[])
{
if(argc < 2)
{
cerr << "Usage: " << argv[0] << " bf_file" << endl;
return -1;
}
ifstream sourceFile(argv[1]);
string line, source;
while(getline(sourceFile, line)) source += line;
std::string error;
InitializeNativeTarget();
Module* module = new Module("bfcode", getGlobalContext());
ExecutionEngine *engine = EngineBuilder(module)
.setErrorStr(&error)
.setOptLevel(CodeGenOpt::Aggressive)
.create();
if(!engine)
{
cout << "No engine created: " << error << endl;
return -1;
}
cout << "Parsing..." << flush;
Function* func = makeFunc(module, source.c_str());
cout << " done" << endl;
cout << "Optimizing..." << flush;
FunctionPassManager pm(new ExistingModuleProvider(module));
pm.add(new TargetData(*(engine->getTargetData())));
pm.add(createVerifierPass());
pm.add(createInstructionCombiningPass()); pm.add(createLICMPass()); pm.add(createIndVarSimplifyPass()); pm.add(createLoopDeletionPass());
for(int repeat=0; repeat < 3; repeat++)
{
pm.add(createGVNPass()); pm.add(createSCCPPass()); pm.add(createCFGSimplificationPass()); pm.add(createInstructionCombiningPass());
pm.add(createCondPropagationPass()); pm.add(createAggressiveDCEPass()); pm.add(createCFGSimplificationPass()); pm.add(createDeadStoreEliminationPass()); }
pm.run(*func);
cout << "done" << endl;
cout << *module << endl;
cout << "Compiling..." << flush;
void (*bf)() = (void (*)())engine->getPointerToFunction(func);
cout << " done" << endl;
bf();
return 0;
}
Compile with:
g++ ./main.cpp `llvm-config --cppflags --ldflags --libs core jit native all` -o ./test
Note: The order of the arguments is important! The source file must
come before the llvm-config
part or the linking will fail! I lost
about a hour on this non-intuitive behaviour.
Awesome! This is exactly what I’ve been looking for.
— B L 2011-03-03