Brainfuck compiler / optimizer / VM

Remco Bloemen

2010-02-09, last updated 2014-03-04

Last night I read this paper by M. Ertl and David Gregg about the efficient design of virtual machines. The majority of the paper is about the comparison between two techniques to run trough the bytecode.

The first method is to run through the bytecode by making a big switch statement. This method is called “instruction dispatch using switch”:

enum Instructions {
        add,
        // Other instructions...
        end
};

void engine()
{
        static Instruction program[] = {
                add,
                // ...
                end
        };

        Instruction* instructionPointer = program;
        for(;;) switch(*instructionPointer++) {
                case add:
                        // Add some things...
                        break;
                // Other instructions...
                case end:
                        return;
        }
}

The second method uses a GCC extension to the language called “labels as values”. This allows one to take the memory addresses of goto labels. A virtual machine could be implemented as:

void engine()
{
        static Instruction program[] = {
                &&add,
                // ...
                &&end
        };

        Instruction* instructionPointer = program;
        goto **instructionPointer;

        add:
                // Add some things...
                goto **++instructionPointer;
        // Other instructions...
        end:
                return;
}

Shorter and faster, exactly the way I like my optimizations! If you are interested in why this implementation is faster I encourage you to read the paper. It basically boils down to three things: no bounds checking, no jumptable lookups and better branch prediction.

To make the virtual machine even faster the paper suggests to inline certain combinations of instructions into superinstructions. This is commonly known as “selective inlining”.

Okay, now that I know all that, I need a simple language to try all this out. Brainfuck will do nicely with only eight language commands and a simple program state.

The first problem was how to get the labels addresses outside of the engine functions. My solution was to create a special condition when an empty program is entered to store the addresses in an instruction table:

void engine(const Program& program)
{
        if(program.size() == 0) {
                instructionTable[L">"] = &&right;
                instructionTable[L"<"] = &&left;
                instructionTable[L"]"] = &&exitLoop;
                // ...
                instructionTable[L"end"] = &&end;
                return;
        }

        // ...
}

With the label addresses stored I can now implement the compiler:

Program compile(const wstring& source)
{
        InstructionTable::iterator insItt;
        Program program;
        wstring token(L" ");
        for(int i=0; i < source.size(); i++) {
                token[0] = source[i];
                insItt = instructionTable.find(token);
                if(insItt != instructionTable.end())
                        program.push_back((*insItt).second);
        }
        program.push_back(instructionTable[L"end"]);
        return program;
}

To the eight brainfuck instructions I added seven superinstructions. The first four represent repeated occurrences of [, <, + or -, which happens often in brainfuck. They take an argument containing the number of repetitions. The next superinstruction implements [-]. The last two super instructions are conditional jumps forward and backward. The brainfuck looping instructions are transformed into pre-calculated conditional jumps.

I then created an optimization pass that takes a program and transforms it by applying the superinstruction replacement rules described above.

Benchmark

One of the more interesting brainfuck programs I found is mandelbrot by Erik Bosman. I used a version that computes a 256×96 mandelbrot set. The virtual machine without optimization pass takes 7:24 minutes to complete. With optimization pass it takes exactly one minute.

A 128×48 mandelbrot
A 128×48 mandelbrot

Conclusion

Virtual machines are easy and fast.

Oh yes, and the whole thing properly sets up an unicode compliant input/output and treats cell values as utf-32. So you can output all the unicode characters from your brainfuck programs.

Source code

main.cpp: The compiler. optimizer and virtual machine. In a little under three hundred lines :)

Makefile: A simple Makefile.

fixups.h and fixups.cpp: Some C++ fixups to make the language less unbearable. All explained in this post.