What is this?
This is a compiler. When you run it, it compiles ‘input.program’ and produces a binary a.out.
The frontend is written in clojure, and the back-end is LLVM.
How to Run
- Install LLVM library.
- lein deps
- lein run
NOTE: The code assumes LLVM is at /opt/local/lib/libLLVM-2.9.dylib For different paths, update jna.library.path java property in native.clj.
Input Program
Here our toy language is used as input into compiler frontend.
func cos(Double) Double;
func printf(...) Integer;
func bar() Integer {
printf("foo");
return 3;
}
func main() Integer {
printf("hello %d", 3);
printf("bar ");
bar();
return 0;
}
Program Output
- The intermediate representation is printed.
- The object file and linked binary is created.
- Binary is executed, output is shown at bottom.
How does it work?
- parser.clj is for transforming Input=>AST (using clarsec, a combinator parser)
- ast.clj is for transforming AST=>LLVM IR
- main.clj is for LLVM IR=>object file=>linking.
- llvm.clj contains needed llvm bindings for clojure.
- native.clj is a thin JNA wrapper. It’s like clj-native, but less efficient. (I couldn’t get arrays of pointers working in clj-native.)
If input program is this:
func main() Integer { printf("foo"); printf("hello"); return 1; }
AST looks like this:
((:func main [] Integer (:call printf "foo") (:call printf "hello") (:return 1)))
LLVM IR looks like this:
@0 = internal constant [4 x i8] c"foo\00" @1 = internal constant [9 x i8] c"hello %d\00" define i32 @main() { %1 = call i32 (...)* @printf([9 x i8]* @1, i32 3) %2 = call i32 (...)* @printf([5 x i8]* @2) ret i32 1 }
x86 Disassembly looks like this:
0000000100000ef0 <_main>: 100000ef0: 50 push %rax 100000ef1: 48 8d 3d 3c 00 00 00 lea 0x3c(%rip),%rdi 100000ef8: be 03 00 00 00 mov $0x3,%esi 100000efd: 30 c0 xor %al,%al 100000eff: e8 26 00 00 00 callq 100000f2a <_printf$stub> 100000f04: 48 8d 3d 32 00 00 00 lea 0x32(%rip),%rdi 100000f0b: 30 c0 xor %al,%al 100000f0d: e8 18 00 00 00 callq 100000f2a <_printf$stub> 100000f12: 48 8d 3d 17 00 00 00 lea 0x17(%rip),%rdi 100000f19: 30 c0 xor %al,%al 100000f1b: e8 0a 00 00 00 callq 100000f2a <_printf$stub> 100000f20: 31 c0 xor %eax,%eax 100000f22: 5a pop %rdx 100000f23: c3 retq