• Stars
    star
    303
  • Rank 132,721 (Top 3 %)
  • Language
    C
  • License
    Other
  • Created about 7 years ago
  • Updated 12 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

LLVM-based JIT Compiler for Ruby

LLRB

LLRB is a LLVM-based JIT compiler for Ruby.

Project status

I'm currently working on another JIT approach: YARV-MJIT.

What's LLRB?

This is an experimental project to implement an idea presented by @evanphx at RubyKaigi 2015 Keynote:
Method JIT compiler inlining CRuby core functions using LLVM.

How does it work?

On build time, some core functions are compiled to LLVM bitcode (binary form of LLVM IR) files via LLVM IR.

 ________     _________     ______________
|        |   |         |   |              |
| CRuby  |   | CRuby   |   | CRuby        |
| C code |-->| LLVM IR |-->| LLVM bitcode |
|________|   |_________|   |______________|

Those files are separated per function to load only necessary functions. You can see how they are separated in ext directory.

After profiler of LLRB JIT is started, when Ruby is running, LLRB compiles Ruby method's YARV Instruction Sequence to native machine code.

 ______     ______________      __________      ___________      _________
|      |   |              |    |          |    |           |    |         |
| YARV |   | ISeq Control |    | LLVM IR  |    | Optimized |    | Machine |
| ISeq |-->| Flow Graph   |-*->| for ISeq |-*->| LLVM IR   |-*->| code    |
|______|   |______________| |  |__________| |  |___________| |  |_________|
                            |               |                |
                            | Link          | Optimize       | JIT compile
                      ______|_______     ___|____          __|____
                     |              |   |        |        |       |
                     | CRuby        |   | LLVM   |        | LLVM  |
                     | LLVM Bitcode |   | Passes |        | MCJIT |
                     |______________|   |________|        |_______|

Does it improve performance?

Now basic instruction inlining is done. Let's see its effect.

Consider following Ruby method, which is the same as ruby/benchmark/bm_loop_whileloop.rb.

def while_loop
  i = 0
  while i<30_000_000
    i += 1
  end
end

The YARV ISeq, compilation target in LLRB, is this:

> puts RubyVM::InstructionSequence.of(method(:while_loop)).disasm
== disasm: #<ISeq:while_loop@(pry)>=====================================
== catch table
| catch type: break  st: 0015 ed: 0035 sp: 0000 cont: 0035
| catch type: next   st: 0015 ed: 0035 sp: 0000 cont: 0012
| catch type: redo   st: 0015 ed: 0035 sp: 0000 cont: 0015
|------------------------------------------------------------------------
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] i
0000 trace            8                                               (   1)
0002 trace            1                                               (   2)
0004 putobject_OP_INT2FIX_O_0_C_
0005 setlocal_OP__WC__0 3
0007 trace            1                                               (   3)
0009 jump             25
0011 putnil
0012 pop
0013 jump             25
0015 trace            1                                               (   4)
0017 getlocal_OP__WC__0 3
0019 putobject_OP_INT2FIX_O_1_C_
0020 opt_plus         <callinfo!mid:+, argc:1, ARGS_SIMPLE>, <callcache>
0023 setlocal_OP__WC__0 3
0025 getlocal_OP__WC__0 3                                             (   3)
0027 putobject        30000000
0029 opt_lt           <callinfo!mid:<, argc:1, ARGS_SIMPLE>, <callcache>
0032 branchif         15
0034 putnil
0035 trace            16                                              (   6)
0037 leave                                                            (   4)
=> nil

By LLRB compiler, such YARV ISeq is compiled to LLVM IR like:

define i64 @llrb_exec(i64, i64) {
label_0:
  call void @llrb_insn_trace(i64 %0, i64 %1, i32 8, i64 52)
  call void @llrb_insn_trace(i64 %0, i64 %1, i32 1, i64 52)
  call void @llrb_insn_setlocal_level0(i64 %1, i64 3, i64 1)
  call void @llrb_insn_trace(i64 %0, i64 %1, i32 1, i64 52)
  br label %label_25

label_15:                                         ; preds = %label_25
  call void @llrb_insn_trace(i64 %0, i64 %1, i32 1, i64 52)
  %2 = call i64 @llrb_insn_getlocal_level0(i64 %1, i64 3)
  call void @llrb_set_pc(i64 %1, i64 94225474387824)
  %opt_plus = call i64 @llrb_insn_opt_plus(i64 %2, i64 3)
  call void @llrb_insn_setlocal_level0(i64 %1, i64 3, i64 %opt_plus)
  br label %label_25

label_25:                                         ; preds = %label_15, %label_0
  %3 = call i64 @llrb_insn_getlocal_level0(i64 %1, i64 3)
  call void @llrb_set_pc(i64 %1, i64 94225474387896)
  %opt_lt = call i64 @llrb_insn_opt_lt(i64 %3, i64 60000001)
  %RTEST_mask = and i64 %opt_lt, -9
  %RTEST = icmp ne i64 %RTEST_mask, 0
  br i1 %RTEST, label %label_15, label %label_34

label_34:                                         ; preds = %label_25
  call void @llrb_insn_trace(i64 %0, i64 %1, i32 16, i64 8)
  call void @llrb_set_pc(i64 %1, i64 94225474387960)
  call void @llrb_push_result(i64 %1, i64 8)
  ret i64 %1
}

As LLRB compiler links precompiled LLVM bitcode of CRuby functions, using LLVM's FunctionInliningPass ("Pass" is LLVM's optimizer), some C functions are inlined and inlined code will be well optimized by Passes like InstCombinePass.

define i64 @llrb_exec(i64, i64) #0 {
  ...

land.lhs.true.i:                                  ; preds = %label_25
  %49 = load %struct.rb_vm_struct*, %struct.rb_vm_struct** @ruby_current_vm, align 8, !dbg !3471, !tbaa !3472
  %arrayidx.i = getelementptr inbounds %struct.rb_vm_struct, %struct.rb_vm_struct* %49, i64 0, i32 39, i64 7, !dbg !3471
  %50 = load i16, i16* %arrayidx.i, align 2, !dbg !3471, !tbaa !3473
  %and2.i = and i16 %50, 1, !dbg !3471
  %tobool6.i = icmp eq i16 %and2.i, 0, !dbg !3471
  br i1 %tobool6.i, label %if.then.i, label %if.else11.i, !dbg !3475, !prof !3380

if.then.i:                                        ; preds = %land.lhs.true.i
  call void @llvm.dbg.value(metadata i64 %48, i64 0, metadata !2680, metadata !3361) #7, !dbg !3477
  call void @llvm.dbg.value(metadata i64 60000001, i64 0, metadata !2683, metadata !3361) #7, !dbg !3478
  %cmp7.i = icmp slt i64 %48, 60000001, !dbg !3479
  %..i = select i1 %cmp7.i, i64 20, i64 0, !dbg !3481
  br label %llrb_insn_opt_lt.exit

if.else11.i:                                      ; preds = %land.lhs.true.i, %label_25
  %call35.i = call i64 (i64, i64, i32, ...) @rb_funcall(i64 %48, i64 60, i32 1, i64 60000001) #7, !dbg !3483
  br label %llrb_insn_opt_lt.exit, !dbg !3486

llrb_insn_opt_lt.exit:                            ; preds = %if.then.i, %if.else11.i
  %retval.1.i = phi i64 [ %..i, %if.then.i ], [ %call35.i, %if.else11.i ]
  %RTEST_mask = and i64 %retval.1.i, -9
  %RTEST = icmp eq i64 %RTEST_mask, 0

  ...
}

In this example, you can see many things are inlined. LLRB's compiled code fetches RubyVM state and check whether < method is redefined or not, and if < is not redefined, if.then.i block is used and in that block icmp slt is used instead of calling Ruby method #<. Note that it's done by just inlining YARV's opt_lt instruction directly and it's not hard to implement.

Thus, following benchmark shows the performance is improved.

ruby = Class.new
def ruby.script
  i = 0
  while i< 30_000_000
    i += 1
  end
end

llrb = Class.new
def llrb.script
  i = 0
  while i< 30_000_000
    i += 1
  end
end

LLRB::JIT.compile(llrb, :script)

Benchmark.ips do |x|
  x.report('Ruby') { ruby.script }
  x.report('LLRB') { llrb.script }
  x.compare!
end

On wercker:

Calculating -------------------------------------
                Ruby      7.449  (± 0.0%) i/s -     38.000  in   5.101634s
                LLRB     36.974  (± 0.0%) i/s -    186.000  in   5.030540s

Comparison:
                LLRB:       37.0 i/s
                Ruby:        7.4 i/s - 4.96x  slower

How is the design?

Built as C extension

Currently LLRB is in an experimental stage. For fast development and to stay up-to-date for CRuby core changes, LLRB is built as C extension. We can use bundler, benchmark-ips, pry, everything else in normal C extension repository. It's hard in just CRuby fork.

For optimization, unfortunately it needs to export some symbols from CRuby, so it needs to compile k0kubun/ruby's llrb branch and install llrb.gem from that Ruby.

But I had the rule that I don't add modification except exporting symbols. And LLRB code refers to such exported functions or variables in CRuby core by including CRuby header as possible. I believe it contributes to maintainability of LLRB prototype.

Conservative design for safety

YARV is already proven to be reliable. In LLRB, YARV is not modified at all. Then, how is JIT achieved?

YARV has opt_call_c_function instruction, which is explained to "call native compiled method". Why not use that?

LLRB compiles any ISeq to following ISeq.

0000 opt_call_c_function
0002 leave

So simple. Note that opt_call_c_function can handle YARV's internal exception. In that instruction, we can do everything.

One more conservative thing in LLRB is that it fills leave instructions to remaining places. To let YARV catch table work, it needs to update program counter properly, and then it requires an instruction to the place that program counter points to.

For safe exit when catch table is used, leave instructions are filled to the rest of first opt_call_c_function.

Sampling-based lightweight profiler

Sampling profiler is promising approach to reduce the overhead of profiling without spoiling profiling efficiency. LLRB's profiler to schedule JIT-ed ISeq is implemented in almost the same way as stackprof. It is widely-used on production and proven to be a reliable approach.

It kicks profiler in some CPU-time interval, and the parameter can be modified if necessary. Using optcarrot benchmark, I tested profiler overhead and LLRB's profiler didn't reduce the fps of optcarrot with current parameter.

Also, it uses rb_postponed_job_register_one API, which is used by stackprof too, to do JIT compile. So the compilation is done in very safe timing.

Less compilation effort

CRuby's C functions to inline are precompiled as LLVM bitcode on LLRB build process. LLRB's compiler builds LLVM IR using LLVM IRBuilder, so the bitcode files are directly linked to that.

It means that LLRB has no overhead of parsing and compiling C source code on runtime. It has less compilation effort, right?

Currently the performance bottleneck is not in compiler, unfortunately! So it doesn't use extra thread for JIT compilation for now.

Build dependency

Usage

Once build dependency is met, execute gem install llrb and do:

require 'llrb/start'

llrb/start file does LLRB::JIT.start. Note that you can also do that by ruby -rllrb/start -e "...".

If you want to see which method is compiled, compile the gem with #define LLRB_ENABLE_DEBUG 1. Again, it's in an experimental stage and currently it doesn't improve performance in real-world application.

TODOs

  • Improve performance...
  • Implement ISeq method inlining
  • Support all YARV instructions
    • expandarray, reverse, reput, defineclass, once, opt_call_c_function are not supported yet.
  • Care about unexpectedly GCed object made during compilation

License

The same as Ruby.

More Repositories

1

pp

Colored pretty printer for Go language
Go
1,679
star
2

hamlit

High Performance Haml Implementation
Ruby
979
star
3

md2key

Convert markdown to keynote
Ruby
977
star
4

xremap

Key remapper for X11 and Wayland
Rust
770
star
5

Nocturn

Multi-platform Twitter Client built with React, Redux and Electron
JavaScript
714
star
6

gitstar-ranking

GitHub star ranking for users, organizations and repositories
Kotlin
700
star
7

rack-user_agent

Rack::Request extension for handling User-Agent
Ruby
187
star
8

activerecord-precount

N+1 count query killer for ActiveRecord
Ruby
154
star
9

activerecord-precounter

Yet Another N+1 COUNT Query Killer for ActiveRecord
Ruby
102
star
10

go-ansi

Windows-portable ANSI escape sequence utility for Go language
Go
85
star
11

tetris

TETRIS for your terminal
Go
76
star
12

yarv-mjit

MRI method JIT compiler based on original stack-based YARV instructions (Development Repository of ruby/ruby#1782, already merged)
Ruby
65
star
13

itamae-plugin-recipe-rbenv

Itamae/MItamae plugin to install ruby with rbenv
Ruby
56
star
14

jjvm

JVM implementation written in Java
Java
53
star
15

karabiner-ruby

Lightweight keyremap configuration DSL for Karabiner
Ruby
53
star
16

ruby-jit-challenge

Tutorial to write a Ruby JIT
Ruby
51
star
17

itamae-go

Go implementation of itamae embedding mruby
Go
50
star
18

dotfiles

Bootstrap development environment
Shell
49
star
19

hescape

C library for fast HTML escape using SSE instructions
C
48
star
20

railsbench

Rails 6.1 version of headius/pgrailsbench with database seeds
Ruby
43
star
21

gem-default

Change a non-default gem to a default gem in your local environment
Ruby
37
star
22

rebuild

Development environment bootstrap automation toolkit for OSX
Ruby
31
star
23

vim-open-github

Quickly open your current buffer in GitHub.
Ruby
29
star
24

graphql-query-builder

GraphQL query builder for Java
Java
23
star
25

perf-profile

Profiling C code with Linux perf made easy
Python
19
star
26

wrap-bootstrap-rails

Rails plugin generator for Wrap Bootstrap design templates
Ruby
18
star
27

gosick

Scheme implementation by Go language
Go
18
star
28

hescape-ruby

HTML escape utility for Ruby
Ruby
17
star
29

twitter-auth

Twitter access token generator for CLI
Go
14
star
30

fluent-logger-go

A structured logger for Fluentd in Golang
Go
13
star
31

thunderbolt

Twitter client using Streaming API in Go language
Go
13
star
32

lineprof

Easy-to-use line profiler for Ruby
Ruby
13
star
33

itamae-template

Itamae template generator for roles and cookbooks
Ruby
11
star
34

stackflame

Stackflame provides a simple API to deal with Flamegraph of stackprof
Ruby
9
star
35

itamae-plugin-recipe-docker

Itamae recipe to install docker
Ruby
9
star
36

ghq-cache

Show frequently used repositories first in ghq list
Ruby
9
star
37

github_api-v4-client

A very thin GitHub GraphQL API v4 client
Ruby
7
star
38

go-keybind

Multi-platform terminal key input reader for Go language
Go
7
star
39

clannad

C language compiler
C
7
star
40

action-slack

Notify Slack with incoming webhook for GitHub Actions
TypeScript
6
star
41

tomodachi

Automatic follow back tool with Twitter streaming API
Ruby
5
star
42

twitter

Tiny twitter client library for Go language
Go
5
star
43

dwarftree

A wrapper of objdump --dwarf=info to visualize an object's structure and show code size
Ruby
5
star
44

legacy-dotfiles

Configuration for my client machines
Common Lisp
4
star
45

rack-stackprof

Periodically dump StackProf profile result to `tmp` with easy-to-understand filenames
Ruby
4
star
46

go-termios

Go bindings for termios
Go
4
star
47

misc

Miscellaneous scripts and stuff
JavaScript
4
star
48

github-stream

GitHub Events API v3 client for Go language
Go
3
star
49

ajax_render

Rails plugin to simplify your ajax implementation
Ruby
3
star
50

isucon4-qualifier

My answer for ISUCON4 qualifier
Go
3
star
51

sandal

Fault-aware model checker for message passing systems
Go
3
star
52

perf

Use Linux perf for some region of Ruby code easily
Ruby
3
star
53

rockstar

Colorful GitHub user summarizer
Go
3
star
54

mitamae-plugin-resource-deploy_directory

Fork of mitamae-plugin-resource-deploy_revision to deploy directory instead of git repository
Ruby
3
star
55

itamae-plugin-resource-ghq

Itamae resource plugin to manage repositories with ghq
Ruby
3
star
56

itamae-plugin-resource-cask

Itamae resource plugin for homebrew cask
Ruby
2
star
57

chrome-response-time

Chrome extension to show response time on badge
JavaScript
2
star
58

ruby-color

ruby-color foo.rb
Ruby
2
star
59

mitamae-plugin-resource-cron

MItamae plugin to reproduce the behavior of cron resource in Chef
Ruby
2
star
60

ruboty

My Ruboty configuration
Ruby
2
star
61

sigcdump

Sigdump for C backtrace
Ruby
2
star
62

userstream

Twitter UserStream client with OAuth for Go language
Go
2
star
63

xraise

Fast X Window Raiser
Rust
2
star
64

libx11-ruby

Ruby binding of libx11 mostly for xlib
Ruby
2
star
65

mjit-disable

Unofficial gem to disable MJIT dynamically
Ruby
1
star
66

itamae-sandbox

Ruby
1
star
67

erb-trim

An ERB extension that supports <%-=
Ruby
1
star
68

PKGBUILDs

PKGBUILDs for Arch Linux
Shell
1
star
69

GomokuAI

Artificial Intelligence for Gomoku
C++
1
star
70

picturesque

Personal web server for image distribution
Go
1
star
71

changelogger

Local file change logger
Go
1
star
72

ruby-prehistory

A repository generated by https://github.com/yhara/ruby-prehistory
C
1
star
73

ruby-cvs

cvs2git from https://github.com/takahashim/rhg-repository
C
1
star
74

isucon2-ruby

My answer for ISUCON2
Ruby
1
star
75

mitamae-plugin-recipe-rvm

MItamae plugin similar to sous-chefs/rvm
Ruby
1
star
76

ruboty-ghibli

A Ruboty plugin
Ruby
1
star
77

mitamae-plugin-recipe-buildpack

MItamae plugin to run heroku-buildpack
Ruby
1
star
78

submarine

Ruby
1
star
79

erb-indent

ERB with de-indentation
Ruby
1
star
80

vagrant-box-arch

Arch Linux Vagrant box for VirtualBox provider
1
star
81

lambda-gyazo-s3

Gyazo server clone implemented with AWS Lambda and API Gateway
Java
1
star
82

each_with_rank

Rank iterator for Enumerable
Ruby
1
star
83

pr_viewer

Pull requests viewer
Go
1
star