• Stars
    star
    194
  • Rank 200,219 (Top 4 %)
  • Language
    Lua
  • License
    MIT License
  • Created almost 9 years ago
  • Updated over 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A self-hosting compiler for the Lua language.

Welcome to Lua, in Lua!

Hey! I'm glad you've found me. I am a somewhat correct compiler for the Lua 5.2 language into the Lua Bytecode format, but more than that, I hope that I can become a useful guide for those who are interested in language implementation.

I grew out of a month-long sprint from an initiative to complete a side-project within the span of the November of 2015, but I wanted to keep tredding on.

Installation

Make sure that you have Lua 5.2 with luarocks installed. Note that it's possible that you may have luarocks configured with luajit. In this case, the LUA_PATH ought to stay the same, so you can still use the luainlua package; you just won't be able to run lua.lua.

git clone https://github.com/leegao/LuaInLua.git
cd LuaInLua
luarocks make

And voila, the package luainlua and the script lua.lua will be installed.

Usage

luainlua comes with a scripted "interpreter" (lua.lua) as well as the entire compilation API.

lua.lua

lua.lua filename.lua will compile and run filename.lua using the luainlua compiler. In addition, you may pass in the -d flag (at the very end) in order to print a human-readable disassembly, like so

leegao@DESKTOP-3RST9I3:/mnt/c/Users/leegao/Documents/IdeaProjects/LuaInLua$ lua.lua testing/hello_world.lua -d
Level 0
Code
1       (line 10)       CLOSURE(A=r(0), Bx=v(0))
2       (line 13)       MOVE(A=r(1), B=r(0))
3       (line 13)       LOADK(A=r(2), Bx=Kst(0))
4       (line 13)       LOADK(A=r(3), Bx=Kst(1))
5       (line 13)       LOADK(A=r(4), Bx=Kst(2))
6       (line 13)       CALL(A=r(1), B=v(4), C=v(1))
7       (line 9)        RETURN(A=r(0), B=v(1))
Constants
1       1
2       2
3       3
Upvalues
0       _ENV    upval   0
   Level 1
   Code
   1    (line 10)       GETTABUP(A=r(0), B=v(0), C=Kst(0):rk(256))
   2    (line 10)       VARARG(A=r(1), B=v(0))
   3    (line 10)       CALL(A=r(0), B=v(0), C=v(1))
   4    (line 10)       RETURN(A=r(0), B=v(1))
   Constants
   1    print
   Upvalues
   0    _ENV    upval   0
--- END OF DUMP ---
1       2       3

Additionally, if you don't specify a file to run, lua.lua will read from stdin.

Compilation API

You can directly invoke the luainlua compiler using the luainlua.luac package. It exposes a single compile method that compiles source code (not file) into a quadruple of a lua function, its raw bytecode (in 5.2 compliant format), its internal representation in luainlua, and a human-readable dumper.

local luac = require 'luainlua.luac'
local func, bytecode, proto, dumper = luac.compile('print("Hello World")')
-- Inspect the disassembly of bytecode
dumper(proto)
-- Run func
func()

There's a wealth of internal APIs revolved around the compilation process (luainlua.lua.*) as well as the bytecode format (luainlua.bytecode.*) that can be useful for anyone who wishes to understand the Lua internals.

Bootstrapping Test

The ultimate test of a compiler written in its own language is the bootstrap test. That is, can we use the compiler to compile itself, and then use the resulting compiler to compile an arbitrary program.

Here is hello world in lua.lua in lua.

-- lua.lua
luac = require 'luainlua.luac'
func, bytecode, prototype, dumper = luac.compile('print("Hello World")')
dumper(prototype) -- Human readable disassembly
func()
-- <ctrl + d>
--[[ Level 0
   Code
   1       (line 1)        GETTABUP(A=r(0), B=v(0), C=Kst(0):rk(256))
   2       (line 1)        LOADK(A=r(1), Bx=Kst(1))
   3       (line 1)        CALL(A=r(0), B=v(2), C=v(1))
   4       (line 1)        RETURN(A=r(0), B=v(1))
   Constants
   1       print
   2       Hello World
   Upvalues
   0       _ENV    upval   0
   Hello World
]]--

Parser

In addition to the vanilla compiler, luainlua also comes with its own recursive-descent powered parser generator and its own implementation of "regular" regular expression. Both of these are within the luainlua.parsing.* and the luainlua.ll1.ll1 packages. In addition, you can augment the vanilla lua grammar (luainlua/lua/grammar.ylua) and recompile it by just requiring luainlua.generate_parser.

A sample grammar for the handmade parser generator is given here:

%file "experimental_parser"
%require "luainlua.parsing.lex"
%require "luainlua.parsing.re"

/*
root   = expr
rexpr  = %eps | $expr | PLUS $expr
expr   = $consts $rexpr | ID $rexpr | FUN ID -> $expr | LPAREN $expr RPAREN $rexpr
consts = NUMBER | STRING | TRUE | FALSE
*/

%code {:
local string_stack = {}
local function id(token) return function(...) return {token, ...} end end
local function ignore(...) return end
local function pop(stack) return table.remove(stack) end
local function push(item, stack) table.insert(stack, item) end
local tokenizer = lex.lex {
  root = {
    {'+', id 'PLUS'},
    {'fun', id 'FUN'},
    {'->', id 'ARROW'},
    {'(', id 'LPAREN'},
    {')', id 'RPAREN'},
    {'true', id 'TRUE'},
    {'false', id 'FALSE'},
    {re '%s+', ignore},
    {re '%d+', id 'NUMBER'},
    {re '%d+%.%d+', id 'NUMBER'},
    {re '(%a|_)(%a|%d|_|\')*', id 'ID'},
    {'"', function(piece, lexer) lexer:go 'string'; push('', string_stack) end},
  },
  string = {
    {'"', function(piece, lexer) 
      lexer:go 'root'
      return {'STRING', pop(string_stack)}
    end},
    {re '.', function(piece, lexer) 
      push(pop(string_stack) .. piece, string_stack)
    end}
  },
}
:}

%default.action {:
  function(item)
    return item
  end
:}

%prologue {:
  function(stream)
    local tokens = {}
    for token in tokenizer(stream) do
      table.insert(tokens, token)
    end
    return tokens
  end
:}

%convert {:
  function(token)
    return token[1]
  end
:}

%epilogue {:
  function(result)
    return result
  end
:}

%quote '(' LPAREN
%quote ')' RPAREN
%quote 'fun' FUN
%quote '->' ARROW
%quote '+' PLUS
%quote 'true' TRUE
%quote 'false' FALSE

consts := NUMBER
        | STRING
        | 'true'
        | 'false'

expr		:=    ('(' $expr ')' | $consts | ID ;) ( '+' $expr | $expr;)? | 'fun' ID+ '->' $expr;
root		:=    $expr;

This presents a lambda-calculus-esque language.

Philosophy

Why Lua? If you're interested in language implementation, you typically see two types of tutorials out there depending on your background.

  1. If you're in college, chances are you'll find your local Programming Language department to be filled with OCaml/Haskell enthusiasts pushing for compilers for mini-OCaml/Haskell/Imp. You'll be shown the purple Dragon Book and the way out.
  2. If you're on your own, chances are you'll stumble across various tutorials on how to use Lex/Yacc masquerading as "How to write your own compiler!" guides. You'll learn about words, and then you get shown this weird parser language called BNF, and finally you get a grammar for some subset of C. "Parsing is the hardest part of a compiler" they'll tell you, "the rest is trivial and is subsequently left as an exercise for the reader." Finally, they'll tell you to go out and buy the purple Dragon Book.

I've tried both of these approaches, and neither worked. Obviously, the Dragon Book is to be blamed.

My philosophy: don't read, just do. This is one of the core tenets of people who love to program. No matter how hard you try, you will rarely be able to have a full end-to-end perspective on a large and complex system. Stop falling into never-ending rabbit holes and start iterating. There's no better way to learn about a domain than to get your hands dirty.

So this is my sell for my grand ambition of getting free labor. Lua turns out to be a rather curious language, both from the language design perspective as well as the choice of tooling that it uses. No matter if you're a fresh developer or a seasoned Clang hacker (that's me!), you're sure to find something fresh and interesting because the design choices made by Lua are somewhat unconventional. So come hack along and squash some bugs. Hopefully you'll gain some valuable insights that'll finally bridge you across the deep chasm between the "How to even Compilers for Dummies" and actually fleshing out your own language.


Lua

More Repositories

1

readme2tex

Renders TeXy Math for Github Readmes
Python
906
star
2

float-hacks

Floating Point Hacks
C++
162
star
3

see.lua

A Lua 5.x/LuaJIT introspection library for humans
Lua
85
star
4

pyXLSX

A python class library that can read Office 2007 Excel Files
Python
43
star
5

RegexEnumerator

Combinatorial tricks to enumerate regular expressions in closed-form.
Python
33
star
6

BullshitMath.lol

You can't do that! Can you?
HTML
22
star
7

Rocket-for-Lua

A Sqlite ORM modeler (implements Django's model paradigm in Lua) for all data related projects.
Lua
18
star
8

pyChroma

Chroma-Hash in python as implemented by http://mattt.me/2009/07/chroma-hash-a-belated-introduction
JavaScript
9
star
9

Yuck

A Compiler Tutorial for a contemporary dynamic language.
Java
6
star
10

Xi

The Xi Language
C
5
star
11

NotepadThing

JavaScript
3
star
12

luadis

Lua Disassembler and decompiler
Lua
3
star
13

notetag

blargh
Python
3
star
14

pybird

A simple and fast python twitter library that doesn't need all of the hassels of python-twitter
Python
2
star
15

compiling-with-cps

Compiling with CPS
OCaml
2
star
16

mu

A small experimental language
C++
2
star
17

laughing-octo-bot

I have no fucking idea
C
2
star
18

AMX2D

AMX2D For CS2D - Advanced bootstrapping of all CS2D functionalities. Created out of boredom
Lua
1
star
19

Urban-RealEstate

Geo Project
Python
1
star
20

MatrixParser

cool reduction of CFG parsing into matrix multiplication
Python
1
star
21

monomer-dimer-tilings

Convoluted solution for the monomer-dimer tiling problem.
Python
1
star
22

LuaPrettify

Cleans up and formats your Lua Code (In Python)
Python
1
star
23

LetsBuildACompiler

A set of tools developed for let's build a compiler in python series
Python
1
star
24

rpn_jit

Experimenting with primitive JIT mechanisms.
C++
1
star
25

Simple-Parser

A simple LL(1) topdown parser generator implemented in the D programming language. (Sorry, no lexers yet)
D
1
star
26

Open2DServer

Open source implementation of CS2D 0.1.0.4 beta protocol (outdated)
Python
1
star