• Stars
    star
    142
  • Rank 258,495 (Top 6 %)
  • Language Ballerina
  • License
    Apache License 2.0
  • Created over 3 years ago
  • Updated 12 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Ballerina compiler that generates native executables.

nBallerina

Goals

The long-term goal of the nBallerina project is to create a new compiler for the Ballerina language that is written in Ballerina and can generate native code using LLVM. It will implement the Ballerina language as defined by the 2021R1 version of the language specification; we will aim to track any agreed changes to the spec.

Eventually we expect nBallerina to replace the existing jBallerina compiler, which is written in Java. The long-term vision for Ballerina is to support execution both natively and on top of the JVM, which means it will eventually need to be possible also to generate JVM bytecode using nBallerina.

At a high level, nBallerina and jBallerina share a common architecture: they are divided into a frontend and a backend, which communicate using an intermediate representation called BIR. Since we expect it to be several years until nBallerina can replace jBallerina, in the meantime we want to set things up so that we can combine the jBallerina frontend with the nBallerina backend. Eventually we may also want to use the jBallerina backend to allow nBallerina to produce JVM bytecode.

One of the most fundamental features of the Ballerina language is that subtyping is semantic. This means that a type corresponds to a set of values, and the subtype relationship between types is defined in terms of the subset relationship between the corresponding sets of values: a type S is a subtype of T if and only if the set of values denoted by S is a subset of the set of values denoted by T. This is a very simple and powerful idea, but unfortunately is rather challenging to implement.

The implementation of jBallerina was begun many years ago, before typing was fully worked out. Its most significant current limitation is that its implementation of subtyping is not semantic. Instead it implements a syntactic approximation to semantic subtyping. Although this is good enough for many uses, it is a high priority to fix this. Semantic subtyping needs to be consistently applied in the frontend, the backend and the runtime.

The initial goals for nBallerina are the things that jBallerina does not yet do:

  • generating LLVM IR from BIR
  • semantic subtyping.

There is an earlier project that was called nBallerina. It took a different approach: a backend written in C++, which read serialized BIR from the jBallerina frontend. This is in the nballerina-cpp repository and is no longer being developed.

Structure

The nBallerina compiler, which is organized as a Ballerina project in the compiler directory, is structured into the following components written in Ballerina:

  • Semantic subtyping implementation (in the types module). This provides a normalized representation of Ballerina types, and operations on that normalized representation.
  • BIR (in the bir module). This is a definition of BIR as a Ballerina type, together with some utility functions. BIR (as used in nBallerina) represents types using the normalized representation provided by the semantic subtyping implementation. This also includes a verifier that uses the semantic subtyper to verify that the BIR is well-typed. This depends on the types module.
  • Front-end (in the front module). This generates BIR from the source code of a Ballerina module using the front.syntax module. This depends on the bir module and the front.syntax module.
  • Parser (in the front.syntax module). This produces an abstract syntax tree from the source code for a single file.
  • LLVM API (in the print.llvm module). This provides a Ballerina API to LLVM. The implementation of this API builds a textual representation of the LLVM IR as LLVM assembly language, which can be written to an .ll file and then compiled with LLVM's clang command. This is designed to be very similar to the LLVM C API. This does not depend on any of the other modules.
  • LLVM API on top of JNI (in the jni.llvm module). This implements the same API as the print.llvm module on top of the LLVM C API via JNI. (So this is Ballerina, on top of Java, on top of C via JNI, on top of C++.)
  • Native backend (in the nback module). This builds the LLVM IR representation of a Ballerina module from the BIR representation of a Ballerina module. It depends on the bir and the nback.llvm modules (but not the front module).
  • A compiler driver (in default module). This calls the frontend to generate the BIR and then calls the backend to generate LLVM. It depends on the bir, front and nback modules.

The starting point for the types and front modules was the semtype project.

As well as a compiler, nBallerina needs a runtime, which is in the runtime directory. This is currently in C and fairly minimal. The major components will include

  • runtime type checker (we hope to write this mostly in Ballerina);
  • implementation of the langlib;
  • garbage collector (current plan is eventually for this to be in Rust and built on top of MMTK);
  • scheduler.

Implementation plan

The implementation strategy is start by implementing a tiny subset of the language, and then implement progressively larger subsets. The plan is that each subset will be implemented correctly. This implies that

  • if the compiler accepts the program, then the result when executed will behave as the language spec requires it to behave;
  • the compiler will not accept a program that the language spec says is not a valid program.

In accordance with the initial goals of the project, we will not initially devote much engineering effort to usability aspects of the front end, such as high-quality error messages and error recovery. The initial priority for the front-end is correctness; we will evolve it into something more usable later.

In designing the sequence of subsets, we want to

  • maintain correctness
  • be able to write the appropriate parts of the runtime needed for each subset in Ballerina
  • define subsets syntactically as much as possible (i.e. the subset is all programs that have this grammar)

and then work towards implementing a subset that is sufficient for each of the following milestones

  1. Supporting all the BIR instructions needed to self-host the compiler. This will allow us to run the entire nBallerina compiler natively, by compiling it using the jBallerina frontend in conjunction with the nBallerina backend.
  2. A full implementation of semantic typing. This provides a foundation for jBallerina to switch over to doing semantic subtyping.
  3. Self-hosting the compiler.
  4. Implementing useful Ballerina clients.

Note that as regards the third of the above milestones, we will allow the nBallerina to make use all the relevant language features implemented by jBallerina, since a secondary goal of the nBallerina project is to help with testing jBallerina. This should not affect how long it takes us to get to the first of the above milestones, but will mean it takes us longer to get to the third. We will, however, initially restrict nBallerina's use of concurrency: we don't want to have to implement Ballerina's concurrency features before we can self-host.

Usage

The compiler has not yet got to a stage where it is useful. But if you want to play with it or help with development, this is the way:

  1. Clone the nBallerina repository.
  2. Download and install the latest Ballerina distribution (Swan Lake not 1.2.x)
  3. You can build the compiler by using the command bal build in the compiler directory; this will generate a file target/bin/nballerina.jar. This should work on any system that Ballerina works on.
  4. You can use java -jar nballerina.jar example.bal to compile a Ballerina module into an LLVM assembly file example.ll (note that only a tiny subset of the language is currently implemented, as described in the Status section).
  5. If you want to be able to turn the LLVM assembly file into something you can execute, there are additional requirements: Linux or OS X, LLVM 16 and GNU make. With these, you can build the runtime and run the tests by running make test in the top-level directory. This compiles and executes all the test cases and checks that they produce the right outputs. You can use e.g. make -j8 to make it run tests in parallel.

If you want to turn the LLVM assembly into an executable, you can use the test/run.sh command.

You can try out the semantic subtyping implementation using

java -jar ballerina.jar --showTypes example.bal

where example.bal is a Ballerina module containing only type definitions and const definitions. It will print out the subtype relationships between all the defined types.

To output the Ballerina intermediate representation (BIR) for a module, use

java -jar ballerina.jar --bir example.bal

Testing

The compiler is tested using the test cases in the compiler/testSuite directory. The bal build command performs a first level of testing on these: it checks that the test cases that should get compile errors do, and that the test cases that should not get compile errors do not. This should work on any platform on which the Ballerina distribution works.

For those test cases that are valid Ballerina programs, the Makefile in the test directory is used to further test that the generated LLVM assembly files can be compiled with LLVM and give the correct output when executed. This Makefile has the following targets:

  • test (the default target) forces a rebuild of all the .ll files if the compiler jar has changed, and then does compile and testll
  • compile builds any out of date .ll files (but does not consider the compiler jar when determining whether a .ll is out of date)
  • testll builds an executable for each test case, executes it and checks its output

Status

We have completed subset 14 and are working on subset 15.

The semantic subtyping implementation is further along than the backend. It implements the subset of Ballerina's type syntax described by this grammar.

More Repositories

1

ballerina-lang

The Ballerina Programming Language
Ballerina
3,286
star
2

lsp4intellij

This language client library provides language server protocol support for IntelliJ IDEA and other Jetbrains IDEs.
Java
413
star
3

ballerina-spec

Ballerina Language and Platform Specifications
HTML
171
star
4

ballerina-dev-website

Dev version of the ballerina.io website
HTML
163
star
5

module-ballerina-graphql

The Ballerina GraphQL module is part of the Ballerina Standard Library. It is a spec-compliant, production-ready GraphQL implementation for writing GraphQL APIs in Ballerina.
Ballerina
144
star
6

ballerina-library

The Ballerina Library
Ballerina
137
star
7

module-ballerina-jwt

Ballerina JWT module.
Ballerina
130
star
8

openapi-tools

Ballerina OpenApi-Tool
Java
129
star
9

module-ballerina-grpc

Ballerina gRPC Module
Ballerina
128
star
10

ballerina-release

Ballerina release scripts
Python
126
star
11

openapi-connectors

Generate Ballerina connector with OpenAPI definition
Ballerina
126
star
12

module-ballerina-http

Ballerina HTTP Module
Java
124
star
13

module-ballerinax-nats

Ballerina NATS Module.
Ballerina
124
star
14

ballerina-platform.github.io

ballerina-platform.github.io - Github pages based ballerina.io website
HTML
124
star
15

ballerina-action

Dockerfile
124
star
16

module-ballerina-io

Ballerina io Module
Ballerina
123
star
17

module-ballerina-tcp

Ballerina socket module
Java
122
star
18

module-ballerina-oauth2

Ballerina OAuth2 Module
Ballerina
122
star
19

module-ballerina-websocket

Ballerina WebSocket Module
Java
121
star
20

module-ballerina-websub

Ballerina Websub module.
Ballerina
120
star
21

module-ballerina-mime

Ballerina MIME Module
Java
119
star
22

plugin-intellij

Ballerina extension for IntelliJ IDEA.
Java
119
star
23

module-ballerinax-mysql

Ballerina mysql Module
Ballerina
119
star
24

module-ballerina-auth

Ballerina Auth Module
Java
119
star
25

module-ballerina-sql

Ballerina SQL Module
Java
119
star
26

module-ballerina-email

Ballerina module to send and receive emails
Java
119
star
27

module-ballerinax-kafka

Ballerina Kafka Module.
Ballerina
119
star
28

module-ballerina-udp

Ballerina UDP module enables transport layer communication over UDP protocol.
Java
118
star
29

module-ballerinax-java.jdbc

Ballerina JDBC Module
Ballerina
118
star
30

module-ballerina-cache

Ballerina cache Module
Ballerina
118
star
31

module-ballerina-log

Ballerina log Module
Ballerina
118
star
32

module-ballerina-c2c

Ballerina Code2Cloud implementation
Java
118
star
33

module-ballerinax-slack

Ballerina slack module
Ballerina
118
star
34

module-ballerinax-azure-cosmosdb

Ballerina
118
star
35

plugin-vscode-compiler-toolkit

Compiler tools for Ballerina developers
TypeScript
118
star
36

ballerina-dev-tools

Ballerina Developer Tooling
Java
118
star
37

module-ballerinax-stan

Ballerina NATS Streaming Module.
Java
117
star
38

module-ballerina-crypto

Ballerina crypto Module
Ballerina
117
star
39

module-ballerina-websubhub

This modules includes a bunch of APIs to facilitate writing different WebSub Hub implementations
Ballerina
116
star
40

module-ballerinax-googleapis.calendar

Connector repository for Google Calendar API.
Ballerina
116
star
41

module-ballerina-xmldata

Ballerina xml utils Module
Ballerina
116
star
42

module-ballerinax-postgresql

Ballerina PostgreSQL DB module
Ballerina
116
star
43

module-ballerinax-java.jms

Ballerina
116
star
44

module-ballerina-file

Ballerina File Module
Ballerina
116
star
45

module-ballerinax-azure-service-bus

Ballerina
116
star
46

module-ballerinax-aws.dynamodb

This is to keep the Amazon DynamoDB connector for Ballerina.
Ballerina
116
star
47

module-ballerinax-aws.s3

Ballerina
116
star
48

module-ballerina-task

Ballerina task Module
Java
116
star
49

module-ballerina-time

Ballerina time Module
Ballerina
116
star
50

module-ballerinax-azure.functions

The implementation of Azure Functions compiler extension for Ballerina.
Java
116
star
51

module-ballerinax-datamapper

A compiler extension to extract abstract representation of Ballerina connector actions and their associated types
Java
116
star
52

module-ballerina-uuid

Ballerina UUID Module
Ballerina
116
star
53

module-ballerinax-netsuite

The Ballerina connector to perform operations on Netsuite integrate cloud system.
Ballerina
116
star
54

module-ballerinax-twitter

This repo is to keep Ballerina Twitter connector implementation for Ballerina
Ballerina
116
star
55

ballerina-update-tool

Ballerina Update Tool implementation to manage Ballerina versions
Java
116
star
56

module-ballerinax-ai.agent

Ballerina ReAct type Agent module using Large language models (LLMs)
Ballerina
115
star
57

module-ballerina-os

Ballerina system Module
Java
115
star
58

module-ballerinax-jaeger

Ballerina Jaeger Observability Extension Module
Java
115
star
59

module-ballerinax-aws.sqs

Ballerina
115
star
60

module-ballerinax-mssql

Ballerina MSSQL DB module
Ballerina
115
star
61

module-ballerinax-aws.lambda

Java
115
star
62

module-ballerina-serdes

This is the Ballerina SerDes package, which is a part of the Ballerina Language Standard Library
Java
115
star
63

module-ballerinax-oracledb

Oracle Database Connector for Ballerina
Ballerina
115
star
64

module-ballerina-xslt

Ballerina xslt module
Java
115
star
65

module-ballerina-url

Ballerina encoding module.
Ballerina
115
star
66

module-ballerinax-rabbitmq

Ballerina RabbitMQ Module.
Ballerina
115
star
67

module-ballerinax-prometheus

Ballerina Prometheus Observability Extension Module
Java
115
star
68

module-ballerinai-transaction

Ballerina internal module of transaction implementation
Ballerina
115
star
69

module-ballerinax-mysql.driver

Ballerina Azure MySQL Module
Ballerina
115
star
70

module-ballerinax-azure-storage-service

Ballerina
115
star
71

graphql-tools

Maintain the source code for GraphQL related tools.
Java
115
star
72

module-ballerina-jballerina.java.arrays

Ballerina Java Array Module
Ballerina
114
star
73

module-ballerina-constraint

Ballerina Constraint Module
Ballerina
114
star
74

module-ballerinax-choreo

Ballerina Choreo Observability Extension Module
Java
114
star
75

ballerina-performance-cloud

Ballerina Performance Tests in Cloud
Shell
114
star
76

module-ballerinax-azure.eventhub

Azure Eventhub connector
Ballerina
114
star
77

module-ballerina-regex

Ballerina Regex Module
Ballerina
114
star
78

plugin-gradle

Ballerina Gradle plugin
Groovy
114
star
79

module-ballerinax-mssql.driver

Ballerina MSSQL DB Driver
Ballerina
114
star
80

module-ballerina-random

Ballerina Random Library
Ballerina
114
star
81

module-ballerinax-health.fhir.templates

FHIR Ballerina templates
Ballerina
114
star
82

module-ballerinax-persist.sql

SQL database support of Ballerina Persist
Ballerina
114
star
83

module-ballerinax-microsoft.onedrive

The Ballerina connector to perform operations on the files, which is stored on OneDrive
Ballerina
114
star
84

ballerina-custom-jre

Generates platform-specific custom Java runtime images to be bundled with Ballerina platform distributions, which contains only the required modules for Ballerina runtime.
114
star
85

asyncapi-tools

This repository is the code base for the ballerina async-api tool
Java
114
star
86

persist-tools

Ballerina persist tools
Ballerina
113
star
87

module-ballerinax-cdata.connect

Manage Ballerina CData connector modules centrally.
Java
113
star
88

module-ballerina-persist

Ballerina Persist module
Java
113
star
89

module-ballerinax-peoplehr

Ballerina connector for People HR
Ballerina
113
star
90

module-ballerinax-aws.ses

The Ballerina connector to perform operations on Amazon Simple Email Service(Amazon SES).
Ballerina
113
star
91

module-ballerinax-googleapis.people

Repository for Google People API Connector
Ballerina
113
star
92

module-ballerinax-microsoft.teams

The Ballerina Microsoft Teams Connector for teamwork and intelligent communications.
Ballerina
113
star
93

module-ballerinax-googleapis.drive

Repository for Google Drive module.
Ballerina
113
star
94

module-ballerinax-microsoft.excel

The Ballerina connector to perform operations on Excel workbooks stored in Microsoft OneDrive.
Ballerina
113
star
95

asyncapi-triggers

This repo will contain the trigger source code generated through ballerina async api tool
Ballerina
113
star
96

module-ballerinax-aws.simpledb

This is to keep the Amazon SimpleDB connector for Ballerina.
Ballerina
112
star
97

module-ballerinax-aws.sns

This repo is to keep the newly written Amazon SNS connector for Ballerina.
Ballerina
112
star
98

module-ballerina-toml

Ballerina TOML Parser
Ballerina
112
star
99

edi-tools

This library provides the functionality required to process EDI files and implement EDI integrations.
Ballerina
112
star
100

module-ballerinax-health.fhir.r4

FHIR R4 Ballerina modules
Ballerina
112
star