The master repository for this work is located at: https://savannah.nongnu.org/projects/stage0/
If you wish to contribute:
pull requests can be made at https://github.com/oriansj/stage0 and https://gitlab.com/janneke/stage0 or patches/diffs can be sent via email to Jeremiah (at) pdp10 [dot] guru or join us on libera.chat’s #bootstrappable or update the wiki at https://bootstrapping.miraheze.org/wiki/Stage0
Those wishing to work on POSIX ports of stage0 can do so here: https://github.com/oriansj/stage0-posix Those wishing to do work with CPM/DOS porting let me know (I have some easy work for you)
Goal
This is a set of manually created hex programs in a Cthulhu Path to madness fashion. Which only have the goal of creating a bootstrapping path to a C compiler capable of compiling GCC, with only the explicit requirement of a single 1 KByte binary or less.
Additionally, all code must be able to be understood by 70% of the population of programmers. If the code can not be understood by that volume, it needs to be altered until it satisfies the above requirement.
Also found within
This repo contains a few of my false start pieces that may be of interest to people who want to independently create the root binary. I welcome all bug fixes and code that aids in the above stated goal.
A link to the successful POSIX ports for x86, AMD64 and AArch64 in the POSIX submodule
FYI
I’ll be adding more code and documentation as I build pieces. ALL code in this REPO is under the GPLv3 or Later.
In order to build stage0 and all the pieces, one only needs to run make all. Each individual piece can be built by simply running make $piece with $piece being replaced by the actual part you want to make.
The only pieces that have any external dependencies are the Web IDE (Python3+CherryPy), libvm (GCC) and vm (GCC+GNU getopt) Those wishing to work in Python, please checkout https://github.com/markjenkins/knightpies He does an amazing job
Future
Software
Add more ports to more hardware platforms.
Hardware
Implement the Knight processor in FPGA and then convert into TTL.
Need to know information
This repository utilizes submodules, so you need to clone this repository using `git clone –recursive`. If you have already cloned it run `git submodule update –init` or after a pull be sure to do: `git submodule update –recursive`
stage0
The stage0 is the ultimate lowest level of bootstrap that is useful for systems without firmware, operating systems nor any other provided software functionality. Those with such capabilities can skip this stage as it requires human input.
Hex0_monitor
The Hex0_monitor provides dual functionality:
- It assembles hex0 programs manually typed in
- It writes the characters, providing minimal text input functionality.
The first is essential for creating of the root binaries. The second is essential for creating source files before you have an editor. The distinction is important because only the Hex0 assembler in stage1 is built by the Hex0_monitor and from that point onwards it is used as a minimal text editor until a more advanced text editor can be bootstrapped.
stage1
The stage1 is dependent on the availability of text source files and at least a hex0 monitor or assembler. The steps in this stage can be fully automated should one trust their automation or performed manually on any hardware they trust.
Regardless of which method selected, the resulting binaries MUST be identical.
Hex0
The Hex0 assembler or stage1_assembler-0 is the head node of the stage1 bootstrap. Its functionality is reduced compared to the stage0 monitor simply because it only performs half of the required functions; that of generating binaries from hex0 source files.
Its most important features of note are: ; line comments and
As careful notes are essential for this stage.
Hex1
The Hex1 assembler or stage1_assembler-1 is the next logical extension of the Hex0 assembler, single character labels and relative displacement using a prefix. In this case labels start with : thus the label a must be written :a and the prefix for relative offsets is @ thus the pointer must be written @a Further because of the mescc-tools standardization of syntax @label indicates a 16bit relative displacement.
Alternative architectures porting this need not limit themselves to 16bit displacements should they so choose, rather they must provide at least 1 size of displacement or if they so desire, they may skip and write their Hex2 assembler in Hex0 but as it is a much larger program, I recommend against it.
Hex2
The Hex2 assembler or stage1_assembler-2 or hex2_linker is as complex of a hex language that is both meaningful and worth the effort.
Hex2’s important advances over Hex1 are as follows: Support for long labels (Minimal 42 chars, ideally unlimited) Support for Absolute addressing ($label for 16bit absolute addresses) Support for Alternative pointer sizes (%label for 32bit relative and &label for 32bit absolute addresses)
Optionally support for !label (8bit relative addressing) and ?label (Architecture specific size/properties) and/or @label1>label2 %label1>label2 displacements may be implemented should the specific architecture require it for human readable hex2 source files (such as ELF headers).
M0
M0 or M0-macro or M1-macro is the minimal string replacement program with string processing functionality required to convert an Assembly like syntax into Hex2 programs that can be compiled. Its rules are merely an extension of Hex2 with the goal of reducing the amount of hex that one would need to write.
The 3 essential pieces are:
- DEFINE STRING1 HEX_CHARACTERS (No extra whitespace nor \t or \n inside
definition)
- “Raw strings” allow every character except ” as there is no support for
string escapes, including NULL; which are converted to Hex chars for Hex2 To convert back to the chars inside of the “quotes” with the addition of a trailing NULL character or the number desired (Must be at least 1, no upper bound) and restrictions such as padding to word boundaries are acceptable.
- ‘Raw char strings’ will be passing anything inside of them (except ’ which
terminates the string).
Thus by combining :label, @label, DEFINE SYSCALL 0F05, Raw strings and chars; one has created a rather flexible and powerful Assembler capable of building far more ambitious pieces in “Macro Assembly”.
stage2
The stage2 is dependent on the availability of text source files and at least a functional macro assembler and can be used to build operating systems or other “Bootstrap” functionality that might be required to enable functional binaries; such as programs that set execute bits or generate dwarf stubs.
FORTH
Because a great many people stated FORTH would be an ideal bootstrapping language the time and effort was put forth by Caleb and Jeremiah to provide a framework for those people to contribute immediately; thus the FORTH was born.
Several efforts were taken to make the FORTH more standard but ultimately it was determined, Assembly was preferable as the underlying architecture wasn’t total garbage.
It now sits waiting for any FORTH programmer who wishes to prove FORTH is a real bootstrapping language.
Lisp
The next recommendation in bootstrapping was Lisp, so efforts were taken to design the most minimal Lisp with all of the functionality described in the original Lisp papers. The task was completed relatively quickly compared to the FORTH and even had enhancements such as a compacting garbage collector.
Ultimately it was found, the lisp that many rave about isn’t entirely compatible with modern lisps or schemes; thus was shelved for any Lisper who wishes to pick it up.
C
After being told for months there is no way to write a proper C compiler in assembly and months of research without any human written C compilers in assembly found. To prove the point Jeremiah decided the First C compiler on the bootstrap would actually be a cross-compiler for x86, such that everyone would be able to verify it did exactly what it was supposed to and see it self-host its C version.
stage3
The stage3 is dependent on the availability of text source files and at least a functional M2-Planet level C compiler, FORTH and a Minimal Garbage collecting Lisp and can be used to build more advanced tools that can be used in bootstrapping whole operating systems with modern tool stacks.
initial_library
A library collection of very useful FORTH functionality designed to make the lives of any FORTH programmer easier.
It now sits waiting for any FORTH programmer who wishes to build upon it.
ascension
A library collection of useful Lisp functionality designed to make the lives of any Lisp programmer easier.
As it depends on archaic Lisp dialect; it will likely need to be replaced should the Lisp be properly fixed.
blood-elf_x86
The x86 program for a dwarf stub generator used in mescc-tools bootstrapping. Specifically mescc-tools-seed generation, which can be used to build M2-Planet and thus complete the circle.
get_machine_x86
The trivial x86 program that allows one to skip tests or scripts that will not run on that specific platform or run alternative commands depending upon the architecture.
hex2_linker_x86
The program that allows one to build the hex2 programs for any hardware platform on x86 and thus verify software builds for hardware one does not even have.
M1-macro_x86
The program that allows one to build the M1 program for any hardware platform on x86 and thus verify software builds for hardware one does not even have.
M2-Planet_x86
The x86 port of the M2-Planet C compiler v1.0 used as one of the paths in bootstrapping M2-Planet on x86 hardware.
Inspirations
This work wouldn’t have come so far without the inspirational work of others They are in alphabetical order of the Author’s last names
GRIMLEY EVANS, Edmund - bcompiler [http://homepage.ntlworld.com/edmund.grimley-evans/bcompiler.html] :: The inspiration for hex0, hex1 and hex2 GRIMLEY EVANS, Edmund - cc500 [http://homepage.ntlworld.com/edmund.grimley-evans/cc500] :: The inspiration for M2-Planet Jones, Richard W.M. - jonesforth [http://git.annexia.org/?p=jonesforth.git] :: The inspiration for stage2 FORTH and initial_library Piner, Steve and Deutsch, L. Peter - Expensive Typewriter [http://archive.computerhistory.org/resources/text/DEC/pdp-1/DEC.pdp_1.1972.102650079.pdf] :: The inspiration for SET kragensitaker - The Monitor [https://old.reddit.com/r/programming/comments/9x15g/programming_thought_experiment_stuck_in_a_room/c0ewj2c/] :: The inspiration for the hex0-monitor