• Stars
    star
    111
  • Rank 314,510 (Top 7 %)
  • Language
    Haskell
  • Created over 6 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

theseus, functional programming language with fully reversible computation

Credit:

Credit for design and original implementation of Theseus goes to Roshan P. James and Amr Sabry of Indiana University. This is my fork of their implementation, which I intend to keep as a pet project.

The original project can be found here.

Theseus, the programming language.

Why create another programming language in this time and age? And why make such an obscure finicky one wherein it's not obvious how one can write a web server or an iphone app? Theseus exists due to a certain philosophical point of view on computation. What follows is a casual overview.

The paradox of the Ship of Theseus was originally proposed by Plutarch in the setting of the ship that the Greek hero Theseus used for his adventures. Since the Athenians admired Theseus they took on the task of preserving the ship in his memory. Over time, as parts of the ship got spoiled, they replaced them with equivalent new parts. Thus the ship was always in good shape.

However one day, many years later, someone makes the observation that the ship no longer had any original parts. The entirety of the ship had been replaced piece by piece. This raises the question: is this ship really still the ship of Theseus?

And, if not, when does the change take place? When the first part is replaced? When the last one is replaced? When about half the parts are replaced?

Further, to compound the question, Thomas Hobbes added the following corollary: Imagine a junk yard outside Athens where all the discarded parts of the ship had been collected. If the proprietor of the junk assembled the parts into a ship, which ship is now the real ship of Theseus?

This is a philosophical paradox about the nature of equality and identity. The question also applies to people. Since we are constantly changing, cells and thought patterns continuously being replaced, is any person the same person they were some time ago? Can anyone ever not change? Is a lactose free, sugar free, gluten free cupcake still a cupcake?

Our programming language is named Theseus because, like in the paradox, the main computation step involves replacing a value by an isomorphic value. Since isomorphic things really are the same, have we changed the value? And if not, what have we computed? Have we computed at all? As we keep doing these replacement in our program we will have transformed the value that was the input to the value that is the output.

There are many answers to the paradox of the ship of Theseus. The obvious answers of saying either 'yes, it is the same ship' or 'no it's not' have their justifications. Other answers exist too. One is to say that the question is wrong and that it meaningless to ask this sort of question about identity. Like a river whose very nature is to flow and change from moment to moment, so is nature of the ship. Over time the ship changes, just like the river, and it is meaningless to ask if is the same ship. It is our flawed notion of time and identity that makes us assume that the ship is not like the river and to expect one to have static identity and the other one to not.

Theseus owes its design to a certain philosophical point of view about computation. Turing machines and the lambda-calculus are abstract models of computation. They were invented by people as mental models of how computation may be expressed. These are meant to serve as conceptual entities and we may think of them as being free of physical constraints, i.e. it does not matter if you have the latest laptop, a whole data center or no computer at all, the ideas underlying the abstract models apply equally well. However there is something wrong with this view.

Abstract models of computation are meant to apply to our physical reality. We might be constrained by the resources we have at hand to do some computations. For instance, we might not have enough memory to run some programs. However, we do expect that models are compatible with our physics. For example, if we imagined that each step of a Turing machine took only half the amount of time the previous step took to execute, after 2 units of time time all Turing machines would have observable results. Such a machine is impossible to build (except maybe in the movie Inception where you could fall asleep causing time to speed up in each recursive dream step). Its not just the difficulty of building it thats at issue here. This violates the our notion of space and time in a deep way. Futher it gives rise to issues in our mathemaics and formal logics since it resolves the halting problem. We think of such models of computations that violate physics merely as curiosities and not as modeling computation in the physical world.

The line of work that Theseus comes from stems from the idea that abstract models of computation, must in their essence be compatible with our physics. Mordern physics at the level of quantum mechanics describes a universe where every fundamental action is reversible and fundamental quantities such as energy, matter and information are conserved. The notion of conservation of information stems from a line of thought originating from Maxwell's paradox of the "demon" that seemed to violate the Second law of Thermodynamics and its current accepted resolution set forth by Ralph Landauer. Landauer argued that the demon must do thermodynamic work to forget the information that its learns about the speed of each particle. This act of forgetting information implied that information should be treated as a physical quantity subject to conservation laws. In more recent years, Pasquele Malacaria and others wrote about the entropy of computation and Eric Lutz and others experimentally verified Landauer's principle. ("Maxwell's Demon 2" by Leff and Rex is a pretty good read on the general topic; the first part of the book is pretty accessible and the later part is largely a collection of historically relevant papers.)

So the physics that Theseus is concerned with is this strange new physics where information is not longer a concept of the human mind, like love and peace, but a physical entity. This has happened to computation before; before Turing worked out that computation itself had a formal notion that can be cpatured by abstract computational models, computation itself was considered an activity of the human mind not subject to the rigors of mathematics and logic.

The model of computation that was originally devised in this regard came out under the longish name of 'dagger symmetric traced bimonoidal categories'. The name was too long, was somewhat imprecise and didn't really say much at all. The underlying idea was that we could look for computation in the rules of equality.

If every allowed operation is a transformation of one quantity for another equivalent quantity, then computation should preserve information. However, can we even compute like this? After all, our conventional models of computation allows us to do "logical or" and "logical and" reducing two bools into one bool. Changing the value of a variable means loosing the information in that variable forever. Conditional statements and loops seem to inherently be lossy operations because it is unclear how to execute them in reverse without knowing which banches were originally taken. And more importantly, in the words of Barry Mazur, when is one thing equal to some other thing?

We settled on simplest notions of equality, those familiar from arithmetic. Rules like:

0 + x       = x
x + y       = y + x
(x + y) + z = x + (y + z)

1 * x       = x
x * y       = y * x
(x * y) * z = x * (y * z)

x * 0       = 0
x * (y + z) = x * y + x * y

if x = y and y = z, then x = z
if x = y and w = z, then x + w = y + z
if x = y and w = z, then x * w = y * z
if x + y = x + z, then y = z

We then took the numbers represented by x, y etc to be a measure of the "amount" of information and we only allowed operation that corresponded to these equalities. Thus each operation preserved the amount of information.

For a while it was unclear that what we had was even a model of computation, i.e. it took us a while to learn to express programs in it. Figuring out how to do the equivalent of a conditional took a long time in that setting. The resulting model of computation was one where every primitive operation was a sort of type isomorphism. When we add recursive types to the mix, the model becomes Turing complete. Programs in this early model were easier to represent as diagrams. Each program was essntially a complex wiring diagram where each wire had a type and process of "running" the program was the process of tracing the flow of particles through these wires. In the references below you can find lots of details about all this.

A Gentle Introduction to Theseus

While all of this worked out in theory, it was tedious constructing programs. For attempts of programming directly in the above model see PDF, PDF and PDF. This was when Theseus happened and we realized that we could express the computations of the above information preserving model in a format that looked somewhat like a regular functional programming language. Much like the situation of replacing ship parts with other equivalent ship parts, Theseus computes by replacing values with equivalent values.

All the programs in Theseus are reversible and the type system will prevent you from writing anything that isnt. You can program in Theseus without knowing anything about the body of theory that motivated it.

-- booleans
data Bool = True | False

-- boolean not 
iso not :: Bool <-> Bool
| True  <-> False
| False <-> True

Theseus has algebraic data types. It has no GADTs or polymorphism yet, but those are boring and can be added later. For the purpose of this presentation I will pretend that we do have polymorphism though. The equivalent of functions in Theseus are the things called iso. The iso called not maps Bool to Bool. In a coventional functional language the part on the left hand side of the <-> is called the pattern and the part on the right is called the expression. In Theseus, both the LHS and the RHS of the <-> are called patterns. In Theseus patterns and expressions are the same thing.

Now here is the interesting bit: The patterns on the right hand side and the patterns on the left hand side should entirely cover the type. On the RHS we have

:: Bool <->
| True  <->
| False <->

which do indeed cover all the cases of the type Bool. On the LHS we have

:: <-> Bool
|  <-> False
|  <-> True

which also covers the type bool. Patterns cover a type, when every value of the type is matched by one and only one pattern. So the following single pattern does not cover the type Bool since there the value True is unmatched.

:: Bool
| False

The following also does not cover the type Bool since the value False can be matched by both patterns. Here the variable x is a pattern that matches any value.

:: Bool
| False
| x

The following does cover the type Bool:

:: Bool
| x

Here is another type definition and another iso:

data Num = Z | S Num 

iso parity :: Num * Bool <-> Num * Bool
| n, x                   <-> lab $ n, Z, x
| lab $ S n, m, x        <-> lab $ n, S m, not x
| lab $ Z, m, x          <-> m, x
where lab :: Num * Num * Bool

Here the lab is called a label and labels are followed by a $ sign. All the patterns that have no label should cover the type of the corresponding side of the function.

:: Num * Bool <-> Num * Bool
| n, x        <-> 
|             <-> 
|             <-> m, x

Here n, x on the LHS do cover the type Num * Bool. The variable n matches any Num and the variable x matches any Bool. The RHS is covered similalry by m, x. The label lab has the type Num * Num * Bool and the intention is that the patterns of the label on the LHS and the RHS must each should cover the type of label.

:: Num * Num * Bool <->
|                   <-> 
| lab $ S n, m, x   <-> 
| lab $ Z, m, x     <-> 

On the LHS we see that every value of the type Num * Num * Bool is matched by one of the two patterns. The same applies to the RHS patterns of the lab label.

Labels are a way of doing loops. When a pattern on the LHS results in a label application on the RHS, control jumps to the LHS again and we try to match the resulting value against a pattern of teh label on the RHS. This continues till we end up in a non-labelled pattern on the right. So lets trace the execution of parity when we given it the input S S S Z, True of the type Num * Bool.

S S S Z, True           -> lab $ S S S Z, Z, True 
lab $ S S S Z, Z, True	-> lab $ S S Z, S Z, False
lab $ S S Z, S Z, False -> lab $ S Z, S S Z, True
lab $ S Z, S S Z, True	-> lab $ Z, S S S Z, False
lab $ Z, S S S Z, False -> S S S Z, False 

So parity of S S S Z, True applied not to True three times, resulting in S S S Z, False. Here is one more iso:

iso add1 :: Num <-> Num 
| x            <-> ret $ inR x
| lab $ Z      <-> ret $ inL () 
| lab $ S n    <-> lab $ n
| ret $ inL () <-> Z
| ret $ inR n  <-> S n
where ret :: 1 + Num
      lab :: Num

This iso has two labels lab and ret and one can verify that the same coverage contraints hold. Here the type 1 + Num would be written as Either () Num in Haskell, i.e. 1 is the unit type that has only one value. The value is denoted by () and is read as "unit". One can run add1 on any n of type Num and verify that we get S n back as the result.

Now here is the next interesting bit: For any Theseus iso, we can get its inverse iso by simply swapping the LHS and RHS of the clauses. This also get at the essence of the idea of information preservation: if we have a program and an input to it, we can run the input through the program and get the program and the output. Using the program and the output, we can recover the input.

Some programs may not terminate on some inputs and in such cases it is meaningless to ask about reverse execution. Given that Theseus is a Turing complete language it is not surprising that some executions are non-terminating. What however may be surprising that information preservation holds in the presence of non-termination.

Here the reverse execution of add1, lets call it sub1 does indeed diverge on the input Z. For every other value of the form S n it returns n. Its worth tracing this execution and thinking about why this comes about in Theseus.

Theseus also supports the notion of parametrizing an iso with another one. For example:

iso if :: then:(a <-> b) -> else:(a <-> b) -> (Bool * a <-> Bool * b)
| True, x  <-> True, then x
| False, x <-> False, else x

Here the iso called if takes two arguments, then and else, and decides which one to call depending on the value of the boolean argument. Theseus doesn't really have higher-order or first-class functions. The parameter isos are expected to be fully inlined before transofrmation of the value starts. Theseus will only run a value v:t1 on an iso of type t1 <-> t2, and the -> types in the above should be fully instantiated away.

Running if ~then:add1 ~else:sub1 on the input True, S Z gives us the output True, S S Z and running the same function with input False, S Z gives us False, Z. The syntax of labelled arguments is similar to that used by OCaml.

Reverse evaluation works by flipping LHS and RHS in the same way as before. However, you wonder, what happens when there is a function call in a left hand side pattern? Here is a simple example:

iso adjoint :: f:(a <-> b) -> (b <-> a)
| f x <-> x

Here is the interesting bit again: An iso call on the left side pattern is the dual of an iso call on the right hand side. When f x appears on the right we know that we have an x in hand and the result we want is the result of the application f x. When f x appears on the left it means that we have the result of the application f x and we want to determine x. We can infer the value of x by tracing the flow of the given value backwards through f and the result of the this backward execution of f is x. We can do this because isos represent information preserving transformations.

So if we had add1 and its inverse sub1, then adjoint ~f:add1 is equivalent to sub1 and adjoint ~f:(adjoint ~f:add1) is equivalent to add1.

Some references for additional reading.

  • Roshan P. James and Amr Sabry. Theseus: A High Level Language for Reversible Computing. Work-in-progress report in the Conference on Reversible Computation, 2014. PDF

    This is the paper that introduces Theseus. The syntax of Theseus as presented here differs somewhat from what is in the paper, but most of the these syntactic differences are superficial and largely an artifact of the difficulty of having Parsec understand location sensitive syntax. Please see below for some notes on how the implementation of Theseus differs from that in the paper. For more of the academically relevant citations, you can chase the references at the end of the paper.

  • Harvey Leff and Andrew F. Rex. Maxwell's Demon 2 Entropy, Classical and Quantum Information, Computing. CRC Press, 2002. Amazon

    This book is a survey of about a century and a half of thought about Maxwell's demon with significant focus on Landauer's principle and the work surrounding it.

Running Theseus Programs

For now Theseus does not have a well developed REPL and is still work in progress. We use the Haskell REPL to run it. Roughly:

$ cd examples/
$ ghci ../src/Theseus.hs 
[...]
Ok, modules loaded: Theseus.
*Theseus> run "peano.ths"
[...]
-- {Loading bool.ths}
Typechecking...
Evaluating...
eval toffoli True, (True, True) = True, (True, False)
eval toffoli True, (True, False) = True, (True, True)
[...]
*Theseus> 

Like the run function above, we also have echo which parses the given file and echoes it on screen and echoT which prints out all the definitions after parsing the given file and inlining all the imports. echoT also checks for violations of non-overlapping and exhaustive pattern coverage and reports these. For instance:

*Theseus> echoT "test.ths"
[...]
iso f :: Bool <-> Bool
| True <-> False
| x <-> True
 
Error: LHS of f: Multiple patterns match values of the form : True
*Theseus> 

Differences between the implementation and the RC 2014 paper

The current implementation of Theseus is of experimental status. There are many niceties and tools required for a proper programming language that are not available as yet. When editing Theseus programs turn on Haskell mode in Emacs and that tends to work out ok.

Here are some notable differences from that of the paper.

  • The type checker is not yet implmented. However, Theseus will check strict coverage and complain if there are clauses that are overlapping or non-exhaustve. It does not check types of the variables or that they are used exaclty once. The strict coverage of function call contexts is also not checked.

  • We need to specify a keyword called iso when defining maps.

  • The import and file load semantics currently creates one flat list of definitions. This can violate the expected static scope of top level definitions.

  • The paper allows for dual definitions as shown below. The current implementation does not handle simulataneous definition of the inverse function :: sub1

  add1 :: Num <-> Num :: sub1
  | ... 
  • All constructors take only one type or value arugment, similar to OCaml constructor definitions. Hence one has to write
data List = E | Cons (Num * List)

f :: ...
| ... Cons (n, ls) ...  

instead of

data List = E | Cons Num List

f :: ...
| ... (Cons n ls) ...  

More Repositories

1

nix-std

no-nixpkgs standard library for the nix expression language
Nix
102
star
2

semirings

semirings and *-semirings of types in base/haskell-platform
Haskell
40
star
3

hsdatalog

BDD-based implementation of Datalog
Haskell
33
star
4

pump

reverse dependency build matrix generator
Haskell
31
star
5

shwifty

DEPRECATED: use https://github.com/MercuryTechnologies/moat
Haskell
23
star
6

eigen

Haskell bindings to the Eigen C++ library
C++
16
star
7

nixos-configs

my nixos configs
Nix
14
star
8

streaming-fft

sliding fast fourier transform using haskell streaming
Haskell
13
star
9

ring-buffers

concurrent mutable ring buffers with atomic updates in GHC Haskell
Haskell
11
star
10

patience

Patience diff and longest increasing subsequence
Haskell
11
star
11

string

DEPRECATED: `text` is now UTF-8: https://github.com/haskell/text
Haskell
9
star
12

vector-circular

cycling vectors with O(1) rotation and O(1), total access
Haskell
9
star
13

freq

frequency analysis tool in Haskell
Haskell
8
star
14

haskell-library-picker

domain-based library choice assistant for Haskell
7
star
15

nike

nixos remote installation via `justdoit`
Nix
7
star
16

emmy

custom event manager in ghc haskell
Haskell
7
star
17

st2

ST2 monad as described in the functional pearl "Ghosts of Departed Proofs"
Haskell
6
star
18

diet

discrete interval encoding trees (DIETs) - deprecated in favour of https://github.com/andrewthad/primitive-containers/tree/master/src-diet-map-indef
SMT
6
star
19

we-tha-best-slack

DJ Khaled SlackBot
Haskell
5
star
20

these-skinny

clone of haskell 'these' package without the dependency bloat
Haskell
5
star
21

silvi

Haskell library for generating fake data.
Haskell
5
star
22

hedgehog-generic

GHC Generics automatically derived hedgehog generators
Haskell
5
star
23

lacroix

crack open a cold one with the boys
Haskell
5
star
24

knuckles

generate typeclass test suites using hedgehog-classes and ghc-api
Haskell
4
star
25

grab

fetch hash information for nixpkgs from github repos
Haskell
4
star
26

yi-chessai

my yi configuration
Haskell
4
star
27

bytestring-substring-old

break ByteStrings up into substrings
Haskell
4
star
28

simd

GHC Haskell NCG SIMD library (non-portable)
Haskell
4
star
29

unpacked-either

Haskell
3
star
30

wizard

the wizard monad
Haskell
3
star
31

acme-cuteboy

maybe gives you a cute boy.
Haskell
3
star
32

rose

rose trees
Haskell
3
star
33

dyna

haskell growable vector library
Haskell
3
star
34

std

Haskell
3
star
35

cfr

A Haskell implementation of Counterfactual Regret Minimization for poker
Haskell
3
star
36

plutus-flake-utils

a nix flake for plutus projects
Nix
3
star
37

TicTacToe

TicTacToe (Naughts and Crosses) implementation to learn Haskell
Haskell
3
star
38

gw

ghcWithPackages cmdline util
Haskell
2
star
39

acme-npm

the heart and soul of web development
Haskell
2
star
40

wa

commandline calculator
Haskell
2
star
41

streaming-lz78

Haskell
2
star
42

kafka

Nix
2
star
43

disjoint-sets

Haskell Disjoint Sets (Union Find) Library
Haskell
2
star
44

bytestring-substring

See README for more info
Haskell
2
star
45

chessai

2
star
46

cmf

concurrent (commutative) monoidal folds
Haskell
2
star
47

always

a typeclass that is satisfied by all types, always
Haskell
2
star
48

union-find

rust union find (disjoint sets) library
Rust
2
star
49

morse

Haskell
2
star
50

pong

GHC Haskell command-line ping utility
Haskell
2
star
51

poketext

Haskell
2
star
52

hs-onesixtyone

haskell snmp scanner
Haskell
2
star
53

levity

levity-polymorphic versions of typeclasses in Haskell's `base` library
Haskell
2
star
54

contiguous-mmap

memory map contiguous data structures
Haskell
2
star
55

asn

Haskell
2
star
56

cool

Haskell
2
star
57

useless

useless: random haskell thoughts, either mine or collected from elsewhere
Haskell
2
star
58

contiguous-bogo

package i wrote to annoy @andrewthad
Haskell
2
star
59

reflex-mono

monomorphised reflex
Haskell
2
star
60

rtq

purely functional real time queues in haskell
Haskell
2
star
61

kittypath

typesafe bytestring-based path library with a simpler/"user-friendly" api
Haskell
2
star
62

nvim-configs

Lua
2
star
63

1password-beta-nix

nix expression to test out the 1password beta (linux only)
Nix
2
star
64

unpacked-validation

Haskell
2
star
65

poke-type

pokemon type that maintains the invariant of type uniqueness per pokemon
Haskell
2
star
66

nom

bytearray parser combinators
Haskell
2
star
67

nonempty

lift potentially empty structures to one guaranteed to be nonempty
Haskell
2
star
68

generics-bench

generate modules to benchmark generics deriving
Haskell
1
star
69

gp

openconnect convenience wrapper for globalprotect protocol
Nix
1
star
70

vellian

fibonacci algebra
Haskell
1
star
71

initialize

initializing and deinitializing unmanaged memory of recursive structures
Haskell
1
star
72

bytestring-twiddle

bit-twiddling optimisations on bytestrings
Haskell
1
star
73

wc

See README for more info
Haskell
1
star
74

uuid-uhm

don't use this
Haskell
1
star
75

fractran

[FR]eaking [A]wful [C]rappy fractran interpreter
Haskell
1
star
76

jabber-jaw

Haskell
1
star
77

xmlbf

github fork of xmlbf
Haskell
1
star
78

soylent

never eat food again
Haskell
1
star
79

toposort

Haskell
1
star
80

coerce-util

some utilities for Data.Coerce and some newtype utilities
Haskell
1
star
81

aks

AKS primality test in haskell
Haskell
1
star
82

bazd

Haskell
1
star
83

oxyclean

uses fsnotify to call cargo check on rust projects
Nix
1
star
84

us-weather-history

us weather history data
1
star
85

dependent-map-aeson

aeson instance for dependent-map-Data.Dependent.Map.DMap
Haskell
1
star
86

rstream

vector stream-fusion framework as a package (playground)
Haskell
1
star
87

hjd

hydra jobset configuration diff tool
Haskell
1
star
88

hadamard

Haskell
1
star
89

simple-fft

Haskell
1
star
90

dame

Haskell
1
star
91

bitarray

bitarray supporting efficient writes/reads
Haskell
1
star
92

multihash

Haskell
1
star
93

chessai-summoner

See README for more info
Haskell
1
star
94

yaml-with-cabal-file

don't use this. needed for fetchFromGitHub
Haskell
1
star
95

qsem

simple and general quantity semaphores in ghc haskell
Haskell
1
star
96

pktparse

haskell port of rust's pktparse
Haskell
1
star
97

vflow-types

types for https://github.com/VerizonDigital/vflow
Haskell
1
star
98

hedgehog-classes-old

Haskell
1
star
99

go

Haskell implementation of the Logical Rules of Go
Haskell
1
star
100

afpg

afpg talks
Haskell
1
star