• Stars
    star
    298
  • Rank 139,663 (Top 3 %)
  • Language
  • Created about 8 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The start page about my efforts around smart contract verification

Formal Verification of Ethereum Contracts (Yoichi's attempts)

This is the start page about my efforts around smart contract verification.

Goal

The goal is to establish a method how to verify a smart contract so that no surprises happen after their deployment.

Talks

  • gave a talk in DEVCON 1
  • Berlin, 2016-11-1; gave a talk in Berlin Ethereum Meetup (video)
  • Paris, EDCON, 2017-02-17 slide video
  • Cambridge, ARM Research Summit, 2017-09-13 slide

Links

The rest of the page

What are formal methods

What are formal methods? The word "formal" here is about looking at the shape, not at the meaning, of mathematical proofs.

For a long time, mathematical proofs were read, understood and then checked. In some cases you can do calculations like "a + b - b = a" by just looking at the form, saying something like "this cancels that". Still, in the usual practice of mathematics, you ought to be able to explain what is going on. You need to understand.

In the first half of 20th century, rigid languages appeared where proofs can be checked by machines. You can see examples here. Maybe the computers do not understand the meaning, but they can check the proofs.

When you translate mathematical texts into such machine-readable proofs, you are "formalizing" mathematics. Recent decades saw a tantalizing progress in this area. The Flyspeck project and CoqFiniteGroups formalized big results from the past centuries. The Homotopy Type Theory was formalized from its very early stages.

So far this was about mathematics. The take away is, you can obtain infinitely many truths at one shot. The equality "a + b - b = a" is true for any natural numbers. By the way, here we have a smart contract, whose input is actually only finitely many. Can we do something about this?

Path a: Verification of EVM bytecodes (currently followed)

One way is to look at the EVM bytecodes. They are executed on a simple virtual machine. The rules of the virtual machine is well understood by different Ethereum clients which usually match (otherwise they fix the difference with uttermost priority). The current attempt in Coq is in evmverif repository, and there is a screen cast.

Obstacles

  • Coq proofs are currently lengthy. Isabelle/HOL made the proofs much shorter.
    • I believe Isabelle/HOL provides much easier user experience (because it has a well-polished machine word library). I've ported the Coq attempt into Isabelle/HOL.
  • At the bytecode level, it's harder to see what the code is doing and what to expect.
    • One solution is to make the Solidity compiler annotate the bytecode with the expected properties at specific code location.
  • At the bytecode level, Solidity array access looks like storage access with Kaccek hashes, and somehow we need to assume no collisions. If I assume ∀ a, b. keccak(a) = keccak(b) -> a = b, using the pigeon hole argument, I can prove 0 = 1 and everything.
    • I can locally say sorry or admit to indicate that I give up proofs whenever I have to prove keccak(a) != keccak(b) because a != b. Then the proofs would be incomplete and I cannot call my results "theorems", but I'm happy that way. Anyway the compiler assumes no hash collisions without logical justification (but empirical).
    • The module system of Coq provides a clever workaround.

Steps on Path a

  1. try Isabelle/HOL to see if proofs are really 5 times easier there (less than a week)
    • By the way, about the trustworthiness, I would put Isabelle/HOL at least as good as Coq because Isabelle/HOL is based on a simpler deduction system (a deduction system is not a piece of software; it is a bunch of inference rules usually written in LaTeX)
  2. choose Coq or Isabelle/HOL (overnight) I chose Isabelle/HOL over Coq.
  3. verify Deed and some other simple bytecode programs against simple properties (6-10 days) This is finished and there is a report.
  4. translate the definitions into Lem (2 weeks; almost done except the Keccak hash) done.
  5. develop a method how to verify assertions between opcodes (this is evmverif#5) (a week; there is a milestone for this item)
  6. cover all opcodes (3 days): this is already in progress. A milestone. mostly done, but LOGx does nothing right now
  7. test the new EVM against the standard VM tests (4 weeks; started, now parsing the tests) mostly done. but VM tests involving multiple contracts are still skipped
  8. ~~build a simple Hoare logic
  9. combine the Hoare logic with an invariant tracking technique
  10. develop Ethereum ABI
  11. implement the packaging ABI
  12. extend the model so that it can run state-tests
  13. implement verification-condition generation -- the desired post-condition and invariant annotations at JUMPDESTs would generate the formulas to prove.
  14. implement a wallet (might be between 8. and 9.)
  15. build decompilation of EVM bytecode into a recursive function in a theorem prover
  16. try to automate the process of verification / finding vulnerabilities

Personal Account of the Struggle

  • My EVM in Isabelle/HOL says "your memory contents might change after you CALL another account", and I'm like "no, that's paranoid."
  • The verification of Deed contract finishes. I break the contract to see what happens, then the proofs still work. I had a bug in my JUMPI implementation.
  • I try to verify the Deed contract again, I cannot prove the safety property anymore, and I find that one precondition was missing about active flag.
  • I added gas to the EVM and run the Deed proofs again, it now takes overnight. I forgot to time it.
  • The OCaml extraction and the Isabelle/HOL extraction were using different endianness in the hash implementation.

Path b: Verification of Solidity programs

Another way would be to verify Solidity sources somehow, not looking at the EVM bytecode.

Obstacles

  • The only way to know the meaning of a Solidity program is to compile it into bytecodes.
  • The Solidity compiler is changing fast and the verification tools would need to follow the changes
  • The verification tools tend to believe a different behavior from that of actually deployed bytecode

Steps on this path

The steps that have to be taken:

  1. Know Solidity program's meaning without checking the bytecode. For this, one way is to write a Solidity interpreter in Coq or in an ML language. Another way is to compile a Solidity program into WhyML or F*.
  2. Check the above against the reality. We need a big set of tests to compare the above translation against the bytecode.
  3. Translate the ML implementation into a proof assistant (or we rely on Why3, in that case we need to trust the SMT solvers)
  4. Try to assert properties in a shallow-embedded way and write proofs
  5. Automate the process, maybe by deep-embedding some kind of Hoare logic

Related Projects

Sideway: Formal Verificaton of Proof-of-Stake Protocols

Sideway: a safe programming language

This alone does not verify smart contracts, but I have a chance to make it static analyzer friendly.

The bamboo compiler is now producing snippets of bytecode for the empty contract (still lots to be done).

The language has

  • no loops or recursions
    • this makes life much easier for static analyzers
    • gas consumption is stabler
    • users would need to call it again and again
  • persistent program counter
    • this discourages multiple workflows in a single contract
    • reentrancy does not jump to far away in the source code, but reentrancy arrives around the last sentence executed
  • protection against reentrancy
  • explcit storage setting at every exit

It is inspired by an existing language but I will talk about it when it's actually working.

Sideway: Dr-Y's contract analyzer

The contract analyzer needs an overhaul in the functionality and the UX. I want to see the meaning of each basic block, and how the execution can jump around the basic blocks.

Sideway: transaction visualization

A quick hack to visualize the dataflow in a transaction. This is useful after a surprise.

Resources

An Incomplete List of Papers I Saw

More Repositories

1

awesome-ethereum-virtual-machine

Ethereum Virtual Machine Awesome List
839
star
2

bamboo

Bamboo see https://github.com/cornellblockchain/bamboo
OCaml
324
star
3

eth-isabelle

A Lem formalization of EVM and some Isabelle/HOL proofs
Isabelle
236
star
4

coq2rust

Coq to Rust program extraction. The whole tree is on the original Coq code base.
OCaml
214
star
5

evmverif

An EVM code verification framework in Coq
Coq
44
star
6

dry-analyzer

Dr. Y's Ethereum Contract Analyzer
Coq
42
star
7

vmtrace_visualizer

A program that annotates a vm trace with dataflow information
Ruby
35
star
8

ethereum-word-list

Words are Hard: Defining Common Terms in the Ethereum / Crypto Space
22
star
9

gohantabeyo

Gohantabeyo is a web site where people can make a wish whom they want to eat out with. If the other makes a similar wish, their wishes are told to both.
Ruby
15
star
10

cbc_casper

Isabelle formalization of binary consensus
Isabelle
8
star
11

record

6
star
12

neta

neta notes
HTML
5
star
13

rlp-ocaml

RLP serialization for OCaml
OCaml
4
star
14

practice

JavaScript
3
star
15

kissdb-rust

kissdb ported to rust
Rust
3
star
16

sql2lisp

Verilog
3
star
17

verbose-code-reading

Verboselly Logged Code Reading
2
star
18

token_why3

A Why3 modelling of a token contract
Coq
2
star
19

surreal

surreal numbers in Coq
Verilog
2
star
20

proofmarket

2
star
21

kietter

kietter
Ruby
2
star
22

js-graph-it-with-containers

a fork of http://js-graph-it.sourceforge.net/
JavaScript
2
star
23

ethereum-formal-list

A list of formal method applications on smart contracts
1
star
24

undecidable

undecidability club's web page
1
star
25

async_lazy_rpc

Haskell
1
star
26

taocp_in_coq

taocp_in_coq
1
star
27

htodo

a todo manager
Haskell
1
star
28

CLTT

Verilog
1
star
29

lifepicker

A randomized activity picker for your 5 minites
Python
1
star
30

waitfree

A combinator library for asynchronous waitfree computation among forkIO threads.
Haskell
1
star
31

haskell-twitter

1
star
32

thesis

1
star
33

game

combinatorial game
Verilog
1
star
34

ConcurrentSet

Haskell
1
star
35

pirapira.github.io

HTML
1
star
36

salary-nego

Salary negotiation app on the Nexus zkVM
Rust
1
star