• Stars
    star
    191
  • Rank 202,877 (Top 4 %)
  • Language
    Haskell
  • License
    BSD 3-Clause "New...
  • Created about 9 years ago
  • Updated about 9 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

GHC plugin that improves Haskell code's numerical stability

Herbie GHC Plugin

The Herbie GHC Plugin automatically improves the numerical stability of your Haskell code. The Herbie plugin fully supports the SubHask numeric prelude, and partially supports the standard prelude (see the known bugs section below).

This README is organized into the following sections:

Example: linear algebra and numerical instability

The popular linear library contains the following calculation:

w :: Double -> Double -> Double
w far near = -(2 * far * near) / (far - near)

This code looks correct, but it can give the wrong answer. When the values of far and near are both very small (or very large), the product far * near will underflow to 0 (or overflow to infinity). In the worst case scenario using Doubles, this calculation can lose up to 14 bits of information.

The Herbie plugin automatically prevents this class of bugs (with some technical caveats, see how it works below). If you compile the linear package with Herbie, the code above gets rewritten to:

w :: Double -> Double -> Double
w far near = if far < -1.7210442634149447e81
    then ((-2 * far) / (far - near)) * near
    else if far < 8.364504563556443e16
        then -2 * far * (near / (far - near))
        else ((-2 * far) / (far - near)) * near

This modified code is numerically stable. The if statements check to see which regime we are in (very small or very large) and select the calculation that is most appropriate for this regime.

The test suite contains MANY more examples of the types of expressions the Herbie plugin can analyze.

How it works

GHC Plugins let library authors add new features to the Haskell compiler. The Herbie plugin gets run after type checking, but before any optimizations. Because GHC is so good at optimizing, the code generated by the Herbie plugin is just as fast as hand-written code. The plugin has two key components: first it finds the floating point computations in your code; then it replaces them with numerically stable versions.

Finding the computations

When the Herbie plugin is run, it traverses your code's abstract syntax tree looking for mathematical expressions. These expressions may:

  • consist of the following operators: / , * , - , * , ** , ^ , ^^ , $ , cos , sin , tan , acos , asin , atan , cosh , sinh , tanh , exp , log , sqrt , abs

  • contain an arbitrary number of free variables

  • contain arbitrary non-mathematical subexpressions

For example, in the function below:

test :: String -> String
test str = show ( sqrt (1+fromIntegral (length str))
                - sqrt (fromIntegral (length str))
                :: Float
                )

the Herbie plugin extracts the expression sqrt (x+1) - sqrt x; calculates the numerically stable version 1 / (sqrt (x+1) + sqrt x); and then substitutes back in.

The Herbie plugin performs this procedure on both concrete types (Float and Double) and polymorphic types. For example, the plugin will rewrite

f :: Field a => a -> a -> a
f far near = (far+near)/(far*near*2)

to

f :: Field a => a -> a -> a
f far near = 0.5/far + 0.5/near

(The Field constraint comes from SubHask.)

These polymorphic rewrites are always guaranteed to preserve the semantics when the expression is evaluated on an exact numeric type. So both versions of f above are guaranteed to give the same result when called on Rational values, but the rewritten version will be more accurate when called on Float or Double.

The main limitation of the Herbie plugin is that any recursive part of an expression is ignored. This is because analyzing the numeric stability of a Turing complete language is undecidable (and no practical heuristics are known).

Fortunately, the Herbie plugin can still analyze the non-recursive subexpressions of a recursive expression. So in the code:

go :: Float -> Float -> Float
go 0 b = b
go a b = go (a-1) (sqrt $ (a+b) * (a+b))

the expression sqrt $ (a+b) * (a+b) gets rewritten by Herbie into abs (a+b).

Disabling Herbie

You can prevent Herbie from analyzing a function using an annotation pragma. In the following example:

{-# ANN foo "NoHerbie" #-}
foo x = (x + 1) - x

Herbie will NOT rewrite the code into

foo x = 1

These GHC annotations can only be applied to top level bindings. They prevent Herbie from searching anywhere inside the binding.

Improving the stability of expressions

The Herbie plugin uses two sources of information to find numerically stable replacements for expressions. The simplest source is a sqlite3 database. This database contains about 400 expressions that are known to be used by Haskell libraries.

The more important source is the Herbie program. Herbie is a recent research project on using probabilistic searches to find numerically stable expressions. Because Herbie is probabilistic, it provides weak theoretic guarantees on the numerical stability of the resulting expressions; but in practice, the improved expressions are significantly better. To ensure reproducible builds, the same random seed is used on all calls to the Herbie program. For more details on how Herbie works, check out the PLDI15 paper.

The Herbie program can take a long time to run. If the program doesn't return a solution within two minutes, then the Herbie plugin assumes that no better solution is possible and continues processing. To improve compile times, every time the Herbie program returns a new solution, the solution is added to the Herbie.db database. When compiling a file, if all the math expressions are already in the database, then the Herbie plugin imposes essentially no overhead on compile times.

Installing and Using the Herbie Plugin

The Herbie plugin requires GHC 7.10.1 or 7.10.2. It is installable via cabal using the command:

cabal update && cabal install HerbiePlugin

It is recommended (but not required) that you also install the Herbie program. (Without installing the program, the Herbie plugin can only replace expressions in the standard database.) The Herbie program is written in racket, so you must first install Racket. Go to the download page for Racket 6.1.1 and install the version for your platform. Then run the commands:

git clone http://github.com/mikeizbicki/HerbiePlugin --recursive
cd HerbiePlugin/herbie
raco exe -o herbie-exec herbie/interface/inout.rkt

The last line creates an executable herbie-exec that the Herbie plugin will try to call. You must move the program somewhere into your PATH for these calls to succeed. One way to do this is with the command:

sudo mv herbie-exec /usr/local/bin

To compile a file with the Herbie plugin, you need to pass the flag -fplugin=Herbie to when calling GHC. You can have cabal install automatically apply the Herbie plugin by passing the following flags

--ghc-option=-fplugin=Herbie
--ghc-option=-package-id herbie-haskell-0.1.0.0-50ba55c8f248a3301dce2d3339977982

Running Herbie on all of stackage

Stackage LTS-3.5 is a collection of 1351 of the most popular Haskell libraries. The script install.sh compiles all of stackage using the Herbie plugin.

48 of the 1351 packages (3.5%) use floating point computations internally. Of these, 40 packages (83%) contain expressions whose numerical stability is improved with the Herbie plugin. In total, there are 303 distinct numerical expressions used in all stackage packages, and 164 of these expressions (54%) are more stable using the Herbie plugin.

The table below shows a detailed breakdown of which packages contain the unstable expressions. Notice that the unstable and stable expression columns may not add up to the total expressions column. The difference is the expressions could not be analyzed because the Herbie program timed out.

package total math expressions unstable expressions stable expressions
math-functions-0.1.5.2 92 50 34
colour-2.3.3 28 8 18
linear-1.19.1.3 28 19 8
diagrams-lib-1.3.0.3 25 15 10
diagrams-solve-0.1 25 14 11
statistics-0.13.2.3 15 5 7
plot-0.2.3.4 12 9 2
random-fu-0.2.6.2 11 4 5
Chart-1.5.3 10 8 2
circle-packing-0.1.0.4 9 3 6
mwc-random-0.13.3.2 9 1 7
Rasterific-0.6.1 8 3 5
diagrams-contrib-1.3.0.5 6 6 0
log-domain-0.10.2.1 5 0 5
repa-algorithms-3.4.0.1 5 3 2
rasterific-svg-0.2.3.1 4 2 2
sbv-4.4 4 2 2
clustering-0.2.1 3 1 2
erf-2.0.0.0 3 2 1
hsignal-0.2.7.1 3 2 0
hyperloglog-0.3.4 3 1 2
integration-0.2.1 3 1 2
intervals-0.7.1 3 2 1
shake-0.15.5 3 2 1
Chart-diagrams-1.5.1 2 2 0
JuicyPixels-3.2.6.1 2 2 0
Yampa-0.10.2 2 2 0
YampaSynth-0.2 2 1 1
diagrams-rasterific-1.3.1.3 2 1 1
fay-base-0.20.0.1 2 1 1
histogram-fill-0.8.4.1 2 0 2
parsec-3.1.9 2 0 2
smoothie-0.4.0.2 2 2 0
Octree-0.5.4.2 1 1 0
ad-4.2.4 1 1 0
approximate-0.2.2.1 1 1 0
crypto-random-0.0.9 1 1 0
diagrams-cairo-1.3.0.3 1 1 0
diagrams-svg-1.3.1.4 1 1 0
dice-0.1 1 0 1
force-layout-0.4.0.2 1 1 0
gipeda-0.1.2.1 1 0 1
hakyll-4.7.2.3 1 0 1
hashtables-1.2.0.2 1 1 0
hmatrix-0.16.1.5 1 0 1
hscolour-1.23 1 0 1
metrics-0.3.0.2 1 1 0

The easiest way to find the offending code in each package is to compile the package using the Herbie plugin.

Known Bugs

There are no known bugs when compiling programs that use the SubHask numeric prelude.

The standard Prelude is only partially supported. In particular, the Herbie plugin is able to extract mathematical expressions correctly and will print the stabilized version to stdout. But the plugin can only substitute the stabilized version on polymorphic expressions, and does not perform the substitution on non-polymorphic ones. The problem is that the solveWantedsTcM function (called within the getDictionary function inside of Herbie/CoreManip.hs) is unable to find the Num dictionary for Float and Double types. I have no idea why this is happening, and I'd be very happy to accept pull requests that fix the issue :)

More Repositories

1

HLearn

Homomorphic machine learning
Haskell
1,607
star
2

ucr-cs100

open source software construction course
C++
483
star
3

subhask

Type safe interface for working in subcategories of Hask
Haskell
413
star
4

ifcxt

constraint level if statements
Haskell
110
star
5

cmc-csci046

CMC's Data Structures and Algorithms Course Materials
TeX
52
star
6

typeparams

Lens-like interface for type level parameters; allows unboxed unboxed vectors and supercompilation
Haskell
41
star
7

cmc-csci143

big data course materials
TeX
40
star
8

cmc-csci040

Computing for the Web
HTML
37
star
9

hmm

hidden markov models in haskell
Haskell
34
star
10

cmc-csci145-math166

Data Mining
TeX
32
star
11

datasets

A collection of publicly available datasets
Scilab
26
star
12

simd

simple interface to ghc's simd vector support
Haskell
23
star
13

gitlearn

a course management system (similar to ilearn) based on git
Shell
22
star
14

parsed

a haskellified version of the classic sed unix tool
TeX
21
star
15

homoiconic

Constructs FAlgebras from typeclasses, making Haskell functions homoiconic
Haskell
18
star
16

deep-tda

TeX
17
star
17

typespeed

fork of the popular typespeed program designed for the UCR cs100 curriculum
C
16
star
18

cmc-csci181-deeplearning

deep learning course materials
Python
15
star
19

vector-heterogenous

Arbitrary size tuples in Haskell
Haskell
12
star
20

american-shit

primary and secondary sources documenting America's misdeads
11
star
21

geolocation

Image/text geolocation with tensorflow and the MvMF
Python
9
star
22

modulus-magnus-linguae

Python
8
star
23

ConstraintKinds

Implements common Haskell type classes using the constraint kinds pattern to allow constraints.
Haskell
7
star
24

vector-functorlazy

vectors supporting lazy fmap application; asymptotically faster in some cases
Haskell
7
star
25

metahtml

Python
5
star
26

chajda

Python
5
star
27

wiktionary_bli

TeX
4
star
28

cmc-csci181-languages

HTML
3
star
29

twitter_coronavirus

Python
2
star
30

rapache

a simple webserver built with bash commands
Shell
2
star
31

pagila-hw3

Shell
1
star
32

lab-timeit2

Python
1
star
33

lambda-server

Shell
1
star
34

pagila-hw2

Shell
1
star
35

binary_search

Python
1
star
36

korean

1
star
37

wardial

Python
1
star
38

lab-goodreads

1
star
39

multiskipgram

Python
1
star
40

mediabiasfactcheck

Python
1
star
41

reddit

Python
1
star
42

fake-news

Python
1
star
43

novichenko

Python
1
star
44

twitter_postgres2

Python
1
star
45

word_ladder

Python
1
star
46

2023spring-FinTechPracticum

1
star
47

pgrollup

easy creation of rollup tables in postgresql (compute count(*) queries in constant time)
PLpgSQL
1
star
48

NovichenkoBot

Python
1
star
49

2021fintech

Jupyter Notebook
1
star
50

sorting

Python
1
star
51

2020summer

TeX
1
star
52

cv

my curriculum vitae
TeX
1
star
53

search_engine

Python
1
star
54

Classification

Haskell
1
star
55

histogram

A Haskell package for easily creating histograms
Haskell
1
star
56

dominion

Haskell
1
star
57

haskell-lecture

Outline of material for an intro to haskell course
1
star
58

pagila-midterm

PLpgSQL
1
star
59

cmc-advising

1
star
60

containers

Python
1
star
61

glowing-wookie

Haskell
1
star
62

twitter_postgres

Python
1
star
63

2021summer

1
star
64

hperfstat

haskell bindings to linux's cpu hw counters (i.e. results of `perf stat`)
1
star