• Stars
    star
    133
  • Rank 272,600 (Top 6 %)
  • Language
    C++
  • License
    MIT License
  • Created about 5 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

C++ Large Scale Genetic Programming

Modern C++ framework for Symbolic Regression

License build-linux build-windows Documentation Status Matrix Channel

Operon is a modern C++ framework for symbolic regression that uses genetic programming to explore a hypothesis space of possible mathematical expressions in order to find the best-fitting model for a given regression target. Its main purpose is to help develop accurate and interpretable white-box models in the area of system identification. More in-depth documentation available at https://operongp.readthedocs.io/.

How does it work?

Broadly speaking, genetic programming (GP) is said to evolve a population of "computer programs" ― AST-like structures encoding behavior for a given problem domain ― following the principles of natural selection. It repeatedly combines random program parts keeping only the best results ― the "fittest". Here, the biological concept of fitness is defined as a measure of a program's ability to solve a certain task.

In symbolic regression, the programs represent mathematical expressions typically encoded as expression trees. Fitness is usually defined as goodness of fit between the dependent variable and the prediction of a tree-encoded model. Iterative selection of best-scoring models followed by random recombination leads naturally to a self-improving process that is able to uncover patterns in the data:

Build instructions

The project requires CMake and a C++17 compliant compiler (C++20 if you're on the cpp20 branch). The recommended way to build Operon is via either nix or vcpkg.

Check out https://github.com/heal-research/operon/blob/master/BUILDING.md for detailed build instructions and how to enable/disable certain features.

Nix

First, you have to install nix and enable flakes. For a portable install, see nix-portable.

To create a development shell:

nix develop github:heal-research/operon --no-write-lock-file

To build Operon (a symlink to the nix store called result will be created).

nix build github:heal-research/operon --no-write-lock-file

Vcpkg

Select the build generator appropriate for your system and point CMake to the vcpkg.cmake toolchain file

cmake -S . -B build -G "Visual Studio 16 2019" -A x64 \
-DCMAKE_TOOLCHAIN_FILE=..\vcpkg\scripts\buildsystems\vcpkg.cmake \
-DVCPKG_OVERLAY_PORTS=.\ports

The file CMakePresets.json contains some presets that you may find useful. For using clang-cl instead of cl, pass -TClangCL to the above (official documentation).

Python wrapper

Python bindings for the Operon library are available as a separate project: PyOperon, which also includes a scikit-learn compatible regressor.

Bibtex info

If you find Operon useful you can cite our work as:

@inproceedings{10.1145/3377929.3398099,
    author = {Burlacu, Bogdan and Kronberger, Gabriel and Kommenda, Michael},
    title = {Operon C++: An Efficient Genetic Programming Framework for Symbolic Regression},
    year = {2020},
    isbn = {9781450371278},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3377929.3398099},
    doi = {10.1145/3377929.3398099},
    booktitle = {Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion},
    pages = {1562–1570},
    numpages = {9},
    keywords = {symbolic regression, genetic programming, C++},
    location = {Canc\'{u}n, Mexico},
    series = {GECCO '20}
}

Operon was also featured in a recent survey of symbolic regression methods, where it showed good results:

@article{DBLP:journals/corr/abs-2107-14351,
    author    = {William G. La Cava and
                 Patryk Orzechowski and
                 Bogdan Burlacu and
                 Fabr{\'{\i}}cio Olivetti de Fran{\c{c}}a and
                 Marco Virgolin and
                 Ying Jin and
                 Michael Kommenda and
                 Jason H. Moore},
    title     = {Contemporary Symbolic Regression Methods and their Relative Performance},
    journal   = {CoRR},
    volume    = {abs/2107.14351},
    year      = {2021},
    url       = {https://arxiv.org/abs/2107.14351},
    eprinttype = {arXiv},
    eprint    = {2107.14351},
    timestamp = {Tue, 03 Aug 2021 14:53:34 +0200},
    biburl    = {https://dblp.org/rec/journals/corr/abs-2107-14351.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

More Repositories

1

SimSharp

Sim# is a .NET port of SimPy, process-based discrete event simulation framework
C#
126
star
2

HeuristicLab

HeuristicLab - An environment for heuristic and evolutionary optimization
C#
39
star
3

pyoperon

Python bindings and scikit-learn interface for the Operon library for symbolic regression.
C++
37
star
4

vstat

SIMD-enabled descriptive statistics (mean, variance, covariance, correlation)
C++
18
star
5

HEAL.Attic

HEAL.Attic is a serialization and persistence framework for .NET. It serializes and deserializes complete object graphs and uses Google Protocol Buffers for compact storage.
C#
13
star
6

arPLS

arPLS algorithm from "Baseline correction using asymmetrically reweighted penalized least squares smoothing"
C++
9
star
7

TreesearchLib

A modeling framework for optimization problems and a collection of algorithms for finding solutions
C#
7
star
8

HEAL.Entities

Entity focused domain-driven design (DDD) framework containing data access libraries for Excel, RDBMS or the dwh schema variant Data Vault V2.
C#
6
star
9

HEAL.VarPro

A C# implementation of variable projection with L1 regularization for separable non-linear least squares
C#
4
star
10

HEAL.Parsers.DIAdem

C# parser and wrapper methods National Instrument's DIAdem C library (NiLibDdc)
C#
3
star
11

HEAL.NonlinearRegression

Fit and evaluate nonlinear regression models.
C#
3
star
12

HEAL.Bricks

HEAL.Bricks is a package framework for .NET. It discovers, loads, and executes packages at runtime and supports isolation in separate processes or Docker containers.
C#
2
star
13

HEAL.SCS

.NET wrapper for the splitting conic solver
C#
1
star
14

pappus

Pappus is a modern C++ library for affine arithmetic
C++
1
star
15

HEAL.Bricks.Demo

Demo projects and examples for using HEAL.Bricks.
C#
1
star