• Stars
    star
    211
  • Rank 186,867 (Top 4 %)
  • Language
    Julia
  • License
    MIT License
  • Created over 2 years ago
  • Updated 8 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A style guide for stylish Julia developers

SciML Style Guide for Julia

SciML Code Style Global Docs

The SciML Style Guide is a style guide for the Julia programming language. It is used by the SciML Open Source Scientific Machine Learning Organization. As such, it is open to discussion with the community. Please file an issue or open a PR to discuss changes to the style guide.

Table of Contents

Code Style Badge

Let contributors know your project is following the SciML Style Guide by adding the badge to your README.md.

[![SciML Code Style](https://img.shields.io/static/v1?label=code%20style&message=SciML&color=9558b2&labelColor=389826)](https://github.com/SciML/SciMLStyle)

Overarching Dogmas of the SciML Style

Consistency vs Adherence

According to PEP8:

A style guide is about consistency. Consistency with this style guide is important. Consistency within a project is more important. Consistency within one module or function is the most important.

But most importantly: know when to be inconsistent -- sometimes the style guide just doesn't apply. When in doubt, use your best judgment. Look at other examples and decide what looks best. And don't hesitate to ask!

Some code within the SciML organization is old, on life support, donated by researchers to be maintained. Consistency is the number one goal, so updating to match the style guide should happen on a repo-by-repo basis, i.e. do not update one file to match the style guide (leaving all other files behind).

Community Contribution Guidelines

For a comprehensive set of community contribution guidelines, refer to ColPrac. A relevant point to highlight PRs should do one thing. In the context of style, this means that PRs which update the style of a package's code should not be mixed with fundamental code contributions. This separation makes it easier to ensure that large style improvement are isolated from substantive (and potentially breaking) code changes.

Open source contributions are allowed to start small and grow over time

If the standard for code contributions is that every PR needs to support every possible input type that anyone can think of, the barrier would be too high for newcomers. Instead, the principle is to be as correct as possible to begin with, and grow the generic support over time. All recommended functionality should be tested, any known generality issues should be documented in an issue (and with a @test_broken test when possible). However, a function that is known to not be GPU-compatible is not grounds to block merging, rather it is encouraged for a follow-up PR to improve the general type support!

Generic code is preferred unless code is known to be specific

For example, the code:

function f(A, B)
    for i in 1:length(A)
        A[i] = A[i] + B[i]
    end
end

would not be preferred for two reasons. One is that it assumes A uses one-based indexing, which would fail in cases like OffsetArrays and FFTViews. Another issue is that it requires indexing, while not all array types support indexing (for example, CuArrays). A more generic compatible implementation of this function would be to use broadcast, for example:

function f(A, B)
    @. A = A + B
end

which would allow support for a wider variety of array types.

Internal types should match the types used by users when possible

If f(A) takes the input of some collections and computes an output from those collections, then it should be expected that if the user gives A as an Array, the computation should be done via Arrays. If A was a CuArray, then it should be expected that the computation should be internally done using a CuArray (or appropriately error if not supported). For these reasons, constructing arrays via generic methods, like similar(A), is preferred when writing f instead of using non-generic constructors like Array(undef,size(A)) unless the function is documented as being non-generic.

Trait definition and adherence to generic interface is preferred when possible

Julia provides many different interfaces, for example:

Those interfaces should be followed when possible. For example, when defining broadcast overloads, one should implement a BroadcastStyle as suggested by the documentation instead of simply attempting to bypass the broadcast system via copyto! overloads.

When interface functions are missing, these should be added to Base Julia or an interface package, like ArrayInterface.jl. Such traits should be declared and used when appropriate. For example, if a line of code requires mutation, the trait ArrayInterface.ismutable(A) should be checked before attempting to mutate, and informative error messages should be written to capture the immutable case (or, an alternative code which does not mutate should be given).

One example of this principle is demonstrated in the generation of Jacobian matrices. In many scientific applications, one may wish to generate a Jacobian cache from the user's input u0. A naive way to generate this Jacobian is J = similar(u0,length(u0),length(u0)). However, this will generate a Jacobian J such that J isa Matrix.

Macros should be limited and only be used for syntactic sugar

Macros define new syntax, and for this reason they tend to be less composable than other coding styles and require prior familiarity to be easily understood. One principle to keep in mind is, "can the person reading the code easily picture what code is being generated?". For example, a user of Soss.jl may not know what code is being generated by:

@model (x, α) begin
    σ ~ Exponential()
    β ~ Normal()
    y ~ For(x) do xj
        Normal(α + β * xj, σ)
    end
    return y
end

and thus using such a macro as the interface is not preferred when possible. However, a macro like @muladd is trivial to picture on a code (it recursively transforms a*b + c to muladd(a,b,c) for more accuracy and efficiency), so using such a macro for example:

julia> @macroexpand(@muladd k3 = f(t + c3 * dt, @. uprev + dt * (a031 * k1 + a032 * k2)))
:(k3 = f((muladd)(c3, dt, t), (muladd).(dt, (muladd).(a032, k2, (*).(a031, k1)), uprev)))

is recommended. Some macros in this category are:

Some performance macros, like @simd, @threads, or @turbo from LoopVectorization.jl, make an exception in that their generated code may be foreign to many users. However, they still are classified as appropriate uses as they are syntactic sugar since they do (or should) not change the behavior of the program in measurable ways other than performance.

Errors should be caught as high as possible, and error messages should be contextualized for newcomers

Whenever possible, defensive programming should be used to check for potential errors before they are encountered deeper within a package. For example, if one knows that f(u0,p) will error unless u0 is the size of p, this should be caught at the start of the function to throw a domain specific error, for example "parameters and initial condition should be the same size".

Subpackaging and interface packages is preferred over conditional modules via Requires.jl

Requires.jl should be avoided at all costs. If an interface package exists, such as ChainRulesCore.jl for defining automatic differentiation rules without requiring a dependency on the whole ChainRules.jl system, or RecipesBase.jl which allows for defining Plots.jl plot recipes without a dependency on Plots.jl, a direct dependency on these interface packages is preferred.

Otherwise, instead of resorting to a conditional dependency using Requires.jl, it is preferred one creates subpackages, i.e. smaller independent packages kept within the same Github repository with independent versioning and package management. An example of this is seen in Optimization.jl which has subpackages like OptimizationBBO.jl for BlackBoxOptim.jl support.

Some important interface packages to know about are:

Functions should either attempt to be non-allocating and reuse caches, or treat inputs as immutable

Mutating codes and non-mutating codes fall into different worlds. When a code is fully immutable, the compiler can better reason about dependencies, optimize the code, and check for correctness. However, many times a code making the fullest use of mutation can outperform even what the best compilers of today can generate. That said, the worst of all worlds is when code mixes mutation with non-mutating code. Not only is this a mishmash of coding styles, it has the potential non-locality and compiler proof issues of mutating code while not fully benefiting from the mutation.

Out-Of-Place and Immutability is preferred when sufficient performant

Mutation is used to get more performance by decreasing the amount of heap allocations. However, if it's not helpful for heap allocations in a given spot, do not use mutation. Mutation is scary and should be avoided unless it gives an immediate benefit. For example, if matrices are sufficiently large, then A*B is as fast as mul!(C,A,B), and thus writing A*B is preferred (unless the rest of the function is being careful about being fully non-allocating, in which case this should be mul! for consistency).

Similarly, when defining types, using struct is preferred to mutable struct unless mutating the struct is a common occurrence. Even if mutating the struct is a common occurrence, see whether using Setfield.jl is sufficient. The compiler will optimize the construction of immutable structs, and thus this can be more efficient if it's not too much of a code hassle.

Tests should attempt to cover a wide gamut of input types

Code coverage numbers are meaningless if one does not consider the input types. For example, one can hit all of the code with Array, but that does not test whether CuArray is compatible! Thus it's always good to think of coverage not in terms of lines of code but in terms of type coverage. A good list of number types to think about are:

  • Float64
  • Float32
  • Complex
  • Dual
  • BigFloat

Array types to think about testing are:

When in doubt, a submodule should become a subpackage or separate package

Keep packages to one core idea. If there's something separate enough to be a submodule, could it instead be a separate well-tested and documented package to be used by other packages? Most likely yes.

Globals should be avoided whenever possible

Global variables should be avoided whenever possible. When required, global variables should be constants and have an all uppercase name separated with underscores (e.g. MY_CONSTANT). They should be defined at the top of the file, immediately after imports and exports but before an __init__ function. If you truly want mutable global style behaviour you may want to look into mutable containers.

Type-stable and Type-grounded code is preferred wherever possible

Type-stable and type-grounded code helps the compiler create not only more optimized code, but also faster to compile code. Always keep containers well-typed, functions specializing on the appropriate arguments, and types concrete.

Closures should be avoided whenever possible

Closures can cause accidental type instabilities that are difficult to track down and debug; in the long run it saves time to always program defensively and avoid writing closures in the first place, even when a particular closure would not have been problematic. A similar argument applies to reading code with closures; if someone is looking for type instabilities, this is faster to do when code does not contain closures. Furthermore, if you want to update variables in an outer scope, do so explicitly with Refs or self defined structs. For example,

map(Base.Fix2(getindex, i), vector_of_vectors)

is preferred over

map(v -> v[i], vector_of_vectors)

or

[v[i] for v in vector_of_vectors]

Numerical functionality should use the appropriate generic numerical interfaces

While you can use A\b to do a linear solve inside a package, that does not mean that you should. This interface is only sufficient for performing factorizations, and so that limits the scaling choices, the types of A that can be supported, etc. Instead, linear solves within packages should use LinearSolve.jl. Similarly, nonlinear solves should use NonlinearSolve.jl. Optimization should use Optimization.jl. Etc. This allows the full generic choice to be given to the user without depending on every solver package (effectively recreating the generic interfaces within each package).

Functions should capture one underlying principle

Functions mean one thing. Every dispatch of + should be "the meaning of addition on these types". While in theory you could add dispatches to + that mean something different, that will fail in generic code for which + means addition. Thus, for generic code to work, code needs to adhere to one meaning for each function. Every dispatch should be an instantiation of that meaning.

Internal choices should be exposed as options whenever possible

Whenever possible, numerical values and choices within scripts should be exposed as options to the user. This promotes code reusability beyond the few cases the author may have expected.

Prefer code reuse over rewrites whenever possible

If a package has a function you need, use the package. Add a dependency if you need to. If the function is missing a feature, prefer to add that feature to said package and then add it as a dependency. If the dependency is potentially troublesome, for example because it has a high load time, prefer to spend time helping said package fix these issues and add the dependency. Only when it does not seem possible to make the package "good enough" should using the package be abandoned. If it is abandoned, consider building a new package for this functionality as you need it, and then make it a dependency.

Prefer to not shadow functions

Two functions can have the same name in Julia by having different namespaces. For example, X.f and Y.f can be two different functions, with different dispatches, but the same name. This should be avoided whenever possible. Instead of creating MyPackage.sort, consider adding dispatches to Base.sort for your types if these new dispatches match the underlying principle of the function. If it doesn't, prefer to use a different name. While using MyPackage.sort is not conflicting, it is going to be confusing for most people unfamiliar with your code, so MyPackage.special_sort would be more helpful to newcomers reading the code.

Avoid unmaintained dependencies

Packages should only be dependend on if they have maintainers who are responsive. Good code requires good communities. If maintainers do not respond to breakage within 2 weeks with multiple notices, then all dependencies from that organization should be considered for removal. Note that some issues may take a long time to fix, so it may take more time than 2 weeks to fix, it's simply that the communication should be open, consistent, and timely.

Specific Rules

High Level Rules

  • Use 4 spaces per indentation level, no tabs.
  • Try to adhere to a 92 character line length limit.

General Naming Principles

  • All type names should be CamelCase.
  • All struct names should be CamelCase.
  • All module names should be CamelCase.
  • All function names should be snake_case (all lowercase).
  • All variable names should be snake_case (all lowercase).
  • All constant names should be SNAKE_CASE (all uppercase).
  • All abstract type names should begin with Abstract.
  • All type variable names should be a single capital letter, preferably related to the value being typed.
  • Whole words are usually better than abbreviations or single letters.
  • Variables meant to be internal or private to a package should be denoted by prepending two underscores, i.e. __.
  • Single letters can be okay when naming a mathematical entity, i.e. an entity whose purpose or non-mathematical "meaning" is likely only known by downstream callers. For example, a and b would be appropriate names when implementing *(a::AbstractMatrix, b::AbstractMatrix), since the "meaning" of those arguments (beyond their mathematical meaning as matrices, which is already described by the type) is only known by the caller.
  • Unicode is fine within code where it increases legibility, but in no case should Unicode be used in public APIs. This is to allow support for terminals which cannot use Unicode: if a keyword argument must be η, then it can be exclusionary to uses on clusters which do not support Unicode inputs.

Comments

  • TODO to mark todo comments and XXX to mark comments about currently broken code
  • Quote code in comments using backticks (e.g. `variable_name`).
  • When possible, code should be changed to incorporate information that would have been in a comment. For example, instead of commenting # fx applies the effects to a tree, simply change the function and variable names apply_effects(tree).
  • Comments referring to Github issues and PRs should add the URL in the comments. Only use inline comments if they fit within the line length limit. If your comment cannot be fitted inline then place the comment above the content to which it refers:
# Yes:

# Number of nodes to predict. Again, an issue with the workflow order. Should be updated
# after data is fetched.
p = 1

# No:

p = 1  # Number of nodes to predict. Again, an issue with the workflow order. Should be
# updated after data is fetched.
  • In general, comments above a line of code or function are preferred to inline comments.

Modules

  • Module imports should occur at the top of a file or right after a module declaration.
  • Module imports in packages should either use import or explicitly declare the imported functionality, for example using Dates: Year, Month, Week, Day, Hour, Minute, Second, Millisecond.
  • Import and using statements should be separated, and should be divided by a blank line.
# Yes:
import A: a
import C

using B
using D: d

# No:
import A: a
using B
import C
using D: d
  • Large sets of imports are preferred to be written in space filling lines separated by commas.
# Yes:
using A, B, C, D

# No:
using A
using B
using C
using D

# No:
using A,
      B,
      C,
      D
  • Exported variables should be considered as part of the public API, and changing their interface constitutes a breaking change.
  • Any exported variables should be sufficiently unique. I.e., do not export f as that is very likely to clash with something else.
  • A file that includes the definition of a module, should not include any other code that runs outside that module. i.e. the module should be declared at the top of the file with the module keyword and end at the bottom of the file. No other code before, or after (except for module docstring before). In this case the code with in the module block should not be indented.
  • Sometimes, e.g. for tests, or for namespacing an enumeration, it is desirable to declare a submodule midway through a file. In this case the code within the submodule should be indented.

Functions

  • Only use short-form function definitions when they fit on a single line:
# Yes:
foo(x::Int64) = abs(x) + 3

# No:
foobar(array_data::AbstractArray{T}, item::T) where {T <: Int64} = T[
    abs(x) * abs(item) + 3 for x in array_data
]
  • Inputs should be required unless a default is historically expected or likely to be applicable to >95% of use cases. For example, the tolerance of a differential equation solver was set to a default of abstol=1e-6,reltol=1e-3 as a generally correct plot in most cases, and is an expectation from back in the 90's. In that case, using the historically expected and most often useful default tolerances is justified. However, if one implements GradientDescent, the learning rate needs to be adjusted for each application (based on the size of the gradient), and thus a default of GradientDescent(learning_rate = 1) is not recommended.
  • Arguments which do not have defaults should be preferably made into positional arguments. The newer syntax of required keyword arguments can be useful but should not be abused. Notable exceptions are cases where "either or" arguments are accepted, for example of defining g or dgdu is sufficient, then making them both keyword arguments with = nothing and checking that either is not nothing (and throwing an appropriate error) is recommended if distinct dispatches with different types is not possible.
  • When calling a function always separate your keyword arguments from your positional arguments with a semicolon. This avoids mistakes in ambiguous cases (such as splatting a Dict).
  • When writing a function that sends a lot of keyword arguments to another function, say sending keyword arguments to a differential equation solver, use a named tuple keyword argument instead of splatting the keyword arguments. For example, use diffeq_solver_kwargs = (; abstol=1e-6, reltol=1e-6,) as the API and use solve(prob, alg; diffeq_solver_kwargs...) instead of splatting all keyword arguments.
  • Functions which mutate arguments should be appended with !.
  • Avoid type piracy. I.e., do not add methods to functions you don't own on types you don't own. Either own the types or the function.
  • Functions should prefer instances instead of types for arguments. For example, for a solver type Tsit5, the interface should use solve(prob,Tsit5()), not solve(prob,Tsit5). The reason for this is multifold. For one, passing a type has different specialization rules, so functionality can be slower unless ::Type{Tsit5} is written in the dispatches which use it. Secondly, this allows for default and keyword arguments to extend the choices, which may become useful for some types down the line. Using this form allows adding more options in a non-breaking manner.
  • If the number of arguments is too large to fit into a 92 character line, then use as many arguments as possible within a line and start each new row with the same indentation, preferably at the same column as the ( but this can be moved left if the function name is very long. For example:
# Yes
function my_large_function(argument1, argument2,
                           argument3, argument4,
                           argument5, x, y, z)

# No
function my_large_function(argument1,
                           argument2,
                           argument3,
                           argument4,
                           argument5,
                           x,
                           y,
                           z)

Function Argument Precedence

  1. Function argument. Putting a function argument first permits the use of do blocks for passing multiline anonymous functions.

  2. I/O stream. Specifying the IO object first permits passing the function to functions such as sprint, e.g. sprint(show, x).

  3. Input being mutated. For example, in [fill!(x, v)](@ref fill!), x is the object being mutated and it appears before the value to be inserted into x.

  4. Type. Passing a type typically means that the output will have the given type. In [parse(Int, "1")](@ref parse), the type comes before the string to parse. There are many such examples where the type appears first, but it's useful to note that in [read(io, String)](@ref read), the IO argument appears before the type, which is in keeping with the order outlined here.

  5. Input not being mutated. In fill!(x, v), v is not being mutated and it comes after x.

  6. Key. For associative collections, this is the key of the key-value pair(s). For other indexed collections, this is the index.

  7. Value. For associative collections, this is the value of the key-value pair(s). In cases like [fill!(x, v)](@ref fill!), this is v.

  8. Everything else. Any other arguments.

  9. Varargs. This refers to arguments that can be listed indefinitely at the end of a function call. For example, in Matrix{T}(undef, dims), the dimensions can be given as a Tuple, e.g. Matrix{T}(undef, (1,2)), or as Varargs, e.g. Matrix{T}(undef, 1, 2).

  10. Keyword arguments. In Julia keyword arguments have to come last anyway in function definitions; they're listed here for the sake of completeness.

The vast majority of functions will not take every kind of argument listed above; the numbers merely denote the precedence that should be used for any applicable arguments to a function.

Tests and Continuous Integration

  • The high level runtests.jl file should only be used to shuttle to other test files.
  • Every set of tests should be included into a @safetestset. A standard @testset does not fully enclose all defined values, such as functions defined in a @testset, and thus can "leak".
  • Test includes should be written in one line, for example:
@time @safetestset "Jacobian Tests" include("interface/jacobian_tests.jl")
  • Every test script should be fully reproducible in isolation. I.e., one should be able to copy paste that script and receive the results.
  • Test scripts should be grouped based on categories, for example tests of the interface vs tests for numerical convergence. Grouped tests should be kept in the same folder.
  • A GROUP environment variable should be used to specify test groups for parallel testing in continuous integration. A fallback group All should be used to specify all of the tests that should be run when a developer runs ]test Package locally. As an example, see the OrdinaryDiffEq.jl test structure
  • Tests should include downstream tests to major packages which use the functionality, to ensure continued support. Any update which breaks the downstream tests should follow with a notification to the downstream package of why the support was broken (preferably in the form of a PR which fixes support), and the package should be given a major version bump in the next release if the changed functionality was part of the public API.
  • CI scripts should use the default settings unless required.
  • CI scripts should test the Long-Term Support (LTS) release and the current stable release. Nightly tests are only necessary for packages which a heavy reliance on specific compiler details.
  • Any package supporting GPUs should include continuous integration for GPUs.
  • Doctests should be enabled except for on the examples which are computationally-prohibitive to have as part of continuous integration.

Whitespace

  • Avoid extraneous whitespace immediately inside parentheses, square brackets or braces.

    # Yes:
    spam(ham[1], [eggs])
    
    # No:
    spam( ham[ 1 ], [ eggs ] )
  • Avoid extraneous whitespace immediately before a comma or semicolon:

    # Yes:
    if x == 4 @show(x, y); x, y = y, x end
    
    # No:
    if x == 4 @show(x , y) ; x , y = y , x end
  • Avoid whitespace around : in ranges. Use brackets to clarify expressions on either side.

    # Yes:
    ham[1:9]
    ham[9:-3:0]
    ham[1:step:end]
    ham[lower:upper-1]
    ham[lower:upper - 1]
    ham[lower:(upper + offset)]
    ham[(lower + offset):(upper + offset)]
    
    # No:
    ham[1: 9]
    ham[9 : -3: 1]
    ham[lower : upper - 1]
    ham[lower + offset:upper + offset]  # Avoid as it is easy to read as `ham[lower + (offset:upper) + offset]`
  • Avoid using more than one space around an assignment (or other) operator to align it with another:

    # Yes:
    x = 1
    y = 2
    long_variable = 3
    
    # No:
    x             = 1
    y             = 2
    long_variable = 3
  • Surround most binary operators with a single space on either side: assignment (=), updating operators (+=, -=, etc.), numeric comparisons operators (==, <, >, !=, etc.), lambda operator (->). Binary operators may be excluded from this guideline include: the range operator (:), rational operator (//), exponentiation operator (^), optional arguments/keywords (e.g. f(x = 1; y = 2)).

    # Yes:
    i = j + 1
    submitted += 1
    x^2 < y
    
    # No:
    i=j+1
    submitted +=1
    x^2<y
  • Avoid using whitespace between unary operands and the expression:

    # Yes:
    -1
    [1 0 -1]
    
    # No:
    - 1
    [1 0 - 1]  # Note: evaluates to `[1 -1]`
  • Avoid extraneous empty lines. Avoid empty lines between single line method definitions and otherwise separate functions with one empty line, plus a comment if required:

    # Yes:
    # Note: an empty line before the first long-form `domaths` method is optional.
    domaths(x::Number) = x + 5
    domaths(x::Int) = x + 10
    function domaths(x::String)
        return "A string is a one-dimensional extended object postulated in string theory."
    end
    
    dophilosophy() = "Why?"
    
    # No:
    domath(x::Number) = x + 5
    
    domath(x::Int) = x + 10
    
    
    
    function domath(x::String)
        return "A string is a one-dimensional extended object postulated in string theory."
    end
    
    
    dophilosophy() = "Why?"
  • Function calls which cannot fit on a single line within the line limit should be broken up such that the lines containing the opening and closing brackets are indented to the same level while the parameters of the function are indented one level further. In most cases the arguments and/or keywords should each be placed on separate lines. Note that this rule conflicts with the typical Julia convention of indenting the next line to align with the open bracket in which the parameter is contained. If working in a package with a different convention follow the convention used in the package over using this guideline.

    # Yes:
    f(a, b)
    constraint = conic_form!(SOCElemConstraint(temp2 + temp3, temp2 - temp3, 2 * temp1),
                             unique_conic_forms)
    
    # No:
    # Note: `f` call is short enough to be on a single line
    f(
        a,
        b,
    )
    constraint = conic_form!(SOCElemConstraint(temp2 + temp3,
                                               temp2 - temp3, 2 * temp1),
                             unique_conic_forms)
  • Group similar one line statements together.

    # Yes:
    foo = 1
    bar = 2
    baz = 3
    
    # No:
    foo = 1
    
    bar = 2
    
    baz = 3
  • Use blank-lines to separate different multi-line blocks.

    # Yes:
    if foo
        println("Hi")
    end
    
    for i in 1:10
        println(i)
    end
    
    # No:
    if foo
        println("Hi")
    end
    for i in 1:10
        println(i)
    end
  • After a function definition, and before an end statement do not include a blank line.

    # Yes:
    function foo(bar::Int64, baz::Int64)
        return bar + baz
    end
    
    # No:
    function foo(bar::Int64, baz::Int64)
    
        return bar + baz
    end
    
    # No:
    function foo(bar::In64, baz::Int64)
        return bar + baz
    
    end
  • Use line breaks between control flow statements and returns.

    # Yes:
    function foo(bar; verbose = false)
        if verbose
            println("baz")
        end
    
        return bar
    end
    
    # Ok:
    function foo(bar; verbose = false)
        if verbose
            println("baz")
        end
        return bar
    end

NamedTuples

The = character in NamedTuples should be spaced as in keyword arguments. Space should be put between the name and its value. The empty NamedTuple should be written NamedTuple() not (;)

# Yes:
xy = (x = 1, y = 2)
x = (x = 1,)  # Trailing comma required for correctness.
x = (; kwargs...)  # Semicolon required to splat correctly.

# No:
xy = (x=1, y=2)
xy = (;x=1,y=2)

Numbers

  • Floating-point numbers should always include a leading and/or trailing zero:
# Yes:
0.1
2.0
3.0f0

# No:
.1
2.
3.f0
  • Always prefer the type Int to Int32 or Int64 unless one has a specific reason to choose the bit size.

Ternary Operator

Ternary operators (?:) should generally only consume a single line. Do not chain multiple ternary operators. If chaining many conditions, consider using an if-elseif-else conditional, dispatch, or a dictionary.

# Yes:
foobar = foo == 2 ? bar : baz

# No:
foobar = foo == 2 ?
    bar :
    baz
foobar = foo == 2 ? bar : foo == 3 ? qux : baz

As an alternative, you can use a compound boolean expression:

# Yes:
foobar = if foo == 2
    bar
else
    baz
end

foobar = if foo == 2
    bar
elseif foo == 3
    qux
else
    baz
end

For loops

For loops should always use in, never = or ∈. This also applies to list and generator comprehensions

# Yes
for i in 1:10
    #...
end

[foo(x) for x in xs]

# No:
for i = 1:10
    #...
end

[foo(x) for x ∈ xs]

Function Type Annotations

Annotations for function definitions should be as general as possible.

# Yes:
splicer(arr::AbstractArray, step::Integer) = arr[begin:step:end]

# No:
splicer(arr::Array{Int}, step::Int) = arr[begin:step:end]

Using as generic types as possible allows for a variety of inputs and allows your code to be more general:

julia> splicer(1:10, 2)
1:2:9

julia> splicer([3.0, 5, 7, 9], 2)
2-element Array{Float64,1}:
 3.0
 7.0

Struct Type Annotations

Annotations on type fields need to be given a little more thought since field access is not concrete unless the compiler can infer the type (see type-dispatch design for details). Since well-inferred code is preferred, abstract type annotations, i.e.

mutable struct MySubString <: AbstractString
    string::AbstractString
    offset::Integer
    endof::Integer
end

are not recommended. Instead a concretely-typed struct:

mutable struct MySubString <: AbstractString
    string::String
    offset::Int
    endof::Int
end

is preferred. If generality is required, then parametric typing is preferred, i.e.:

mutable struct MySubString{T<:Integer} <: AbstractString
    string::String
    offset::T
    endof::T
end

Untyped fields should be explicitly typed Any, i.e.:

struct StructA
    a::Any
end

Macros

  • Do not add spaces between assignments when there are multiple assignments.
Yes:
@parameters a = b
@parameters a=b c=d

No:
@parameters a = b c = d

Types and Type Annotations

  • Avoid elaborate union types. Vector{Union{Int,AbstractString,Tuple,Array}} should probably be Vector{Any}. This will reduce the amount of extra strain on compilation checking many branches.
  • Unions should be kept to two or three types only for branch splitting. Unions of three types should be kept to a minimum for compile times.
  • Do not use === to compare types. Use isa or <: instead.

Package version specifications

  • Use Semantic Versioning
  • For simplicity, avoid including the default caret specifier when specifying package version requirements.
# Yes:
DataFrames = "0.17"

# No:
DataFrames = "^0.17"
  • For accuracy, do not use constructs like >= to avoid upper bounds.
  • Every dependency should have a bound.
  • All packages should use CompatHelper and attempt to stay up to date with the dependencies.
  • The lower bound on dependencies should be the last tested version.

Documentation

  • Documentation should always attempt to be at the highest level possible. I.e., documentation of an interface that all methods follow is preferred to documenting every method, and documenting the interface of an abstract type is preferred to documenting all of the subtypes individually. All instances should then refer to the higher level documentation.
  • Documentation should use Documenter.jl.
  • Tutorials should come before reference materials.
  • Every package should have a starting tutorial that covers "the 90% use case", i.e. the ways that most people will want to use the package.
  • The tutorial should show a complete workflow and be opinionated in said workflow. For example, when writing a tutorial about a simulator, pick a plotting package and show to plot it.
  • Variable names in tutorials are important. If you use u0, then all other codes will copy that naming scheme. Show potential users the right way to use your code with the right naming.
  • When applicable, tutorials on how to use the "high performance advanced features" should be separated from the beginning tutorial.
  • All documentation should summarize contents before going into specifics of API docstrings.
  • Most modules, types and functions should have docstrings.
  • Prefer documenting accessor functions instead of fields when possible. Documented fields are part of the public API and changing their contents/name constitutes a breaking change.
  • Only exported functions are required to be documented.
  • Avoid documenting methods common overloads ==.
  • Try to document a function and not individual methods where possible as typically all methods will have similar docstrings.
  • If you are adding a method to a function which already has a docstring only add a docstring if the behaviour of your function deviates from the existing docstring.
  • Docstrings are written in Markdown and should be concise.
  • Docstring lines should be wrapped at 92 characters.
"""
    bar(x[, y])

Compute the Bar index between `x` and `y`. If `y` is missing, compute the Bar index between
all pairs of columns of `x`.
"""
function bar(x, y) ...
  • It is recommended that you have a blank line between the headings and the content when the content is of sufficient length.
  • Try to be consistent within a docstring whether you use this additional whitespace.
  • Follow one of the following templates for types and functions when possible:

Type Template (should be skipped if is redundant with the constructor(s) docstring):

"""
    MyArray{T, N}

My super awesome array wrapper!

# Fields
- `data::AbstractArray{T, N}`: stores the array being wrapped
- `metadata::Dict`: stores metadata about the array
"""
struct MyArray{T, N} <: AbstractArray{T, N}
    data::AbstractArray{T, N}
    metadata::Dict
end

Function Template (only required for exported functions):

"""
    mysearch(array::MyArray{T}, val::T; verbose = true) where {T} -> Int

Searches the `array` for the `val`. For some reason we don't want to use Julia's
builtin search :)

# Arguments
- `array::MyArray{T}`: the array to search
- `val::T`: the value to search for

# Keywords
- `verbose::Bool = true`: print out progress details

# Returns
- `Int`: the index where `val` is located in the `array`

# Throws
- `NotFoundError`: I guess we could throw an error if `val` isn't found.
"""
function mysearch(array::AbstractArray{T}, val::T) where {T}
    ...
end
  • The @doc doc""" """ formulation from the Markdown standard library should be used whenever there is LaTeX.
  • Only public fields of types must be documented. Undocumented fields are considered non-public internals.
  • If your method contains lots of arguments or keywords you may want to exclude them from the method signature on the first line and instead use args... and/or kwargs....
"""
    Manager(args...; kwargs...) -> Manager

A cluster manager which spawns workers.

# Arguments

- `min_workers::Integer`: The minimum number of workers to spawn or an exception is thrown
- `max_workers::Integer`: The requested number of workers to spawn

# Keywords

- `definition::AbstractString`: Name of the job definition to use. Defaults to the
    definition used within the current instance.
- `name::AbstractString`: ...
- `queue::AbstractString`: ...
"""
function Manager(...)
    ...
end
  • Feel free to document multiple methods for a function within the same docstring. Be careful to only do this for functions you have defined.
"""
    Manager(max_workers; kwargs...)
    Manager(min_workers:max_workers; kwargs...)
    Manager(min_workers, max_workers; kwargs...)

A cluster manager which spawns workers.

# Arguments

- `min_workers::Int`: The minimum number of workers to spawn or an exception is thrown
- `max_workers::Int`: The requested number of workers to spawn

# Keywords

- `definition::AbstractString`: Name of the job definition to use. Defaults to the
    definition used within the current instance.
- `name::AbstractString`: ...
- `queue::AbstractString`: ...
"""
function Manager end
  • If the documentation for bullet-point exceeds 92 characters the line should be wrapped and slightly indented. Avoid aligning the text to the :.
"""
...

# Keywords
- `definition::AbstractString`: Name of the job definition to use. Defaults to the
    definition used within the current instance.
"""

Error Handling

  • error("string") should be avoided. Defining and throwing exception types is preferred. See the manual on exceptions for more details.
  • Try to avoid try/catch. Use it as minimally as possible. Attempt to catch potential issues before running code, not after.

Arrays

  • Avoid splatting (...) whenever possible. Prefer iterators such as collect, vcat, hcat, etc. instead.

Line Endings

Always use Unix style \n line ending.

VS-Code Settings

If you are a user of VS Code we recommend that you have the following options in your Julia syntax specific settings. To modify these settings open your VS Code Settings with CMD+, (Mac OS) or CTRL+, (other OS), and add to your settings.json:

{
    "[julia]": {
        "editor.detectIndentation": false,
        "editor.insertSpaces": true,
        "editor.tabSize": 4,
        "files.insertFinalNewline": true,
        "files.trimFinalNewlines": true,
        "files.trimTrailingWhitespace": true,
        "editor.rulers": [92],
        "files.eol": "\n"
    },
}

Additionally you may find the Julia VS-Code plugin useful.

JuliaFormatter

Note: the sciml style is only available in JuliaFormatter v1.0 or later

One can add .JuliaFormatter.toml with the content

style = "sciml"

in the root of a repository, and run

using JuliaFormatter, SomePackage
format(joinpath(dirname(pathof(SomePackage)), ".."))

to format the package automatically.

Add FormatCheck.yml to enable the formatting CI. The CI will fail if the repository needs additional formatting. Thus, one should run format before committing.

References

Many of these style choices were derived from the Julia style guide, the YASGuide, and the Blue style guide.

More Repositories

1

DifferentialEquations.jl

Multi-language suite for high-performance solvers of differential equations and scientific machine learning (SciML) components. Ordinary differential equations (ODEs), stochastic differential equations (SDEs), delay differential equations (DDEs), differential-algebraic equations (DAEs), and more in Julia.
Julia
2,599
star
2

SciMLBook

Parallel Computing and Scientific Machine Learning (SciML): Methods and Applications (MIT 18.337J/6.338J)
HTML
1,841
star
3

ModelingToolkit.jl

An acausal modeling framework for automatically parallelized scientific machine learning (SciML) in Julia. A computer algebra system for integrated symbolics for physics-informed machine learning and automated transformations of differential equations
Julia
1,424
star
4

NeuralPDE.jl

Physics-Informed Neural Networks (PINN) Solvers of (Partial) Differential Equations for Scientific Machine Learning (SciML) accelerated simulation
Julia
961
star
5

DiffEqFlux.jl

Pre-built implicit layer architectures with O(1) backprop, GPUs, and stiff+non-stiff DE solvers, demonstrating scientific machine learning (SciML) and physics-informed machine learning methods
Julia
858
star
6

SciMLTutorials.jl

Tutorials for doing scientific machine learning (SciML) and high-performance differential equation solving with open source software.
CSS
710
star
7

Optimization.jl

Mathematical Optimization in Julia. Local, global, gradient-based and derivative-free. Linear, Quadratic, Convex, Mixed-Integer, and Nonlinear Optimization in one simple, fast, and differentiable interface.
Julia
708
star
8

OrdinaryDiffEq.jl

High performance ordinary differential equation (ODE) and differential-algebraic equation (DAE) solvers, including neural ordinary differential equations (neural ODEs) and scientific machine learning (SciML)
Julia
528
star
9

diffeqpy

Solving differential equations in Python using DifferentialEquations.jl and the SciML Scientific Machine Learning organization
Python
521
star
10

Catalyst.jl

Chemical reaction network and systems biology interface for scientific machine learning (SciML). High performance, GPU-parallelized, and O(1) solvers in open source software.
Julia
462
star
11

DataDrivenDiffEq.jl

Data driven modeling and automated discovery of dynamical systems for the SciML Scientific Machine Learning organization
Julia
405
star
12

SciMLSensitivity.jl

A component of the DiffEq ecosystem for enabling sensitivity analysis for scientific machine learning (SciML). Optimize-then-discretize, discretize-then-optimize, adjoint methods, and more for ODEs, SDEs, DDEs, DAEs, etc.
Julia
330
star
13

Surrogates.jl

Surrogate modeling and optimization for scientific machine learning (SciML)
Julia
328
star
14

SciMLBenchmarks.jl

Scientific machine learning (SciML) benchmarks, AI for science, and (differential) equation solvers. Covers Julia, Python (PyTorch, Jax), MATLAB, R
MATLAB
299
star
15

DiffEqOperators.jl

Linear operators for discretizations of differential equations and scientific machine learning (SciML)
Julia
285
star
16

DiffEqGPU.jl

GPU-acceleration routines for DifferentialEquations.jl and the broader SciML scientific machine learning ecosystem
Julia
283
star
17

FluxNeuralOperators.jl

DeepONets, (Fourier) Neural Operators, Physics-Informed Neural Operators, and more in Julia
Julia
267
star
18

DiffEqDocs.jl

Documentation for the DiffEq differential equations and scientific machine learning (SciML) ecosystem
Julia
262
star
19

DiffEqBase.jl

The lightweight Base library for shared types and functionality for defining differential equation and scientific machine learning (SciML) problems
Julia
254
star
20

LinearSolve.jl

LinearSolve.jl: High-Performance Unified Interface for Linear Solvers in Julia. Easily switch between factorization and Krylov methods, add preconditioners, and all in one interface.
Julia
245
star
21

Integrals.jl

A common interface for quadrature and numerical integration for the SciML scientific machine learning organization
Julia
226
star
22

NonlinearSolve.jl

High-performance and differentiation-enabled nonlinear solvers (Newton methods), bracketed rootfinding (bisection, Falsi), with sparsity and Newton-Krylov support.
Julia
226
star
23

DataInterpolations.jl

A library of data interpolation and smoothing functions
Julia
212
star
24

StochasticDiffEq.jl

Solvers for stochastic differential equations which connect with the scientific machine learning (SciML) ecosystem
Julia
209
star
25

ReservoirComputing.jl

Reservoir computing utilities for scientific machine learning (SciML)
Julia
206
star
26

Sundials.jl

Julia interface to Sundials, including a nonlinear solver (KINSOL), ODE's (CVODE and ARKODE), and DAE's (IDA) in a SciML scientific machine learning enabled manner
Julia
195
star
27

RecursiveArrayTools.jl

Tools for easily handling objects like arrays of arrays and deeper nestings in scientific machine learning (SciML) and other applications
Julia
167
star
28

MethodOfLines.jl

Automatic Finite Difference PDE solving with Julia SciML
Julia
160
star
29

diffeqr

Solving differential equations in R using DifferentialEquations.jl and the SciML Scientific Machine Learning ecosystem
R
140
star
30

JumpProcesses.jl

Build and simulate jump equations like Gillespie simulations and jump diffusions with constant and state-dependent rates and mix with differential equations and scientific machine learning (SciML)
Julia
140
star
31

SciMLBase.jl

The Base interface of the SciML ecosystem
Julia
129
star
32

NBodySimulator.jl

A differentiable simulator for scientific machine learning (SciML) with N-body problems, including astrophysical and molecular dynamics
Julia
128
star
33

ColPrac

Contributor's Guide on Collaborative Practices for Community Packages
Julia
123
star
34

DiffEqBayes.jl

Extension functionality which uses Stan.jl, DynamicHMC.jl, and Turing.jl to estimate the parameters to differential equations and perform Bayesian probabilistic scientific machine learning
Julia
121
star
35

LabelledArrays.jl

Arrays which also have a label for each element for easy scientific machine learning (SciML)
Julia
120
star
36

PolyChaos.jl

A Julia package to construct orthogonal polynomials, their quadrature rules, and use it with polynomial chaos expansions.
Julia
116
star
37

SymbolicNumericIntegration.jl

SymbolicNumericIntegration.jl: Symbolic-Numerics for Solving Integrals
Julia
116
star
38

ModelingToolkitStandardLibrary.jl

A standard library of components to model the world and beyond
Julia
112
star
39

PreallocationTools.jl

Tools for building non-allocating pre-cached functions in Julia, allowing for GC-free usage of automatic differentiation in complex codes
Julia
111
star
40

StructuralIdentifiability.jl

Fast and automatic structural identifiability software for ODE systems
Julia
110
star
41

ODE.jl

Assorted basic Ordinary Differential Equation solvers for scientific machine learning (SciML). Deprecated: Use DifferentialEquations.jl instead.
Julia
103
star
42

QuasiMonteCarlo.jl

Lightweight and easy generation of quasi-Monte Carlo sequences with a ton of different methods on one API for easy parameter exploration in scientific machine learning (SciML)
Julia
102
star
43

RuntimeGeneratedFunctions.jl

Functions generated at runtime without world-age issues or overhead
Julia
100
star
44

FEniCS.jl

A scientific machine learning (SciML) wrapper for the FEniCS Finite Element library in the Julia programming language
Julia
96
star
45

DiffEqCallbacks.jl

A library of useful callbacks for hybrid scientific machine learning (SciML) with augmented differential equation solvers
Julia
94
star
46

ExponentialUtilities.jl

Fast and differentiable implementations of matrix exponentials, Krylov exponential matrix-vector multiplications ("expmv"), KIOPS, ExpoKit functions, and more. All your exponential needs in SciML form.
Julia
93
star
47

EllipsisNotation.jl

Julia-based implementation of ellipsis array indexing notation `..`
Julia
80
star
48

EasyModelAnalysis.jl

High level functions for analyzing the output of simulations
Julia
79
star
49

AutoOptimize.jl

Automatic optimization and parallelization for Scientific Machine Learning (SciML)
Julia
78
star
50

ParameterizedFunctions.jl

A simple domain-specific language (DSL) for defining differential equations for use in scientific machine learning (SciML) and other applications
Julia
73
star
51

HighDimPDE.jl

A Julia package for Deep Backwards Stochastic Differential Equation (Deep BSDE) and Feynman-Kac methods to solve high-dimensional PDEs without the curse of dimensionality
Julia
71
star
52

SciMLExpectations.jl

Fast uncertainty quantification for scientific machine learning (SciML) and differential equations
Julia
64
star
53

MultiScaleArrays.jl

A framework for developing multi-scale arrays for use in scientific machine learning (SciML) simulations
Julia
64
star
54

SimpleNonlinearSolve.jl

Fast and simple nonlinear solvers for the SciML common interface. Newton, Broyden, Bisection, Falsi, and more rootfinders on a standard interface.
Julia
63
star
55

DiffEqNoiseProcess.jl

A library of noise processes for stochastic systems like stochastic differential equations (SDEs) and other systems that are present in scientific machine learning (SciML)
Julia
63
star
56

CellMLToolkit.jl

CellMLToolkit.jl is a Julia library that connects CellML models to the Scientific Julia ecosystem.
Julia
62
star
57

SciMLDocs

Global documentation for the Julia SciML Scientific Machine Learning Organization
Julia
60
star
58

SparsityDetection.jl

Automatic detection of sparsity in pure Julia functions for sparsity-enabled scientific machine learning (SciML)
Julia
59
star
59

DelayDiffEq.jl

Delay differential equation (DDE) solvers in Julia for the SciML scientific machine learning ecosystem. Covers neutral and retarded delay differential equations, and differential-algebraic equations.
Julia
58
star
60

DiffEqProblemLibrary.jl

A library of premade problems for examples and testing differential equation solvers and other SciML scientific machine learning tools
Julia
56
star
61

sciml.ai

The SciML Scientific Machine Learning Software Organization Website
CSS
53
star
62

DiffEqParamEstim.jl

Easy scientific machine learning (SciML) parameter estimation with pre-built loss functions
Julia
52
star
63

Static.jl

Static types useful for dispatch and generated functions.
Julia
52
star
64

GlobalSensitivity.jl

Robust, Fast, and Parallel Global Sensitivity Analysis (GSA) in Julia
Julia
51
star
65

DeepEquilibriumNetworks.jl

Implicit Layer Machine Learning via Deep Equilibrium Networks, O(1) backpropagation with accelerated convergence.
Julia
50
star
66

MinimallyDisruptiveCurves.jl

Finds relationships between the parameters of a mathematical model
Julia
49
star
67

DiffEqPhysics.jl

A library for building differential equations arising from physical problems for physics-informed and scientific machine learning (SciML)
Julia
48
star
68

OperatorLearning.jl

No need to train, he's a smooth operator
Julia
44
star
69

MuladdMacro.jl

This package contains a macro for converting expressions to use muladd calls and fused-multiply-add (FMA) operations for high-performance in the SciML scientific machine learning ecosystem
Julia
44
star
70

DiffEqDevTools.jl

Benchmarking, testing, and development tools for differential equations and scientific machine learning (SciML)
Julia
43
star
71

BoundaryValueDiffEq.jl

Boundary value problem (BVP) solvers for scientific machine learning (SciML)
Julia
42
star
72

SciMLOperators.jl

SciMLOperators.jl: Matrix-Free Operators for the SciML Scientific Machine Learning Common Interface in Julia
Julia
42
star
73

SBMLToolkit.jl

SBML differential equation and chemical reaction model (Gillespie simulations) for Julia's SciML ModelingToolkit
Julia
41
star
74

HelicopterSciML.jl

Helicopter Scientific Machine Learning (SciML) Challenge Problem
Julia
38
star
75

ADTypes.jl

Repository for automatic differentiation backend types
Julia
38
star
76

RootedTrees.jl

A collection of functionality around rooted trees to generate order conditions for Runge-Kutta methods in Julia for differential equations and scientific machine learning (SciML)
Julia
37
star
77

SciMLWorkshop.jl

Workshop materials for training in scientific computing and scientific machine learning
Julia
36
star
78

AutoOffload.jl

Automatic GPU, TPU, FPGA, Xeon Phi, Multithreaded, Distributed, etc. offloading for scientific machine learning (SciML) and differential equations
Julia
35
star
79

ModelOrderReduction.jl

High-level model-order reduction to automate the acceleration of large-scale simulations
Julia
33
star
80

ModelingToolkitCourse

A course on composable system modeling, differential-algebraic equations, acausal modeling, compilers for simulation, and building digital twins of real-world devices
Julia
33
star
81

DifferenceEquations.jl

Solving difference equations with DifferenceEquations.jl and the SciML ecosystem.
Julia
32
star
82

DASSL.jl

Solves stiff differential algebraic equations (DAE) using variable stepsize backwards finite difference formula (BDF) in the SciML scientific machine learning organization
Julia
31
star
83

FiniteVolumeMethod.jl

Solver for two-dimensional conservation equations using the finite volume method in Julia.
Julia
31
star
84

SteadyStateDiffEq.jl

Solvers for steady states in scientific machine learning (SciML)
Julia
30
star
85

TruncatedStacktraces.jl

Simpler stacktraces for the Julia Programming Language
Julia
28
star
86

PDESystemLibrary.jl

A library of systems of partial differential equations, as defined with ModelingToolkit.jl in Julia
Julia
28
star
87

DiffEqOnline

It's Angular2 business in the front, and a Julia party in the back! It's scientific machine learning (SciML) for the web
TypeScript
27
star
88

ReactionNetworkImporters.jl

Julia Catalyst.jl importers for various reaction network file formats like BioNetGen and stoichiometry matrices
Julia
26
star
89

StochasticDelayDiffEq.jl

Stochastic delay differential equations (SDDE) solvers for the SciML scientific machine learning ecosystem
Julia
25
star
90

DiffEqOnlineServer

Backend for DiffEqOnline, a webapp for scientific machine learning (SciML)
Julia
25
star
91

MathML.jl

Julia MathML parser
Julia
23
star
92

IRKGaussLegendre.jl

Implicit Runge-Kutta Gauss-Legendre 16th order (Julia)
Jupyter Notebook
23
star
93

SimpleDiffEq.jl

Simple differential equation solvers in native Julia for scientific machine learning (SciML)
Julia
22
star
94

DiffEqFinancial.jl

Differential equation problem specifications and scientific machine learning for common financial models
Julia
22
star
95

ModelingToolkitNeuralNets.jl

Symbolic-Numeric Universal Differential Equations for Automating Scientific Machine Learning (SciML)
Julia
22
star
96

SciPyDiffEq.jl

Wrappers for the SciPy differential equation solvers for the SciML Scientific Machine Learning organization
Julia
21
star
97

SciMLTutorialsOutput

Tutorials for doing scientific machine learning (SciML) and high-performance differential equation solving with open source software.
HTML
20
star
98

OptimalControl.jl

A component of the SciML scientific machine learning ecosystem for optimal control
Julia
20
star
99

MATLABDiffEq.jl

Common interface bindings for the MATLAB ODE solvers via MATLAB.jl for the SciML Scientific Machine Learning ecosystem
Julia
20
star
100

IfElse.jl

Under some conditions you may need this function
Julia
19
star