• Stars
    star
    123
  • Rank 290,145 (Top 6 %)
  • Language
    Julia
  • License
    MIT License
  • Created over 5 years ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

For working with dimensions of arrays by name

NamedDims

CI Codecov PkgEval code style blue ColPrac: Contributor's Guide on Collaborative Practices for Community Packages

NamedDimsArray is a zero-cost abstraction to add names to the dimensions of an array.

Core functionality:

For nda = NamedDimsArray{(:x, :y, :z)}(rand(10, 20, 30)).

  • Indexing: nda[y=2] is the same as nda[x=:, y=2, z=:] which is the same as nda[:, 2, :].
  • Functions taking a dims keyword: sum(nda; dims=:y) is the same as sum(nda; dims=2).
  • Accessing Names: dimnames(nda) returns (:x, :y, :z), a tuple with the dimension names.
  • Identifying a dimension by name: dim(nda, :y) returns 2, the numerical dimension named :y. Similarly dim(nda, (:y, :z)) returns (2, 3).
  • Unwrapping: parent(nda) returns the underlying AbstractArray that is wrapped by the NamedDimsArray.
  • Unnaming: unname(a) ensures an AbstractArray is not a NamedDimsArray; if passed a NamedDimsArray it unwraps it, otherwise just returns the given AbstractArray.
  • Renaming: rename(nda, new_names) returns a new NamedDimsArray with the new_names but still wrapping the same data.
  • Refining Names: NamedDimsArray(nda, names) returns a new NamedDimsArray with any unnamed dimensions of nda getting their names from names. It errors if any names present in both disagree.

Dimensionally Safe Operations

Any operation of multiple NamedDimArrays must have compatible dimension names. For example trying NamedDimsArray{(:time,)}(ones(5)) + NamedDimsArray{(:place,)}(ones(5)) will throw an error. If you perform an operation between another AbstractArray and a NamedDimsArray, then the result will take its names from the NamedDimsArray. You can use this to bypass the protection, e.g. NamedDimsArray{(:time,)}(ones(5)) + parent(NamedDimsArray{(:place,)}(ones(5))) is allowed.

Partially Named Dimensions (:_)

To allow for arrays where only some dimensions have names, the name :_ is treated as a wildcard. Dimensions named with :_ will not be protected against operating between dimensions of different names; in these cases the result will take the name from the non-wildcard name, if any of the operands had such a concrete name. For example: NamedDimsArray{(:time,:_)}(ones(5,2)) + NamedDimsArray{(:_, :place,)}(ones(5,2)) is allowed. and would have a result of: NamedDimsArray{(:time,:place)}(2*ones(5,2)) As such, unless you want this wildcard behaviour, you should not use :_ as a dimension name. (Also that is a terrible dimension name, and goes against the whole point of this package.)

When you perform matrix multiplication between a AbstractArray and a NamedDimsArray then the new dimensions name is given as the wildcard :_. Similarly, when you take the transpose of a AbstractVector, the new first dimension is named :_.

Usage

Writing functions that accept NamedDimsArrays or AbstractArrays

It is a common desire to be able to write code that anyone can call, whether they are using NamedDimsArrays or not. While also being able to use NamedDimsArrays internally in its definition; and also getting the assertion when a NamedDimsArray is passed in, that it has the expected dimensions. The way to do this is to call the NamedDimsArray constructor, with the expected names within the function. This operation corresponds to PyTorch's refine_names. As in the following example:

function total_variance(data::AbstractMatrix)
    n_data = NamedDimsArray(data, (:times, :locations))
    location_variance = var(n_data; dims=:times)  # calculate variance at each location
    return sum(location_variance; dims=:locations)  # total them
end

If this function is given (say) a Matrix, then it will apply the names to it in n_data. Thus the function will just work on unnamed types. If data is a NamedDimsArray, with incompatible names an error will be thrown. For example if it data was mistakenly transposed and so had the dimension names: (:locations, :times) instead of (:times, :locations). If data was partially named, e.g. (:_, :locations), then that name would be allowed to be combined with the named from the constructor; yielding n_data with the expected names: (:times, :locations). This pattern allows both assertions of correctness (for named inputs), and convenience and compatibility (for unnamed input). And since NamedDimsArray is a zero-cost abstraction, this will basically compile out of existence, most of the time.

Extending support for more functions

There are two common things to do to make a function support NamedDimsArrays. These are:

  • Adding support for referring to a dimension by name to an existing function
  • Make the operation return a NamedDimsArray rather than a Array. (Many operations fallback to dropping the names) Often they are done together.

They are illustrated by the following example:

function foo(nda::NamedDimsArray, args...; dims=:)
    numerical_dims = dim(nda, dims)  # convert any form of dims into numerical dims
    raw_result = foo(parent(nda), args...; dims=numerical_dims)  # call it on the backed data
    new_names = determine_foo_names(nda, args...)  # workout what the new names will be
    return NamedDimsArray{new_names)(raw_result)  # wrap the result up
end

You can do this to your own functions in your own packages, to add NamedDimsArray support. If you implement it for any functions in a standard library, a PR would be very appreciated.

Caveats

If multiple dimensions have the same names, indexing by name is considered undefined behaviour and should not be relied upon.

More Repositories

1

JLSO.jl

Julia Serialized Object (JLSO) file format for storing checkpoint data.
Julia
90
star
2

Memento.jl

A flexible logging library for Julia
Julia
86
star
3

ExprTools.jl

Light-weight expression manipulation tools
Julia
77
star
4

Impute.jl

Imputation methods for missing data in julia
Julia
71
star
5

Nabla.jl

A operator overloading, tape-based, reverse-mode AD
Julia
67
star
6

Keras.jl

A julia wrapper for https://keras.io
Julia
53
star
7

Dispatcher.jl

Build, distribute, and execute task graphs
Julia
46
star
8

FeatureTransforms.jl

Transformations for performing feature engineering in machine learning applications
Julia
37
star
9

Intervals.jl

Non-iterable ranges
Julia
34
star
10

SyntheticGrids.jl

Julia package for building synthetic power grids
Julia
27
star
11

BayesianOptimization.jl

A julia package for bayesian optimization of black box functions.
Julia
23
star
12

FTPClient.jl

Julia FTP client using LibCURL.jl
Julia
22
star
13

JuliaFormat.jl

A code formatting tool for Julia inspired by gofmt and rustfmt
Julia
15
star
14

DaskDistributedDispatcher.jl

Submit and execute distributed computations. A dask.distributed scheduler and Dispatcher.jl integration.
Julia
14
star
15

Arbiter

A concurrent task-runner that automatically resolves dependency issues
Python
13
star
16

Checkpoints.jl

A package for dynamically checkpointing program state
Julia
12
star
17

KeepActionsAlive

Prevent scheduled GitHub Actions from becoming disabled after 60 days
Python
11
star
18

OPFSampler.jl

Takes a power grid case and generates OPF samples by changing the input parameters.
Julia
10
star
19

MetaOptOPF.jl

Code for paper: Learning an Optimally Reduced Formulation of OPF through Meta-Optimization
Julia
9
star
20

LayerDicts.jl

Layered dictionary lookups for Julia
Julia
9
star
21

PDMatsExtras.jl

Extra Positive (Semi-)Definite Matricies
Julia
9
star
22

DateParser.jl

Handle automatic parsing of DateTime strings
Julia
8
star
23

Parallelism.jl

A library for threaded and distributed parallelism.
Julia
8
star
24

std-semaphore

Semaphore and SemaphoreGuard from std::sync in rust<=1.8.0
Rust
7
star
25

TrackedDistributions.jl

Julia
7
star
26

FullNetworkSystems.jl

Definitions of the Julia types for simulating an ISO's market clearing.
Julia
7
star
27

Metrics.jl

Performance metrics for evaluating learning algorithms and prediction models. Includes subsampling confidence intervals
Julia
6
star
28

Cliquing.jl

Algorithms for finding a non-overlapping set of cliques in a graph
Julia
6
star
29

FullNetworkModels.jl

Create a Build a JuMP.jl Model from a FullNetworkSystems.jl System, solved unit commitment, and OPF etc
Julia
6
star
30

PowerSystemsUnits.jl

PowerSystems Units for Unitful
Julia
5
star
31

VirtualArrays.jl

A way to concatenate arrays without copying values.
Julia
5
star
32

DeferredFutures.jl

Julia Futures which are initialized when written to
Julia
5
star
33

ReadWriteLocks.jl

A simple read-write lock for Julia
Julia
5
star
34

AxisSets.jl

Consistent operations over a collection of KeyedArrays
Julia
5
star
35

JuliaTraining2022

Resources for Julia training session
5
star
36

TagBotGitLab

Julia TagBot for GitLab
Python
5
star
37

StackTraces.jl

Intuitive, useful stack traces for Julia.
Julia
4
star
38

KeyedDistributions.jl

Distributions and Sampleables with keys for the variates
Julia
4
star
39

DataClient.jl

For accessing datalakes on S3
Julia
4
star
40

CloudWatchLogs.jl

AWS CloudWatch Logs integration for Julia using Memento.jl
Julia
4
star
41

Holidays.jl

Julia library for handling holidays
Julia
4
star
42

sphinxcontrib-runcmd

A Sphinx extention that aims to allow you to place the output of arbitrary commands in to your rst files, while also giving you greater flexibility in how the output is formatted
Python
4
star
43

KeyedFrames.jl

A DataFrame that also keeps track of its unique key
Julia
3
star
44

DateSelectors.jl

Utilities for partitioning Dates into validation and holdout sets.
Julia
3
star
45

RingArrays.jl

A sliding window over a huge array.
Julia
3
star
46

JuliaLAB

Embedded Julia in MATLAB
C
3
star
47

Models.jl

An interface package that defines the methods and types for working with models.
Julia
3
star
48

SublimeLinter-contrib-julialintserver

SublimeLinter plugin using Lint.jl lintserver
Python
2
star
49

ObservationDims.jl

Traits for specifying the orientation of features and observations in data
Julia
2
star
50

Syslogs.jl

Julia syslog interface
Julia
2
star
51

DistributedLogging.jl

A place for logging helpers for distributed jobs to live
Julia
2
star
52

matpy

Call Python from MATLAB
C++
2
star
53

GPForecasting.jl

A Julia package for Gaussian Processes
Julia
2
star
54

blog

Invenia's blog
SCSS
2
star
55

IndexedDims.jl

Deprecated in favour of https://github.com/mcabbott/AxisKeys.jl
Julia
2
star
56

lambdalayers

Some useful AWS Lambda layers for Invenia (and code to deploy them)
Python
1
star
57

MaximumGeneratorProfitTCRDD.jl

Julia
1
star
58

Hyperparameters.jl

Julia
1
star
59

Arbiter.jl

A task-runner that automatically resolves dependency issues
Julia
1
star
60

LibPQBuilder

A script masquerading as a BinaryBuilder repo for libpq
Julia
1
star
61

DiffLinearAlgebra.jl

Implementation-agnostic linear algebra optimisations for Reverse-Mode AD
Julia
1
star
62

TrajectoryMatrices.jl

Julia
1
star
63

testre

Temporary rethinkdb servers for testing
Python
1
star
64

RemoteSemaphores.jl

Julia
1
star
65

S3DBConverter

Internal tool for converting our data stores to modern performant formats
Python
1
star
66

FTPServer.jl

Julia wrapper for pyftpdlib
Julia
1
star
67

Wrangling.jl

Wrangle your data into shape. Deals with Columns and Files and Lags and Cattle.
Julia
1
star