• Stars
    star
    99
  • Rank 331,243 (Top 7 %)
  • Language
    Swift
  • License
    MIT License
  • Created about 5 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Accelerated tensor operations and dynamic neural networks based on reverse mode automatic differentiation for every device that can run Swift - from watchOS to Linux

DL4S

License Releases Documentation
Supports Linux, macOS, iOS, tvOS and watchOS Build Status

DL4S provides a high-level API for many accelerated operations common in neural networks and deep learning. It furthermore has automatic differentiation builtin, which allows you to create and train neural networks without needing to manually implement backpropagation - without needing a special Swift toolchain.

Features include implementations for many basic binary and unary operators, broadcasting, matrix operations, convolutional and recurrent neural networks, commonly used optimizers, second derivatives and much more. DL4S provides implementations for common network architectures, such as VGG, AlexNet, ResNet and Transformers.

While its primary purpose is deep learning and optimization, DL4S can be used as a library for vectorized mathematical operations like numpy.

Read the full documentation

Overview

  1. Installation
  2. Features
    1. Layers
    2. Optimizers
    3. Losses
    4. Tensor Operations
    5. Engines
    6. Architectures
  3. Examples

Installation

iOS / tvOS / macOS

  1. In Xcode, select "File" > "Swift Packages" > "Add Package Dependency"
  2. Enter https://github.com/palle-k/DL4S.git into the Package URL field and click "Next".
  3. Select "Branch", "master" and click "Next".
  4. Enable the Package Product DL4S, your app in the "Add to Target" column and click "Next".

Note: Installation via CocoaPods is no longer supported for newer versions.

Swift Package

Add the dependency to your Package.swift file:

.package(url: "https://github.com/palle-k/DL4S.git", .branch("master"))

Then add DL4S as a dependency to your target:

.target(name: "MyPackage", dependencies: ["DL4S"])

MKL / IPP / OpenMP Support

DL4S can be accelerated with Intel's Math Kernel Library, Integrated Performance Primitives and OpenMP (Installation Instructions).

On Apple devices, DL4S uses vectorized functions provided by the builtin Accelerate framework by default. If no acceleration library is available, a fallback implementation is used.

Compiling with MKL/IPP:

# After adding the APT repository as described in the installation instructions
sudo apt-get install intel-mkl-64bit-2019.5-075 intel-ipp-64bit-2019.5-075 libiomp-dev

export MKLROOT=/opt/intel/mkl
export IPPROOT=/opt/intel/ipp
export LD_LIBRARY_PATH=${MKLROOT}/lib/intel64:${IPPROOT}/lib/intel64:${LD_LIBRARY_PATH}

swift build -c release \
    -Xswiftc -DMKL_ENABLE \
    -Xlinker -L${MKLROOT}/lib/intel64 \
    -Xlinker -L${IPPROOT}/lib/intel64

TensorBoard Support

DL4S-Tensorboard provides a summary writer that can write tensorboard compatible logs.

LLDB Extension

DL4S includes a LLDB python script that provides custom descriptions for Tensors (util/debugger_support/tensor.py).

To use enhanced summaries, execute command script import /path/to/DL4S/util/debugger_support/tensor.py either directly in LLDB or add the command to your ~/.lldbinit file.

Then you can use the print or frame variable commands to print human-readable descriptions of tensors.

Features

Layers

Core:

  • Convolution
  • Transposed Convolution
  • Dense/Linear/Fully Connected
  • LSTM
  • Gated Recurrent Unit (GRU)
  • Vanilla RNN
  • Embedding
  • Multi-head Attention
  • Transformer Block

Pooling:

  • Max Pooling
  • Average Pooling
  • Adaptive Max Pooling
  • Adaptive Average Pooling

Norm:

  • Batch Norm
  • Layer Norm

Utility:

  • Bidirectional RNNs
  • Sequential
  • Lambda
  • Dropout
  • Lambda

Activation:

  • Relu
  • LeakyRelu
  • Gelu
  • Tanh
  • Sigmoid
  • Softmax
  • Log Softmax
  • Dropout
  • Gelu
  • Swish
  • Mish
  • LiSHT

Transformer:

  • Positional Encoding
  • Scaled Dot Product Attention
  • Multihead Attention
  • Pointwise Feed Forward
  • Transformer Encoder Block
  • Transformer Decoder Block

Optimizers

  • SGD
  • Momentum
  • Adam
  • AMSGrad
  • AdaGrad
  • AdaDelta
  • RMSProp

Losses

  • Binary Cross-Entropy
  • Categorical Cross-Entropy
  • Negative Log Likelihood (NLL Loss)
  • MSE
  • L1 & L2 regularization

Tensor Operations

Behavior of broadcast operations is consistent with numpy rules.

  • broadcast-add
  • broadcast-sub
  • broadcast-mul
  • broadcast-div
  • matmul
  • neg
  • exp
  • pow
  • log
  • sqrt
  • sin
  • cos
  • tan
  • tanh
  • sum
  • max
  • relu
  • leaky relu
  • gelu
  • elu
  • elementwise min
  • elementwise max
  • reduce sum
  • reduce max
  • scatter
  • gather
  • conv2d
  • transposed conv2d
  • max pool
  • avg pool
  • subscript
  • subscript range
  • transpose
  • axis permute
  • reverse
  • im2col
  • col2im
  • stack / concat
  • swish activation
  • mish activation
  • lisht activation
  • diagonal matrix generation
  • diagonal extraction
  • band matrix generation

Engines

  • CPU (Accelerate framework for Apple Devices)
  • CPU (Intel Math Kernel Library and Integrated Performance Primitives)
  • CPU (Generic)
  • GPU (ArrayFire: OpenCL, CUDA)

For an experimental, early stage GPU accelerated version, check out feature/arrayfire.

Architectures

Default implementations are provided for the following architectures:

  • ResNet18
  • VGG (11, 13, 16, 19)
  • AlexNet
  • Transformer

Examples

Some high level examples have been implemented in other repositories:

Arithmetic & Differentiation

DL4S provides a high-level interface to many vectorized operations on tensors.

let a = Tensor<Float, CPU>([[1,2],[3,4],[5,6]], requiresGradient: true)
let prod = a.transposed().matrixMultipled(with: a)
let s = prod.reduceSum()
let l = log(s)
print(l) // 5.1873856

When a tensor is marked to require a gradient, a compute graph will be captured. The graph stores all operations, which use that tensor directly or indirectly as an operand.

It is then possible to backpropagate through that graph using the gradients(of:) function:

// Backpropagate
let dl_da = l.gradients(of: [a])[0]

print(dl_da)
/*
[[0.034, 0.034]
 [0.078, 0.078]
 [0.123, 0.123]]
*/

Second derivatives

The operations used during backpropagation are themselves differentiable. Therefore, second derivatives can be computed by computing the gradient of the gradient.

When higher order derivatives are required, the compute graph of the backwards pass has to be explicitly retained.

let t = Tensor<Float, CPU>([1,2,3,4], requiresGradient: true)

let result = t * t * t
print(result) // [1, 8, 27, 64]

let grad = result.gradients(of: [t], retainBackwardsGraph: true)[0]
print(grad) // [3, 12, 27, 48]

let secondGrad = grad.gradients(of: [t], retainBackwardsGraph: true)[0]
print(secondGrad) // [6, 12, 18, 24]

let thirdGrad = secondGrad.gradients(of: [t])[0]
print(thirdGrad) // [6, 6, 6, 6]

Convolutional Networks

Example for MNIST classification

// Input must be batchSizex1x28x28
var model = Sequential {
   Convolution2D<Float, CPU>(inputChannels: 1, outputChannels: 6, kernelSize: (5, 5))
   Relu<Float, CPU>()
   MaxPool2D<Float, CPU>(windowSize: 2, stride: 2)
   
   Convolution2D<Float, CPU>(inputChannels: 6, outputChannels: 16, kernelSize: (5, 5))
   Relu<Float, CPU>()
   MaxPool2D<Float, CPU>(windowSize: 2, stride: 2)
   
   Flatten<Float, CPU>()
   
   Dense<Float, CPU>(inputSize: 256, outputSize: 120)
   Relu<Float, CPU>()
   
   Dense<Float, CPU>(inputSize: 120, outputSize: 10)
   LogSoftmax<Float, CPU>()
}

var optimizer = Adam(model: model, learningRate: 0.001)

// Single iteration of minibatch gradient descent
let batch: Tensor<Float, CPU> = ... // shape: [batchSize, 1, 28, 28]
let y_true: Tensor<Int32, CPU> = ... // shape: [batchSize]

// use optimizer.model, not model
let pred = optimizer.model(batch)
let loss = categoricalNegativeLogLikelihood(expected: y_true, actual: pred)

let gradients = loss.gradients(of: optimizer.model.parameters)
optimizer.update(along: gradients)

Recurrent Networks

Example for MNIST classification

The Gated Reccurent Unit scans the image from top to bottom and uses the final hidden state for classification.

let model = Sequential {
    GRU<Float, CPU>(inputSize: 28, hiddenSize: 128, direction: .forward)
    Lambda<GRU<Float, CPU>.Outputs, Tensor<Float, CPU>, Float, CPU> { inputs in
        inputs.0
    }
    Dense<Float, CPU>(inputSize: 128, outputSize: 10)
    LogSoftmax<Float, CPU>()
}

var optimizer = Adam(model: model, learningRate: 0.001)

let batch: Tensor<Float, CPU> = ... // shape: [batchSize, 28, 28]
let y_true: Tensor<Int32, CPU> = ... // shape: [batchSize]

let x = batch.permuted(to: 1, 0, 2) // Swap first and second axis
let pred = optimizer.model(x)
let loss = categoricalNegativeLogLikelihood(expected: y_true, actual: pred)

let gradients = loss.gradients(of: optimizer.model.parameters)
optimizer.update(along: gradients)

More Repositories

1

Covfefe

A parser for nondeterministic context free languages
Swift
59
star
2

Graphite

Simple force directed graph drawing for iOS
Swift
42
star
3

GVRSCNRenderer

SceneKit Rendering and ARKit 6DOF Tracking for Google Cardboard
Swift
22
star
4

Path-Tracing-Demo

A path tracer for photorealistic rendering written in Swift
Swift
19
star
5

Mandelbrot

OpenCL accelerated Mandelbrot renderer for macOS
Objective-C
15
star
6

SyntaxKit

A TextView for iOS with Syntax Highlighting, Line Numbers
Swift
12
star
7

SwiftState

Redux-Like unidirectional data flow for SwiftUI with a Redux-Saga-like side effect model
Swift
8
star
8

ER-Editor

Editor for Entity Relationship models used in database creation.
Java
6
star
9

Seq2Seq-DL4S

Neural Machine Translation with Seq2Seq using DL4S
Swift
5
star
10

SocketKit

SocketKit is a client and server side networking framework for macOS written entirely in Swift.
Swift
5
star
11

SeeFood

Oculus? I said octopus.
GLSL
4
star
12

DL4S-WGAN-GP

Wasserstein GAN with Gradient Penalty in DL4S
Swift
4
star
13

SOMRenderer

Kohonen Self-Organizing Maps for Visualizations and Tag based Search on the MovieLens Dataset
Swift
4
star
14

TV-Guide

Swift
3
star
15

SwiftyGPU

A simple command line GPU usage monitor for macOS
Swift
3
star
16

TV-Learn

Python
3
star
17

FlyBuy

This project is so fly, you won't believe it
Swift
2
star
18

MKLSwift

MKL Wrapper Package
Swift
2
star
19

ExpressionSolver

A solver for mathematical expressions using a context free grammar.
Swift
2
star
20

Symbols

A library for symbolic expressions in Swift, allowing you to create and modify mathematical expressions at runtime
Swift
2
star
21

DL4S-Tensorboard

Pure Swift TensorBoard plugin for DL4S
Swift
2
star
22

REINFORCE-DL4S

Implementation of the REINFORCE algorithm in DL4S
Swift
1
star
23

FaceLogin

Unlock your Mac with your face
Python
1
star
24

WWDC20

Winning WWDC20 Swift Student Challenge submission
Swift
1
star