Discover idiap/sparch Open Source project

Code for experiments regarding importance sampling for training neural networks

1,586

importance-sampling

Bob is a free signal-processing and machine learning toolbox originally developed by the Biometrics group at Idiap Research Institute, in Switzerland.

319

bob

266

fullgrad-saliency

Full-gradient saliency maps

ESLAM

multicamera-calibration

Multi-Camera Calibration Suite

Generalizing NeRF with Geometry Priors

178

GeoNeRF

This Python package enables the training and inference of deep learning models for very large data, such as megapixel images, using attention-sampling

104

attention-sampling

Implementation of audio degradation processes

acoustic-simulator

Linear time Maximally Stable Extremal Regions implementation

mser

Extension to Kaldi implementing the standard i-vector hyperparameter estimation and i-vector extraction procedure

kaldi-ivector

Multilingual hierarchical attention networks toolkit

mhan

A pytorch wrapper for LF-MMI training and parallel training in Kaldi

pkwrap

Document-Level Neural Machine Translation with Hierarchical Attention Networks

HAN_NMT

JavaScript

gafro

An efficient c++ library targeting robotics applications using geometric algebra

Juicer is a Weighted Finite State Transducer (WFST) based decoder for Automatic Speech Recognition (ASR).

juicer

Compare your face recognition algorithm to baseline algorithms

facereclib

g2g-transformer

Pytorch implementation of “Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement”

Code paper Uncertainty Reduction for Uncertainty Reduction for Model Adaptation in Semantic Segmentation at CVPR 2021

model-uncertainty-for-adaptation

Implementation of fast exact k-means algorithms

eakmeans

Speech Signal Processing - a small collection of routines in Python to do signal processing

ssp

A Corpus for Research on Robust Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications

atco2-corpus

potr

Residual Pose: A Decoupled Approach for Depth-based 3D Human Pose Estimation

residual_pose

Implementation of the work presented in "CNN based Query by Example Spoken Term Detection"

CNN_QbE_STD

w2v2-air-traffic

Neural Network based Sound Source Localization Models

nnsslm

Code for the PyTorch implementation of "Spatially-Variant CNN-based Point Spread Function Estimation for Blind Deconvolution and Depth Estimation in Optical Microscopy", IEEE Transactions on Image Processing, 2020.

psfestimation

C++ Implementation of the Information Bottleneck System

IBDiarization

A generalized input-label embedding for text classification

gile

Code for "Semi-Blind Spatially-Variant Deconvolution in Optical Microscopy with Local Point Spread Function Estimation By Use Of Convolutional Neural Networks" ICIP 2018

semiblindpsfdeconv

A Python-based modular toolbox for building Deep Neural Network models (using PyTorch) for statistical parametric speech synthesis

IdiapTTS

Enables computing the gradient of the parameters of Hidden Markov Models (HMMs)

HMMGradients.jl

Julia

inv-tn

A bunch of scripts exploiting several tools to perform inverse text normalization (ITN)

Pytorch implementation of "DeepFocus: a Few-Shot Microscope Slide Auto-Focus using a Sample Invariant CNN-based Sharpness Function"

deepfocus

Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering

zff_vad

Implementation of the contextual biasing for ASR decoding on GPUs without lattice generation. The code supports submission to Interspeech 2023.

contextual-biasing-on-gpus

Data and code related to the ICASSP submission "A comparison of methods for OOV-word recognition"

icassp-oov-recognition

This repo provides the training and testing code for our paper "A Modular Multimodal Architecture for Gaze Target Prediction: Application to Privacy-Sensitive Settings" published at the GAZE workshop at CVPR 2022

multimodal_gaze_target_prediction

Phonetic and phonological vocoding platform

phonvoc

Various scripts that facilitate the preparation of Automatic Speech Recognition related resources

asrt

Efficient Pose Machine for Multi-Person Pose Estimation

fast_pose_machines

APAM toolkit is built on PyTorch and provides recipes to adapt pretrained acoustic models with a variety of sequence discriminative training criterions.

apam

Speech Signal Processing - C++ port of a subset of the Python library SSP

libssp

Content-based Recommendation Generator

cbrec

A Kaldi recipe for training automatic speech recognition systems on the Torgo corpus of dysarthric speech

torgo_asr

Weighted multiple-instance learning algorithm based on stochastic gradient descent

wmil-sgd

A PyTorch implementation of TTGO algorithm and the applications presented in the paper "Tensor Train for Global Optimization Problems in Robotics"

ttgo

Scripts for speech processing

iss

PyTorch implementation for HyperMixing, a linear-time token-mixing technique used in HyperMixer architecture

hypermixing

A PyTorch-based program which estimates 3D depth maps from active structured-light sensor's multiple video frames

DepthInSpace

rgbd

Tracter is a data flow framework.

tracter

Deep residual output layers for neural language generation

drill

nvib_transformers

bert-text-diarization-atc

Supervised Speech Representation Learning for Parkinson's Disease Classification

pddetection-reps-learning

Partitional data clustering around centers

zentas

Experiments using fast linear transformer

linear-transformer-experiments

Emotion-based Recommendation Generator

emorec

OpenEdge ABL

DocRec

Keyword extraction and document recommendation in conversations

MATLAB

depth_human_synthesis

DepthHuman: A tool for depth image synthesis for human pose estimation

Geometry-aware Face Reconstruction

gafar

nvib

hallucination-detection

CNNs for voice antispoofing detection

cnn-for-voice-antispoofing

MATLAB

wav2vec-lfmmi

Recipes from fine-tuning a pre-trained wav2vec 2.0 model using the espresso tool kit

A C++ iLQR library that allows you to solve iLQR optimization problem on any robot as long as you provide an URDF file describing the kinematics chain of the robot

ilqr_planner

A reference-based metric to evaluate the accuracy of pronoun translation (APT)

APT

sentence-planner

ISS scripts for handling pronunciation dictionaries

iss-dicts

cncsharedtask

Methods to estimate the visual focus of attention

slog

MATLAB

vfoa

BuSLR: Build System for Speech and Language Research

buslr

CMake

Node_weighted_GCN_for_depression_detection

Node-weighted Graph Convolutional Network for Depression Detection in Transcribed Clinical Interviews

HTML

abroad-re

Towards an end-to-end Relation Extraction system for the natural product literature: datasets, strategies and models

ML3 classifier (Multiclass Latent Locally Linear Support Vector Machines)

ML3

Sense-aware Neural Machine Translation

sense_aware_NMT

Source code for the paper 'Can Self-Supervised Neural Representations Pre-Trained on Human Speech distinguish Animal Callers?' by E. Sarkar and M. Magimai Doss (2023).

ssl-caller-detection

Extracting pre-trained self-supervised embeddings for ICML ExVO 2022 challenge

ExVo-2022

PHP Generic Registration Module [GPLv3]

php-geremo

PHP

idiap.github.com

Main page for idiap@github

CSS

TIDIGITSRecipe.jl

A Julia recipe for training an ASR system using the TIDIGITS database

Julia

hpca

bayesian-recurrence

A Bayesian Interpretation of Recurrence in Neural Networks

Reference implementation of the ICLR 2021 paper "Rethinking the Role of Gradient-Based Attribution Methods for Model Interpretability".

rethinking-saliency

Classifier models and feature extractors for discourse relations

DiscoConn-Classifier

Perl

pydhn

Allows to calibrate a gaze estimator in an unsupervised fashion by automatically collecting calibration samples using task-related priors

unsupervised_gaze_calibration

Implementation and output data of "Global-Context Neural Machine Translation through Target-Side Attentive Residual Connections"

Attentive_Residual_Connections_NMT

JavaScript

FiniteStateTransducers.jl

Play with Weighted Finite State Transducers (WFST) in the Julia language.

Julia

iss-wsj

ISS scripts for the Wall Street Journal task

Pytorch network architectures for audio perception

archs

A Python module for generating District Heating Networks layouts

dhgen

A lightweight URDF parser library, based on TinyXML2, that converts an [URDF file] into a KDL object

tinyurdfparser

PyTorch implementation of "Estimating Nonplanar Flow from 2D Motion-blurred Widefield Microscopy Images via Deep Learning", submitted to IEEE ISBI, 2021

flowestimation

apkit

Audio processing toolkit

The trimed algorithm for obtaining the medoid of a set

trimed

Linux Imaging and Deployment Made Easy

100

simple-imager