• Stars
    star
    119
  • Rank 287,544 (Top 6 %)
  • Language
    R
  • Created almost 9 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Fast truncated singular value decompositions

irlba

Implicitly-restarted Lanczos methods for fast truncated singular value decomposition of sparse and dense matrices (also referred to as partial SVD). IRLBA stands for Augmented, Implicitly Restarted Lanczos Bidiagonalization Algorithm. The package provides the following functions (see help on each for details and examples).

  • irlba() partial SVD function
  • ssvd() l1-penalized matrix decompoisition for sparse PCA (based on Shen and Huang's algorithm)
  • prcomp_irlba() principal components function similar to the prcomp function in stats package for computing the first few principal components of large matrices
  • svdr() alternate partial SVD function based on randomized SVD (see also the rsvd package by N. Benjamin Erichson for an alternative implementation)
  • partial_eigen() a very limited partial eigenvalue decomposition for symmetric matrices (see the RSpectra package for more comprehensive truncated eigenvalue decomposition)

Help documentation for each function includes extensive documentation and examples. Also see the package vignette, vignette("irlba", package="irlba").

An overview web page is here: https://bwlewis.github.io/irlba/.

New in 2.3.2

  • Fixed a regression in prcomp_irlba() discovered by Xiaojie Qiu, see #25, and other related problems reported in #32.
  • Added rchk testing to pre-CRAN submission tests.
  • Fixed a sign bug in ssvd() found by Alex Poliakov.

What's new in Version 2.3.1?

  • Fixed an irlba() bug associated with centering (PCA), see #21.
  • Fixed irlba() scaling to conform to scale, see #22.
  • Improved prcomp_irlba() from a suggestion by N. Benjamin Erichson, see #23.
  • Significanty changed/improved svdr() convergence criterion.
  • Added a version of Shen and Huang's Sparse PCA/SVD L1-penalized matrix decomposition (ssvd()).
  • Fixed valgrind errors.

Deprecated features

I will remove partial_eigen() in a future version. As its documentation states, users are better off using the RSpectra package for eigenvalue computations (although not generally for singular value computations).

The mult argument is deprecated and will be removed in a future version. We now recommend simply defining a custom class with a custom multiplcation operator. The example below illustrates the old and new approaches.

library(irlba)
set.seed(1)
A <- matrix(rnorm(100), 10)

# ------------------ old way ----------------------------------------------
# A custom matrix multiplication function that scales the columns of A
# (cf the scale option). This function scales the columns of A to unit norm.
col_scale <- sqrt(apply(A, 2, crossprod))
mult <- function(x, y)
        {
          # check if x is a  vector
          if (is.vector(x))
          {
            return((x %*% y) / col_scale)
          }
          # else x is the matrix
          x %*% (y / col_scale)
        }
irlba(A, 3, mult=mult)$d
## [1] 1.820227 1.622988 1.067185

# Compare with:
irlba(A, 3, scale=col_scale)$d
## [1] 1.820227 1.622988 1.067185

# Compare with:
svd(sweep(A, 2, col_scale, FUN=`/`))$d[1:3]
## [1] 1.820227 1.622988 1.067185

# ------------------ new way ----------------------------------------------
setClass("scaled_matrix", contains="matrix", slots=c(scale="numeric"))
setMethod("%*%", signature(x="scaled_matrix", y="numeric"), function(x ,y) [email protected] %*% (y / x@scale))
setMethod("%*%", signature(x="numeric", y="scaled_matrix"), function(x ,y) (x %*% [email protected]) / y@scale)
a <- new("scaled_matrix", A, scale=col_scale)

irlba(a, 3)$d
## [1] 1.820227 1.622988 1.067185

We have learned that using R's existing S4 system is simpler, easier, and more flexible than using custom arguments with idiosyncratic syntax and behavior. We've even used the new approach to implement distributed parallel matrix products for very large problems with amazingly little code.

Wishlist / help wanted...

  • More Matrix classes supported in the fast code path
  • Help improving the solver for singular values in tricky cases (basically, for ill-conditioned problems and especially for the smallest singular values); in general this may require a combination of more careful convergence criteria and use of harmonic Ritz values; Dmitriy Selivanov has proposed alternative convergence criteria in #29 for example.

References

  • Baglama, James, and Lothar Reichel. "Augmented implicitly restarted Lanczos bidiagonalization methods." SIAM Journal on Scientific Computing 27.1 (2005): 19-42.
  • Halko, Nathan, Per-Gunnar Martinsson, and Joel A. Tropp. "Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions." (2009).
  • Shen, Haipeng, and Jianhua Z. Huang. "Sparse principal component analysis via regularized low rank matrix approximation." Journal of multivariate analysis 99.6 (2008): 1015-1034.
  • Witten, Daniela M., Robert Tibshirani, and Trevor Hastie. "A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis." Biostatistics 10.3 (2009): 515-534.

More Repositories

1

rthreejs

Three.js widgets for R and shiny
JavaScript
295
star
2

GLM

Notes on generalized linear models
TeX
110
star
3

rredis

R client for Redis
R
93
star
4

doRedis

R/foreach Redis backend for parallel computing
R
71
star
5

irlbpy

Truncated SVD by implicitly restarted Lanczos bidiagonalization for Python Numpy
Python
58
star
6

tcor

Fast thresholded correlation mattices
R
31
star
7

iqfeed

An R package that interfaces to DTN IQFeed over TCP/IP
R
29
star
8

1000_genomes_examples

Examples using R and 1000 genomes data
R
27
star
9

duckdb_and_r

My thoughts and examples on DuckDB and R
HTML
12
star
10

r-and-singularity

Notes on R and Singularity
12
star
11

lz4

LZ4 Data Compression and Decompression for R
C
11
star
12

lazy.frame

Lazy person's file-backed data frame
C
11
star
13

writing_foreach_adapters

Tutorials, code and notes about writing R foreach adapters
HTML
9
star
14

fls

The Kalaba-Tesfatsion Flexible Least Squares method for R.
8
star
15

hclust_in_R

A pure R hierarchical clustering implementation so I can better learn the method
R
8
star
16

IRL

Implicitly restarted Lanczos methods fo R
8
star
17

correlation-regularization

A fun example of covariance shrinkage for financial time series data
8
star
18

crosstool

An experimental, very generic control/communication widget for use with the crosstalk and htmlwidgets packages.
R
8
star
19

feathercache

Simple networked object cache
C
7
star
20

betfair

Betfair API for R
R
6
star
21

future.redis

A Redis-based task queue backend for R's future system
R
6
star
22

urca

Pfaff/Stigler Unit Root and Cointegration Analysis R package
R
6
star
23

slides

Simple HTML/CSS/JS slide environment
HTML
5
star
24

python

Simple Python bindings for R derived from https://github.com/rstudio/tensorflow
5
star
25

x3dom

Tools for creating x3dom and d3.js graphics with R and Shiny
JavaScript
5
star
26

cassini

Gershgorin discs and Brauer's ovals of Cassini
JavaScript
4
star
27

shim

Simple mmap for PVFS2 and others
C
3
star
28

RSQLServer

SQL Server database interface for R on Windows
C++
3
star
29

R4P

An R library for Processing.
C
3
star
30

nvd3

Basic R interface to nvd3.js
JavaScript
3
star
31

ratlab

Tight integration between R and GNU/Octave or Matlab
C++
3
star
32

doRedisWindowsService

A Windows service wrapper for the doRedis R/foreach backend
2
star
33

share

Trivially simple networked object cache for generic R data
C
2
star
34

2020_rstudioconf

Slides from my talk at the 2020 RStudio conference in San Francisco
HTML
2
star
35

tir

Example time in range computation
HTML
2
star
36

least-squares-without-calculus

Ordinary least squares data fitting for college algebra courses
TeX
2
star
37

Feb2016_Cleveland_R_Meetup

Slides and data for the February, 2016 Cleveland R Meetup
HTML
1
star
38

rfinance_2016

talk slides
HTML
1
star
39

R_hacker_news_dashboard

An experiment that glues the Hacker News API to R's new flex dashboard framework from RStudio
R
1
star
40

dnn_notes

Miscellaneous notes on deep neural networks
HTML
1
star
41

sshg

ssh to a group of nodes
Shell
1
star
42

msft_april_2016

a few talks for Microsoft
CSS
1
star