Implementations of selected sparse matrix formats for linear algebra supporting scientific and machine learning applications. Compatible with the APIs in the Gonum package and interoperable with Gonum dense matrix types.
Machine learning applications typically model entities as vectors of numerical features so that they may be compared and analysed quantitively. Typically the majority of the elements in these vectors are zeros. In the case of text mining applications, each document within a corpus is represented as a vector and its features represent the vocabulary of unique words. A corpus of several thousand documents might utilise a vocabulary of hundreds of thousands (or perhaps even millions) of unique words but each document will typically only contain a couple of hundred unique words. This means the number of non-zero values in the matrix might only be around 1%.
Sparse matrix formats capitalise on this premise by only storing the non-zero values thereby reducing both storage/memory requirements and processing effort for manipulating the data.
- Implementations of Sparse BLAS standard routines.
- Compatible with Gonum's APIs and interoperable with Gonum's dense matrix types.
- Implemented Formats:
- Sparse Matrix Formats:
- DOK (Dictionary Of Keys) format
- COO (COOrdinate) format (sometimes referred to as 'triplet')
- CSR (Compressed Sparse Row) format
- CSC (Compressed Sparse Column) format
- DIA (DIAgonal) format
- sparse vectors
- Other Formats:
- Binary (Bit) vectors and matrices
- Sparse Matrix Formats:
- Matrix multiplication, addition and subtraction and vector dot products.
The sparse matrices in this package implement the Gonum Matrix
interface and so are fully interoperable and mutually compatible with the Gonum APIs and dense matrix types.
// Construct a new 3x2 DOK (Dictionary Of Keys) matrix
dokMatrix := sparse.NewDOK(3, 2)
// Populate it with some non-zero values
dokMatrix.Set(0, 0, 5)
dokMatrix.Set(2, 1, 7)
// Demonstrate accessing values (could use Gonum's mat.Formatted()
// function to pretty print but this demonstrates element access)
m, n := dokMatrix.Dims()
for i := 0; i < m; i++ {
for j := 0; j < n; j++ {
fmt.Printf("%.0f,", dokMatrix.At(i, j))
}
fmt.Printf("\n")
}
// Convert DOK matrix to CSR (Compressed Sparse Row) matrix
// just for fun (not required for upcoming multiplication operation)
csrMatrix := dokMatrix.ToCSR()
// Create a random 2x3 COO (COOrdinate) matrix with
// density of 0.5 (half the elements will be non-zero)
cooMatrix := sparse.Random(sparse.COOFormat, 2, 3, 0.5)
// Convert CSR matrix to Gonum mat.Dense matrix just for fun
// (not required for upcoming multiplication operation)
// then transpose so it is the right shape/dimensions for
// multiplication with the original CSR matrix
denseMatrix := csrMatrix.ToDense().T()
// Multiply the 2 matrices together and store the result in the
// sparse receiver (multiplication with sparse product)
var csrProduct sparse.CSR
csrProduct.Mul(csrMatrix, cooMatrix)
// As an alternative, use the sparse BLAS routines for efficient
// sparse matrix multiplication with a Gonum mat.Dense product
// (multiplication with dense product)
denseProduct := sparse.MulMatMat(false, 1, csrMatrix, denseMatrix, nil)
With Go installed, package installation is performed using go get.
go get -u github.com/james-bowman/sparse/...
- Gonum
- Netlib. BLAS. Chapter 3: Sparse BLAS
- J.R. Gilbert, C. Moler, and R. Schreiber. Sparse matrices in MATLAB: Design and implementation. SIAM Journal on Matrix Analysis and Applications, 13:333–356, 1992.
- F.G. Gustavson. Some basic techniques for solving sparse systems of linear equations. In D.J. Rose and R.A. Willoughby, eds., Sparse Matrices and Their Applications, 41–52, New York: Plenum Press, 1972.
- F.G. Gustavson. Efficient algorithm to perform sparse matrix multiplication. IBM Technical Disclosure Bulletin, 20:1262–1264, 1977.
- Wikipedia. Sparse Matrix
- A. Fog. 2. Optimizing subroutines in assembly language An optimization guide for x86 platforms, 1996.
MIT