- Introduction
- What is a sparse matrix?
- Creating a sparse matrix
- Adding and subtracting sparse matrices
- Multiplying sparse matrices
- Multiply matrix and vector
- The dot function
- Summing over a dimension
- Wall time comparison
- Using
smn
within a Numba compiled function - Conclusion
Introduction
Numba is a widely-used just-in-time (JIT) compiler for Python that optimizes the execution of numerical computations by translating them to efficient machine code. It is popular for its ability to accelerate performance in various scientific and computational tasks. However, one limitation of Numba is the absence of a built-in implementation for sparse matrices (while many elements of SciPy were implemented in Numba, sparse matrices were never one of them). Sparse matrices – meaning, matrices where most of the elements are zero – are prevalent in several scientific fields such as graph theory, natural language processing, and computational biology.
In what follows, we will introduce the smn
(Sparse Matrices in Numba) library that can be used in conjunction with Numba to handle sparse matrices – meaning can be used inside functions compiled through Numba – providing an effective solution for scientific computing tasks that involve sparse data structures. The present document can be seen both as a brief lesson on sparse matrices, but also as a documentation of smn
with mathematical details.
First, let’s import smn
which can be found in my GitHub repository, for the code presented here to work it suffices to have the file smn.py
installed in the same directory. We will also import the standard numpy
library for numerical analysis in python, which will be further used to compare the accuracy of the functions implemented in smn
.
import smn
import numpy as np
import numba
What is a sparse matrix?
A sparse matrix is a data structure meant to represent matrices for which the vast majority of elements is zero. Instead of allocating memory for all elements in the matrix, these data structures store only the non-zero elements (and their corresponding indices). This can vastly conserve memory and allows for faster computations. For example, let us consider the following matrix
This could be, instead, represented as by the triad of 1 dimensional arrays as:
Array index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 2 | 3 | 4 | 5 | 4 | 3 | 2 | 1 | |
1 | 0 | 2 | 1 | 3 | 2 | 4 | 3 | 5 | 4 | |
2.0 | 1.0 | 2.0 | 1.0 | 2.0 | 1.0 | 2.0 | 1.0 | 2.0 | 1.0 |
This sparse matrix representation shown above is known as the “coordinate list” (COO) format. In the COO format, the non-zero elements of the matrix are represented by their corresponding row indices, column indices, and values. This is the way in which smn
represents sparse matrices. Other formats, such as the “compressed sparse row” (CSR), “compressed sparse column” (CSC), and “dictionary of keys” (DOK) can be used instead, with higher efficiency in some specific applications, but for smn
we opted for COO for more general applicability and easy of understanding.
Given that matrix [latex]A[/latex] is 6×6, it has a total of 36 elements. In a dense representation, we would need to store all 36 elements, even though the majority of them are zero. On the other hand, the representation presented above only stores the non-zero elements (10 in this example) along with their corresponding row and column indices (30 elements in total). While this is a modest reduction in memory usage, it is significantly more advantageous for larger matrices that exhibit sparsity. As the size of the matrix increases, and the number of non-zero elements remains relatively small, the memory savings become even more significant. In the present document we proceed using 6×6 matrices for pedagogical pourposes, but the smn
library is able to deal with arbitrarily large matricial dimensions.
Creating a sparse matrix
The smn
library has a specialized class called sparse_matrix
that is designed to store sparse matrices in the COO format. Our first example consists of generating a sparse matrix using the ‘smn’ library. For this, one has to explicitly says the line, colum and value of each non-zero term in the matrix. One example that will be further useful in this document is presented below.
smn.sparse_matrix
(lines,columns,values,shape)
Parameters:
lines – numpy.array
of integers.
columns – numpy.array
of integers.
values – numpy.array
of floats.
shape – integer
the dimension of the matrix, (if not specified will assume it is the maximum element of lines) as of now it only implements square matrices of dimension (shape,shape).
Returns: The equivalent sparse matrix as an object of the smn.sparse_matrix
class.
lines = np.array((0,1,2,3,4,5,4,3,2,1),dtype=np.int64)
columns = np.array((1,2,3,4,5,4,3,2,1,0),dtype=np.int64)
values = np.array((2,2,2,2,2,1,1,1,1,1),dtype=np.float64)
A = smn.sparse_matrix(lines,columns,values)
This creates our sparse matrix. One can recover the lines, columns and values arrays using the respective atributes:
A.lines
array([0, 1, 2, 3, 4, 5, 4, 3, 2, 1])
A.columns
array([1, 2, 3, 4, 5, 4, 3, 2, 1, 0])
A.values
array([2., 2., 2., 2., 2., 1., 1., 1., 1., 1.])
Some times, for ease of observation, one may want to see the sparse matrix as a dense matrix. For this, we have the to_dense
method
A.to_dense()
array([[0., 2., 0., 0., 0., 0.],
[1., 0., 2., 0., 0., 0.],
[0., 1., 0., 2., 0., 0.],
[0., 0., 1., 0., 2., 0.],
[0., 0., 0., 1., 0., 2.],
[0., 0., 0., 0., 1., 0.]])
which returns a 2-dimensional numpy array equivalent to the sparse matrix. Now, we proceed to show how the basic matrix operations are implemented in smn
.
Adding and subtracting sparse matrices
In order to present how to add and subtract matrices within our library, let us first generate another spare matrix, [latex]B[/latex],
lines = np.arange(6,dtype=np.int64)
columns = np.arange(5,-1,-1,dtype=np.int64)
values = np.ones(6,dtype=np.float64)
B = smn.sparse_matrix(lines,columns,values)
B.to_dense()
array([[0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1., 0.],
[0., 0., 0., 1., 0., 0.],
[0., 0., 1., 0., 0., 0.],
[0., 1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0.]])
Then we can add the two sparse matrices,[latex]A[/latex] and[latex]B[/latex], using the usual +
operation
C=A+B
C.to_dense()
array([[0., 2., 0., 0., 0., 1.],
[1., 0., 2., 0., 1., 0.],
[0., 1., 0., 3., 0., 0.],
[0., 0., 2., 0., 2., 0.],
[0., 1., 0., 1., 0., 2.],
[1., 0., 0., 0., 1., 0.]])
and, similarly, subtract matrices using the usual -
operation
D=A-B
D.to_dense()
array([[ 0., 2., 0., 0., 0., -1.],
[ 1., 0., 2., 0., -1., 0.],
[ 0., 1., 0., 1., 0., 0.],
[ 0., 0., 0., 0., 2., 0.],
[ 0., -1., 0., 1., 0., 2.],
[-1., 0., 0., 0., 1., 0.]])
Here it is important to understand the method through which smn
adds or subtracts the sparse matrices. It is done by simply concatenating the arrays of values and indices. Therefore it may contain duplicates in terms of values and corresponding indices. Let’s see the matrix [latex]D = A-B[/latex] for an example
D.lines,D.columns,D.values
(array([0, 1, 2, 3, 4, 5, 4, 3, 2, 1, 0, 1, 2, 3, 4, 5]),
array([1, 2, 3, 4, 5, 4, 3, 2, 1, 0, 5, 4, 3, 2, 1, 0]),
array([ 2., 2., 2., 2., 2., 1., 1., 1., 1., 1., -1., -1., -1.,
-1., -1., -1.]))
Note that even though the element (3,2) of [latex]D[/latex] is 0 (line 10), it still appears in the attributes of [latex]D[/latex] associated to the value 1 and once again to the value -1 . With the sum of them 0. being the correct final value.
It is important to mention that such redundancy still leads to the correct result when performing matrix operations, as the redundant elements contribute to the final result as they should.
Nevertheless, eliminating redundancy in sparse matrix representations can significantly optimize memory usage and computational performance, particularly for large matrices. As a result, we recommend minimizing the use of the + and – operators. However, if one still finds the need to use them and aims to optimize performance, we have also provided the ‘simplify’ method. This method efficiently eliminates redundancies, ensuring streamlined operations.
D.simplify()
D.lines,D.columns,D.values
(array([0, 0, 1, 1, 1, 2, 2, 3, 4, 4, 4, 5, 5]),
array([1, 5, 0, 2, 4, 1, 3, 4, 1, 3, 5, 0, 4]),
array([ 2., -1., 1., 2., -1., 1., 1., 2., -1., 1., 2., -1., 1.]))
For a more dramatic example let’s create the null matrix as [latex]N=A-A[/latex].
N = A-A
N.to_dense()
array([[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.]])
Even though [latex]N[/latex] it is an null matrix, it still occupies a lot more memory than necessary if the simplify
method is not used, as it can be seen below:
N.lines,N.columns,N.values
(array([0, 1, 2, 3, 4, 5, 4, 3, 2, 1, 0, 1, 2, 3, 4, 5, 4, 3, 2, 1]),
array([1, 2, 3, 4, 5, 4, 3, 2, 1, 0, 1, 2, 3, 4, 5, 4, 3, 2, 1, 0]),
array([ 2., 2., 2., 2., 2., 1., 1., 1., 1., 1., -2., -2., -2.,
-2., -2., -1., -1., -1., -1., -1.]))
N.simplify()
N.lines,N.columns,N.values
(array([], dtype=int64), array([], dtype=int64), array([], dtype=float64))
After simplification, [latex]N[/latex] is an null sparse matrix, as it has no non-zero elements.
Multiplying sparse matrices
Similarly to addition and subtraction, we can multiply sparse matrices using the usual *
operator
H = A*B
H.to_dense()
array([[0., 0., 0., 0., 2., 0.],
[0., 0., 0., 2., 0., 1.],
[0., 0., 2., 0., 1., 0.],
[0., 2., 0., 1., 0., 0.],
[2., 0., 1., 0., 0., 0.],
[0., 1., 0., 0., 0., 0.]])
H.lines,H.columns,H.values
(array([0, 1, 1, 2, 2, 3, 3, 4, 4, 5]),
array([4, 3, 5, 2, 4, 1, 3, 0, 2, 1]),
array([2., 2., 1., 2., 1., 2., 1., 2., 1., 1.]))
which is consistent with the dot function in numpy
np.dot(A.to_dense(),B.to_dense())
array([[0., 0., 0., 0., 2., 0.],
[0., 0., 0., 2., 0., 1.],
[0., 0., 2., 0., 1., 0.],
[0., 2., 0., 1., 0., 0.],
[2., 0., 1., 0., 0., 0.],
[0., 1., 0., 0., 0., 0.]])
Note however that this is not exchangeable with the (numpy.ndarray
) product operator, since it multiplies the elements term by term
A.to_dense()*B.to_dense()
array([[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 2., 0., 0.],
[0., 0., 1., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.]])
Multiply matrix and vector
Although it is interesting to multiply matrix by matrix. In many scientific applications, the focus lies in multiplying matrices by vectors. This is especially common in physics, where vectors are typically represented as column matrices and multiplied on the right-hand side
Here is where the sparse matrix format has a further oportunity to shine, as the sparse matix format wil not need to calculate the terms for which the matrix element is zero. When performing multiplication involving a sparse matrix, it can potentially introduce non-zero elements in previously empty positions, leading to a denser matrix. This increase in density occurs due to the multiplication process and the interaction between non-zero elements.
Meanwhile, when multiplying a sparse matrix by a (dense) vector, the memory usage remains the same, limited to the vector’s size. As a result, the size and sparsity pattern of the vector remain unchanged, preserving the memory efficiency associated with sparse representations.
Because of this, it is recommended for better performance to avoid direct matrix multiplications and opt for sucessive matrix times vector products whenever possible.
In smn
a multiplication of sparse matrix times a (column) vector as above is implemented in the function sm_times_array
v = np.array((.25,.5,.75,1.,1.25,1.5))
smn.sm_times_array(A,v)
array([1. , 1.75, 2.5 , 3.25, 4. , 1.25])
which is equvalent to the numpy
dot function for numpy arrays
np.dot(A.to_dense(),v)
array([1. , 1.75, 2.5 , 3.25, 4. , 1.25])
Similarly, a multiplication of a (row) vector represented by an array times a matrix, as it often preferred in some fields such as stochastic processes,
is implemented in the function array_times_sm
smn.array_times_sm(v,A)
array([0.5 , 1.25, 2. , 2.75, 3.5 , 2.5 ])
which is equivalent to the numpy
version.
np.dot(v,A.to_dense())
array([0.5 , 1.25, 2. , 2.75, 3.5 , 2.5 ])
The dot function
Both sm_times_array
and array_times_sm
can be used within Numba. However, in order to have a syntax familiar to numpy
users we also implemented an equivalent dot
function that handles, at the same time, matrix multiplication
I = smn.dot(A,B)
I.lines,I.columns,I.values
(array([0, 1, 1, 2, 2, 3, 3, 4, 4, 5]),
array([4, 3, 5, 2, 4, 1, 3, 0, 2, 1]),
array([2., 2., 1., 2., 1., 2., 1., 2., 1., 1.]))
multiplication by a column vector
smn.dot(A,v)
array([1. , 1.75, 2.5 , 3.25, 4. , 1.25])
multiplication by a row vector
smn.dot(v,A)
array([0.5 , 1.25, 2. , 2.75, 3.5 , 2.5 ])
and the regular dot product of numpy arrays.
smn.dot(v,v)
5.6875
Note that since dot
gives results in different data types, it is not compiled in Numba and, because of this, can not be used inside a Numba compiled function. If necessary to do so, we recomend using the product operator *
, sm_times_array
, array_times_sm
, and numpy.dot
respectively.
Summing over a dimension
Now, let us implement a function that sums the elements of a matrix line by line, meaning [latex]f[/latex] such that
which is implemented in smn
through the method line_sum
A.line_sum()
array([2., 3., 3., 3., 3., 1.])
which is equivalent to the numpy
function sum
with the parameter axis=1
for dense matrices
np.sum(A.to_dense(),axis=1)
array([2., 3., 3., 3., 3., 1.])
and the column sum, meaning [latex]g[/latex] such that
Implemented in smn
as column_sum
A.column_sum()
array([1., 3., 3., 3., 3., 2.])
equivalent to the numpy
function sum
with the parameter axis=0
np.sum(A.to_dense(),axis=0)
array([1., 3., 3., 3., 3., 2.])
Now, in a manner similar to the dot
function implemented above, we also implemented the sum
function for greater similarity with numpy
smn.sum(A,axis=1) #line
array([2., 3., 3., 3., 3., 1.])
smn.sum(A,axis=0) #column
array([1., 3., 3., 3., 3., 2.])
and when the parameter axis
is not specified, it gives the sum of all elements, as in numpy
smn.sum(A)
15.0
np.sum(A.to_dense())
15.0
Please note that, as it happened with dot
, smn.sum
is not compiled with Numba.
Wall time comparison
It is reasonable to anticipate that algorithms implemented using sparse matrices, as mentioned earlier, would exhibit greater efficiency compared to their equivalent counterparts using dense matrices. As previously mentioned, memory requirements are significantly reduced. However, to validate this assumption, it is essential to conduct benchmarking tests to measure and compare the actual performance of algorithms utilizing sparse and dense matrices. For this, we shall first create a large sparse matrix
N = 1000 #size of the matrix, change for further testing
lines_1 = np.array(np.arange(N),dtype=np.int64)
columns_1 = np.array(np.arange(N),dtype=np.int64)
values_1 = np.random.rand(N) #diagonal matrix with random elements
lines_2 = np.array(np.arange(N),dtype=np.int64)
columns_2 = np.array(np.random.randint(N,size=N),dtype=np.int64) #matrix with 1 extra element per line in a random column
values_2 = np.random.rand(N)
Z = smn.sparse_matrix(lines_1,columns_1,values_1)+smn.sparse_matrix(columns_2,lines_2,values_2)
Z_dense = Z.to_dense()
Z_dense
array([[0.28965617, 0. , 0. , ..., 0. , 0. ,
0. ],
[0. , 0.04267883, 0. , ..., 0. , 0. ,
0. ],
[0. , 0. , 0.14100828, ..., 0. , 0. ,
0. ],
...,
[0. , 0. , 0. , ..., 0.91822673, 0. ,
0. ],
[0. , 0. , 0. , ..., 0. , 0.33303376,
0. ],
[0. , 0. , 0. , ..., 0. , 0. ,
0.66147908]])
In order to compare the execution time of product of matrices, we shall use the timeit
special function
import timeit
allowing us to verify that the time spent to calculate a matrix product using smn
sparse matrix is considerably smaller than the numpy
made with the equivalent dense matrix.
%timeit (Z*Z)
3.44 ms ± 49.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Z_dense = Z.to_dense()
%timeit np.dot(Z_dense,Z_dense)
15.8 ms ± 841 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
The time gained is even more noticeable when doing matrix times vector operations (when compared to the dense numpy
version)
rho = np.random.rand(N)
%timeit smn.dot(rho,Z)
4.71 µs ± 46.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
%timeit np.dot(rho,Z_dense)
141 µs ± 9.32 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
and when summing by dimension
%timeit smn.sum(Z,axis=0)
2.83 µs ± 35.8 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
%timeit np.sum(Z_dense,axis=0)
375 µs ± 19.8 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Why use smn? – compiling Numba function
While the previous examples show that the sparse matrix structure makes matrix operations more significant, that is not the main reason why smn
is useful. Rather, the reason leading to smn
development is that we can use objects of the smn.sparse_matrix
class within Numba compiled functions.
As an example let analize a function with three parameters: [latex]\rho [/latex] a row vector, [latex]A [/latex] a (sparse) matrix, and [latex]k [/latex] a positive integer. The goal being to calculate [latex]\rho A^k [/latex]. Using smn
this function can be written as
@numba.njit
def evolve_smnandnumba(rho,A,k):
res = rho
for i in range(k):
res = smn.array_times_sm(rho,A)
return res
The @numba.njit
decorator above instructs Numba to compile the decorated function using just-in-time (JIT) compilation. The first time the function is ran, Numba analyzes the function code and generates machine code based on the specific data types encountered during execution. The JIT compilation process occurs in the background, and following calls utilize the compiled version.
With that in mind, we can benchmark the execution time of the JIT-compiled code,
void = evolve_smnandnumba(rho,Z,50) #Forces JIT compilation
%timeit evolve_smnandnumba(rho,Z,50)
150 µs ± 969 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
and compare it to another version that does use sparse matrix, but not compiled
def evolve_smn(rho,A,k):
res = rho*1.0
for i in range(k):
res = smn.array_times_sm(rho,A)
return res
%timeit evolve_smn(rho,Z,50)
225 µs ± 7.69 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
The reduction is even more dramatic when compared to a compiled code using the regular numpy
dense array
@numba.njit
def evolve_numpy(rho,A,k):
res = rho*1.0
for i in range(k):
res = np.dot(rho,A)
return res
void = evolve_numpy(rho,Z_dense,50)
%timeit evolve_numpy(rho,Z_dense,50)
6.09 ms ± 53.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Conclusion
The present post introduces the world of sparse matrices, which offers a plethora of possibilities for optimizing data processing and numerical analysis. With a deeper understanding of their underlying concepts and advantages, it becomes evident that sparse matrices are indispensable in handling large datasets and dynamical processes efficiently.
The library smn
presented here enables operations on sparse matrices within Numba compiled function. By leveraging smn
, users can have the advantages of compiled code while still using python sintax and programing tools.