In this Python tutorial, we will learn about “Scipy Sparse” where we will cover how to create a sparse matrix. Additionally, we will cover the following topics.
- Scipy Sparse rand
- Scipy Sparse linalg
- Scipy Sparse CSR matrix
- Scipy Sparse matrix to NumPy array
- Scipy Sparse hstack
- Scipy Sparse coo matrix
- Scipy Sparse eigsh
- Scipy Sparse to dense
- Scipy Sparse matrix from pandas dataframe
What is Scipy Sparse
The Sparse Matrix contains the elements where most of the elements are zero. There are two types of sparse matrices in Scipy, the first is the CSR(Compressed Sparse Row)
and the second one is the CSC(Compressed Sparse Column)
. There is a module scipy.sparse
to deal with sparse data or matrices.
Scipy Sparse Rand
The scipy.sparse
package contains a function rand()
to generate a matrix containing uniformly distributed values by specifying shape and density.
The syntax to create a sparse matrix using the rand()
the function is given below.
scipy.sparse.rand(m, n, density=0.01, format='coo', dtype=None, random_state=None)
Where parameters are:
- m,n: It defines the shape of matrix-like we want to build a matrix of shape 2 by 3 where m and n are 2 and 3 respectively.
- density: It is used to specify the density of the matrix that we want to generate, if we specify the density of the matrix as 1, then it is the full matrix. If specify density as 0 which means the matrix contains no non-zero items.
- format: It is used to specify the format of the matrix.
- dtype: It is used to define the data type of the returned matrix values.
- random_state: It is used to generate the reproducible values.
In the below demonstration, we are going to generate the sparse matrix using the function rand()
.
Import the function rand()
using the below code.
from scipy.sparse import rand
Create a matrix by specifying a shape of 4 by 3 with density= 0.30, format=”csr” and random_state=40 using the below code.
matrix_data = rand(4, 3, density=0.30, format="csr", random_state=40)
Check the matrix data type and its format.
matrix_data
Now, check the elements of a created matrix using the function toarray()
on that matrix.
matrix_data.toarray()
Read: What is Scipy in Python
Scipy Sparse linalg
In Scipy, the subpackage scipy.sparse
has module linalg
to deal with linear algebra problems and it has lots of methods related to different categories that are given below.
Abstract linear operators
It has two methods :
- LinearOperator(*args, **kwargs) : It is a common interface for doing matrix-vector products.
- aslinearoperator(A): It output a LinearOperator.
Matrix norms
It has also two methods to calculate the normal distribution of the matrix.
- norm(x[, ord, axis]): It returns the normal distribution of a given sparse matrix.
- onenormest(A[, t, itmax, compute_v, compute_w]): Given the 1-norm of a sparse matrix, It calculates a lower bound.
Matrix Operations
It is used to calculate the exponential and inverse of a given sparse matrix and has three methods.
- inv(A): It is used to calculate the inverse of a given sparse matrix
- expm(A): t is used to calculate the matrix exponential with the help of the Pade approximation.
- expm_multiply(A, B[, start, stop, num, endpoint]): t is used to calculate the action of the matrix exponential of A on B.
Solving linear problems
It has direct methods to solve linear problems and has lots of methods, here we will know about some of the methods.
- spsolve(A, b[, permc_spec, use_umfpack]): It is used to find the solution of the sparse linear system Ax=b, where b represents a vector or a matrix.
- spsolve_triangular(A, b[, lower, …]): It is used to find the solution of the equation like A x = b for x, where A is considered as a triangular matrix.
There are other methods available at official documentation “scipy.sparse.linalg”
Scipy Sparse CSR matrix
The csr
stand for Compressed Sparse Row Matrix
, so we can create csr matrix
using the function in the subpackage scipy.sparse
of Scipy.
If we want a matrix where we can perform addition, multiplication, subtraction, matrix power, and division, then csr matrix
is suitable for that.
The csr matrix
can be created in many ways as shown below.
csr_matrix(D): Using the rank_2 ndarray or dense matrix.
Import the necessary libraries using the below code.
import numpy as np
from scipy.sparse import csr_matrix
Create a rank-2 matrix using the below code.
D = np.array([[1, 0, 1, 0, 0, 0], [2, 0, 2, 0, 0, 1],\
[0, 0, 0, 2, 0, 1]])
Check the crated matrix using the below code.
print(D)
pass the created matrix to function csr_matrix()
, to create csr matrix
and view it using the below code.
# Creating csr matrix
csr_m = csr_matrix(D)
csr_matrix(S): create a new csr matrix
one using another already created sparse matrix. Use the same matrix that we created in the above name csr_m
.
csr_mat = csr_matrix(csr_m)
Check the data type and other information related to the matrix.
csr_mat
View the crates matrix using the below code.
csr_mat.toarray()
csr_matrix((M, N), [type]) : It is used to create an empty matrix by specifying the shape M and N with optional option type.
Create a csr matrix
using the below code.
csr_matrix((5, 4), dtype=np.int8).toarray()
csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)]): It is used to construct matrix where the relationship csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)])
is satisfied by data, row_ind and col_ind
.
Create a csr matrix
using the below code.
import numpy as np
from scipy.sparse import csr_matrix
row_idx = np.array([0, 1, 0, 1, 2, 2])
col_idx = np.array([0, 1, 1, 0, 1, 2])
data = np.array([2, 3, 4, 5, 6, 7])
csr_matrix((data, (row_idx,col_idx)), shape=(4, 4)).toarray()
csr_matrix((data, indices, indptr), [shape=(M, N)]):
import numpy as np
from scipy.sparse import csr_matrix
indptr_a = np.array([0, 4, 5, 2])
indices_b = np.array([0, 2, 2, 0, 2, 3])
data = np.array([1, 2, 3, 4, 5, 6])
csr_matrix((data, indices_b, indptr_a), shape=(3, 3)).toarray()
Read: Pandas in Python
Scipy Sparse matrix to NumPy array
The numpy array
is a matrix which is a representation of a dense ndarray matrix, so here will take the csr matrix
and convert it into dense ndarray
using the function toarray
.
The syntax is given below.
csr_matrix.todense(order=None, out=None)
where parameters are:
- order: It is used to specify which orders to use like row-major(C) and column-major(F) for storing the multi-dimensional data. By default, it is
None
means no order. - out: It is used to specify how to return the result as an array (numpy.matrix) like an output buffer instead of creating a new array while returning the result.
Let’s take an example using the below steps:
Import the necessary libraries using the below code.
import numpy as np
from scipy.sparse import csr_matrix
Creating arrays using the below code.
row_data = np.array([0, 0, 2, 2, 1, 1, 4, 4, 5, 5, 1, 1, 2])
col_data = np.array([0, 3, 1, 0, 6, 1, 6, 3, 5, 3, 1, 4, 3])
array_data = np.array([1]*len(r))
Creating csr matrix
using the below code.
csr_mat = csr_matrix(array_data,(row_data,col_data),shape =(7,7))
Check the data type and stored elements within the matrix using the below code.
csr
Convert the csr matrix
to the numpy array matrix by applying the method toarray()
on the matrix csr_mat
using the below code.
csr_to_array = csr_mat.toarrat()
Check the elements of the dense matrix csr_to_array
using the below code.
csr_to_array
Read: Scipy Rotate Image + Examples
Scipy Sparse hstack
To stack sparse matrices column-wise (horizontally), the scipy.sparse
has method hstack()
for that.
The syntax is given below.
scipy.sparse.hstack(blocks, format=None, dtype=None)
Where parameters are:
- blocks: It is sparse matrices that we want to stack.
- format: It is used to specify the format of sparse matrices by default it is
csr
. - dtype: It is the data type of the returned matrix.
Run the below steps to create a horizontal stack matrix.
Import the required libraries using the below code.
from scipy.sparse import hstack, coo_matrix
Create two sparse matrices and pass these two matrices to a method hstack
.
first_mat = coo_matrix([[2, 3], [4, 5]])
second_mat = coo_matrix([[6], [7]])
hstack([first_mat,second_mat]).toarray()
Look in the above output, how two matrices are stacked horizontally.
Read: Scipy Integrate + Examples
Scipy Sparse coo matrix
In Scipy, the subpackage scipy.sparse
contains the method coo_matrix()
to generate a new sparse matrix in coordinate format.
The coo matrix
can be created in many ways as shown below.
coo_matrix(D): Using the rank_2 ndarray D or dense matrix.
Import the necessary libraries using the below code.
import numpy as np
from scipy.sparse import csr_matrix
Create a rank-2 matrix using the below code.
D = np.array([[1, 0, 1, 0, 0, 0], [2, 0, 2, 0, 0, 1],\
[0, 0, 0, 2, 0, 1]])
Check the crated matrix using the below code.
print(D)
Pass the created matrix to function coo_matrix()
, to create coo matrix
and view it using the below code.
# Creating coo matrix
csr_m = csr_matrix(D)
- coo_matrix(S): create a new
coo matrix
one using another already created sparse matrix. Use the same matrix that we created in the above namecoo_m
.
coo_mat = csr_matrix(coo_m)
Check the data type and other information related to the matrix.
coo_mat
View the crated matrix using the below code.
coo_mat.toarray()
- coo_matrix((M, N), [type]): It is used to create an empty matrix by specifying the shapes M and N with an optional option type.
Create a coo matrix
using the below code.
coo_matrix((4, 6), dtype=np.int8).toarray()
Read: Scipy Signal – Helpful Tutorial
Scipy Sparse eigsh
To find the eigenvector and eigenvalues of the given symmetric square matrix, the method eigsh()
is used that exist within sub-packages scipy.sparase.linalg
.
The syntax is given below.
scipy.sparse.linalg.eigsh(A, k=6)
where parameters are:
- A: It accepts ndarray, Linear operator, or sparse matrix. It is a square operator that acts for the operation
(A * x)
where A is a matrix. - k: It is used to specify the number of eigenvectors and eigenvalues we want from that matrix.
Let’s take an example using the below steps.
Import the method eigsh
using the below code.
from scipy.sparse.linalg import eigsh
import numpy as np
Create an identity matrix using the function of np.eye()
.
identity_data = np.eye(15)
Find the eigenvalues and eigenvectors of the created matrix using the below code.
eigenval, eigenvect = eigsh(identity_data, k=8)
Check the eigenvalues using the below code.
eigenval
Read: Python Scipy Minimize [With 8 Examples]
Scipy Sparse to dense
The dense matrix
is a matrix where most of the elements within the matrix are non-zero, so here will take the csr matrix
and convert it into dense matrix
using the function todense
.
The syntax is given below.
csr_matrix.todense(order=None, out=None)
where parameters are:
- order: It is used to specify which orders to use like row-major(C) and column-major(F) for storing the multi-dimensional data. By default, it is
None
means no order. - out: It is used to specify how to return the result as an array (numpy.matrix) like an output buffer instead of creating a new array while returning the result.
Let’s take an example using the below steps:
Import the necessary libraries using the below code.
import numpy as np
from scipy.sparse import csr_matrix
Creating arrays using the below code.
row_data = np.array([0, 0, 2, 2, 1, 1, 4, 4, 5, 5, 1, 1, 2])
col_data = np.array([0, 3, 1, 0, 6, 1, 6, 3, 5, 3, 1, 4, 3])
array_data = np.array([1]*len(r))
Creating csr matrix
using the below code.
csr_mat = csr_matrix(array_data,(row_data,col_data),shape =(6,6))
Check the data type and stored elements within the matrix using the below code.
csr
Convert the csr matrix
to the dense matrix by applying the method todense()
on the matrix csr_mat
using the below code.
csr_to_dense = csr_mat.todense()
Check the elements of the dense matrix csr_to_dense
using the below code.
csr_to_dense
Read: Python Scipy Exponential
Scipy Sparse matrix from pandas dataframe
Here we will create a sparse matrix from the pandas dataframe using the function csr_matrix()
.
First, create a new dataframe using the below code.
# importing pandas as pd
import pandas as pd
# Creating the DataFrame
dataf = pd.DataFrame({'Marks':[50, 90, 60, 20, 80],
'Age':[14, 25, 55, 8, 21]})
# Create the index
index_ = pd.Series(range(1,6))
# Set the index
dataf.index = index_
# Print the DataFrame
print(dataf)
Now convert the above-created dataframe into a sparse data frame or matrix using the below code.
# Importing the method csr_matrix()
from scipy.sparse import csr_matrix
# Creating the sparse matrix by passing the values of dataf
# using dataf.values.T to method csr_matrix()
sparse_matrix_from_dataframe = csr_matrix(dataf.values.T)
# viewing the elements of created sparse matrix
sparse_matrix_from_dataframe.toarray()
You may also like to read the following Python SciPy Tutorials.
- Scipy Find Peaks – Useful Tutorial
- Python Scipy Chi-Square Test
- Scipy Linalg – Helpful Guide
- Scipy Stats Zscore + Examples
So, in this tutorial, we have learned the “Scipy Sparse” and covered the following topics.
- Scipy Sparse rand
- Scipy Sparse linalg
- Scipy Sparse CSR matrix
- Scipy Sparse matrix to NumPy array
- Scipy Sparse hstack
- Scipy Sparse coo matrix
- Scipy Sparse eigsh
- Scipy Sparse to dense
- Scipy Sparse matrix from pandas dataframe
I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.