Scipy Sparse – Helpful Tutorial

In this Python tutorial, we will learn about “Scipy Sparse” where we will cover how to create a sparse matrix. Additionally, we will cover the following topics.

  • Scipy Sparse rand
  • Scipy Sparse linalg
  • Scipy Sparse CSR matrix
  • Scipy Sparse matrix to NumPy array
  • Scipy Sparse hstack
  • Scipy Sparse coo matrix
  • Scipy Sparse eigsh
  • Scipy Sparse to dense
  • Scipy Sparse matrix from pandas dataframe

What is Scipy Sparse

The Sparse Matrix contains the elements where most of the elements are zero. There are two types of sparse matrices in Scipy, the first is the CSR(Compressed Sparse Row) and the second one is the CSC(Compressed Sparse Column). In there is a module scipy.sparse to deal with sparse data or matrix.

Scipy Sparse Rand

The scipy.sparse package contains a function rand() to generate a matrix containing uniformly distributed values by specifying shape and density.

The syntax to create a sparse matrix using the rand() the function is given below.

scipy.sparse.rand(m, n, density=0.01, format='coo', dtype=None, random_state=None)

Where parameters are:

  • m,n: It defines the shape of matrix-like we want to build a matrix of shape 2 by 3 where m and n are 2 and 3 respectively.
  • density: It is used to specify the density of the matrix that we want to generate, if we specify the density of the matrix as 1, then it is the full matrix. If specify density as 0 which means the matrix contains no non-zero items.
  • format: It is used to specify the format of the matrix.
  • dtype: It is used to define the data type of the returned matrix values.
  • random_state: It is used to generate the reproducible values.

In the below demonstration, we are going to generate the sparse matrix using the function rand().

Import the function rand() using the below code.

from scipy.sparse import rand

Create a matrix by specifying a shape of 4 by 3 with density= 0.30, format=”csr” and random_state=40 using the below code.

matrix_data = rand(4, 3, density=0.30, format="csr", random_state=40)

Check the matrix data type and its format.

matrix_data

Now, check the elements of a created matrix using the function toarray() on that matrix.

matrix_data.toarray()
Scipy Sparse rand
Scipy Sparse rand

Read: What is Scipy in Python

Scipy Sparse linalg

In Scipy, the subpackage scipy.sparse has module linalg to deal with linear algebra problems and it has lots of methods related to different categories that are given below.

Abstract linear operators

It has two methods :

  • LinearOperator(*args, **kwargs) : It is a common interface for doing matrix-vector products.
  • aslinearoperator(A): It output a LinearOperator.

Matrix norms

It has also two methods to calculate the normal distribution of the matrix.

  • norm(x[, ord, axis]): It returns the normal distribution of a given sparse matrix.
  • onenormest(A[, t, itmax, compute_v, compute_w]): Given the 1-norm of a sparse matrix, It calculates a lower bound.

Matrix Operations

It is used to calculate the exponential and inverse of a given sparse matrix and has three methods.

  • inv(A): It is used to calculate the inverse of a given sparse matrix
  • expm(A): t is used to calculate the matrix exponential with the help of the Pade approximation.
  • expm_multiply(A, B[, start, stop, num, endpoint]): t is used to calculate the action of the matrix exponential of A on B.

Solving linear problems

It has direct methods to solve linear problems and has lots of methods, here we will know about some of the methods.

  • spsolve(A, b[, permc_spec, use_umfpack]): It is used to find the solution of the sparse linear system Ax=b, where b represents a vector or a matrix.
  • spsolve_triangular(A, b[, lower, …]): It is used to find the solution of the equation like A x = b for x, where A is considered as a triangular matrix.

There are other methods available at official documentation “scipy.sparse.linalg”

Scipy Sparse CSR matrix

The csr stand for Compressed Sparse Row Matrix, so we can create csr matrix using the function in the subpackage scipy.sparse of Scipy.

If we want a matrix where we can perform addition, multiplication, subtraction, matrix power, and division, then csr matrix is suitable for that.

The csr matrix can be created in many ways as shown below.

csr_matrix(D): Using the rank_2 ndarray or dense matrix.

Import the necessary libraries using the below code.

import numpy as np
from scipy.sparse import csr_matrix

Create a rank-2 matrix using the below code.

D = np.array([[1, 0, 1, 0, 0, 0], [2, 0, 2, 0, 0, 1],\
 [0, 0, 0, 2, 0, 1]])

Check the crated matrix using the below code.

print(D)

pass the created matrix to function csr_matrix(), to create csr matrix and view it using the below code.

# Creating csr matrix
csr_m = csr_matrix(D)
Scipy Sparse CSR matrix csr_matrix(D)
Scipy Sparse CSR matrix csr_matrix(D)

csr_matrix(S): create a new csr matrix one using another already created sparse matrix. Use the same matrix that we created in the above name csr_m.

csr_mat = csr_matrix(csr_m)

Check the data type and other information related to the matrix.

csr_mat

View the crates matrix using the below code.

csr_mat.toarray()
Scipy Sparse CSR matrix csr_matrix(S)
Scipy Sparse CSR matrix csr_matrix(S)

csr_matrix((M, N), [type]) : It is used to create an empty matrix by specifying the shape M and N with optional option type.

Create a csr matrix using the below code.

csr_matrix((5, 4), dtype=np.int8).toarray()
Scipy Sparse CSR matrix csr_matrix(M N)
Scipy Sparse CSR matrix csr_matrix(M N)

csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)]): It is used to construct matrix where the relationship csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)]) is satisfied by data, row_ind and col_ind.

Create a csr matrix using the below code.

import numpy as np
from scipy.sparse import csr_matrix

row_idx = np.array([0, 1, 0, 1, 2, 2])
col_idx = np.array([0, 1, 1, 0, 1, 2])
data = np.array([2, 3, 4, 5, 6, 7])
csr_matrix((data, (row_idx,col_idx)), shape=(4, 4)).toarray()
Scipy Sparse CSR matrix csr_matrix example
Scipy Sparse CSR matrix csr_matrix example

csr_matrix((data, indices, indptr), [shape=(M, N)]):

import numpy as np
from scipy.sparse import csr_matrix


indptr_a = np.array([0, 4, 5, 2])
indices_b = np.array([0, 2, 2, 0, 2, 3])
data = np.array([1, 2, 3, 4, 5, 6])
csr_matrix((data, indices_b, indptr_a), shape=(3, 3)).toarray()
Scipy Sparse CSR matrix csr_matrix tutorial
Scipy Sparse CSR matrix csr_matrix tutorial

Read: Pandas in Python

Scipy Sparse matrix to NumPy array

The numpy array is a matrix which is a representation of a dense ndarray matrix, so here will take the csr matrix and convert it into dense ndarray using the function toarray.

The syntax is given below.

csr_matrix.todense(order=None, out=None)

where parameters are:

  • order: It is used to specify which orders to use like row-major(C) and column-major(F) for storing the multi-dimensional data. By default, it is None means no order.
  • out: It is used to specify how to return the result as an array (numpy.matrix) like an output buffer instead of creating a new array while returning the result.

Let’s take an example using the below steps:

Import the necessary libraries using the below code.

import numpy as np
from scipy.sparse import csr_matrix

Creating arrays using the below code.

row_data = np.array([0, 0, 2, 2, 1, 1, 4, 4, 5, 5, 1, 1, 2])
col_data = np.array([0, 3, 1, 0, 6, 1, 6, 3, 5, 3, 1, 4, 3])
array_data = np.array([1]*len(r))

Creating csr matrix using the below code.

csr_mat = csr_matrix(array_data,(row_data,col_data),shape =(6,6))

Check the data type and stored elements within the matrix using the below code.

csr

Convert the csr matrix to the numpy array matrix by applying the method toarray() on the matrix csr_mat using the below code.

csr_to_array = csr_mat.toarrat()

Check the elements of the dense matrix csr_to_array using the below code.

csr_to_array
Scipy Sparse matrix to NumPy array
Scipy Sparse matrix to NumPy array

Read: Scipy Rotate Image + Examples

Scipy Sparse hstack

To stack sparse matrices column-wise (horizontally), the scipy.sparse has method hstack() for that.

The syntax is given below.

scipy.sparse.hstack(blocks, format=None, dtype=None)

Where parameters are:

  • blocks: It is sparse matrices that we want to stack.
  • format: It is used to specify the format of sparse matrices by default it is csr.
  • dtype: It is the data type of the returned matrix.

Run the below steps to create a horizontal stack matrix.

Import the required libraries using the below code.

from scipy.sparse import hstack, coo_matrix

Create two sparse matrices and pass these two matrices to a method hstack.

first_mat = coo_matrix([[2, 3], [4, 5]])
second_mat = coo_matrix([[6], [7]])
hstack([first_mat,second_mat]).toarray()
Scipy Sparse hstack
Scipy Sparse hstack

Look in the above output, how two matrices are stacked horizontally.

Read: Scipy Integrate + Examples

Scipy Sparse coo matrix

In Scipy, the subpackage scipy.sparse contains the method coo_matrix() to generate a new sparse matrix in coordinate format.

The coo matrix can be created in many ways as shown below.

coo_matrix(D): Using the rank_2 ndarray D or dense matrix.

Import the necessary libraries using the below code.

import numpy as np
from scipy.sparse import csr_matrix

Create a rank-2 matrix using the below code.

D = np.array([[1, 0, 1, 0, 0, 0], [2, 0, 2, 0, 0, 1],\
 [0, 0, 0, 2, 0, 1]])

Check the crated matrix using the below code.

print(D)

Pass the created matrix to function coo_matrix(), to create coo matrix and view it using the below code.

# Creating coo matrix
csr_m = csr_matrix(D)
Scipy Sparse coo matrix coo_matrix(D)
Scipy Sparse coo matrix coo_matrix(D)
  • coo_matrix(S): create a new coo matrix one using another already created sparse matrix. Use the same matrix that we created in the above name coo_m.
coo_mat = csr_matrix(coo_m)

Check the data type and other information related to the matrix.

coo_mat

View the crated matrix using the below code.

coo_mat.toarray()
Scipy Sparse coo matrix coo_matrix(S)
Scipy Sparse coo matrix coo_matrix(S)
  • coo_matrix((M, N), [type]): It is used to create an empty matrix by specifying the shapes M and N with an optional option type.

Create a coo matrix using the below code.

coo_matrix((4, 6), dtype=np.int8).toarray()
Scipy Sparse coo matrix coo_matrix(M,N)
Scipy Sparse coo matrix coo_matrix(M,N)

Read: Scipy Signal – Helpful Tutorial

Scipy Sparse eigsh

To find the eigenvector and eigenvalues of the given symmetric square matrix, the method eigsh() is used that exist within sub-packages scipy.sparase.linalg.

The syntax is given below.

scipy.sparse.linalg.eigsh(A, k=6)

where parameters are:

  • A: It accepts ndarray, Linear operator, or sparse matrix. It is a square operator that acts for the operation (A * x) where A is a matrix.
  • k: It is used to specify the number of eigenvectors and eigenvalues we want from that matrix.

Let’s take an example using the below steps.

Import the method eigsh using the below code.

from scipy.sparse.linalg import eigsh
import numpy as np

Create an identity matrix using the function of np.eye().

identity_data = np.eye(15)

Find the eigenvalues and eigenvectors of the created matrix using the below code.

eigenval, eigenvect = eigsh(identity_data, k=8)

Check the eigenvalues using the below code.

eigenval
Scipy Sparse eigsh
Scipy Sparse eigsh

Scipy Sparse to dense

The dense matrix is a matrix where most of the elements within the matrix are non-zero, so here will take the csr matrix and convert it into dense matrix using the function todense.

The syntax is given below.

csr_matrix.todense(order=None, out=None)

where parameters are:

  • order: It is used to specify which orders to use like row-major(C) and column-major(F) for storing the multi-dimensional data. By default, it is None means no order.
  • out: It is used to specify how to return the result as an array (numpy.matrix) like an output buffer instead of creating a new array while returning the result.

Let’s take an example using the below steps:

Import the necessary libraries using the below code.

import numpy as np
from scipy.sparse import csr_matrix

Creating arrays using the below code.

row_data = np.array([0, 0, 2, 2, 1, 1, 4, 4, 5, 5, 1, 1, 2])
col_data = np.array([0, 3, 1, 0, 6, 1, 6, 3, 5, 3, 1, 4, 3])
array_data = np.array([1]*len(r))

Creating csr matrix using the below code.

csr_mat = csr_matrix(array_data,(row_data,col_data),shape =(6,6))

Check the data type and stored elements within the matrix using the below code.

csr

Convert the csr matrix to the dense matrix by applying the method todense() on the matrix csr_mat using the below code.

csr_to_dense = csr_mat.todense()

Check the elements of the dense matrix csr_to_dense using the below code.

csr_to_dense
Scipy Sparse to dense
Scipy Sparse to dense

Scipy Sparse matrix from pandas dataframe

Here we will create a sparse matrix from the pandas dataframe using the function csr_matrix().

First, create a new dataframe using the below code.

# importing pandas as pd
import pandas as pd

# Creating the DataFrame
dataf = pd.DataFrame({'Marks':[50, 90, 60, 20, 80],
                 'Age':[14, 25, 55, 8, 21]})

# Create the index
index_ = pd.Series(range(1,6))

# Set the index
dataf.index = index_

# Print the DataFrame
print(dataf)

Now convert the above-created dataframe into a sparse data frame or matrix using the below code.

# Importing the method csr_matrix()
from scipy.sparse import csr_matrix

# Creating the sparse matrix by passing the values of dataf 
# using dataf.values.T to method csr_matrix()
sparse_matrix_from_dataframe = csr_matrix(dataf.values.T)

# viewing the elements of created sparse matrix
sparse_matrix_from_dataframe.toarray()
Scipy Sparse matrix from pandas dataframe
Scipy Sparse matrix from pandas dataframe

So, in this tutorial, we have learned the “Scipy Sparse” and covered the following topics.

  • Scipy Sparse rand
  • Scipy Sparse linalg
  • Scipy Sparse CSR matrix
  • Scipy Sparse matrix to NumPy array
  • Scipy Sparse hstack
  • Scipy Sparse coo matrix
  • Scipy Sparse eigsh
  • Scipy Sparse to dense
  • Scipy Sparse matrix from pandas dataframe