Python Scipy Sparse Csr_matrix

We will learn about the “Python Scipy Sparse Csr_matrix” in this tutorial so that we may generate a CSR matrix and use various techniques including multiplication, dot, and transposition.

The following concepts will be covered, along with locating the index of the max and min items in the CSR matrix.

  • What is Scipy Sparse Csr_matrix
  • Python Scipy Sparse Csr_matrix
  • Python Scipy Sparse Csr_matrix Multiply
  • Python Scipy Sparse Csr_matrix Indptr
  • Python Scipy Sparse Csr_matrix Dot
  • Python Scipy Sparse Csr_matrix Shape
  • Python Scipy Sparse Csr_matrix Sort indices
  • Python Scipy Sparse Csr_matrix Nnz
  • Python Scipy Sparse Csr_matrix Transpose
  • Python Scipy Sparse Csr_matrix Argmax
  • Python Scipy Sparse Csr_matrix Argmin

What is Scipy Sparse Csr_matrix?

There are two common types of matrices: sparse and dense. In contrast to dense matrices, which have a majority of non-zero elements, sparse matrices have a majority of zeros.

When a large matrix is available, it is typical for the majority of the elements to be zeros. Therefore, makes it reasonable to conduct operations using only non-zero numbers because zero times anything always results in zero.

  • Numerous sparse matrix functions in Scipy save only non-zero elements. By doing this, the amount of memory needed for data storage can be reduced. Data frames need to be stored in memory frequently for machine learning processes.
  • The data frame is divided so that it can fit in RAM. Data can fit in RAM with ease by compressing. The algorithm’s execution performance can be significantly accelerated by performing operations on the sparse matrix’s only non-zero values.

One of the algorithms offered by Scipy is called Compressed Sparse Row (CSR). Here’s how it functions.

Suppose that a text document contains words or texts like those given below.

This is a Tsinfo

Not Tsinfo. It is Tsinfotechnologies, Is it?

The first step is indexing: The words are given numbers. Assign the same numbers if words are repeated. We can determine the total number of words in the document from this step.

This is a Tsinfo
 0   1  2   3
This is not Tsinfo. It is Tsinfotechnologies, Is it?
 0   1  4    3     5   1       6              1  5  

Indexing begins at zero. The first word is “This,” which has an index of “0.” Every other word that is distinct will also have an index. Because the word “This” appears twice in the document, it receives the identical index value of “0” each time.

The second step is the document vector representation: There is a vector representation built for each line in the document.

How many distinct indices are there? Since there are 7 indices in this situation, ranging from 0 to 6, each document (or line) is represented by 7 values, where each value indicates the frequency with which a particular term associated with each index occurs.

(word-index): (This-0), (is-1), (a-2), (Tsinfo-3), (not-4), (It-5), (Tsinfotechnologies-6)
[1111000],[1111000],[2210000]

The third step is creating a sparse vector for every document: The sparse matrix representation of each document is shown below. It eliminates all values that are 0 and stores only non-zero values.

Doc<01,11>
Doc<01,11,21,31>

Read: Python Scipy Stats Fit

Python Scipy Sparse Csr_matrix

The function in the Scipy subpackage scipy.sparse allows us to generate a CSR matrix, which stands for a compressed sparse row matrix.

The CSR matrix is appropriate if what we need is a matrix that can execute addition, multiplication, subtraction, matrix power, and division.

As demonstrated below, there are numerous approaches to generating the CSR matrix, but here we will use one of the ways to create a matrix. To know more, please visit the official documentation of Python Scipy.

We are going to use the method csr_matrix(D) or also called the rank_2 ndarray or dense matrix.

Import the necessary libraries using the below code.

import numpy as np
from scipy.sparse import csr_matrix

Create a rank-2 matrix using the below code.

D = np.array([[1, 0, 1, 0, 0, 0], [2, 0, 2, 0, 0, 1],\
 [0, 0, 0, 2, 0, 1]])

Check the crated matrix using the below code.

print(D)

Pass the created matrix to function csr_matrix(), to create csr matrix and view it using the below code.

# Creating csr matrix
csr_m = csr_matrix(D)
csr_m.toarray()
Python Scipy Sparse Csr_matrix
Python Scipy Sparse Csr_matrix

This is how to create a CSR matrix using the method csr_matrix() of Python Scipy.

Read: Python Scipy Load Mat File

Python Scipy Sparse Csr_matrix Multiply

A sparse matrix is one in which the majority of its elements are zeros. SciPy’s 2-D sparse matrix package for numerical data is called “scipy.sparse”. It offers us, various classes, various classes to build sparse matrices. These two classes are csc_matrix and csr_matrix.

In contrast to csr_matrix(), which is used to build a compressed sparse row matrix, the csc_matrix() creates a compressed sparse column matrix.

To multiply two sparse matrices, we use the multiply() method offered by the CSR matrix classes. Let’s take an example for the demonstration by following the below code.

Import the required libraries using the below python code.

from scipy import sparse
import numpy as np

Create a first CSR matrix using the below code.

row_1 = np.array([0, 1, 2, 0 ])
col_1 = np.array([0, 3, 0, 1])
data_1 = np.array([3, 4, 9, 8])
  
csr_matrix_A = sparse.csr_matrix((data_1, 
                          (row_1, col_1)),
                        shape = (3, 4))
print("first CSR matrix: \n", 
      csr_matrix_A.toarray())

Create a second CSR matrix using the below code.

row_2 = np.array([1, 2, 0, 0 ])
col_2 = np.array([3, 0, 0, 1])
data_2 = np.array([8, 3, 4, 9])
  
csr_matrix_B = sparse.csr_matrix((data_2, 
                          (row_2, col_2)),
                        shape = (3, 4))
print("Second CSR matrix: \n", 
      csr_matrix_B.toarray())
Python Scipy Sparse Csr_matrix Multiply Example
Python Scipy Sparse Csr_matrix Multiply Example

Multiply both matrices using the method mulitply().

sparse_matrix_AB = csr_matrix_A.multiply(csr_matrix_B)
print("Multiplication of Sparse Matrix:\n",
      sparse_matrix_AB.toarray())
Python Scipy Sparse Csr_matrix Multiply
Python Scipy Sparse Csr_matrix Multiply

The output matrix contains the multiplication of the csr_matrix_A and csr_matrix_B.

This is how to apply the method multiply() on CSR matrices to get the product of two CSR matrices.

Read Python Scipy ttest_ind

Python Scipy Sparse Csr_matrix Indptr

The Indptr the attribute of the method csr_matrix() is the matrix’s index pointer array in CSR format.

The syntax is given below.

csr_matrix((data, indices, indptr), [shape=(M, N)])

Where parameters are:

  • data: The matrix’s CSR format data array
  • indices: The matrix’s index array in CSR format
  • indptr: The matrix’s CSR-format index pointer array
  • shape: It is used to specify the shape of the matrix.

Let’s create a CSR matrix using the Indptr by following the below steps:

Import the required libraries or methods using the below python code.

import numpy as np
from scipy import sparse

Create a sparse CSR matrix using indptr, indices and data values.

indptr_ = np.array([0, 6, 2, 3])
indices_ = np.array([0, 5, 4, 0, 4, 2])
data_ = np.array([1, 2, 3, 4, 5, 6])
matrix_csr = sparse.csr_matrix((data_, indices_, indptr_), shape=(3, 3))
matrix_csr.toarray()
Python Scipy Sparse Csr_matrix Indptr
Python Scipy Sparse Csr_matrix Indptr

To check the index pointer use the attribute indptr on the above-created matrix.

matrix_csr.indptr
Python Scipy Sparse Csr_matrix Indptr Example
Python Scipy Sparse Csr_matrix Indptr Example

This is how to get the matrix’s index pointer array in CSR format using the attribute indptr on the CSR matrix.

Read: Python Scipy Stats Norm

Python Scipy Sparse Csr_matrix Dot

The Python Scipy sparse has a method dot() to find the ordinary dot product of the CSR matrix.

Let’s take an example by following the below steps:

Import the required libraries or methods using the below python code.

import numpy as np
from scipy import sparse

Create an array for the dot product using the below code.

array_v = np.array([-1, 0, 1])

Create a CSR matrix using the below code.

matrix_A = sparse.csr_matrix([[2, 1, 0], [0, 3, 0], [5, 0, 4]])

Compute the dot of the above-created matrix by applying the dot() method on the matrix using the below code.

matrix_A.dot(array_v)
Python Scipy Sparse Csr_matrix Dot
Python Scipy Sparse Csr_matrix Dot

This is how to find the dot product of any CSR matrix using the method dot() of Python Scipy.

Read: Python Scipy Normal Test

Python Scipy Sparse Csr_matrix Shape

The method get_shape() can be applied to the CSR matrix to get the shape. The syntax is given below.

csr_matrix.get_shape()

Let’s take an example by following the below steps:

Import the required libraries or methods using the below python code.

import numpy as np
from scipy import sparse

Create a sparse CSR matrix using the below code.

matrx = sparse.csr_matrix((4, 4), dtype=np.int8)

Now use the function get_shape() on the above-created CSR matrix.

matrx.get_shape()
Python Scipy Sparse Csr_matrix Shape
Python Scipy Sparse Csr_matrix Shape

This is how to get the shape of any CSR matrix using the method get_shape() that returns the result in the tuple.

Read: Python Scipy Stats Poisson

Python Scipy Sparse Csr_matrix Sort indices

The method sort_indices() of Python Scipy is applied to the CSR matrix to sort the matrix’s indexes.

Import the required libraries or methods using the below python code.

import numpy as np
from scipy import sparse

Create a sparse CSR matrix using indptr, indices, and data values.

indptr_ = np.array([0, 3, 2, 6])
indices_ = np.array([0, 3, 2, 0, 2, 1])
data_ = np.array([1, 2, 3, 4, 5, 6])
matrix_csr = sparse.csr_matrix((data_, indices_, indptr_), shape=(3, 3))
matrix_csr.toarray()

Now apply the method sort_indices() on the above matrix using the below code.

matrix_csr.sort_indices
Python Scipy Sparse Csr_matrix Sort indices
Python Scipy Sparse Csr_matrix Sort indices

This is how to apply the attribute sort_indices on the CSR matrix to sort the index of the matrix.

Read: Python Scipy Eigenvalues

Python Scipy Sparse Csr_matrix Nnz

“Non-zero” elements are kept in several arrays via a sparse matrix. Basically, nnz reports the size of these arrays.

Let’s understand with an example and find the size of the non-zero elements within the CSR matrix by following the below steps:

Import the required libraries or methods using the below python code.

import numpy as np
from scipy import sparse

Create a sparse CSR matrix using the below code.

matrix_A = sparse.csr_matrix([[2, 1, 0], [0, 3, 0], [5, 0, 4]])
matrix_A.toarray()

Now apply the method nnz on the matrix using the below code.

matrix_nnz = matrix_A.nnz
matrix_nnz
Python Scipy Sparse Csr_matrix Nnz
Python Scipy Sparse Csr_matrix Nnz

This is how to know the size or the number of the non-zero elements in the CSR matrix using the attribute nnz of Python Scipy.

Read: Python Scipy Stats Mode

Python Scipy Sparse Csr_matrix Transpose

By flipping rows into columns or columns into rows, you can find a matrix’s transpose. Python Scipy has a method transpose() that can be applied to the CSR matrix to reverse the sparse matrix’s dimensions.

The syntax is given below.

csr_matrix.transpose(axes=None, copy=False)

Where parameters are:

  • axes: The only reason this argument is in the signature is to ensure NumPy compatibility. Nothing else should be entered beside the default value.
  • copy(boolean): Whether or not self-attributes should be imitated wherever possible. Depending on the kind of sparse matrix being utilized, different attributes are duplicated to varying degrees.

Let’s understand with an example and compute the transpose of the CSR matrix by following the below steps:

Import the required libraries or methods using the below python code.

import numpy as np
from scipy import sparse

Create a sparse CSR matrix using the below code.

matrix_A = sparse.csr_matrix([[2, 1, 0], [0, 3, 0], [5, 0, 4]])
matrix_A.toarray()

Now apply the method transpose() on the matrix matrix_A using the below code.

matrix_trans = matrix_A.transpose()
matrix_trans.toarray()
Python Scipy Sparse Csr_matrix Transpose
Python Scipy Sparse Csr_matrix Transpose

This is how to flip the rows into columns or columns into rows using the method transpose() of Python Scipy on the CSR matrix.

Read: Python Scipy Freqz

Python Scipy Sparse Csr_matrix Argmax

The method argmax() Python Scipy of csr_matrix is maximum elements along an axis that are returned as indexes.

The syntax is given below.

csr_matrix.argmax(axis=None)

Where parameters are:

axis(0,1,-1,-2): Along this axis, the argmax is calculated. If None (the default), the flatten data’s maximum element’s index is returned.

Import the required libraries or methods using the below python code.

import numpy as np
from scipy import sparse

Create a sparse CSR matrix using indptr, indices, and data values.

indptr_ = np.array([0, 3, 2, 6])
indices_ = np.array([0, 2, 1, 0, 1, 1])
data_ = np.array([1, 2, 3, 4, 5, 6])
matrix_csr = sparse.csr_matrix((data_, indices_, indptr_), shape=(3, 3))
print(matrix_csr.toarray())

Now pass the CSR matrix to the method argmax() using the below code.

print("Index of the maximum (14) element is :",sparse.csr_matrix.argmax(matrix_csr))
Python Scipy Sparse Csr_matrix Argmax
Python Scipy Sparse Csr_matrix Argmax

This is how to find the index of the maximum element in the CSR matrix using the method argman() of Python Scipy.

Read: Python Scipy Distance Matrix

Python Scipy Sparse Csr_matrix Argmin

The method argmin() Python Scipy of csr_matrix is minimum elements along an axis that are returned as indexes.

The syntax is given below.

csr_matrix.argmin(axis=None)

Where parameters are:

axis(0,1,-1,-2): Along this axis, the argmin is calculated. If None (the default), the flattened data’s maximum element’s index is returned.

Import the required libraries or methods using the below python code.

import numpy as np
from scipy import sparse

Create a sparse CSR matrix using indptr, indices, and data values.

indptr_ = np.array([0, 3, 2, 6])
indices_ = np.array([0, 2, 1, 0, 1, 1])
data_ = np.array([11, 21, 6, 8, 15, 16])
matrix_csr = sparse.csr_matrix((data_, indices_, indptr_), shape=(3, 3))
print(matrix_csr.toarray())
print("Index of the minimum element is :",sparse.csr_matrix.argmin(matrix_csr))

Now pass the CSR matrix to the method argmin() using the below code.

print("Index of the minimum element is :",sparse.csr_matrix.argmin(matrix_csr))
Python Scipy Sparse Csr_matrix Argmin
Python Scipy Sparse Csr_matrix Argmin

This is how to find the index of the element minimum in the CSR matrix using the method argmin() of Python Scipy.

You may also like to read the following Python SciPy tutorials.

In this tutorial, we have learned how to create a CSR matrix and also explored sorting, finding the index of maximum and minimum elements with the shape of the matrix, etc. Additionally covered the following topics.

  • What is Scipy Sparse Csr_matrix
  • Python Scipy Sparse Csr_matrix
  • Python Scipy Sparse Csr_matrix Multiply
  • Python Scipy Sparse Csr_matrix Indptr
  • Python Scipy Sparse Csr_matrix Dot
  • Python Scipy Sparse Csr_matrix Shape
  • Python Scipy Sparse Csr_matrix Sort indices
  • Python Scipy Sparse Csr_matrix Nnz
  • Python Scipy Sparse Csr_matrix Transpose
  • Python Scipy Sparse Csr_matrix Argmax
  • Python Scipy Sparse Csr_matrix Argmin